Python requests上传文件实现步骤


Posted in Python onSeptember 15, 2020

官方文档:https://2.python-requests.org//en/master/

工作中涉及到一个功能,需要上传附件到一个接口,接口参数如下:

使用http post提交附件 multipart/form-data 格式,url : http://test.com/flow/upload,

字段列表:
md5:      //md5加密(随机值_当时时间戳)
filesize:  //文件大小
file:       //文件内容(须含文件名)
返回值:
{"success":true,"uploadName":"tmp.xml","uploadPath":"uploads\/201311\/758e875fb7c7a508feef6b5036119b9f"}

由于工作中主要用python,并且项目中已有使用requests库的地方,所以计划使用requests来实现,本来以为是很简单的一个小功能,结果花费了大量的时间,requests官方的例子只提到了上传文件,并不需要传额外的参数:

https://2.python-requests.org//en/master/user/quickstart/#post-a-multipart-encoded-file

>>> url = 'https://httpbin.org/post'
>>> files = {'file': ('report.xls', open('report.xls', 'rb'), 'application/vnd.ms-excel', {'Expires': '0'})}

>>> r = requests.post(url, files=files)
>>> r.text
{
 ...
 "files": {
  "file": "<censored...binary...data>"
 },
 ...
}

但是如果涉及到了参数的传递时,其实就要用到requests的两个参数:data、files,将要上传的文件传入files,将其他参数传入data,request库会将两者合并到一起做一个multi part,然后发送给服务器。

最终实现的代码是这样的:

with open(file_name) as f:
content = f.read()
request_data = {
  'md5':md5.md5('%d_%d' % (0, int(time.time()))).hexdigest(),
  'filesize':len(content),
}
files = {'file':(file_name, open(file_name, 'rb'))}
MyLogger().getlogger().info('url:%s' % (request_url))
resp = requests.post(request_url, data=request_data, files=files)

虽然最终代码可能看起来很简单,但是其实我费了好大功夫才确认这样是OK的,中间还翻了requests的源码,下面记录一下翻阅源码的过程:

首先,找到post方法的实现,在requests.api.py中:

def post(url, data=None, json=None, **kwargs):
  r"""Sends a POST request.

  :param url: URL for the new :class:`Request` object.
  :param data: (optional) Dictionary, list of tuples, bytes, or file-like
    object to send in the body of the :class:`Request`.
  :param json: (optional) json data to send in the body of the :class:`Request`.
  :param \*\*kwargs: Optional arguments that ``request`` takes.
  :return: :class:`Response <Response>` object
  :rtype: requests.Response
  """

  return request('post', url, data=data, json=json, **kwargs)

这里可以看到它调用了request方法,咱们继续跟进request方法,在requests.api.py中:

def request(method, url, **kwargs):
  """Constructs and sends a :class:`Request <Request>`.

  :param method: method for the new :class:`Request` object: ``GET``, ``OPTIONS``, ``HEAD``, ``POST``, ``PUT``, ``PATCH``, or ``DELETE``.
  :param url: URL for the new :class:`Request` object.
  :param params: (optional) Dictionary, list of tuples or bytes to send
    in the query string for the :class:`Request`.
  :param data: (optional) Dictionary, list of tuples, bytes, or file-like
    object to send in the body of the :class:`Request`.
  :param json: (optional) A JSON serializable Python object to send in the body of the :class:`Request`.
  :param headers: (optional) Dictionary of HTTP Headers to send with the :class:`Request`.
  :param cookies: (optional) Dict or CookieJar object to send with the :class:`Request`.
  :param files: (optional) Dictionary of ``'name': file-like-objects`` (or ``{'name': file-tuple}``) for multipart encoding upload.
    ``file-tuple`` can be a 2-tuple ``('filename', fileobj)``, 3-tuple ``('filename', fileobj, 'content_type')``
    or a 4-tuple ``('filename', fileobj, 'content_type', custom_headers)``, where ``'content-type'`` is a string
    defining the content type of the given file and ``custom_headers`` a dict-like object containing additional headers
    to add for the file.
  :param auth: (optional) Auth tuple to enable Basic/Digest/Custom HTTP Auth.
  :param timeout: (optional) How many seconds to wait for the server to send data
    before giving up, as a float, or a :ref:`(connect timeout, read
    timeout) <timeouts>` tuple.
  :type timeout: float or tuple
  :param allow_redirects: (optional) Boolean. Enable/disable GET/OPTIONS/POST/PUT/PATCH/DELETE/HEAD redirection. Defaults to ``True``.
  :type allow_redirects: bool
  :param proxies: (optional) Dictionary mapping protocol to the URL of the proxy.
  :param verify: (optional) Either a boolean, in which case it controls whether we verify
      the server's TLS certificate, or a string, in which case it must be a path
      to a CA bundle to use. Defaults to ``True``.
  :param stream: (optional) if ``False``, the response content will be immediately downloaded.
  :param cert: (optional) if String, path to ssl client cert file (.pem). If Tuple, ('cert', 'key') pair.
  :return: :class:`Response <Response>` object
  :rtype: requests.Response

  Usage::

   >>> import requests
   >>> req = requests.request('GET', 'https://httpbin.org/get')
   <Response [200]>
  """

  # By using the 'with' statement we are sure the session is closed, thus we
  # avoid leaving sockets open which can trigger a ResourceWarning in some
  # cases, and look like a memory leak in others.
  with sessions.Session() as session:
    return session.request(method=method, url=url, **kwargs)

这个方法的注释比较多,从注释里其实已经可以看到files参数使用传送文件,但是还是无法知道当需要同时传递参数和文件时该如何处理,继续跟进session.request方法,在requests.session.py中:

def request(self, method, url,
      params=None, data=None, headers=None, cookies=None, files=None,
      auth=None, timeout=None, allow_redirects=True, proxies=None,
      hooks=None, stream=None, verify=None, cert=None, json=None):
    """Constructs a :class:`Request <Request>`, prepares it and sends it.
    Returns :class:`Response <Response>` object.

    :param method: method for the new :class:`Request` object.
    :param url: URL for the new :class:`Request` object.
    :param params: (optional) Dictionary or bytes to be sent in the query
      string for the :class:`Request`.
    :param data: (optional) Dictionary, list of tuples, bytes, or file-like
      object to send in the body of the :class:`Request`.
    :param json: (optional) json to send in the body of the
      :class:`Request`.
    :param headers: (optional) Dictionary of HTTP Headers to send with the
      :class:`Request`.
    :param cookies: (optional) Dict or CookieJar object to send with the
      :class:`Request`.
    :param files: (optional) Dictionary of ``'filename': file-like-objects``
      for multipart encoding upload.
    :param auth: (optional) Auth tuple or callable to enable
      Basic/Digest/Custom HTTP Auth.
    :param timeout: (optional) How long to wait for the server to send
      data before giving up, as a float, or a :ref:`(connect timeout,
      read timeout) <timeouts>` tuple.
    :type timeout: float or tuple
    :param allow_redirects: (optional) Set to True by default.
    :type allow_redirects: bool
    :param proxies: (optional) Dictionary mapping protocol or protocol and
      hostname to the URL of the proxy.
    :param stream: (optional) whether to immediately download the response
      content. Defaults to ``False``.
    :param verify: (optional) Either a boolean, in which case it controls whether we verify
      the server's TLS certificate, or a string, in which case it must be a path
      to a CA bundle to use. Defaults to ``True``.
    :param cert: (optional) if String, path to ssl client cert file (.pem).
      If Tuple, ('cert', 'key') pair.
    :rtype: requests.Response
    """
    # Create the Request.
    req = Request(
      method=method.upper(),
      url=url,
      headers=headers,
      files=files,
      data=data or {},
      json=json,
      params=params or {},
      auth=auth,
      cookies=cookies,
      hooks=hooks,
    )
    prep = self.prepare_request(req)

    proxies = proxies or {}

    settings = self.merge_environment_settings(
      prep.url, proxies, stream, verify, cert
    )

    # Send the request.
    send_kwargs = {
      'timeout': timeout,
      'allow_redirects': allow_redirects,
    }
    send_kwargs.update(settings)
    resp = self.send(prep, **send_kwargs)

    return resp

先大概看一下这个方法,先是准备request,最后一步是调用send,推测应该是发送请求了,所以我们需要跟进到prepare_request方法中,在requests.session.py中:

def prepare_request(self, request):
    """Constructs a :class:`PreparedRequest <PreparedRequest>` for
    transmission and returns it. The :class:`PreparedRequest` has settings
    merged from the :class:`Request <Request>` instance and those of the
    :class:`Session`.

    :param request: :class:`Request` instance to prepare with this
      session's settings.
    :rtype: requests.PreparedRequest
    """
    cookies = request.cookies or {}

    # Bootstrap CookieJar.
    if not isinstance(cookies, cookielib.CookieJar):
      cookies = cookiejar_from_dict(cookies)

    # Merge with session cookies
    merged_cookies = merge_cookies(
      merge_cookies(RequestsCookieJar(), self.cookies), cookies)

    # Set environment's basic authentication if not explicitly set.
    auth = request.auth
    if self.trust_env and not auth and not self.auth:
      auth = get_netrc_auth(request.url)

    p = PreparedRequest()
    p.prepare(
      method=request.method.upper(),
      url=request.url,
      files=request.files,
      data=request.data,
      json=request.json,
      headers=merge_setting(request.headers, self.headers, dict_class=CaseInsensitiveDict),
      params=merge_setting(request.params, self.params),
      auth=merge_setting(auth, self.auth),
      cookies=merged_cookies,
      hooks=merge_hooks(request.hooks, self.hooks),
    )
    return p

在prepare_request中,生成了一个PreparedRequest对象,并调用其prepare方法,跟进到prepare方法中,在requests.models.py中:

def prepare(self,
      method=None, url=None, headers=None, files=None, data=None,
      params=None, auth=None, cookies=None, hooks=None, json=None):
    """Prepares the entire request with the given parameters."""

    self.prepare_method(method)
    self.prepare_url(url, params)
    self.prepare_headers(headers)
    self.prepare_cookies(cookies)
    self.prepare_body(data, files, json)
    self.prepare_auth(auth, url)

    # Note that prepare_auth must be last to enable authentication schemes
    # such as OAuth to work on a fully prepared request.

    # This MUST go after prepare_auth. Authenticators could add a hook
    self.prepare_hooks(hooks)

这里调用许多prepare_xx方法,这里我们只关心处理了data、files、json的方法,跟进到prepare_body中,在requests.models.py中:

def prepare_body(self, data, files, json=None):
    """Prepares the given HTTP body data."""

    # Check if file, fo, generator, iterator.
    # If not, run through normal process.

    # Nottin' on you.
    body = None
    content_type = None

    if not data and json is not None:
      # urllib3 requires a bytes-like body. Python 2's json.dumps
      # provides this natively, but Python 3 gives a Unicode string.
      content_type = 'application/json'
      body = complexjson.dumps(json)
      if not isinstance(body, bytes):
        body = body.encode('utf-8')

    is_stream = all([
      hasattr(data, '__iter__'),
      not isinstance(data, (basestring, list, tuple, Mapping))
    ])

    try:
      length = super_len(data)
    except (TypeError, AttributeError, UnsupportedOperation):
      length = None

    if is_stream:
      body = data

      if getattr(body, 'tell', None) is not None:
        # Record the current file position before reading.
        # This will allow us to rewind a file in the event
        # of a redirect.
        try:
          self._body_position = body.tell()
        except (IOError, OSError):
          # This differentiates from None, allowing us to catch
          # a failed `tell()` later when trying to rewind the body
          self._body_position = object()

      if files:
        raise NotImplementedError('Streamed bodies and files are mutually exclusive.')

      if length:
        self.headers['Content-Length'] = builtin_str(length)
      else:
        self.headers['Transfer-Encoding'] = 'chunked'
    else:
      # Multi-part file uploads.
      if files:
        (body, content_type) = self._encode_files(files, data)
      else:
        if data:
          body = self._encode_params(data)
          if isinstance(data, basestring) or hasattr(data, 'read'):
            content_type = None
          else:
            content_type = 'application/x-www-form-urlencoded'

      self.prepare_content_length(body)

      # Add content-type if it wasn't explicitly provided.
      if content_type and ('content-type' not in self.headers):
        self.headers['Content-Type'] = content_type

    self.body = body

这个函数比较长,需要重点关注L52,这里调用了_encode_files方法,我们跟进这个方法:

def _encode_files(files, data):
    """Build the body for a multipart/form-data request.

    Will successfully encode files when passed as a dict or a list of
    tuples. Order is retained if data is a list of tuples but arbitrary
    if parameters are supplied as a dict.
    The tuples may be 2-tuples (filename, fileobj), 3-tuples (filename, fileobj, contentype)
    or 4-tuples (filename, fileobj, contentype, custom_headers).
    """
    if (not files):
      raise ValueError("Files must be provided.")
    elif isinstance(data, basestring):
      raise ValueError("Data must not be a string.")

    new_fields = []
    fields = to_key_val_list(data or {})
    files = to_key_val_list(files or {})

    for field, val in fields:
      if isinstance(val, basestring) or not hasattr(val, '__iter__'):
        val = [val]
      for v in val:
        if v is not None:
          # Don't call str() on bytestrings: in Py3 it all goes wrong.
          if not isinstance(v, bytes):
            v = str(v)

          new_fields.append(
            (field.decode('utf-8') if isinstance(field, bytes) else field,
             v.encode('utf-8') if isinstance(v, str) else v))

    for (k, v) in files:
      # support for explicit filename
      ft = None
      fh = None
      if isinstance(v, (tuple, list)):
        if len(v) == 2:
          fn, fp = v
        elif len(v) == 3:
          fn, fp, ft = v
        else:
          fn, fp, ft, fh = v
      else:
        fn = guess_filename(v) or k
        fp = v

      if isinstance(fp, (str, bytes, bytearray)):
        fdata = fp
      elif hasattr(fp, 'read'):
        fdata = fp.read()
      elif fp is None:
        continue
      else:
        fdata = fp

      rf = RequestField(name=k, data=fdata, filename=fn, headers=fh)
      rf.make_multipart(content_type=ft)
      new_fields.append(rf)

    body, content_type = encode_multipart_formdata(new_fields)

    return body, content_type

OK,到此为止,仔细阅读完这个段代码,就可以搞明白requests.post方法传入的data、files两个参数的作用了,其实requests在这里把它俩合并在一起了,作为post的body。

以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持三水点靠木。

Python 相关文章推荐
Python深入学习之装饰器
Aug 31 Python
跟老齐学Python之变量和参数
Oct 10 Python
python实现矩阵乘法的方法
Jun 28 Python
Python实现简易版的Web服务器(推荐)
Jan 29 Python
Python日期时间Time模块实例详解
Apr 15 Python
Flask框架工厂函数用法实例分析
May 25 Python
python GUI库图形界面开发之PyQt5单选按钮控件QRadioButton详细使用方法与实例
Feb 28 Python
Django REST 异常处理详解
Jul 15 Python
Python基于template实现字符串替换
Nov 27 Python
python 装饰器的基本使用
Jan 13 Python
python 实现图与图之间的间距调整subplots_adjust
May 21 Python
Python实现制作销售数据可视化看板详解
Nov 27 Python
python -v 报错问题的解决方法
Sep 15 #Python
基于Python正确读取资源文件
Sep 14 #Python
Django框架安装及项目创建过程解析
Sep 14 #Python
通过代码实例了解Python sys模块
Sep 14 #Python
基于python实现简单C/S模式代码实例
Sep 14 #Python
Elasticsearch py客户端库安装及使用方法解析
Sep 14 #Python
基于python实现简单网页服务器代码实例
Sep 14 #Python
You might like
phpQuery让php处理html代码像jQuery一样方便
2015/01/06 PHP
PHP实现的DES加密解密封装类完整实例
2017/04/29 PHP
PHP Socket网络操作类定义与用法示例
2017/08/30 PHP
jquery select选中的一个小问题
2009/10/11 Javascript
关于window.pageYOffset和document.documentElement.scrollTop
2011/04/05 Javascript
把字符串按照特定的字母顺序进行排序的js代码
2014/01/28 Javascript
Redis基本知识、安装、部署、配置笔记
2015/03/05 Javascript
javascript获取本机操作系统类型的方法
2015/08/13 Javascript
利用jQuery及AJAX技术定时更新GridView的某一列数据
2015/12/04 Javascript
JS选取DOM元素的简单方法
2016/07/08 Javascript
JS排序之冒泡排序详解
2017/04/08 Javascript
微信小程序 slider的简单实例
2017/04/19 Javascript
js实现网页的两个input标签内的数值加减(示例代码)
2017/08/15 Javascript
仿淘宝JSsearch搜索下拉深度用法
2018/01/15 Javascript
如何解决React官方脚手架不支持Less的问题(小结)
2018/09/12 Javascript
在vue中使用setInterval的方法示例
2019/04/16 Javascript
在JavaScript中如何访问暂未存在的嵌套对象
2019/06/18 Javascript
layui异步加载table表中某一列数据的例子
2019/09/16 Javascript
微信小程序 点击切换样式scroll-view实现代码实例
2019/10/11 Javascript
[44:58]2018DOTA2亚洲邀请赛 4.5 淘汰赛 LGD vs Liquid 第二场
2018/04/06 DOTA
人机交互程序 python实现人机对话
2017/11/14 Python
浅谈Django中的数据库模型类-models.py(一对一的关系)
2018/05/30 Python
对TensorFlow的assign赋值用法详解
2018/07/30 Python
python实现统计文本中单词出现的频率详解
2019/05/20 Python
Django--权限Permissions的例子
2019/08/28 Python
python 下载文件的几种方法汇总
2021/01/06 Python
阿迪达斯俄罗斯官方商城:adidas俄罗斯
2017/03/08 全球购物
亚洲航空公司官方网站:AirAsia
2019/11/25 全球购物
英国家具、照明、家居用品网上商店:Wayfair.co.uk
2020/02/13 全球购物
信息工作经验交流材料
2014/05/28 职场文书
2014年保密工作总结
2014/11/22 职场文书
2014年中职班主任工作总结
2014/12/16 职场文书
世界地球日活动总结
2015/02/09 职场文书
高考作弊检讨书1500字
2015/02/16 职场文书
vue实现移动端div拖动效果
2022/03/03 Vue.js
MySQL一劳永逸永久支持输入中文的方法实例
2022/08/05 MySQL