Python requests上传文件实现步骤


Posted in Python onSeptember 15, 2020

官方文档:https://2.python-requests.org//en/master/

工作中涉及到一个功能,需要上传附件到一个接口,接口参数如下:

使用http post提交附件 multipart/form-data 格式,url : http://test.com/flow/upload,

字段列表:
md5:      //md5加密(随机值_当时时间戳)
filesize:  //文件大小
file:       //文件内容(须含文件名)
返回值:
{"success":true,"uploadName":"tmp.xml","uploadPath":"uploads\/201311\/758e875fb7c7a508feef6b5036119b9f"}

由于工作中主要用python,并且项目中已有使用requests库的地方,所以计划使用requests来实现,本来以为是很简单的一个小功能,结果花费了大量的时间,requests官方的例子只提到了上传文件,并不需要传额外的参数:

https://2.python-requests.org//en/master/user/quickstart/#post-a-multipart-encoded-file

>>> url = 'https://httpbin.org/post'
>>> files = {'file': ('report.xls', open('report.xls', 'rb'), 'application/vnd.ms-excel', {'Expires': '0'})}

>>> r = requests.post(url, files=files)
>>> r.text
{
 ...
 "files": {
  "file": "<censored...binary...data>"
 },
 ...
}

但是如果涉及到了参数的传递时,其实就要用到requests的两个参数:data、files,将要上传的文件传入files,将其他参数传入data,request库会将两者合并到一起做一个multi part,然后发送给服务器。

最终实现的代码是这样的:

with open(file_name) as f:
content = f.read()
request_data = {
  'md5':md5.md5('%d_%d' % (0, int(time.time()))).hexdigest(),
  'filesize':len(content),
}
files = {'file':(file_name, open(file_name, 'rb'))}
MyLogger().getlogger().info('url:%s' % (request_url))
resp = requests.post(request_url, data=request_data, files=files)

虽然最终代码可能看起来很简单,但是其实我费了好大功夫才确认这样是OK的,中间还翻了requests的源码,下面记录一下翻阅源码的过程:

首先,找到post方法的实现,在requests.api.py中:

def post(url, data=None, json=None, **kwargs):
  r"""Sends a POST request.

  :param url: URL for the new :class:`Request` object.
  :param data: (optional) Dictionary, list of tuples, bytes, or file-like
    object to send in the body of the :class:`Request`.
  :param json: (optional) json data to send in the body of the :class:`Request`.
  :param \*\*kwargs: Optional arguments that ``request`` takes.
  :return: :class:`Response <Response>` object
  :rtype: requests.Response
  """

  return request('post', url, data=data, json=json, **kwargs)

这里可以看到它调用了request方法,咱们继续跟进request方法,在requests.api.py中:

def request(method, url, **kwargs):
  """Constructs and sends a :class:`Request <Request>`.

  :param method: method for the new :class:`Request` object: ``GET``, ``OPTIONS``, ``HEAD``, ``POST``, ``PUT``, ``PATCH``, or ``DELETE``.
  :param url: URL for the new :class:`Request` object.
  :param params: (optional) Dictionary, list of tuples or bytes to send
    in the query string for the :class:`Request`.
  :param data: (optional) Dictionary, list of tuples, bytes, or file-like
    object to send in the body of the :class:`Request`.
  :param json: (optional) A JSON serializable Python object to send in the body of the :class:`Request`.
  :param headers: (optional) Dictionary of HTTP Headers to send with the :class:`Request`.
  :param cookies: (optional) Dict or CookieJar object to send with the :class:`Request`.
  :param files: (optional) Dictionary of ``'name': file-like-objects`` (or ``{'name': file-tuple}``) for multipart encoding upload.
    ``file-tuple`` can be a 2-tuple ``('filename', fileobj)``, 3-tuple ``('filename', fileobj, 'content_type')``
    or a 4-tuple ``('filename', fileobj, 'content_type', custom_headers)``, where ``'content-type'`` is a string
    defining the content type of the given file and ``custom_headers`` a dict-like object containing additional headers
    to add for the file.
  :param auth: (optional) Auth tuple to enable Basic/Digest/Custom HTTP Auth.
  :param timeout: (optional) How many seconds to wait for the server to send data
    before giving up, as a float, or a :ref:`(connect timeout, read
    timeout) <timeouts>` tuple.
  :type timeout: float or tuple
  :param allow_redirects: (optional) Boolean. Enable/disable GET/OPTIONS/POST/PUT/PATCH/DELETE/HEAD redirection. Defaults to ``True``.
  :type allow_redirects: bool
  :param proxies: (optional) Dictionary mapping protocol to the URL of the proxy.
  :param verify: (optional) Either a boolean, in which case it controls whether we verify
      the server's TLS certificate, or a string, in which case it must be a path
      to a CA bundle to use. Defaults to ``True``.
  :param stream: (optional) if ``False``, the response content will be immediately downloaded.
  :param cert: (optional) if String, path to ssl client cert file (.pem). If Tuple, ('cert', 'key') pair.
  :return: :class:`Response <Response>` object
  :rtype: requests.Response

  Usage::

   >>> import requests
   >>> req = requests.request('GET', 'https://httpbin.org/get')
   <Response [200]>
  """

  # By using the 'with' statement we are sure the session is closed, thus we
  # avoid leaving sockets open which can trigger a ResourceWarning in some
  # cases, and look like a memory leak in others.
  with sessions.Session() as session:
    return session.request(method=method, url=url, **kwargs)

这个方法的注释比较多,从注释里其实已经可以看到files参数使用传送文件,但是还是无法知道当需要同时传递参数和文件时该如何处理,继续跟进session.request方法,在requests.session.py中:

def request(self, method, url,
      params=None, data=None, headers=None, cookies=None, files=None,
      auth=None, timeout=None, allow_redirects=True, proxies=None,
      hooks=None, stream=None, verify=None, cert=None, json=None):
    """Constructs a :class:`Request <Request>`, prepares it and sends it.
    Returns :class:`Response <Response>` object.

    :param method: method for the new :class:`Request` object.
    :param url: URL for the new :class:`Request` object.
    :param params: (optional) Dictionary or bytes to be sent in the query
      string for the :class:`Request`.
    :param data: (optional) Dictionary, list of tuples, bytes, or file-like
      object to send in the body of the :class:`Request`.
    :param json: (optional) json to send in the body of the
      :class:`Request`.
    :param headers: (optional) Dictionary of HTTP Headers to send with the
      :class:`Request`.
    :param cookies: (optional) Dict or CookieJar object to send with the
      :class:`Request`.
    :param files: (optional) Dictionary of ``'filename': file-like-objects``
      for multipart encoding upload.
    :param auth: (optional) Auth tuple or callable to enable
      Basic/Digest/Custom HTTP Auth.
    :param timeout: (optional) How long to wait for the server to send
      data before giving up, as a float, or a :ref:`(connect timeout,
      read timeout) <timeouts>` tuple.
    :type timeout: float or tuple
    :param allow_redirects: (optional) Set to True by default.
    :type allow_redirects: bool
    :param proxies: (optional) Dictionary mapping protocol or protocol and
      hostname to the URL of the proxy.
    :param stream: (optional) whether to immediately download the response
      content. Defaults to ``False``.
    :param verify: (optional) Either a boolean, in which case it controls whether we verify
      the server's TLS certificate, or a string, in which case it must be a path
      to a CA bundle to use. Defaults to ``True``.
    :param cert: (optional) if String, path to ssl client cert file (.pem).
      If Tuple, ('cert', 'key') pair.
    :rtype: requests.Response
    """
    # Create the Request.
    req = Request(
      method=method.upper(),
      url=url,
      headers=headers,
      files=files,
      data=data or {},
      json=json,
      params=params or {},
      auth=auth,
      cookies=cookies,
      hooks=hooks,
    )
    prep = self.prepare_request(req)

    proxies = proxies or {}

    settings = self.merge_environment_settings(
      prep.url, proxies, stream, verify, cert
    )

    # Send the request.
    send_kwargs = {
      'timeout': timeout,
      'allow_redirects': allow_redirects,
    }
    send_kwargs.update(settings)
    resp = self.send(prep, **send_kwargs)

    return resp

先大概看一下这个方法,先是准备request,最后一步是调用send,推测应该是发送请求了,所以我们需要跟进到prepare_request方法中,在requests.session.py中:

def prepare_request(self, request):
    """Constructs a :class:`PreparedRequest <PreparedRequest>` for
    transmission and returns it. The :class:`PreparedRequest` has settings
    merged from the :class:`Request <Request>` instance and those of the
    :class:`Session`.

    :param request: :class:`Request` instance to prepare with this
      session's settings.
    :rtype: requests.PreparedRequest
    """
    cookies = request.cookies or {}

    # Bootstrap CookieJar.
    if not isinstance(cookies, cookielib.CookieJar):
      cookies = cookiejar_from_dict(cookies)

    # Merge with session cookies
    merged_cookies = merge_cookies(
      merge_cookies(RequestsCookieJar(), self.cookies), cookies)

    # Set environment's basic authentication if not explicitly set.
    auth = request.auth
    if self.trust_env and not auth and not self.auth:
      auth = get_netrc_auth(request.url)

    p = PreparedRequest()
    p.prepare(
      method=request.method.upper(),
      url=request.url,
      files=request.files,
      data=request.data,
      json=request.json,
      headers=merge_setting(request.headers, self.headers, dict_class=CaseInsensitiveDict),
      params=merge_setting(request.params, self.params),
      auth=merge_setting(auth, self.auth),
      cookies=merged_cookies,
      hooks=merge_hooks(request.hooks, self.hooks),
    )
    return p

在prepare_request中,生成了一个PreparedRequest对象,并调用其prepare方法,跟进到prepare方法中,在requests.models.py中:

def prepare(self,
      method=None, url=None, headers=None, files=None, data=None,
      params=None, auth=None, cookies=None, hooks=None, json=None):
    """Prepares the entire request with the given parameters."""

    self.prepare_method(method)
    self.prepare_url(url, params)
    self.prepare_headers(headers)
    self.prepare_cookies(cookies)
    self.prepare_body(data, files, json)
    self.prepare_auth(auth, url)

    # Note that prepare_auth must be last to enable authentication schemes
    # such as OAuth to work on a fully prepared request.

    # This MUST go after prepare_auth. Authenticators could add a hook
    self.prepare_hooks(hooks)

这里调用许多prepare_xx方法,这里我们只关心处理了data、files、json的方法,跟进到prepare_body中,在requests.models.py中:

def prepare_body(self, data, files, json=None):
    """Prepares the given HTTP body data."""

    # Check if file, fo, generator, iterator.
    # If not, run through normal process.

    # Nottin' on you.
    body = None
    content_type = None

    if not data and json is not None:
      # urllib3 requires a bytes-like body. Python 2's json.dumps
      # provides this natively, but Python 3 gives a Unicode string.
      content_type = 'application/json'
      body = complexjson.dumps(json)
      if not isinstance(body, bytes):
        body = body.encode('utf-8')

    is_stream = all([
      hasattr(data, '__iter__'),
      not isinstance(data, (basestring, list, tuple, Mapping))
    ])

    try:
      length = super_len(data)
    except (TypeError, AttributeError, UnsupportedOperation):
      length = None

    if is_stream:
      body = data

      if getattr(body, 'tell', None) is not None:
        # Record the current file position before reading.
        # This will allow us to rewind a file in the event
        # of a redirect.
        try:
          self._body_position = body.tell()
        except (IOError, OSError):
          # This differentiates from None, allowing us to catch
          # a failed `tell()` later when trying to rewind the body
          self._body_position = object()

      if files:
        raise NotImplementedError('Streamed bodies and files are mutually exclusive.')

      if length:
        self.headers['Content-Length'] = builtin_str(length)
      else:
        self.headers['Transfer-Encoding'] = 'chunked'
    else:
      # Multi-part file uploads.
      if files:
        (body, content_type) = self._encode_files(files, data)
      else:
        if data:
          body = self._encode_params(data)
          if isinstance(data, basestring) or hasattr(data, 'read'):
            content_type = None
          else:
            content_type = 'application/x-www-form-urlencoded'

      self.prepare_content_length(body)

      # Add content-type if it wasn't explicitly provided.
      if content_type and ('content-type' not in self.headers):
        self.headers['Content-Type'] = content_type

    self.body = body

这个函数比较长,需要重点关注L52,这里调用了_encode_files方法,我们跟进这个方法:

def _encode_files(files, data):
    """Build the body for a multipart/form-data request.

    Will successfully encode files when passed as a dict or a list of
    tuples. Order is retained if data is a list of tuples but arbitrary
    if parameters are supplied as a dict.
    The tuples may be 2-tuples (filename, fileobj), 3-tuples (filename, fileobj, contentype)
    or 4-tuples (filename, fileobj, contentype, custom_headers).
    """
    if (not files):
      raise ValueError("Files must be provided.")
    elif isinstance(data, basestring):
      raise ValueError("Data must not be a string.")

    new_fields = []
    fields = to_key_val_list(data or {})
    files = to_key_val_list(files or {})

    for field, val in fields:
      if isinstance(val, basestring) or not hasattr(val, '__iter__'):
        val = [val]
      for v in val:
        if v is not None:
          # Don't call str() on bytestrings: in Py3 it all goes wrong.
          if not isinstance(v, bytes):
            v = str(v)

          new_fields.append(
            (field.decode('utf-8') if isinstance(field, bytes) else field,
             v.encode('utf-8') if isinstance(v, str) else v))

    for (k, v) in files:
      # support for explicit filename
      ft = None
      fh = None
      if isinstance(v, (tuple, list)):
        if len(v) == 2:
          fn, fp = v
        elif len(v) == 3:
          fn, fp, ft = v
        else:
          fn, fp, ft, fh = v
      else:
        fn = guess_filename(v) or k
        fp = v

      if isinstance(fp, (str, bytes, bytearray)):
        fdata = fp
      elif hasattr(fp, 'read'):
        fdata = fp.read()
      elif fp is None:
        continue
      else:
        fdata = fp

      rf = RequestField(name=k, data=fdata, filename=fn, headers=fh)
      rf.make_multipart(content_type=ft)
      new_fields.append(rf)

    body, content_type = encode_multipart_formdata(new_fields)

    return body, content_type

OK,到此为止,仔细阅读完这个段代码,就可以搞明白requests.post方法传入的data、files两个参数的作用了,其实requests在这里把它俩合并在一起了,作为post的body。

以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持三水点靠木。

Python 相关文章推荐
Python获取网页上图片下载地址的方法
Mar 11 Python
简单学习Python time模块
Apr 29 Python
Python正则抓取网易新闻的方法示例
Apr 21 Python
使用实现pandas读取csv文件指定的前几行
Apr 20 Python
Python实现App自动签到领取积分功能
Sep 29 Python
python添加模块搜索路径和包的导入方法
Jan 19 Python
Python配置文件处理的方法教程
Aug 29 Python
python hash每次调用结果不同的原因
Nov 21 Python
Python内置方法实现字符串的秘钥加解密(推荐)
Dec 09 Python
Python TKinter如何自动关闭主窗口
Feb 26 Python
Python3标准库之functools管理函数的工具详解
Feb 27 Python
pytorch实现ResNet结构的实例代码
May 17 Python
python -v 报错问题的解决方法
Sep 15 #Python
基于Python正确读取资源文件
Sep 14 #Python
Django框架安装及项目创建过程解析
Sep 14 #Python
通过代码实例了解Python sys模块
Sep 14 #Python
基于python实现简单C/S模式代码实例
Sep 14 #Python
Elasticsearch py客户端库安装及使用方法解析
Sep 14 #Python
基于python实现简单网页服务器代码实例
Sep 14 #Python
You might like
discuz Passport 通行证 整合笔记
2008/06/30 PHP
PHP isset()与empty()的使用区别详解
2010/08/29 PHP
搭建Vim为自定义的PHP开发工具的一些技巧
2015/12/11 PHP
PHP正则表达式匹配替换与分割功能实例浅析
2017/02/04 PHP
PHP获取ttf格式文件字体名的方法示例
2019/03/06 PHP
JQuery获取各种宽度、高度(format函数)实例
2013/03/04 Javascript
没有document.getElementByName方法
2013/08/19 Javascript
js图片处理示例代码
2014/05/12 Javascript
JQuery弹出层示例可自定义
2014/05/19 Javascript
js换图片效果可进行定时操作
2014/06/09 Javascript
jQuery实现列表的全选功能
2015/03/18 Javascript
JavaScript实现多个重叠层点击切换效果的方法
2015/04/24 Javascript
JavaScript实现的SHA-1加密算法完整实例
2016/02/02 Javascript
JavaScript中匿名函数的用法及优缺点详解
2016/06/01 Javascript
Vuejs 组件——props数据传递的实例代码
2017/03/07 Javascript
javascript深拷贝的原理与实现方法分析
2017/04/10 Javascript
vue+swiper实现组件化开发的实例代码
2017/10/26 Javascript
vue-cli配置文件——config篇
2018/01/04 Javascript
VScode格式化ESlint方法(最全最好用方法)
2019/09/10 Javascript
js面向对象之实现淘宝放大镜
2020/01/15 Javascript
[01:28]2014DOTA2国际邀请赛中国区预选赛四大豪门直升机抵达会场
2014/05/24 DOTA
python实现自动更换ip的方法
2015/05/05 Python
【Python】Python的urllib模块、urllib2模块批量进行网页下载文件
2016/11/19 Python
读取本地json文件,解析json(实例讲解)
2017/12/06 Python
Python中fnmatch模块的使用详情
2018/11/30 Python
Python自动化之数据驱动让你的脚本简洁10倍【推荐】
2019/06/04 Python
Python学习笔记之文件的读写操作实例分析
2019/08/07 Python
基于Python获取照片的GPS位置信息
2020/01/20 Python
使用CSS3美化HTML表单的技巧演示
2016/05/17 HTML / CSS
找到您丢失的钥匙、钱包和手机:Tile
2017/05/19 全球购物
美国廉价机票预订网站:Cheapfaremart
2018/04/28 全球购物
我的动漫时代的创业计划书范文
2014/01/27 职场文书
2015年销售人员工作总结
2015/04/07 职场文书
商务宴会祝酒词
2015/08/11 职场文书
导游词之桂林
2019/08/20 职场文书
详解NumPy中的线性关系与数据修剪压缩
2022/05/25 Python