python并发和异步编程实例


Posted in Python onNovember 15, 2018

关于并发、并行、同步阻塞、异步非阻塞、线程、进程、协程等这些概念,单纯通过文字恐怕很难有比较深刻的理解,本文就通过代码一步步实现这些并发和异步编程,并进行比较。解释器方面本文选择python3,毕竟python3才是python的未来,并且python3用原生的库实现协程已经非常方便了。

1、准备阶段

下面为所有测试代码所需要的包

#! python3
# coding:utf-8

import socket
from concurrent import futures
from selectors import DefaultSelector,EVENT_WRITE,EVENT_READ
import asyncio
import aiohttp
import time
from time import ctime

在进行不同实现方式的比较时,实现场景就是在进行爬虫开发的时候通过向对方网站发起一系列的http请求访问,统计耗时来判断实现方式的优劣,具体地,通过建立通信套接字,访问新浪主页,返回源码,作为一次请求。先实现一个装饰器用来统计函数的执行时间:

def tsfunc(func):
  def wrappedFunc(*args,**kargs):
    start = time.clock()
    action = func(*args,**kargs)
    time_delta = time.clock() - start
    print ('[{0}] {1}() called, time delta: {2}'.format(ctime(),func.__name__,time_delta))
    return action
  return wrappedFunc

输出的格式为:当前时间,调用的函数,函数的执行时间。

2、阻塞/非阻塞和同步/异步

这两对概念不是很好区分,从定义上理解:

阻塞:在进行socket通信过程中,一个线程发起请求,如果当前请求没有返回结果,则进入sleep状态,期间线程挂起不能做其他操作,直到有返回结果,或者超时(如果设置超时的话)。
非阻塞:与阻塞相似,只不过在等待请求结果时,线程并不挂起而是进行其他操作,即在不能立刻得到结果之前,该函数不会阻挂起当前线程,而会立刻返回。
同步:同步和阻塞比较相似,但是二者并不是同一个概念,同步是指完成事件的逻辑,是指一件事完成之后,再完成第二件事,以此类推…
异步:异步和非阻塞比较类似,异步的概念和同步相对。当一个异步过程调用发出后,调用者不能立刻得到结果。实际处理这个调用的部件在完成后,通过状态、通知和回调来通知调用者,实现异步的方式通俗讲就是“等会再告诉你”。

1)阻塞方式

回到代码上,首先实现阻塞方式的请求函数:

def blocking_way():
  sock = socket.socket()
  sock.connect(('www.sina.com',80))
  request = 'GET / HTTP/1.0\r\nHOST:www.sina.com\r\n\r\n'
  sock.send(request.encode('ascii'))
  response = b''
  chunk = sock.recv(4096)
  while chunk:
    response += chunk
    chunk = sock.recv(4096)
  return response

测试线程、多进程和多线程

# 阻塞无并发
@tsfunc
def sync_way():
  res = []
  for i in range(10):
    res.append(blocking_way())
  return len(res)
@tsfunc
# 阻塞、多进程
def process_way():
  worker = 10
  with futures.ProcessPoolExecutor(worker) as executor:
    futs = {executor.submit(blocking_way) for i in range(10)}
  return len([fut.result() for fut in futs])
# 阻塞、多线程
@tsfunc
def thread_way():
  worker = 10
  with futures.ThreadPoolExecutor(worker) as executor:
    futs = {executor.submit(blocking_way) for i in range(10)}
  return len([fut.result() for fut in futs])

运行结果:

[Wed Dec 13 16:52:25 2017] sync_way() called, time delta: 0.06371647809425328
[Wed Dec 13 16:52:28 2017] process_way() called, time delta: 2.31437644946734
[Wed Dec 13 16:52:28 2017] thread_way() called, time delta: 0.010172946070299727

可见与非并发的方式相比,启动10个进程完成10次请求访问耗费的时间最长,进程确实需要很大的系统开销,相比多线程则效果好得多,启动10个线程并发请求,比顺序请求速度快了6倍左右。

2)非阻塞方式

实现非阻塞的请求代码,与阻塞方式的区别在于等待请求时并不挂起而是直接返回,为了确保能正确读取消息,最原始的方式就是循环读取,知道读取完成为跳出循环,代码如下:

def nonblocking_way():
  sock = socket.socket(socket.AF_INET,socket.SOCK_STREAM)
  sock.setblocking(False)
  try:
    sock.connect(('www.sina.com', 80))
  except BlockingIOError:
    pass
  request = 'GET / HTTP/1.0\r\nHost: www.sina.com\r\n\r\n'
  data = request.encode('ascii')
  while True:
    try:
      sock.send(data)
      break
    except OSError:
      pass

  response = b''
  while True:
    try:
      chunk = sock.recv(4096)
      while chunk:
        response += chunk
        chunk = sock.recv(4096)
      break
    except OSError:
      pass

  return response

测试单线程异步非阻塞方式:

@tsfunc
def async_way():
  res = []
  for i in range(10):
    res.append(nonblocking_way())
  return len(res)

测试结果与单线程同步阻塞方式相比:

[Wed Dec 13 17:18:30 2017] sync_way() called, time delta: 0.07342884475822574
[Wed Dec 13 17:18:30 2017] async_way() called, time delta: 0.06509009095694886

非阻塞方式起到了一定的效果,但是并不明显,原因肯定是读取消息的时候虽然不是在线程挂起的时候而是在循环读取消息的时候浪费了时间,如果大部分时间读浪费了并没有发挥异步编程的威力,解决的办法就是后面要说的【事件驱动】

3、回调、生成器和协程

a、回调

class Crawler():
  def __init__(self,url):
    self.url = url
    self.sock = None
    self.response = b''

  def fetch(self):
    self.sock = socket.socket()
    self.sock.setblocking(False)
    try:
      self.sock.connect(('www.sina.com',80))
    except BlockingIOError:
      pass
    selector.register(self.sock.fileno(),EVENT_WRITE,self.connected)

  def connected(self,key,mask):
    selector.unregister(key.fd)
    get = 'GET {0} HTTP/1.0\r\nHost:www.sina.com\r\n\r\n'.format(self.url)
    self.sock.send(get.encode('ascii'))
    selector.register(key.fd,EVENT_READ,self.read_response)

  def read_response(self,key,mask):
    global stopped
    while True:
      try:
        chunk = self.sock.recv(4096)
        if chunk:
          self.response += chunk
          chunk = self.sock.recv(4096)
        else:
          selector.unregister(key.fd)
          urls_todo.remove(self.url)
          if not urls_todo:
            stopped = True
        break
      except:
        pass

def loop():
  while not stopped:
    events = selector.select()
    for event_key,event_mask in events:
      callback = event_key.data
      callback(event_key,event_mask)
 @tsfunc
def callback_way():
  for url in urls_todo:
    crawler = Crawler(url)
    crawler.fetch()
  loop1()

这是通过传统回调方式实现的异步编程,结果如下:

[Tue Mar 27 17:52:49 2018] callback_way() called, time delta: 0.054735804048789374

b、生成器

class Crawler2:
  def __init__(self, url):
    self.url = url
    self.response = b''

  def fetch(self):
    global stopped
    sock = socket.socket()
    yield from connect(sock, ('www.sina.com', 80))
    get = 'GET {0} HTTP/1.0\r\nHost: www.sina.com\r\n\r\n'.format(self.url)
    sock.send(get.encode('ascii'))
    self.response = yield from read_all(sock)
    urls_todo.remove(self.url)
    if not urls_todo:
      stopped = True

class Task:
  def __init__(self, coro):
    self.coro = coro
    f = Future1()
    f.set_result(None)
    self.step(f)

  def step(self, future):
    try:
      # send会进入到coro执行, 即fetch, 直到下次yield
      # next_future 为yield返回的对象
      next_future = self.coro.send(future.result)
    except StopIteration:
      return
    next_future.add_done_callback(self.step)

def loop1():
  while not stopped:
    events = selector.select()
    for event_key,event_mask in events:
      callback = event_key.data
      callback()

运行结果如下:

[Tue Mar 27 17:54:27 2018] generate_way() called, time delta: 0.2914336347673473

c、协程

def nonblocking_way():
  sock = socket.socket(socket.AF_INET,socket.SOCK_STREAM)
  sock.setblocking(False)
  try:
    sock.connect(('www.sina.com', 80))
  except BlockingIOError:
    pass
  request = 'GET / HTTP/1.0\r\nHost: www.sina.com\r\n\r\n'
  data = request.encode('ascii')
  while True:
    try:
      sock.send(data)
      break
    except OSError:
      pass

  response = b''
  while True:
    try:
      chunk = sock.recv(4096)
      while chunk:
        response += chunk
        chunk = sock.recv(4096)
      break
    except OSError:
      pass

  return response
@tsfunc
def asyncio_way():
    tasks = [fetch(host+url) for url in urls_todo]
    loop.run_until_complete(asyncio.gather(*tasks))
    return (len(tasks))

运行结果:

[Tue Mar 27 17:56:17 2018] asyncio_way() called, time delta: 0.43688060698484166

到此终于把并发和异步编程实例代码测试完,下边贴出全部代码,共读者自行测试,在任务量加大时,相信结果会大不一样。

#! python3
# coding:utf-8

import socket
from concurrent import futures
from selectors import DefaultSelector,EVENT_WRITE,EVENT_READ
import asyncio
import aiohttp
import time
from time import ctime

def tsfunc(func):
  def wrappedFunc(*args,**kargs):
    start = time.clock()
    action = func(*args,**kargs)
    time_delta = time.clock() - start
    print ('[{0}] {1}() called, time delta: {2}'.format(ctime(),func.__name__,time_delta))
    return action
  return wrappedFunc

def blocking_way():
  sock = socket.socket()
  sock.connect(('www.sina.com',80))
  request = 'GET / HTTP/1.0\r\nHOST:www.sina.com\r\n\r\n'
  sock.send(request.encode('ascii'))
  response = b''
  chunk = sock.recv(4096)
  while chunk:
    response += chunk
    chunk = sock.recv(4096)
  return response

def nonblocking_way():
  sock = socket.socket(socket.AF_INET,socket.SOCK_STREAM)
  sock.setblocking(False)
  try:
    sock.connect(('www.sina.com', 80))
  except BlockingIOError:
    pass
  request = 'GET / HTTP/1.0\r\nHost: www.sina.com\r\n\r\n'
  data = request.encode('ascii')
  while True:
    try:
      sock.send(data)
      break
    except OSError:
      pass

  response = b''
  while True:
    try:
      chunk = sock.recv(4096)
      while chunk:
        response += chunk
        chunk = sock.recv(4096)
      break
    except OSError:
      pass

  return response


selector = DefaultSelector()
stopped = False
urls_todo = ['/','/1','/2','/3','/4','/5','/6','/7','/8','/9']


class Crawler():
  def __init__(self,url):
    self.url = url
    self.sock = None
    self.response = b''

  def fetch(self):
    self.sock = socket.socket()
    self.sock.setblocking(False)
    try:
      self.sock.connect(('www.sina.com',80))
    except BlockingIOError:
      pass
    selector.register(self.sock.fileno(),EVENT_WRITE,self.connected)

  def connected(self,key,mask):
    selector.unregister(key.fd)
    get = 'GET {0} HTTP/1.0\r\nHost:www.sina.com\r\n\r\n'.format(self.url)
    self.sock.send(get.encode('ascii'))
    selector.register(key.fd,EVENT_READ,self.read_response)

  def read_response(self,key,mask):
    global stopped
    while True:
      try:
        chunk = self.sock.recv(4096)
        if chunk:
          self.response += chunk
          chunk = self.sock.recv(4096)
        else:
          selector.unregister(key.fd)
          urls_todo.remove(self.url)
          if not urls_todo:
            stopped = True
        break
      except:
        pass

def loop():
  while not stopped:
    events = selector.select()
    for event_key,event_mask in events:
      callback = event_key.data
      callback(event_key,event_mask)


# 基于生成器的协程
class Future:
  def __init__(self):
    self.result = None
    self._callbacks = []

  def add_done_callback(self,fn):
    self._callbacks.append(fn)

  def set_result(self,result):
    self.result = result
    for fn in self._callbacks:
      fn(self)

class Crawler1():
  def __init__(self,url):
    self.url = url
    self.response = b''

  def fetch(self):
    sock = socket.socket()
    sock.setblocking(False)
    try:
      sock.connect(('www.sina.com',80))
    except BlockingIOError:
      pass

    f = Future()
    def on_connected():
      f.set_result(None)

    selector.register(sock.fileno(),EVENT_WRITE,on_connected)
    yield f
    selector.unregister(sock.fileno())
    get = 'GET {0} HTTP/1.0\r\nHost: www.sina.com\r\n\r\n'.format(self.url)
    sock.send(get.encode('ascii'))

    global stopped
    while True:
      f = Future()
      def on_readable():
        f.set_result(sock.recv(4096))
      selector.register(sock.fileno(),EVENT_READ,on_readable)
      chunk = yield f
      selector.unregister(sock.fileno())
      if chunk:
        self.response += chunk
      else:
        urls_todo.remove(self.url)
        if not urls_todo:
          stopped = True
        break


# yield from 改进的生成器协程
class Future1:
  def __init__(self):
    self.result = None
    self._callbacks = []

  def add_done_callback(self,fn):
    self._callbacks.append(fn)

  def set_result(self,result):
    self.result = result
    for fn in self._callbacks:
      fn(self)

  def __iter__(self):
    yield self
    return self.result

def connect(sock, address):
  f = Future1()
  sock.setblocking(False)
  try:
    sock.connect(address)
  except BlockingIOError:
    pass

  def on_connected():
    f.set_result(None)

  selector.register(sock.fileno(), EVENT_WRITE, on_connected)
  yield from f
  selector.unregister(sock.fileno())

def read(sock):
  f = Future1()

  def on_readable():
    f.set_result(sock.recv(4096))

  selector.register(sock.fileno(), EVENT_READ, on_readable)
  chunk = yield from f
  selector.unregister(sock.fileno())
  return chunk

def read_all(sock):
  response = []
  chunk = yield from read(sock)
  while chunk:
    response.append(chunk)
    chunk = yield from read(sock)
  return b''.join(response)

class Crawler2:
  def __init__(self, url):
    self.url = url
    self.response = b''

  def fetch(self):
    global stopped
    sock = socket.socket()
    yield from connect(sock, ('www.sina.com', 80))
    get = 'GET {0} HTTP/1.0\r\nHost: www.sina.com\r\n\r\n'.format(self.url)
    sock.send(get.encode('ascii'))
    self.response = yield from read_all(sock)
    urls_todo.remove(self.url)
    if not urls_todo:
      stopped = True


class Task:
  def __init__(self, coro):
    self.coro = coro
    f = Future1()
    f.set_result(None)
    self.step(f)

  def step(self, future):
    try:
      # send会进入到coro执行, 即fetch, 直到下次yield
      # next_future 为yield返回的对象
      next_future = self.coro.send(future.result)
    except StopIteration:
      return
    next_future.add_done_callback(self.step)

def loop1():
  while not stopped:
    events = selector.select()
    for event_key,event_mask in events:
      callback = event_key.data
      callback()


# asyncio 协程
host = 'http://www.sina.com'
loop = asyncio.get_event_loop()

async def fetch(url):
  async with aiohttp.ClientSession(loop=loop) as session:
    async with session.get(url) as response:
      response = await response.read()
      return response

@tsfunc
def asyncio_way():
    tasks = [fetch(host+url) for url in urls_todo]
    loop.run_until_complete(asyncio.gather(*tasks))
    return (len(tasks))

@tsfunc
def sync_way():
  res = []
  for i in range(10):
    res.append(blocking_way())
  return len(res)

@tsfunc
def process_way():
  worker = 10
  with futures.ProcessPoolExecutor(worker) as executor:
    futs = {executor.submit(blocking_way) for i in range(10)}
  return len([fut.result() for fut in futs])

@tsfunc
def thread_way():
  worker = 10
  with futures.ThreadPoolExecutor(worker) as executor:
    futs = {executor.submit(blocking_way) for i in range(10)}
  return len([fut.result() for fut in futs])

@tsfunc
def async_way():
  res = []
  for i in range(10):
    res.append(nonblocking_way())
  return len(res)

@tsfunc
def callback_way():
  for url in urls_todo:
    crawler = Crawler(url)
    crawler.fetch()
  loop1()

@tsfunc
def generate_way():
  for url in urls_todo:
    crawler = Crawler2(url)
    Task(crawler.fetch())
  loop1()

if __name__ == '__main__':

  #sync_way()
  #process_way()
  #thread_way()
  #async_way()
  #callback_way()
  #generate_way()
  asyncio_way()

以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持三水点靠木。

Python 相关文章推荐
python使用ctypes模块调用windowsapi获取系统版本示例
Apr 17 Python
Python中的index()方法使用教程
May 18 Python
举例讲解如何在Python编程中进行迭代和遍历
Jan 19 Python
python 批量修改/替换数据的实例
Jul 25 Python
python使用response.read()接收json数据的实例
Dec 19 Python
django-crontab 定时执行任务方法的实现
Sep 06 Python
关于Pytorch的MLP模块实现方式
Jan 07 Python
Python工程师必考的6个经典面试题
Jun 28 Python
在CentOS7下安装Python3教程解析
Jul 09 Python
如何快速一次性卸载所有python包(第三方库)呢
Oct 20 Python
Python就将所有的英文单词首字母变成大写
Feb 12 Python
python绘图模块之利用turtle画图
Feb 12 Python
Numpy截取指定范围内的数据方法
Nov 14 #Python
python numpy元素的区间查找方法
Nov 14 #Python
python爬虫之urllib库常用方法用法总结大全
Nov 14 #Python
Python3爬取英雄联盟英雄皮肤大图实例代码
Nov 14 #Python
python 顺时针打印矩阵的超简洁代码
Nov 14 #Python
Python 实现取矩阵的部分列,保存为一个新的矩阵方法
Nov 14 #Python
Python实现常见的回文字符串算法
Nov 14 #Python
You might like
火影忍者:这才是千手柱间和扉间的真正死因,角都就比较搞笑了!
2020/03/10 日漫
解决了Ajax、MySQL 和 Zend Framework 的乱码问题
2009/03/03 PHP
PHP 5.3.0 安装分析心得
2009/08/07 PHP
PHP获取数组中某元素的位置及array_keys函数应用
2013/01/29 PHP
php过滤HTML标签、属性等正则表达式汇总
2014/09/22 PHP
CI框架AR数据库操作常用函数总结
2016/11/21 PHP
PHP迭代器和迭代的实现与使用方法分析
2018/04/19 PHP
laravel自定义分页的实现案例offset()和limit()
2019/10/15 PHP
广告显示判断
2006/08/31 Javascript
用JavaScript获取网页中的js、css、Flash等文件
2006/12/20 Javascript
JS解密入门 最终变量劫持
2008/06/25 Javascript
javascript弹出层输入框(示例代码)
2013/12/11 Javascript
原生js实现百叶窗效果及原理介绍
2016/04/12 Javascript
Jquery实现$.fn.extend和$.extend函数
2016/04/14 Javascript
jQuery实现ToolTip元素定位显示功能示例
2016/11/23 Javascript
JS实现json的序列化和反序列化功能示例
2017/06/13 Javascript
详解Vuejs2.0 如何利用proxyTable实现跨域请求
2017/08/03 Javascript
vue实现城市列表选择功能
2018/07/16 Javascript
利用Blob进行文件上传的完整步骤
2018/08/02 Javascript
Vue 与 Vuex 的第一次接触遇到的坑
2018/08/16 Javascript
vue强制刷新组件的方法示例
2019/02/28 Javascript
vue2.* element tabs tab-pane 动态加载组件操作
2020/07/19 Javascript
Ant design vue table 单击行选中 勾选checkbox教程
2020/10/24 Javascript
python实现端口转发器的方法
2015/03/13 Python
实例讲解Python爬取网页数据
2018/07/08 Python
使用Python进行中文繁简转换的实现代码
2019/10/18 Python
Django框架序列化与反序列化操作详解
2019/11/01 Python
Python模拟登录和登录跳转的参考示例
2020/10/30 Python
自定义html标记替换html5新增元素
2008/10/17 HTML / CSS
英国第一独立滑雪板商店:The Snowboard Asylum
2020/01/16 全球购物
写一个函数,要求输入一个字符串和一个字符长度,对该字符串进行分隔
2015/07/30 面试题
财务主管自我鉴定
2014/01/17 职场文书
通信研究生自荐信
2014/02/01 职场文书
2014年敬老院工作总结
2014/12/08 职场文书
整脏治乱工作简报
2015/07/21 职场文书
解决ObjectMapper.convertValue() 遇到的一些问题
2021/06/30 Java/Android