使用Python的Twisted框架编写非阻塞程序的代码示例


Posted in Python onMay 25, 2016

先来看一段代码:

# ~*~ Twisted - A Python tale ~*~

from time import sleep

# Hello, I'm a developer and I mainly setup Wordpress.
def install_wordpress(customer):
  # Our hosting company Threads Ltd. is bad. I start installation and...
  print "Start installation for", customer
  # ...then wait till the installation finishes successfully. It is
  # boring and I'm spending most of my time waiting while consuming
  # resources (memory and some CPU cycles). It's because the process
  # is *blocking*.
  sleep(3)
  print "All done for", customer

# I do this all day long for our customers
def developer_day(customers):
  for customer in customers:
    install_wordpress(customer)

developer_day(["Bill", "Elon", "Steve", "Mark"])

运行一下,结果如下所示:

$ ./deferreds.py 1
------ Running example 1 ------
Start installation for Bill
All done for Bill
Start installation
...
* Elapsed time: 12.03 seconds

这是一段顺序执行的代码。四个消费者,为一个人安装需要3秒的时间,那么四个人就是12秒。这样处理不是很令人满意,所以看一下第二个使用了线程的例子:

import threading

# The company grew. We now have many customers and I can't handle the
# workload. We are now 5 developers doing exactly the same thing.
def developers_day(customers):
  # But we now have to synchronize... a.k.a. bureaucracy
  lock = threading.Lock()
  #
  def dev_day(id):
    print "Goodmorning from developer", id
    # Yuck - I hate locks...
    lock.acquire()
    while customers:
      customer = customers.pop(0)
      lock.release()
      # My Python is less readable
      install_wordpress(customer)
      lock.acquire()
    lock.release()
    print "Bye from developer", id
  # We go to work in the morning
  devs = [threading.Thread(target=dev_day, args=(i,)) for i in range(5)]
  [dev.start() for dev in devs]
  # We leave for the evening
  [dev.join() for dev in devs]

# We now get more done in the same time but our dev process got more
# complex. As we grew we spend more time managing queues than doing dev
# work. We even had occasional deadlocks when processes got extremely
# complex. The fact is that we are still mostly pressing buttons and
# waiting but now we also spend some time in meetings.
developers_day(["Customer %d" % i for i in xrange(15)])

运行一下:

$ ./deferreds.py 2
------ Running example 2 ------
Goodmorning from developer 0Goodmorning from developer
1Start installation forGoodmorning from developer 2
Goodmorning from developer 3Customer 0
...
from developerCustomer 13 3Bye from developer 2
* Elapsed time: 9.02 seconds

这次是一段并行执行的代码,使用了5个工作线程。15个消费者每个花费3s意味着总共45s的时间,不过用了5个线程并行执行总共只花费了9s的时间。这段代码有点复杂,很大一部分代码是用于管理并发,而不是专注于算法或者业务逻辑。另外,程序的输出结果看起来也很混杂,可读性也天津市。即使是简单的多线程的代码同样也难以写得很好,所以我们转为使用Twisted:

# For years we thought this was all there was... We kept hiring more
# developers, more managers and buying servers. We were trying harder
# optimising processes and fire-fighting while getting mediocre
# performance in return. Till luckily one day our hosting
# company decided to increase their fees and we decided to
# switch to Twisted Ltd.!

from twisted.internet import reactor
from twisted.internet import defer
from twisted.internet import task

# Twisted has a slightly different approach
def schedule_install(customer):
  # They are calling us back when a Wordpress installation completes.
  # They connected the caller recognition system with our CRM and
  # we know exactly what a call is about and what has to be done next.
  #
  # We now design processes of what has to happen on certain events.
  def schedule_install_wordpress():
      def on_done():
        print "Callback: Finished installation for", customer
    print "Scheduling: Installation for", customer
    return task.deferLater(reactor, 3, on_done)
  #
  def all_done(_):
    print "All done for", customer
  #
  # For each customer, we schedule these processes on the CRM
  # and that
  # is all our chief-Twisted developer has to do
  d = schedule_install_wordpress()
  d.addCallback(all_done)
  #
  return d

# Yes, we don't need many developers anymore or any synchronization.
# ~~ Super-powered Twisted developer ~~
def twisted_developer_day(customers):
  print "Goodmorning from Twisted developer"
  #
  # Here's what has to be done today
  work = [schedule_install(customer) for customer in customers]
  # Turn off the lights when done
  join = defer.DeferredList(work)
  join.addCallback(lambda _: reactor.stop())
  #
  print "Bye from Twisted developer!"
# Even his day is particularly short!
twisted_developer_day(["Customer %d" % i for i in xrange(15)])

# Reactor, our secretary uses the CRM and follows-up on events!
reactor.run()

运行结果:

------ Running example 3 ------
Goodmorning from Twisted developer
Scheduling: Installation for Customer 0
....
Scheduling: Installation for Customer 14
Bye from Twisted developer!
Callback: Finished installation for Customer 0
All done for Customer 0
Callback: Finished installation for Customer 1
All done for Customer 1
...
All done for Customer 14
* Elapsed time: 3.18 seconds

这次我们得到了完美的执行代码和可读性强的输出结果,并且没有使用线程。我们并行地处理了15个消费者,也就是说,本来需要45s的执行时间在3s之内就已经完成。这个窍门就是我们把所有的阻塞的对sleep()的调用都换成了Twisted中对等的task.deferLater()和回调函数。由于现在处理的操作在其他地方进行,我们就可以毫不费力地同时服务于15个消费者。
前面提到处理的操作发生在其他的某个地方。现在来解释一下,算术运算仍然发生在CPU内,但是现在的CPU处理速度相比磁盘和网络操作来说非常快。所以给CPU提供数据或者从CPU向内存或另一个CPU发送数据花费了大多数时间。我们使用了非阻塞的操作节省了这方面的时间,例如,task.deferLater()使用了回调函数,当数据已经传输完成的时候会被激活。
另一个很重要的一点是输出中的Goodmorning from Twisted developer和Bye from Twisted developer!信息。在代码开始执行时就已经打印出了这两条信息。如果代码如此早地执行到了这个地方,那么我们的应用真正开始运行是在什么时候呢?答案是,对于一个Twisted应用(包括Scrapy)来说是在reactor.run()里运行的。在调用这个方法之前,必须把应用中可能用到的每个Deferred链准备就绪,然后reactor.run()方法会监视并激活回调函数。
注意,reactor的主要一条规则就是,你可以执行任何操作,只要它足够快并且是非阻塞的。
现在好了,代码中没有那么用于管理多线程的部分了,不过这些回调函数看起来还是有些杂乱。可以修改成这样:

# Twisted gave us utilities that make our code way more readable!
@defer.inlineCallbacks
def inline_install(customer):
  print "Scheduling: Installation for", customer
  yield task.deferLater(reactor, 3, lambda: None)
  print "Callback: Finished installation for", customer
  print "All done for", customer

def twisted_developer_day(customers):
  ... same as previously but using inline_install() instead of schedule_install()

twisted_developer_day(["Customer %d" % i for i in xrange(15)])
reactor.run()

运行的结果和前一个例子相同。这段代码的作用和上一个例子是一样的,但是看起来更加简洁明了。inlineCallbacks生成器可以使用一些一些Python的机制来使得inline_install()函数暂停或者恢复执行。inline_install()函数变成了一个Deferred对象并且并行地为每个消费者运行。每次yield的时候,运行就会中止在当前的inline_install()实例上,直到yield的Deferred对象完成后再恢复运行。
现在唯一的问题是,如果我们不止有15个消费者,而是有,比如10000个消费者时又该怎样?这段代码会同时开始10000个同时执行的序列(比如HTTP请求、数据库的写操作等等)。这样做可能没什么问题,但也可能会产生各种失败。在有巨大并发请求的应用中,例如Scrapy,我们经常需要把并发的数量限制到一个可以接受的程度上。在下面的一个例子中,我们使用task.Cooperator()来完成这样的功能。Scrapy在它的Item Pipeline中也使用了相同的机制来限制并发的数目(即CONCURRENT_ITEMS设置):

@defer.inlineCallbacks
def inline_install(customer):
  ... same as above

# The new "problem" is that we have to manage all this concurrency to
# avoid causing problems to others, but this is a nice problem to have.
def twisted_developer_day(customers):
  print "Goodmorning from Twisted developer"
  work = (inline_install(customer) for customer in customers)
  #
  # We use the Cooperator mechanism to make the secretary not
  # service more than 5 customers simultaneously.
  coop = task.Cooperator()
  join = defer.DeferredList([coop.coiterate(work) for i in xrange(5)])
  #
  join.addCallback(lambda _: reactor.stop())
  print "Bye from Twisted developer!"

twisted_developer_day(["Customer %d" % i for i in xrange(15)])
reactor.run()

# We are now more lean than ever, our customers happy, our hosting
# bills ridiculously low and our performance stellar.
# ~*~ THE END ~*~

运行结果:

$ ./deferreds.py 5
------ Running example 5 ------
Goodmorning from Twisted developer
Bye from Twisted developer!
Scheduling: Installation for Customer 0
...
Callback: Finished installation for Customer 4
All done for Customer 4
Scheduling: Installation for Customer 5
...
Callback: Finished installation for Customer 14
All done for Customer 14
* Elapsed time: 9.19 seconds

从上面的输出中可以看到,程序运行时好像有5个处理消费者的槽。除非一个槽空出来,否则不会开始处理下一个消费者的请求。在本例中,处理时间都是3秒,所以看起来像是5个一批次地处理一样。最后得到的性能跟使用线程是一样的,但是这次只有一个线程,代码也更加简洁更容易写出正确的代码。

PS:deferToThread使同步函数实现非阻塞
wisted的defer.Deferred (from twisted.internet import defer)可以返回一个deferred对象.

注:deferToThread使用线程实现的,不推荐过多使用
***把同步函数变为异步(返回一个Deferred)***
twisted的deferToThread(from twisted.internet.threads import deferToThread)也返回一个deferred对象,不过回调函数在另一个线程处理,主要用于数据库/文件读取操作

..

# 代码片段

  def dataReceived(self, data):
    now = int(time.time())

    for ftype, data in self.fpcodec.feed(data):
      if ftype == 'oob':
        self.msg('OOB:', repr(data))
      elif ftype == 0x81: # 对服务器请求的心跳应答(这个是解析 防疲劳驾驶仪,发给gps上位机的,然后上位机发给服务器的)
        self.msg('FP.PONG:', repr(data))
      else:
        self.msg('TODO:', (ftype, data))
      d = deferToThread(self.redis.zadd, "beier:fpstat:fps", now, self.devid)
      d.addCallback(self._doResult, extra)

下面这儿完整的例子可以给大家参考一下

# -*- coding: utf-8 -*-

from twisted.internet import defer, reactor
from twisted.internet.threads import deferToThread

import functools
import time

# 耗时操作 这是一个同步阻塞函数
def mySleep(timeout):
  time.sleep(timeout)

  # 返回值相当于加进了callback里
  return 3 

def say(result):
  print "耗时操作结束了, 并把它返回的结果给我了", result

# 用functools.partial包装一下, 传递参数进去
cb = functools.partial(mySleep, 3)
d = deferToThread(cb) 
d.addCallback(say)

print "你还没有结束我就执行了, 哈哈"

reactor.run()
Python 相关文章推荐
wxPython中listbox用法实例详解
Jun 01 Python
Python操作Excel之xlsx文件
Mar 24 Python
Python3.4 tkinter,PIL图片转换
Jun 21 Python
python打包生成的exe文件运行时提示缺少模块的解决方法
Oct 31 Python
详解【python】str与json类型转换
Apr 29 Python
Python常用模块logging——日志输出功能(示例代码)
Nov 20 Python
python循环嵌套的多种使用方法解析
Nov 29 Python
通过Turtle库在Python中绘制一个鼠年福鼠
Feb 03 Python
Python IDE环境之 新版Pycharm安装详细教程
Mar 05 Python
python基于exchange函数发送邮件过程详解
Nov 06 Python
python opencv实现图像配准与比较
Feb 09 Python
Python3压缩和解压缩实现代码
Mar 01 Python
Python的Twisted框架中使用Deferred对象来管理回调函数
May 25 #Python
使用Python的Twisted框架构建非阻塞下载程序的实例教程
May 25 #Python
Python的Twisted框架上手前所必须了解的异步编程思想
May 25 #Python
Python的re模块正则表达式操作
May 25 #Python
Python的for和break循环结构中使用else语句的技巧
May 24 #Python
Python3连接MySQL(pymysql)模拟转账实现代码
May 24 #Python
用Python写一个无界面的2048小游戏
May 24 #Python
You might like
smtp邮件发送一例
2006/10/09 PHP
PHP+MySQL5.0中文乱码解决方法
2006/11/20 PHP
php代码中使用换行及(\n或\r\n和br)的应用
2013/02/02 PHP
深入分析使用mysql_fetch_object()以对象的形式返回查询结果
2013/06/05 PHP
php实现可以设置中奖概率的抽奖程序代码分享
2014/01/19 PHP
PHP针对JSON操作实例分析
2015/01/12 PHP
Yii2框架自定义验证规则操作示例
2019/02/08 PHP
解决Laravel 不能创建 migration 的问题
2019/10/09 PHP
javascript同步Import,同步调用外部js的方法
2008/07/08 Javascript
关于可运行代码无法正常执行的使用说明
2010/05/13 Javascript
javascript 函数调用的对象和方法
2010/07/01 Javascript
JavaScript 用Node.js写Shell脚本[译]
2012/09/20 Javascript
JAVASCRIPT模式窗口中下载文件无法接收iframe的流
2013/10/11 Javascript
node.js中的fs.exists方法使用说明
2014/12/17 Javascript
JS实现去除数组中重复json的方法示例
2017/12/21 Javascript
谈谈JavaScript令人迷惑的==与+
2020/08/31 Javascript
vuex页面刷新导致数据丢失的解决方案
2020/12/10 Vue.js
Python探索之爬取电商售卖信息代码示例
2017/10/27 Python
python async with和async for的使用
2019/06/20 Python
Python 把序列转换为元组的函数tuple方法
2019/06/27 Python
Python Sphinx使用实例及问题解决
2020/01/17 Python
python 实现数据库中数据添加、查询与更新的示例代码
2020/12/07 Python
OpenCV+python实现膨胀和腐蚀的示例
2020/12/21 Python
摩顿布朗英国官方网上商店:奢华沐浴、身体和头发护理
2016/10/29 全球购物
字符串str除首尾字符外的其他字符按升序排列
2013/03/08 面试题
标记环介质访问控制协议
2016/03/27 面试题
介绍下Java中==和equals的区别
2013/09/01 面试题
文秘专业大学生求职信
2013/11/10 职场文书
教研处工作方案
2014/05/26 职场文书
坚守艰苦奋斗精神坚决反对享乐主义整改措施
2014/09/17 职场文书
小学感恩节活动策划方案
2014/10/06 职场文书
2014最新预备党员思想汇报范文:中国梦,我的梦
2014/10/25 职场文书
军训心得体会范文(2016最新篇)
2016/01/11 职场文书
使用springboot暴露oracle数据接口的问题
2021/05/07 Oracle
Python下opencv使用hough变换检测直线与圆
2021/06/18 Python
Python matplotlib安装以及实现简单曲线的绘制
2022/04/26 Python