使用Python的Twisted框架编写非阻塞程序的代码示例


Posted in Python onMay 25, 2016

先来看一段代码:

# ~*~ Twisted - A Python tale ~*~

from time import sleep

# Hello, I'm a developer and I mainly setup Wordpress.
def install_wordpress(customer):
  # Our hosting company Threads Ltd. is bad. I start installation and...
  print "Start installation for", customer
  # ...then wait till the installation finishes successfully. It is
  # boring and I'm spending most of my time waiting while consuming
  # resources (memory and some CPU cycles). It's because the process
  # is *blocking*.
  sleep(3)
  print "All done for", customer

# I do this all day long for our customers
def developer_day(customers):
  for customer in customers:
    install_wordpress(customer)

developer_day(["Bill", "Elon", "Steve", "Mark"])

运行一下,结果如下所示:

$ ./deferreds.py 1
------ Running example 1 ------
Start installation for Bill
All done for Bill
Start installation
...
* Elapsed time: 12.03 seconds

这是一段顺序执行的代码。四个消费者,为一个人安装需要3秒的时间,那么四个人就是12秒。这样处理不是很令人满意,所以看一下第二个使用了线程的例子:

import threading

# The company grew. We now have many customers and I can't handle the
# workload. We are now 5 developers doing exactly the same thing.
def developers_day(customers):
  # But we now have to synchronize... a.k.a. bureaucracy
  lock = threading.Lock()
  #
  def dev_day(id):
    print "Goodmorning from developer", id
    # Yuck - I hate locks...
    lock.acquire()
    while customers:
      customer = customers.pop(0)
      lock.release()
      # My Python is less readable
      install_wordpress(customer)
      lock.acquire()
    lock.release()
    print "Bye from developer", id
  # We go to work in the morning
  devs = [threading.Thread(target=dev_day, args=(i,)) for i in range(5)]
  [dev.start() for dev in devs]
  # We leave for the evening
  [dev.join() for dev in devs]

# We now get more done in the same time but our dev process got more
# complex. As we grew we spend more time managing queues than doing dev
# work. We even had occasional deadlocks when processes got extremely
# complex. The fact is that we are still mostly pressing buttons and
# waiting but now we also spend some time in meetings.
developers_day(["Customer %d" % i for i in xrange(15)])

运行一下:

$ ./deferreds.py 2
------ Running example 2 ------
Goodmorning from developer 0Goodmorning from developer
1Start installation forGoodmorning from developer 2
Goodmorning from developer 3Customer 0
...
from developerCustomer 13 3Bye from developer 2
* Elapsed time: 9.02 seconds

这次是一段并行执行的代码,使用了5个工作线程。15个消费者每个花费3s意味着总共45s的时间,不过用了5个线程并行执行总共只花费了9s的时间。这段代码有点复杂,很大一部分代码是用于管理并发,而不是专注于算法或者业务逻辑。另外,程序的输出结果看起来也很混杂,可读性也天津市。即使是简单的多线程的代码同样也难以写得很好,所以我们转为使用Twisted:

# For years we thought this was all there was... We kept hiring more
# developers, more managers and buying servers. We were trying harder
# optimising processes and fire-fighting while getting mediocre
# performance in return. Till luckily one day our hosting
# company decided to increase their fees and we decided to
# switch to Twisted Ltd.!

from twisted.internet import reactor
from twisted.internet import defer
from twisted.internet import task

# Twisted has a slightly different approach
def schedule_install(customer):
  # They are calling us back when a Wordpress installation completes.
  # They connected the caller recognition system with our CRM and
  # we know exactly what a call is about and what has to be done next.
  #
  # We now design processes of what has to happen on certain events.
  def schedule_install_wordpress():
      def on_done():
        print "Callback: Finished installation for", customer
    print "Scheduling: Installation for", customer
    return task.deferLater(reactor, 3, on_done)
  #
  def all_done(_):
    print "All done for", customer
  #
  # For each customer, we schedule these processes on the CRM
  # and that
  # is all our chief-Twisted developer has to do
  d = schedule_install_wordpress()
  d.addCallback(all_done)
  #
  return d

# Yes, we don't need many developers anymore or any synchronization.
# ~~ Super-powered Twisted developer ~~
def twisted_developer_day(customers):
  print "Goodmorning from Twisted developer"
  #
  # Here's what has to be done today
  work = [schedule_install(customer) for customer in customers]
  # Turn off the lights when done
  join = defer.DeferredList(work)
  join.addCallback(lambda _: reactor.stop())
  #
  print "Bye from Twisted developer!"
# Even his day is particularly short!
twisted_developer_day(["Customer %d" % i for i in xrange(15)])

# Reactor, our secretary uses the CRM and follows-up on events!
reactor.run()

运行结果:

------ Running example 3 ------
Goodmorning from Twisted developer
Scheduling: Installation for Customer 0
....
Scheduling: Installation for Customer 14
Bye from Twisted developer!
Callback: Finished installation for Customer 0
All done for Customer 0
Callback: Finished installation for Customer 1
All done for Customer 1
...
All done for Customer 14
* Elapsed time: 3.18 seconds

这次我们得到了完美的执行代码和可读性强的输出结果,并且没有使用线程。我们并行地处理了15个消费者,也就是说,本来需要45s的执行时间在3s之内就已经完成。这个窍门就是我们把所有的阻塞的对sleep()的调用都换成了Twisted中对等的task.deferLater()和回调函数。由于现在处理的操作在其他地方进行,我们就可以毫不费力地同时服务于15个消费者。
前面提到处理的操作发生在其他的某个地方。现在来解释一下,算术运算仍然发生在CPU内,但是现在的CPU处理速度相比磁盘和网络操作来说非常快。所以给CPU提供数据或者从CPU向内存或另一个CPU发送数据花费了大多数时间。我们使用了非阻塞的操作节省了这方面的时间,例如,task.deferLater()使用了回调函数,当数据已经传输完成的时候会被激活。
另一个很重要的一点是输出中的Goodmorning from Twisted developer和Bye from Twisted developer!信息。在代码开始执行时就已经打印出了这两条信息。如果代码如此早地执行到了这个地方,那么我们的应用真正开始运行是在什么时候呢?答案是,对于一个Twisted应用(包括Scrapy)来说是在reactor.run()里运行的。在调用这个方法之前,必须把应用中可能用到的每个Deferred链准备就绪,然后reactor.run()方法会监视并激活回调函数。
注意,reactor的主要一条规则就是,你可以执行任何操作,只要它足够快并且是非阻塞的。
现在好了,代码中没有那么用于管理多线程的部分了,不过这些回调函数看起来还是有些杂乱。可以修改成这样:

# Twisted gave us utilities that make our code way more readable!
@defer.inlineCallbacks
def inline_install(customer):
  print "Scheduling: Installation for", customer
  yield task.deferLater(reactor, 3, lambda: None)
  print "Callback: Finished installation for", customer
  print "All done for", customer

def twisted_developer_day(customers):
  ... same as previously but using inline_install() instead of schedule_install()

twisted_developer_day(["Customer %d" % i for i in xrange(15)])
reactor.run()

运行的结果和前一个例子相同。这段代码的作用和上一个例子是一样的,但是看起来更加简洁明了。inlineCallbacks生成器可以使用一些一些Python的机制来使得inline_install()函数暂停或者恢复执行。inline_install()函数变成了一个Deferred对象并且并行地为每个消费者运行。每次yield的时候,运行就会中止在当前的inline_install()实例上,直到yield的Deferred对象完成后再恢复运行。
现在唯一的问题是,如果我们不止有15个消费者,而是有,比如10000个消费者时又该怎样?这段代码会同时开始10000个同时执行的序列(比如HTTP请求、数据库的写操作等等)。这样做可能没什么问题,但也可能会产生各种失败。在有巨大并发请求的应用中,例如Scrapy,我们经常需要把并发的数量限制到一个可以接受的程度上。在下面的一个例子中,我们使用task.Cooperator()来完成这样的功能。Scrapy在它的Item Pipeline中也使用了相同的机制来限制并发的数目(即CONCURRENT_ITEMS设置):

@defer.inlineCallbacks
def inline_install(customer):
  ... same as above

# The new "problem" is that we have to manage all this concurrency to
# avoid causing problems to others, but this is a nice problem to have.
def twisted_developer_day(customers):
  print "Goodmorning from Twisted developer"
  work = (inline_install(customer) for customer in customers)
  #
  # We use the Cooperator mechanism to make the secretary not
  # service more than 5 customers simultaneously.
  coop = task.Cooperator()
  join = defer.DeferredList([coop.coiterate(work) for i in xrange(5)])
  #
  join.addCallback(lambda _: reactor.stop())
  print "Bye from Twisted developer!"

twisted_developer_day(["Customer %d" % i for i in xrange(15)])
reactor.run()

# We are now more lean than ever, our customers happy, our hosting
# bills ridiculously low and our performance stellar.
# ~*~ THE END ~*~

运行结果:

$ ./deferreds.py 5
------ Running example 5 ------
Goodmorning from Twisted developer
Bye from Twisted developer!
Scheduling: Installation for Customer 0
...
Callback: Finished installation for Customer 4
All done for Customer 4
Scheduling: Installation for Customer 5
...
Callback: Finished installation for Customer 14
All done for Customer 14
* Elapsed time: 9.19 seconds

从上面的输出中可以看到,程序运行时好像有5个处理消费者的槽。除非一个槽空出来,否则不会开始处理下一个消费者的请求。在本例中,处理时间都是3秒,所以看起来像是5个一批次地处理一样。最后得到的性能跟使用线程是一样的,但是这次只有一个线程,代码也更加简洁更容易写出正确的代码。

PS:deferToThread使同步函数实现非阻塞
wisted的defer.Deferred (from twisted.internet import defer)可以返回一个deferred对象.

注:deferToThread使用线程实现的,不推荐过多使用
***把同步函数变为异步(返回一个Deferred)***
twisted的deferToThread(from twisted.internet.threads import deferToThread)也返回一个deferred对象,不过回调函数在另一个线程处理,主要用于数据库/文件读取操作

..

# 代码片段

  def dataReceived(self, data):
    now = int(time.time())

    for ftype, data in self.fpcodec.feed(data):
      if ftype == 'oob':
        self.msg('OOB:', repr(data))
      elif ftype == 0x81: # 对服务器请求的心跳应答(这个是解析 防疲劳驾驶仪,发给gps上位机的,然后上位机发给服务器的)
        self.msg('FP.PONG:', repr(data))
      else:
        self.msg('TODO:', (ftype, data))
      d = deferToThread(self.redis.zadd, "beier:fpstat:fps", now, self.devid)
      d.addCallback(self._doResult, extra)

下面这儿完整的例子可以给大家参考一下

# -*- coding: utf-8 -*-

from twisted.internet import defer, reactor
from twisted.internet.threads import deferToThread

import functools
import time

# 耗时操作 这是一个同步阻塞函数
def mySleep(timeout):
  time.sleep(timeout)

  # 返回值相当于加进了callback里
  return 3 

def say(result):
  print "耗时操作结束了, 并把它返回的结果给我了", result

# 用functools.partial包装一下, 传递参数进去
cb = functools.partial(mySleep, 3)
d = deferToThread(cb) 
d.addCallback(say)

print "你还没有结束我就执行了, 哈哈"

reactor.run()
Python 相关文章推荐
使用python检测手机QQ在线状态的脚本代码
Feb 10 Python
Python环境变量设置方法
Aug 28 Python
Python的时间模块datetime详解
Apr 17 Python
python机器学习之神经网络(二)
Dec 20 Python
vue.js实现输入框输入值内容实时响应变化示例
Jul 07 Python
numpy.ndarray 交换多维数组(矩阵)的行/列方法
Aug 02 Python
python模块之subprocess模块级方法的使用
Mar 26 Python
centos 安装Python3 及对应的pip教程详解
Jun 28 Python
Tensorflow实现多GPU并行方式
Feb 03 Python
python爬虫看看虎牙女主播中谁最“顶”步骤详解
Dec 01 Python
解决PDF 转图片时丢文字的一种可能方式
Mar 04 Python
Python matplotlib 利用随机函数生成变化图形
Apr 26 Python
Python的Twisted框架中使用Deferred对象来管理回调函数
May 25 #Python
使用Python的Twisted框架构建非阻塞下载程序的实例教程
May 25 #Python
Python的Twisted框架上手前所必须了解的异步编程思想
May 25 #Python
Python的re模块正则表达式操作
May 25 #Python
Python的for和break循环结构中使用else语句的技巧
May 24 #Python
Python3连接MySQL(pymysql)模拟转账实现代码
May 24 #Python
用Python写一个无界面的2048小游戏
May 24 #Python
You might like
PHP新手上路(七)
2006/10/09 PHP
PHP教程 基本语法
2009/10/23 PHP
PHP访问数据库集群的方法小结
2016/03/14 PHP
PHP带节点操作的无限分类实现方法详解
2016/11/09 PHP
js 匿名调用实现代码
2009/06/19 Javascript
浅谈Javascript鼠标和滚轮事件
2012/06/27 Javascript
使用GruntJS链接与压缩多个JavaScript文件过程详解
2013/08/02 Javascript
jQuery获得IE版本不准确webbrowser的解决方法
2014/02/23 Javascript
AngularJS入门教程之学习环境搭建
2014/12/06 Javascript
判断是否存在子节点的实现代码
2016/05/18 Javascript
angularjs封装bootstrap时间插件datetimepicker
2016/06/20 Javascript
微信小程序 两种滑动方式(横向滑动,竖向滑动)详细及实例代码
2017/01/13 Javascript
Vue内容分发slot(全面解析)
2017/08/19 Javascript
jquery手机触屏滑动拼音字母城市选择器的实例代码
2017/12/11 jQuery
jQuery实现点击DIV同时点击CheckBox,并为DIV上背景色的实例
2017/12/18 jQuery
vue组件实践之可搜索下拉框功能
2018/11/25 Javascript
基于VUE实现判断设备是PC还是移动端
2020/07/03 Javascript
Js on及addEventListener原理用法区别解析
2020/07/11 Javascript
微信小程序实现选项卡滑动切换
2020/10/22 Javascript
python将ip地址转换成整数的方法
2015/03/17 Python
Python使用django搭建web开发环境
2017/06/09 Python
python实现批量图片格式转换
2020/06/16 Python
Python小工具之消耗系统指定大小内存的方法
2018/12/03 Python
简单了解Python3里的一些新特性
2019/07/13 Python
Python 分享10个PyCharm技巧
2019/07/13 Python
Python tkinter三种布局实例详解
2020/01/06 Python
Python openpyxl模块原理及用法解析
2020/01/19 Python
Pytorch 使用 nii数据做输入数据的操作
2020/05/26 Python
CSS3 :not()选择器实现最后一行li去除某种css样式
2016/10/19 HTML / CSS
HTML+CSS+JavaScript实现图片3D展览的示例代码
2020/10/12 HTML / CSS
交通事故委托书范本(2篇)
2014/09/21 职场文书
工作批评与自我批评范文
2014/10/16 职场文书
教师群众路线学习心得体会
2014/11/04 职场文书
幼儿园园长个人总结
2015/03/02 职场文书
房租涨价通知
2015/04/23 职场文书
JS数组方法some、every和find的使用详情
2021/10/05 Javascript