探究Python多进程编程下线程之间变量的共享问题


Posted in Python onMay 05, 2015

 1、问题:

群中有同学贴了如下一段代码,问为何 list 最后打印的是空值?
 

from multiprocessing import Process, Manager
import os
 
manager = Manager()
vip_list = []
#vip_list = manager.list()
 
def testFunc(cc):
  vip_list.append(cc)
  print 'process id:', os.getpid()
 
if __name__ == '__main__':
  threads = []
 
  for ll in range(10):
    t = Process(target=testFunc, args=(ll,))
    t.daemon = True
    threads.append(t)
 
  for i in range(len(threads)):
    threads[i].start()
 
  for j in range(len(threads)):
    threads[j].join()
 
  print "------------------------"
  print 'process id:', os.getpid()
  print vip_list

其实如果你了解 python 的多线程模型,GIL 问题,然后了解多线程、多进程原理,上述问题不难回答,不过如果你不知道也没关系,跑一下上面的代码你就知道是什么问题了。
 

python aa.py
process id: 632
process id: 635
process id: 637
process id: 633
process id: 636
process id: 634
process id: 639
process id: 638
process id: 641
process id: 640
------------------------
process id: 619
[]

将第 6 行注释开启,你会看到如下结果:
 

process id: 32074
process id: 32073
process id: 32072
process id: 32078
process id: 32076
process id: 32071
process id: 32077
process id: 32079
process id: 32075
process id: 32080
------------------------
process id: 32066
[3, 2, 1, 7, 5, 0, 6, 8, 4, 9]

2、python 多进程共享变量的几种方式:
(1)Shared memory:
Data can be stored in a shared memory map using Value or Array. For example, the following code

http://docs.python.org/2/library/multiprocessing.html#sharing-state-between-processes
 

from multiprocessing import Process, Value, Array
 
def f(n, a):
  n.value = 3.1415927
  for i in range(len(a)):
    a[i] = -a[i]
 
if __name__ == '__main__':
  num = Value('d', 0.0)
  arr = Array('i', range(10))
 
  p = Process(target=f, args=(num, arr))
  p.start()
  p.join()
 
  print num.value
  print arr[:]

结果:
 

3.1415927
[0, -1, -2, -3, -4, -5, -6, -7, -8, -9]

(2)Server process:

A manager object returned by Manager() controls a server process which holds Python objects and allows other processes to manipulate them using proxies.
A manager returned by Manager() will support types list, dict, Namespace, Lock, RLock, Semaphore, BoundedSemaphore, Condition, Event, Queue, Value and Array.
代码见开头的例子。

http://docs.python.org/2/library/multiprocessing.html#managers
3、多进程的问题远不止这么多:数据的同步

看段简单的代码:一个简单的计数器:
 

from multiprocessing import Process, Manager
import os
 
manager = Manager()
sum = manager.Value('tmp', 0)
 
def testFunc(cc):
  sum.value += cc
 
if __name__ == '__main__':
  threads = []
 
  for ll in range(100):
    t = Process(target=testFunc, args=(1,))
    t.daemon = True
    threads.append(t)
 
  for i in range(len(threads)):
    threads[i].start()
 
  for j in range(len(threads)):
    threads[j].join()
 
  print "------------------------"
  print 'process id:', os.getpid()
  print sum.value

结果:
 

------------------------
process id: 17378
97

也许你会问:WTF?其实这个问题在多线程时代就存在了,只是在多进程时代又杯具重演了而已:Lock!
 

from multiprocessing import Process, Manager, Lock
import os
 
lock = Lock()
manager = Manager()
sum = manager.Value('tmp', 0)
 
 
def testFunc(cc, lock):
  with lock:
    sum.value += cc
 
 
if __name__ == '__main__':
  threads = []
 
  for ll in range(100):
    t = Process(target=testFunc, args=(1, lock))
    t.daemon = True
    threads.append(t)
 
  for i in range(len(threads)):
    threads[i].start()
 
  for j in range(len(threads)):
    threads[j].join()
 
  print "------------------------"
  print 'process id:', os.getpid()
  print sum.value

这段代码性能如何呢?跑跑看,或者加大循环次数试一下。。。
4、最后的建议:

    Note that usually sharing data between processes may not be the best choice, because of all the synchronization issues; an approach involving actors exchanging messages is usually seen as a better choice. See also Python documentation: As mentioned above, when doing concurrent programming it is usually best to avoid using shared state as far as possible. This is particularly true when using multiple processes. However, if you really do need to use some shared data then multiprocessing provides a couple of ways of doing so.

5、Refer:

http://stackoverflow.com/questions/14124588/python-multiprocessing-shared-memory

http://eli.thegreenplace.net/2012/01/04/shared-counter-with-pythons-multiprocessing/

http://docs.python.org/2/library/multiprocessing.html#multiprocessing.sharedctypes.synchronized

Python 相关文章推荐
Python中规范定义命名空间的一些建议
Jun 04 Python
python实现报表自动化详解
Nov 16 Python
Python中XlsxWriter模块简介与用法分析
Apr 24 Python
DES加密解密算法之python实现版(图文并茂)
Dec 06 Python
Python Pandas分组聚合的实现方法
Jul 02 Python
Python中调用其他程序的方式详解
Aug 06 Python
pytorch 使用单个GPU与多个GPU进行训练与测试的方法
Aug 19 Python
简单瞅瞅Python vars()内置函数的实现
Sep 27 Python
python 字典item与iteritems的区别详解
Apr 25 Python
python Socket网络编程实现C/S模式和P2P
Jun 22 Python
opencv+pyQt5实现图片阈值编辑器/寻色块阈值利器
Nov 13 Python
python爬虫基础之urllib的使用
Dec 31 Python
浅谈Python中的数据类型
May 05 #Python
用Python实现一个简单的能够上传下载的HTTP服务器
May 05 #Python
使用Python程序抓取新浪在国内的所有IP的教程
May 04 #Python
Python版微信红包分配算法
May 04 #Python
用Python编写一个每天都在系统下新建一个文件夹的脚本
May 04 #Python
用Python编写生成树状结构的文件目录的脚本的教程
May 04 #Python
使用Python脚本将Bing的每日图片作为桌面的教程
May 04 #Python
You might like
php disk_free_space 返回目录可用空间
2010/05/10 PHP
PHP的Yii框架使用中的一些错误解决方法与建议
2015/08/21 PHP
php官方微信接口大全(微信支付、微信红包、微信摇一摇、微信小店)
2015/12/21 PHP
详解php中 === 的使用
2016/10/24 PHP
PHP实现多图上传(结合uploadify插件)思路分析
2016/11/30 PHP
php curl上传、下载、https登陆实现代码
2017/07/23 PHP
PHP面向对象五大原则之里氏替换原则(LSP)详解
2018/04/08 PHP
jQuery 开发者应该注意的9个错误
2012/05/03 Javascript
javascript中sort()的用法实例分析
2015/01/30 Javascript
深入解读JavaScript中的Iterator和for-of循环
2015/07/28 Javascript
JavaScript中利用各种循环进行遍历的方式总结
2015/11/10 Javascript
jQuery EasyUI 菜单与按钮之创建简单的菜单和链接按钮
2015/11/18 Javascript
基于jquery实现图片上传本地预览功能
2016/01/08 Javascript
简单实现js间歇或无缝滚动效果
2016/06/29 Javascript
JavaScript实现事件的中断传播和行为阻止方法示例
2017/01/20 Javascript
jQuery插件zTree实现删除树子节点的方法示例
2017/03/08 Javascript
Bootstrap Table使用整理(五)之分页组合查询
2017/06/09 Javascript
CSS3+JavaScript实现翻页幻灯片效果
2017/06/28 Javascript
Vue事件修饰符native、self示例详解
2019/07/09 Javascript
turn.js异步加载实现翻书效果
2019/07/25 Javascript
nodejs使用node-xlsx生成excel的方法示例
2019/08/22 NodeJs
jQuery 查找元素操作实例小结
2019/10/02 jQuery
分享JS表单验证源码(带错误提示及密码等级)
2020/01/05 Javascript
浅谈vue在html中出现{{}}的原因及解决方式
2020/11/16 Javascript
[48:56]2018DOTA2亚洲邀请赛 3.31 小组赛 A组 VG vs KG
2018/03/31 DOTA
python集合用法实例分析
2015/05/30 Python
浅谈pandas中DataFrame关于显示值省略的解决方法
2018/04/08 Python
python实现log日志的示例代码
2018/04/28 Python
详解Flask前后端分离项目案例
2020/07/24 Python
基于 HTML5 WebGL 实现的垃圾分类系统
2019/10/08 HTML / CSS
C语言编程题
2015/03/09 面试题
在校实习生求职信
2014/06/18 职场文书
销售区域经理岗位职责
2015/04/10 职场文书
JS监听Esc 键触发事键
2021/04/14 Javascript
pytorch 一行代码查看网络参数总量的实现
2021/05/12 Python
Vue2项目中对百度地图的封装使用详解
2022/06/16 Vue.js