解决python ThreadPoolExecutor 线程池中的异常捕获问题


Posted in Python onApril 08, 2020

问题

最近写了涉及线程池及线程的 python 脚本,运行过程中发现一个有趣的现象,线程池中的工作线程出现问题,引发了异常,但是主线程没有捕获异常,还在发现 BUG 之前一度以为线程池代码正常返回。

先说重点

这里主要想介绍 python concurrent.futuresthread.ThreadPoolExecutor 线程池中的 worker 引发异常的时候,并不会直接向上抛起异常,而是需要主线程通过调用concurrent.futures.Future.exception(timeout=None) 方法主动获取 worker 的异常。

问题重现及解决

引子

问题主要由这样一段代码引起的:

def thread_executor():
 logger.info("I am slave. I am working. I am going to sleep 3s")
 sleep(3)
 logger.info("Exit thread executor")


def main():
 thread_obj = threading.Thread(target=thread_executor)
 while True:
  logger.info("Master starts thread worker")

  try:
   # 工作线程由于某种异常而结束并退出了,想重启工作线程的工作,但又不想重复创建线程
   thread_obj.start() # 这一行会报错,同一线程不能重复启动
  except Exception as e:
   logger.error("Master start thread error", exc_info=True)
   raise e

  logger.info("Master is going to sleep 5s")
  sleep(5)

上面这段代码的功能如注释中解释的,主要要实现类似生产者消费者的功能,工作线程一直去生产资源,主线程去消费工作线程生产的资源。但是工作线程由于异常推出了,想重新启动生产工作。显然,这个代码会报错。

运行结果:

thread: MainThread [INFO] Master starts thread worker
thread: Thread-1 [INFO] I am slave. I am working. I am going to sleep 3s
thread: MainThread [INFO] Master is going to sleep 5s
thread: Thread-1 [INFO] Exit thread executor because of some exception
thread: MainThread [INFO] Master starts thread worker
thread: MainThread [ERROR] Master start thread error
Traceback (most recent call last):
File "xxx.py", line 47, in main
 thread_obj.start()
File "E:\anaconda\lib\threading.py", line 843, in start
 raise RuntimeError("threads can only be started once")
RuntimeError: threads can only be started once
Traceback (most recent call last):
File "xxx.py", line 56, in <module>
 main()
File "xxx.py", line 50, in main
 raise e
File "xxx.py", line 47, in main
 thread_obj.start()
File "E:\anaconda\lib\threading.py", line 843, in start
 raise RuntimeError("threads can only be started once")
RuntimeError: threads can only be started once

切入正题

然而脚本还有其他业务代码要运行,所以需要把上面的资源生产和消费的代码放到一个线程里完成,所以引入线程池来执行这段代码:

def thread_executor():
 while True:
  logger.info("I am slave. I am working. I am going to sleep 3s")
  sleep(3)
  logger.info("Exit thread executor because of some exception")
  break


def main():
 thread_obj = threading.Thread(target=thread_executor)
 while True:
  logger.info("Master starts thread worker")

  # 工作线程由于某种异常而结束并退出了,想重启工作线程的工作,但又不想重复创建线程
  # 没有想到这里会有异常
  thread_obj.start() # 这一行会报错,同一线程不能重复启动

  logger.info("Master is going to sleep 5s")
  sleep(5)


def thread_pool_main():
 thread_obj = ThreadPoolExecutor(max_workers=1, thread_name_prefix="WorkExecutor")
 logger.info("Master ThreadPool Executor starts thread worker")
 thread_obj.submit(main)

 while True:
  logger.info("Master ThreadPool Executor is going to sleep 5s")
  sleep(5)

if __name__ == '__main__':
 thread_pool_main()

代码运行结果如下:

INFO [thread: MainThread] Master ThreadPool Executor starts thread worker
INFO [thread: WorkExecutor_0] Master starts thread worker
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s
INFO [thread: Thread-1] I am slave. I am working. I am going to sleep 3s
INFO [thread: WorkExecutor_0] Master is going to sleep 5s
INFO [thread: Thread-1] Exit thread executor because of some exception
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s
INFO [thread: WorkExecutor_0] Master starts thread worker
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s

... ...

显然,由上面的结果,在线程池 worker 执行到 INFO [thread: WorkExecutor_0] Master starts thread worker 的时候,是会有异常产生的,但是整个代码并没有抛弃任何异常。

解决方法

发现上面的 bug 后,想在线程池 worker 出错的时候,把异常记录到日志。查阅资料,要获取线程池的异常信息,需要调用 concurrent.futures.Future.exception(timeout=None) 方法,为了记录日志,这里加了线程池执行结束的回调函数。同时,日志中记录异常信息,用了 logging.exception() 方法。

def thread_executor():
 while True:
  logger.info("I am slave. I am working. I am going to sleep 3s")
  sleep(3)
  logger.info("Exit thread executor because of some exception")
  break


def main():
 thread_obj = threading.Thread(target=thread_executor)
 while True:
  logger.info("Master starts thread worker")

  # 工作线程由于某种异常而结束并退出了,想重启工作线程的工作,但又不想重复创建线程
  # 没有想到这里会有异常
  thread_obj.start() # 这一行会报错,同一线程不能重复启动

  logger.info("Master is going to sleep 5s")
  sleep(5)


def thread_pool_callback(worker):
 logger.info("called thread pool executor callback function")
 worker_exception = worker.exception()
 if worker_exception:
  logger.exception("Worker return exception: {}".format(worker_exception))


def thread_pool_main():
 thread_obj = ThreadPoolExecutor(max_workers=1, thread_name_prefix="WorkExecutor")
 logger.info("Master ThreadPool Executor starts thread worker")
 thread_pool_exc = thread_obj.submit(main)
 thread_pool_exc.add_done_callback(thread_pool_callback)
 # logger.info("thread pool exception: {}".format(thread_pool_exc.exception()))

 while True:
  logger.info("Master ThreadPool Executor is going to sleep 5s")
  sleep(5)


if __name__ == '__main__':
 thread_pool_main()

代码运行结果:

INFO [thread: MainThread] Master ThreadPool Executor starts thread worker
INFO [thread: WorkExecutor_0] Master starts thread worker
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s
INFO [thread: Thread-1] I am slave. I am working. I am going to sleep 3s
INFO [thread: WorkExecutor_0] Master is going to sleep 5s
INFO [thread: Thread-1] Exit thread executor because of some exception
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s
INFO [thread: WorkExecutor_0] Master starts thread worker
INFO [thread: WorkExecutor_0] called thread pool executor callback function
ERROR [thread: WorkExecutor_0] Worker return exception: threads can only be started once
Traceback (most recent call last):
File "E:\anaconda\lib\concurrent\futures\thread.py", line 57, in run
 result = self.fn(*self.args, **self.kwargs)
File "xxxx.py", line 46, in main
 thread_obj.start() # 这一行会报错,同一线程不能重复启动
File "E:\anaconda\lib\threading.py", line 843, in start
 raise RuntimeError("threads can only be started once")
RuntimeError: threads can only be started once
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s
... ...

最终的写法

其实,上面写法中,想重复利用一个线程去实现生产者线程的实现方法是有问题的,在此处,一般情况下,线程执行结束后,线程资源会被会被操作系统,所以线程不能被重复调用 start() 。

解决python ThreadPoolExecutor 线程池中的异常捕获问题

解决python ThreadPoolExecutor 线程池中的异常捕获问题

一种可行的实现方式就是,用线程池替代。当然,这样做得注意上面提到的线程池执行体的异常捕获问题。

def thread_executor():
 while True:
  logger.info("I am slave. I am working. I am going to sleep 3s")
  sleep(3)
  logger.info("Exit thread executor because of some exception")
  break

def executor_callback(worker):
 logger.info("called worker callback function")
 worker_exception = worker.exception()
 if worker_exception:
  logger.exception("Worker return exception: {}".format(worker_exception))
  # raise worker_exception


def main():
 slave_thread_pool = ThreadPoolExecutor(max_workers=1, thread_name_prefix="SlaveExecutor")
 restart_flag = False
 while True:
  logger.info("Master starts thread worker")

  if not restart_flag:
   restart_flag = not restart_flag
   logger.info("Restart Slave work")
  slave_thread_pool.submit(thread_executor).add_done_callback(executor_callback)

  logger.info("Master is going to sleep 5s")
  sleep(5)

总结

这个问题主要还是因为对 Python 的 concurrent.futuresthread.ThreadPoolExecutor 不够了解导致的,接触这个包是在书本上,但是书本没完全介绍包的全部 API 及用法,所以代码产生异常情况后,DEBUG 了许久在真正找到问题所在。查阅 python docs 后才对其完整用法有所认识,所以,以后学习新的 python 包的时候还是可以查一查官方文档的。

参考资料

英文版: docs of python concurrent.futures

中文版: python docs concurrent.futures — 启动并行任务

exception(timeout=None)

返回由调用引发的异常。如果调用还没完成那么这个方法将等待 timeout 秒。如果在 timeout 秒内没有执行完成,concurrent.futures.TimeoutError 将会被触发。timeout 可以是整数或浮点数。如果 timeout 没有指定或为 None,那么等待时间就没有限制。

如果 futrue 在完成前被取消则 CancelledError 将被触发。

如果调用正常完成那么返回 None。

add_done_callback(fn)

附加可调用 fn 到期程。当期程被取消或完成运行时,将会调用 fn,而这个期程将作为它唯一的参数。

加入的可调用对象总被属于添加它们的进程中的线程按加入的顺序调用。如果可调用对象引发一个 Exception 子类,它会被记录下来并被忽略掉。如果可调用对象引发一个 BaseException 子类,这个行为没有定义。

如果期程已经完成或已取消,fn 会被立即调用。

以上这篇解决python ThreadPoolExecutor 线程池中的异常捕获问题就是小编分享给大家的全部内容了,希望能给大家一个参考,也希望大家多多支持三水点靠木。

Python 相关文章推荐
python计算程序开始到程序结束的运行时间和程序运行的CPU时间
Nov 28 Python
python中使用百度音乐搜索的api下载指定歌曲的lrc歌词
Jul 18 Python
浅析Python中的多重继承
Apr 28 Python
Python中if __name__ == '__main__'作用解析
Jun 29 Python
Python解析最简单的验证码
Jan 07 Python
python 计算数组中每个数字出现多少次--“Bucket”桶的思想
Dec 19 Python
CentOS6.9 Python环境配置(python2.7、pip、virtualenv)
May 06 Python
python实现静态web服务器
Sep 03 Python
Python Switch Case三种实现方法代码实例
Jun 18 Python
python向企业微信发送文字和图片消息的示例
Sep 28 Python
Python字符串常规操作小结
Apr 03 Python
Python加密与解密模块hashlib与hmac
Jun 05 Python
使用Python将Exception异常错误堆栈信息写入日志文件
Apr 08 #Python
TensorFlow2.X结合OpenCV 实现手势识别功能
Apr 08 #Python
python 安装库几种方法之cmd,anaconda,pycharm详解
Apr 08 #Python
TensorFlow2.1.0最新版本安装详细教程
Apr 08 #Python
解决python多线程报错:AttributeError: Can't pickle local object问题
Apr 08 #Python
解决Python 异常TypeError: cannot concatenate 'str' and 'int' objects
Apr 08 #Python
TensorFlow2.1.0安装过程中setuptools、wrapt等相关错误指南
Apr 08 #Python
You might like
phpMyAdmin 安装及问题总结
2009/05/28 PHP
安装apache2.2.22配置php5.4(具体操作步骤)
2013/06/26 PHP
风吟的小型JavaScirpt库 (FY.JS).
2010/03/09 Javascript
javascript检测浏览器flash版本的实现代码
2011/12/06 Javascript
关于js new Date() 出现NaN 的分析
2012/10/23 Javascript
JS的千分位算法实现思路
2013/07/31 Javascript
一个检测表单数据的JavaScript实例
2014/10/31 Javascript
Js获取当前日期时间及格式化代码
2016/09/17 Javascript
JavaScript实现垂直向上无缝滚动特效代码
2016/11/23 Javascript
通过AngularJS实现图片上传及缩略图展示示例
2017/01/03 Javascript
js编写选项卡效果
2017/05/23 Javascript
安装vue-cli的简易过程
2018/05/22 Javascript
ES6 Promise对象的应用实例分析
2019/06/27 Javascript
可拖拽组件slider.js使用方法详解
2020/12/04 Javascript
[09:37]2018DOTA2国际邀请赛寻真——不懈追梦的Team Serenity
2018/08/13 DOTA
Python中for循环详解
2014/01/17 Python
对DJango视图(views)和模版(templates)的使用详解
2019/07/17 Python
Python提取PDF内容的方法(文本、图像、线条等)
2019/09/25 Python
python读文件的步骤
2019/10/08 Python
详解PyQt5信号与槽的几种高级玩法
2020/03/24 Python
Python第三方库的几种安装方式(小结)
2020/04/03 Python
python判断一个变量是否已经设置的方法
2020/08/13 Python
python使用numpy中的size()函数实例用法详解
2021/01/29 Python
使用iframe+postMessage实现页面跨域通信的示例代码
2020/01/14 HTML / CSS
linux面试题参考答案(7)
2012/10/29 面试题
大学生毕业鉴定
2014/01/31 职场文书
《十六年前的回忆》教学反思
2014/02/14 职场文书
入党自我鉴定
2014/03/25 职场文书
社区班子个人对照检查材料思想汇报
2014/10/07 职场文书
党员三严三实对照检查材料
2014/10/13 职场文书
人事主管岗位职责
2015/02/04 职场文书
活动总结书怎么写
2015/05/11 职场文书
安全生产感想
2015/08/07 职场文书
小学生六年级作文之关于感恩
2019/08/16 职场文书
pandas取dataframe特定行列的实现方法
2021/05/24 Python
详解Golang如何实现支持随机删除元素的堆
2022/09/23 Python