解决TensorFlow程序无限制占用GPU的方法


Posted in Python onJune 30, 2020

今天遇到一个奇怪的现象,使用tensorflow-gpu的时候,出现内存超额~~如果我训练什么大型数据也就算了,关键我就写了一个y=W*x…显示如下图所示:

程序如下:

import tensorflow as tf

w = tf.Variable([[1.0,2.0]])
b = tf.Variable([[2.],[3.]])

y = tf.multiply(w,b)

init_op = tf.global_variables_initializer()

with tf.Session() as sess:
 sess.run(init_op)
 print(sess.run(y))

出错提示:

占用的内存越来越多,程序崩溃之后,整个电脑都奔溃了,因为整个显卡全被吃了

2018-06-10 18:28:00.263424: I T:\src\github\tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2018-06-10 18:28:00.598075: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1356] Found device 0 with properties: 
name: GeForce GTX 1060 major: 6 minor: 1 memoryClockRate(GHz): 1.6705
pciBusID: 0000:01:00.0
totalMemory: 6.00GiB freeMemory: 4.97GiB
2018-06-10 18:28:00.598453: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1435] Adding visible gpu devices: 0
2018-06-10 18:28:01.265600: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-06-10 18:28:01.265826: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:929]  0 
2018-06-10 18:28:01.265971: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:942] 0: N 
2018-06-10 18:28:01.266220: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4740 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060, pci bus id: 0000:01:00.0, compute capability: 6.1)
2018-06-10 18:28:01.331056: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 4.63G (4970853120 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.399111: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 4.17G (4473767936 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.468293: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 3.75G (4026391040 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.533138: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 3.37G (3623751936 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.602452: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 3.04G (3261376768 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.670225: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 2.73G (2935238912 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.733120: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 2.46G (2641714944 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.800101: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 2.21G (2377543424 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.862064: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 1.99G (2139789056 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.925434: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 1.79G (1925810176 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.986180: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 1.61G (1733229056 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:02.043456: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 1.45G (1559906048 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:02.103531: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 1.31G (1403915520 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:02.168973: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 1.18G (1263524096 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:02.229387: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 1.06G (1137171712 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:02.292997: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 976.04M (1023454720 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:02.356714: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 878.44M (921109248 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:02.418167: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 790.59M (828998400 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:02.482394: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 711.54M (746098688 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY

分析原因:

显卡驱动不是最新版本,用__驱动软件__更新一下驱动,或者自己去下载更新。

TF运行太多,注销全部程序冲洗打开。

由于TF内核编写的原因,默认占用全部的GPU去训练自己的东西,也就是像meiguo一样优先政策吧

这个时候我们得设置两个方面:

  • 选择什么样的占用方式?优先占用__还是__按需占用
  • 选择最大占用多少GPU,因为占用过大GPU会导致其它程序奔溃。最好在0.7以下

先更新驱动:

解决TensorFlow程序无限制占用GPU的方法

再设置TF程序:

注意:单独设置一个不行!按照网上大神博客试了,结果效果还是很差(占用很多GPU)

设置TF:

  • 按需占用
  • 最大占用70%GPU

修改代码如下:

import tensorflow as tf

w = tf.Variable([[1.0,2.0]])
b = tf.Variable([[2.],[3.]])

y = tf.multiply(w,b)

init_op = tf.global_variables_initializer()

config = tf.ConfigProto(allow_soft_placement=True)
gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.7)
config.gpu_options.allow_growth = True
with tf.Session(config=config) as sess:
 sess.run(init_op)
 print(sess.run(y))

成功解决:

2018-06-10 18:21:17.532630: I T:\src\github\tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2018-06-10 18:21:17.852442: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1356] Found device 0 with properties: 
name: GeForce GTX 1060 major: 6 minor: 1 memoryClockRate(GHz): 1.6705
pciBusID: 0000:01:00.0
totalMemory: 6.00GiB freeMemory: 4.97GiB
2018-06-10 18:21:17.852817: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1435] Adding visible gpu devices: 0
2018-06-10 18:21:18.511176: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-06-10 18:21:18.511397: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:929]  0 
2018-06-10 18:21:18.511544: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:942] 0: N 
2018-06-10 18:21:18.511815: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4740 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060, pci bus id: 0000:01:00.0, compute capability: 6.1)
[[2. 4.]
 [3. 6.]]

参考资料:

主要参考博客

错误实例

到此这篇关于解决TensorFlow程序无限制占用GPU的方法 的文章就介绍到这了,更多相关TensorFlow 占用GPU内容请搜索三水点靠木以前的文章或继续浏览下面的相关文章希望大家以后多多支持三水点靠木!

Python 相关文章推荐
在Python的循环体中使用else语句的方法
Mar 30 Python
使用python实现生成用户信息
Mar 20 Python
Python实现的概率分布运算操作示例
Aug 14 Python
Python中scatter函数参数及用法详解
Nov 08 Python
简述Python2与Python3的不同点
Jan 21 Python
Flask框架URL管理操作示例【基于@app.route】
Jul 23 Python
Python原始套接字编程实例解析
Jan 29 Python
Python +Selenium解决图片验证码登录或注册问题(推荐)
Feb 09 Python
Python Websocket服务端通信的使用示例
Feb 25 Python
python程序输出无内容的解决方式
Apr 09 Python
你需要学会的8个Python列表技巧
Jun 24 Python
Python 如何创建一个线程池
Jul 28 Python
tensorflow 大于某个值为1,小于为0的实例
Jun 30 #Python
基于tf.shape(tensor)和tensor.shape()的区别说明
Jun 30 #Python
Tensorflow全局设置可见GPU编号操作
Jun 30 #Python
Python logging模块异步线程写日志实现过程解析
Jun 30 #Python
浅谈多卡服务器下隐藏部分 GPU 和 TensorFlow 的显存使用设置
Jun 30 #Python
Tensorflow中批量读取数据的案列分析及TFRecord文件的打包与读取
Jun 30 #Python
使用Tensorflow-GPU禁用GPU设置(CPU与GPU速度对比)
Jun 30 #Python
You might like
php小偷相关截取函数备忘
2010/11/28 PHP
PHP基于数组实现的分页函数实例
2014/08/20 PHP
简单实现php上传文件功能
2017/09/21 PHP
JavaScript实际应用:innerHTMl和确认提示的使用
2006/06/22 Javascript
用js实现手把手教你月入万刀(转贴)
2007/11/07 Javascript
用js生产批量批处理执行命令
2008/07/28 Javascript
JavaScript 在各个浏览器中执行的耐性
2009/04/06 Javascript
理清apply(),call()的区别和关系
2011/08/14 Javascript
JavaScript获取当前页面上的指定对象示例代码
2014/02/28 Javascript
wap图片滚动特效无css3元素纯js脚本编写
2014/08/22 Javascript
jQuery原生的动画效果
2015/07/10 Javascript
实现非常简单的js双向数据绑定
2015/11/06 Javascript
理解jquery事件冒泡
2016/01/03 Javascript
浅谈javascript运算符——条件,逗号,赋值,()和void运算符
2016/07/15 Javascript
js select下拉联动 更具级联性!
2020/04/17 Javascript
jquery实现图片平滑滚动详解
2017/03/22 jQuery
ES6模块化的import和export用法方法总结
2017/08/08 Javascript
React Native中的RefreshContorl下拉刷新使用
2017/10/09 Javascript
H5+C3+JS实现五子棋游戏(AI篇)
2020/05/28 Javascript
JavaScript数据结构与算法之二叉树实现查找最小值、最大值、给定值算法示例
2019/03/01 Javascript
Angular+Ionic使用queryParams实现跳转页传值的方法
2020/09/05 Javascript
[10:34]DOTA2上海特级锦标赛全纪录
2016/03/25 DOTA
在centos7中分布式部署pyspider
2017/05/03 Python
Python3.5.3下配置opencv3.2.0的操作方法
2018/04/02 Python
Python编程中NotImplementedError的使用方法
2018/04/21 Python
python单例模式获取IP代理的方法详解
2018/09/13 Python
Python基于opencv实现的简单画板功能示例
2019/03/04 Python
Python 利用高德地图api实现经纬度与地址的批量转换
2019/08/14 Python
使用pandas实现筛选出指定列值所对应的行
2020/12/13 Python
伦敦鲜花递送:Flower Station
2021/02/03 全球购物
创业计划书模版
2014/02/05 职场文书
2015年幼儿园安全工作总结
2015/05/12 职场文书
2015年司法局工作总结
2015/05/22 职场文书
公司管理建议书
2015/09/14 职场文书
毕业生求职自荐信(2016最新版)
2016/01/28 职场文书
Java异常体系非正常停止和分类
2022/06/14 Java/Android