解决TensorFlow程序无限制占用GPU的方法


Posted in Python onJune 30, 2020

今天遇到一个奇怪的现象,使用tensorflow-gpu的时候,出现内存超额~~如果我训练什么大型数据也就算了,关键我就写了一个y=W*x…显示如下图所示:

程序如下:

import tensorflow as tf

w = tf.Variable([[1.0,2.0]])
b = tf.Variable([[2.],[3.]])

y = tf.multiply(w,b)

init_op = tf.global_variables_initializer()

with tf.Session() as sess:
 sess.run(init_op)
 print(sess.run(y))

出错提示:

占用的内存越来越多,程序崩溃之后,整个电脑都奔溃了,因为整个显卡全被吃了

2018-06-10 18:28:00.263424: I T:\src\github\tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2018-06-10 18:28:00.598075: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1356] Found device 0 with properties: 
name: GeForce GTX 1060 major: 6 minor: 1 memoryClockRate(GHz): 1.6705
pciBusID: 0000:01:00.0
totalMemory: 6.00GiB freeMemory: 4.97GiB
2018-06-10 18:28:00.598453: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1435] Adding visible gpu devices: 0
2018-06-10 18:28:01.265600: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-06-10 18:28:01.265826: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:929]  0 
2018-06-10 18:28:01.265971: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:942] 0: N 
2018-06-10 18:28:01.266220: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4740 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060, pci bus id: 0000:01:00.0, compute capability: 6.1)
2018-06-10 18:28:01.331056: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 4.63G (4970853120 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.399111: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 4.17G (4473767936 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.468293: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 3.75G (4026391040 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.533138: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 3.37G (3623751936 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.602452: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 3.04G (3261376768 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.670225: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 2.73G (2935238912 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.733120: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 2.46G (2641714944 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.800101: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 2.21G (2377543424 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.862064: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 1.99G (2139789056 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.925434: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 1.79G (1925810176 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.986180: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 1.61G (1733229056 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:02.043456: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 1.45G (1559906048 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:02.103531: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 1.31G (1403915520 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:02.168973: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 1.18G (1263524096 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:02.229387: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 1.06G (1137171712 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:02.292997: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 976.04M (1023454720 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:02.356714: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 878.44M (921109248 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:02.418167: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 790.59M (828998400 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:02.482394: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 711.54M (746098688 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY

分析原因:

显卡驱动不是最新版本,用__驱动软件__更新一下驱动,或者自己去下载更新。

TF运行太多,注销全部程序冲洗打开。

由于TF内核编写的原因,默认占用全部的GPU去训练自己的东西,也就是像meiguo一样优先政策吧

这个时候我们得设置两个方面:

  • 选择什么样的占用方式?优先占用__还是__按需占用
  • 选择最大占用多少GPU,因为占用过大GPU会导致其它程序奔溃。最好在0.7以下

先更新驱动:

解决TensorFlow程序无限制占用GPU的方法

再设置TF程序:

注意:单独设置一个不行!按照网上大神博客试了,结果效果还是很差(占用很多GPU)

设置TF:

  • 按需占用
  • 最大占用70%GPU

修改代码如下:

import tensorflow as tf

w = tf.Variable([[1.0,2.0]])
b = tf.Variable([[2.],[3.]])

y = tf.multiply(w,b)

init_op = tf.global_variables_initializer()

config = tf.ConfigProto(allow_soft_placement=True)
gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.7)
config.gpu_options.allow_growth = True
with tf.Session(config=config) as sess:
 sess.run(init_op)
 print(sess.run(y))

成功解决:

2018-06-10 18:21:17.532630: I T:\src\github\tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2018-06-10 18:21:17.852442: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1356] Found device 0 with properties: 
name: GeForce GTX 1060 major: 6 minor: 1 memoryClockRate(GHz): 1.6705
pciBusID: 0000:01:00.0
totalMemory: 6.00GiB freeMemory: 4.97GiB
2018-06-10 18:21:17.852817: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1435] Adding visible gpu devices: 0
2018-06-10 18:21:18.511176: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-06-10 18:21:18.511397: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:929]  0 
2018-06-10 18:21:18.511544: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:942] 0: N 
2018-06-10 18:21:18.511815: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4740 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060, pci bus id: 0000:01:00.0, compute capability: 6.1)
[[2. 4.]
 [3. 6.]]

参考资料:

主要参考博客

错误实例

到此这篇关于解决TensorFlow程序无限制占用GPU的方法 的文章就介绍到这了,更多相关TensorFlow 占用GPU内容请搜索三水点靠木以前的文章或继续浏览下面的相关文章希望大家以后多多支持三水点靠木!

Python 相关文章推荐
pandas系列之DataFrame 行列数据筛选实例
Apr 12 Python
python字符串string的内置方法实例详解
May 14 Python
pytorch: tensor类型的构建与相互转换实例
Jul 26 Python
对Python subprocess.Popen子进程管道阻塞详解
Oct 29 Python
详解Python做一个名片管理系统
Mar 14 Python
python打开windows应用程序的实例
Jun 28 Python
python实现一个点绕另一个点旋转后的坐标
Dec 04 Python
Python多线程Threading、子线程与守护线程实例详解
Mar 24 Python
jupyter notebook 多环境conda kernel配置方式
Apr 10 Python
Python常用base64 md5 aes des crc32加密解密方法汇总
Nov 06 Python
BeautifulSoup获取指定class样式的div的实现
Dec 07 Python
在Ubuntu中安装并配置Pycharm教程的实现方法
Jan 06 Python
tensorflow 大于某个值为1,小于为0的实例
Jun 30 #Python
基于tf.shape(tensor)和tensor.shape()的区别说明
Jun 30 #Python
Tensorflow全局设置可见GPU编号操作
Jun 30 #Python
Python logging模块异步线程写日志实现过程解析
Jun 30 #Python
浅谈多卡服务器下隐藏部分 GPU 和 TensorFlow 的显存使用设置
Jun 30 #Python
Tensorflow中批量读取数据的案列分析及TFRecord文件的打包与读取
Jun 30 #Python
使用Tensorflow-GPU禁用GPU设置(CPU与GPU速度对比)
Jun 30 #Python
You might like
如何将一个表单同时提交到两个地方处理
2006/10/09 PHP
屏蔽浏览器缓存另类方法
2006/10/09 PHP
php下通过POST还是GET来传值
2008/06/05 PHP
PHP 得到根目录的 __FILE__ 常量
2008/07/23 PHP
ThinkPHP字符串函数及常用函数汇总
2014/07/18 PHP
php 三大特点:封装,继承,多态
2017/02/19 PHP
php检测mysql表是否存在的方法小结
2017/07/20 PHP
Thinkphp5.0框架视图view的循环标签用法示例
2019/10/12 PHP
php 中self,this的区别和操作方法实例分析
2019/11/04 PHP
javascript中对对层的控制
2006/12/29 Javascript
JQuery 初体验(建议学习jquery)
2009/04/25 Javascript
JavaScript定时器详解及实例
2013/08/01 Javascript
JQuery操作单选按钮以及复选按钮示例
2013/09/23 Javascript
JS的get和set使用示例
2014/02/20 Javascript
一些老手都不一定知道的JavaScript技巧
2014/05/06 Javascript
KnockoutJs快速入门教程
2016/05/16 Javascript
javascript 数组去重复(在线去重工具)
2016/12/17 Javascript
jQuery extend()详解及简单实例
2017/05/06 jQuery
js判断数组是否包含某个字符串变量的实例
2017/11/24 Javascript
python使用Image处理图片常用技巧分析
2015/06/01 Python
编写Python脚本抓取网络小说来制作自己的阅读器
2015/08/20 Python
Python对多属性的重复数据去重实例
2018/04/18 Python
Python面向对象之反射/自省机制实例分析
2018/08/24 Python
图文详解python安装Scrapy框架步骤
2019/05/20 Python
浅析Python3中的对象垃圾收集机制
2019/06/06 Python
Python实现京东抢秒杀功能
2021/01/25 Python
CSS实现定位元素居中的方法
2015/06/23 HTML / CSS
英国时尚高尔夫服装购物网站:Trendy Golf
2020/01/10 全球购物
override和overload的区别
2016/03/09 面试题
为什么说Ruby是一种真正的面向对象程序设计语言
2012/10/30 面试题
购房协议书范本
2014/04/11 职场文书
供货协议书范本
2014/04/22 职场文书
办公室务虚会发言材料
2014/10/20 职场文书
北京故宫导游词
2015/01/31 职场文书
地道战观后感500字
2015/06/04 职场文书
react合成事件与原生事件的相关理解
2021/05/13 Javascript