解决TensorFlow程序无限制占用GPU的方法


Posted in Python onJune 30, 2020

今天遇到一个奇怪的现象,使用tensorflow-gpu的时候,出现内存超额~~如果我训练什么大型数据也就算了,关键我就写了一个y=W*x…显示如下图所示:

程序如下:

import tensorflow as tf

w = tf.Variable([[1.0,2.0]])
b = tf.Variable([[2.],[3.]])

y = tf.multiply(w,b)

init_op = tf.global_variables_initializer()

with tf.Session() as sess:
 sess.run(init_op)
 print(sess.run(y))

出错提示:

占用的内存越来越多,程序崩溃之后,整个电脑都奔溃了,因为整个显卡全被吃了

2018-06-10 18:28:00.263424: I T:\src\github\tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2018-06-10 18:28:00.598075: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1356] Found device 0 with properties: 
name: GeForce GTX 1060 major: 6 minor: 1 memoryClockRate(GHz): 1.6705
pciBusID: 0000:01:00.0
totalMemory: 6.00GiB freeMemory: 4.97GiB
2018-06-10 18:28:00.598453: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1435] Adding visible gpu devices: 0
2018-06-10 18:28:01.265600: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-06-10 18:28:01.265826: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:929]  0 
2018-06-10 18:28:01.265971: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:942] 0: N 
2018-06-10 18:28:01.266220: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4740 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060, pci bus id: 0000:01:00.0, compute capability: 6.1)
2018-06-10 18:28:01.331056: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 4.63G (4970853120 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.399111: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 4.17G (4473767936 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.468293: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 3.75G (4026391040 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.533138: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 3.37G (3623751936 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.602452: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 3.04G (3261376768 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.670225: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 2.73G (2935238912 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.733120: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 2.46G (2641714944 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.800101: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 2.21G (2377543424 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.862064: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 1.99G (2139789056 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.925434: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 1.79G (1925810176 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:01.986180: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 1.61G (1733229056 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:02.043456: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 1.45G (1559906048 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:02.103531: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 1.31G (1403915520 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:02.168973: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 1.18G (1263524096 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:02.229387: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 1.06G (1137171712 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:02.292997: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 976.04M (1023454720 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:02.356714: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 878.44M (921109248 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:02.418167: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 790.59M (828998400 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
2018-06-10 18:28:02.482394: E T:\src\github\tensorflow\tensorflow\stream_executor\cuda\cuda_driver.cc:936] failed to allocate 711.54M (746098688 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY

分析原因:

显卡驱动不是最新版本,用__驱动软件__更新一下驱动,或者自己去下载更新。

TF运行太多,注销全部程序冲洗打开。

由于TF内核编写的原因,默认占用全部的GPU去训练自己的东西,也就是像meiguo一样优先政策吧

这个时候我们得设置两个方面:

  • 选择什么样的占用方式?优先占用__还是__按需占用
  • 选择最大占用多少GPU,因为占用过大GPU会导致其它程序奔溃。最好在0.7以下

先更新驱动:

解决TensorFlow程序无限制占用GPU的方法

再设置TF程序:

注意:单独设置一个不行!按照网上大神博客试了,结果效果还是很差(占用很多GPU)

设置TF:

  • 按需占用
  • 最大占用70%GPU

修改代码如下:

import tensorflow as tf

w = tf.Variable([[1.0,2.0]])
b = tf.Variable([[2.],[3.]])

y = tf.multiply(w,b)

init_op = tf.global_variables_initializer()

config = tf.ConfigProto(allow_soft_placement=True)
gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.7)
config.gpu_options.allow_growth = True
with tf.Session(config=config) as sess:
 sess.run(init_op)
 print(sess.run(y))

成功解决:

2018-06-10 18:21:17.532630: I T:\src\github\tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2018-06-10 18:21:17.852442: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1356] Found device 0 with properties: 
name: GeForce GTX 1060 major: 6 minor: 1 memoryClockRate(GHz): 1.6705
pciBusID: 0000:01:00.0
totalMemory: 6.00GiB freeMemory: 4.97GiB
2018-06-10 18:21:17.852817: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1435] Adding visible gpu devices: 0
2018-06-10 18:21:18.511176: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-06-10 18:21:18.511397: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:929]  0 
2018-06-10 18:21:18.511544: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:942] 0: N 
2018-06-10 18:21:18.511815: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4740 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060, pci bus id: 0000:01:00.0, compute capability: 6.1)
[[2. 4.]
 [3. 6.]]

参考资料:

主要参考博客

错误实例

到此这篇关于解决TensorFlow程序无限制占用GPU的方法 的文章就介绍到这了,更多相关TensorFlow 占用GPU内容请搜索三水点靠木以前的文章或继续浏览下面的相关文章希望大家以后多多支持三水点靠木!

Python 相关文章推荐
Python实现批量下载文件
May 17 Python
python中异常报错处理方法汇总
Nov 20 Python
python实现12306火车票查询器
Apr 20 Python
python实现pdf转换成word/txt纯文本文件
Jun 07 Python
Python实现多进程的四种方式
Feb 22 Python
Python安装whl文件过程图解
Feb 18 Python
利用pandas向一个csv文件追加写入数据的实现示例
Apr 23 Python
python golang中grpc 使用示例代码详解
Jun 03 Python
Scrapy模拟登录赶集网的实现代码
Jul 07 Python
Python sqlalchemy时间戳及密码管理实现代码详解
Aug 01 Python
Pandas中两个dataframe的交集和差集的示例代码
Dec 13 Python
详解python第三方库的安装、PyInstaller库、random库
Mar 03 Python
tensorflow 大于某个值为1,小于为0的实例
Jun 30 #Python
基于tf.shape(tensor)和tensor.shape()的区别说明
Jun 30 #Python
Tensorflow全局设置可见GPU编号操作
Jun 30 #Python
Python logging模块异步线程写日志实现过程解析
Jun 30 #Python
浅谈多卡服务器下隐藏部分 GPU 和 TensorFlow 的显存使用设置
Jun 30 #Python
Tensorflow中批量读取数据的案列分析及TFRecord文件的打包与读取
Jun 30 #Python
使用Tensorflow-GPU禁用GPU设置(CPU与GPU速度对比)
Jun 30 #Python
You might like
php设计模式 Builder(建造者模式)
2011/06/26 PHP
php接口与接口引用的深入解析
2013/08/09 PHP
php session的应用详细介绍
2017/03/22 PHP
php抽象类和接口知识点整理总结
2019/08/02 PHP
php中的依赖注入实例详解
2019/08/14 PHP
一实用的实现table排序的Javascript类库
2007/09/12 Javascript
jQuery实现在最后一个元素之前插入新元素的方法
2015/07/18 Javascript
jquery使用ul模拟select实现表单美化的方法
2015/08/18 Javascript
JS实现超简单的仿QQ折叠菜单效果
2015/09/21 Javascript
js倒计时简单实现代码
2016/08/11 Javascript
JavaScript 数据类型详解
2017/03/13 Javascript
vuejs 单文件组件.vue 文件的使用
2017/07/28 Javascript
vue自定义移动端touch事件之点击、滑动、长按事件
2018/07/10 Javascript
vue中的mescroll搜索运用及各种填坑处理
2019/10/30 Javascript
Node如何后台数据库使用增删改查功能
2019/11/21 Javascript
使用python实现接口的方法
2017/07/07 Python
带你认识Django
2019/01/15 Python
Flask框架请求钩子与request请求对象用法实例分析
2019/11/07 Python
python网络编程socket实现服务端、客户端操作详解
2020/03/24 Python
浅谈python 中的 type(), dtype(), astype()的区别
2020/04/09 Python
tensorflow pb to tflite 精度下降详解
2020/05/25 Python
python实现excel公式格式化的示例代码
2020/12/23 Python
浅析数据存储的三种方式 cookie sessionstorage localstorage 的异同
2020/06/04 HTML / CSS
新百伦折扣店:Joe’s New Balance Outlet
2016/08/20 全球购物
英国著名国际平价时尚男装品牌:Topman
2016/08/27 全球购物
国外平面设计第一市场:99designs
2016/10/25 全球购物
新西兰廉价汽车租赁:Snap Rentals
2018/09/14 全球购物
文明青少年标兵事迹材料
2014/01/28 职场文书
电子银行营销方案
2014/02/22 职场文书
《学会待客》教学反思
2014/02/22 职场文书
人力资源经理的岗位职责
2014/03/02 职场文书
售后服务承诺书
2014/03/26 职场文书
分居协议书范本
2014/11/03 职场文书
刘胡兰观后感
2015/06/16 职场文书
九年级历史教学反思
2016/02/19 职场文书
OpenCV-Python模板匹配人眼的实例
2021/06/08 Python