tensorflow实现残差网络方式(mnist数据集)


Posted in Python onMay 26, 2020

介绍

残差网络是何凯明大神的神作,效果非常好,深度可以达到1000层。但是,其实现起来并没有那末难,在这里以tensorflow作为框架,实现基于mnist数据集上的残差网络,当然只是比较浅层的。

如下图所示:

tensorflow实现残差网络方式(mnist数据集)

实线的Connection部分,表示通道相同,如上图的第一个粉色矩形和第三个粉色矩形,都是3x3x64的特征图,由于通道相同,所以采用计算方式为H(x)=F(x)+x

虚线的的Connection部分,表示通道不同,如上图的第一个绿色矩形和第三个绿色矩形,分别是3x3x64和3x3x128的特征图,通道不同,采用的计算方式为H(x)=F(x)+Wx,其中W是卷积操作,用来调整x维度的。

根据输入和输出尺寸是否相同,又分为identity_block和conv_block,每种block有上图两种模式,三卷积和二卷积,三卷积速度更快些,因此在这里选择该种方式。

具体实现见如下代码:

#tensorflow基于mnist数据集上的VGG11网络,可以直接运行
from tensorflow.examples.tutorials.mnist import input_data
import tensorflow as tf
#tensorflow基于mnist实现VGG11
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

#x=mnist.train.images
#y=mnist.train.labels
#X=mnist.test.images
#Y=mnist.test.labels
x = tf.placeholder(tf.float32, [None,784])
y = tf.placeholder(tf.float32, [None, 10])
sess = tf.InteractiveSession()

def weight_variable(shape):
#这里是构建初始变量
 initial = tf.truncated_normal(shape, mean=0,stddev=0.1)
#创建变量
 return tf.Variable(initial)

def bias_variable(shape):
 initial = tf.constant(0.1, shape=shape)
 return tf.Variable(initial)

#在这里定义残差网络的id_block块,此时输入和输出维度相同
def identity_block(X_input, kernel_size, in_filter, out_filters, stage, block):
 """
 Implementation of the identity block as defined in Figure 3

 Arguments:
 X -- input tensor of shape (m, n_H_prev, n_W_prev, n_C_prev)
 kernel_size -- integer, specifying the shape of the middle CONV's window for the main path
 filters -- python list of integers, defining the number of filters in the CONV layers of the main path
 stage -- integer, used to name the layers, depending on their position in the network
 block -- string/character, used to name the layers, depending on their position in the network
 training -- train or test

 Returns:
 X -- output of the identity block, tensor of shape (n_H, n_W, n_C)
 """

 # defining name basis
 block_name = 'res' + str(stage) + block
 f1, f2, f3 = out_filters
 with tf.variable_scope(block_name):
  X_shortcut = X_input

  #first
  W_conv1 = weight_variable([1, 1, in_filter, f1])
  X = tf.nn.conv2d(X_input, W_conv1, strides=[1, 1, 1, 1], padding='SAME')
  b_conv1 = bias_variable([f1])
  X = tf.nn.relu(X+ b_conv1)

  #second
  W_conv2 = weight_variable([kernel_size, kernel_size, f1, f2])
  X = tf.nn.conv2d(X, W_conv2, strides=[1, 1, 1, 1], padding='SAME')
  b_conv2 = bias_variable([f2])
  X = tf.nn.relu(X+ b_conv2)

  #third

  W_conv3 = weight_variable([1, 1, f2, f3])
  X = tf.nn.conv2d(X, W_conv3, strides=[1, 1, 1, 1], padding='SAME')
  b_conv3 = bias_variable([f3])
  X = tf.nn.relu(X+ b_conv3)
  #final step
  add = tf.add(X, X_shortcut)
  b_conv_fin = bias_variable([f3])
  add_result = tf.nn.relu(add+b_conv_fin)

 return add_result


#这里定义conv_block模块,由于该模块定义时输入和输出尺度不同,故需要进行卷积操作来改变尺度,从而得以相加
def convolutional_block( X_input, kernel_size, in_filter,
    out_filters, stage, block, stride=2):
 """
 Implementation of the convolutional block as defined in Figure 4

 Arguments:
 X -- input tensor of shape (m, n_H_prev, n_W_prev, n_C_prev)
 kernel_size -- integer, specifying the shape of the middle CONV's window for the main path
 filters -- python list of integers, defining the number of filters in the CONV layers of the main path
 stage -- integer, used to name the layers, depending on their position in the network
 block -- string/character, used to name the layers, depending on their position in the network
 training -- train or test
 stride -- Integer, specifying the stride to be used

 Returns:
 X -- output of the convolutional block, tensor of shape (n_H, n_W, n_C)
 """

 # defining name basis
 block_name = 'res' + str(stage) + block
 with tf.variable_scope(block_name):
  f1, f2, f3 = out_filters

  x_shortcut = X_input
  #first
  W_conv1 = weight_variable([1, 1, in_filter, f1])
  X = tf.nn.conv2d(X_input, W_conv1,strides=[1, stride, stride, 1],padding='SAME')
  b_conv1 = bias_variable([f1])
  X = tf.nn.relu(X + b_conv1)

  #second
  W_conv2 =weight_variable([kernel_size, kernel_size, f1, f2])
  X = tf.nn.conv2d(X, W_conv2, strides=[1,1,1,1], padding='SAME')
  b_conv2 = bias_variable([f2])
  X = tf.nn.relu(X+b_conv2)

  #third
  W_conv3 = weight_variable([1,1, f2,f3])
  X = tf.nn.conv2d(X, W_conv3, strides=[1, 1, 1,1], padding='SAME')
  b_conv3 = bias_variable([f3])
  X = tf.nn.relu(X+b_conv3)
  #shortcut path
  W_shortcut =weight_variable([1, 1, in_filter, f3])
  x_shortcut = tf.nn.conv2d(x_shortcut, W_shortcut, strides=[1, stride, stride, 1], padding='VALID')

  #final
  add = tf.add(x_shortcut, X)
  #建立最后融合的权重
  b_conv_fin = bias_variable([f3])
  add_result = tf.nn.relu(add+ b_conv_fin)


 return add_result



x = tf.reshape(x, [-1,28,28,1])
w_conv1 = weight_variable([2, 2, 1, 64])
x = tf.nn.conv2d(x, w_conv1, strides=[1, 2, 2, 1], padding='SAME')
b_conv1 = bias_variable([64])
x = tf.nn.relu(x+b_conv1)
#这里操作后变成14x14x64
x = tf.nn.max_pool(x, ksize=[1, 3, 3, 1],
    strides=[1, 1, 1, 1], padding='SAME')


#stage 2
x = convolutional_block(X_input=x, kernel_size=3, in_filter=64, out_filters=[64, 64, 256], stage=2, block='a', stride=1)
#上述conv_block操作后,尺寸变为14x14x256
x = identity_block(x, 3, 256, [64, 64, 256], stage=2, block='b' )
x = identity_block(x, 3, 256, [64, 64, 256], stage=2, block='c')
#上述操作后张量尺寸变成14x14x256
x = tf.nn.max_pool(x, [1, 2, 2, 1], strides=[1,2,2,1], padding='SAME')
#变成7x7x256
flat = tf.reshape(x, [-1,7*7*256])

w_fc1 = weight_variable([7 * 7 *256, 1024])
b_fc1 = bias_variable([1024])

h_fc1 = tf.nn.relu(tf.matmul(flat, w_fc1) + b_fc1)
keep_prob = tf.placeholder(tf.float32)
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
w_fc2 = weight_variable([1024, 10])
b_fc2 = bias_variable([10])
y_conv = tf.matmul(h_fc1_drop, w_fc2) + b_fc2


#建立损失函数,在这里采用交叉熵函数
cross_entropy = tf.reduce_mean(
 tf.nn.softmax_cross_entropy_with_logits(labels=y, logits=y_conv))

train_step = tf.train.AdamOptimizer(1e-3).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
#初始化变量

sess.run(tf.global_variables_initializer())

print("cuiwei")
for i in range(2000):
 batch = mnist.train.next_batch(10)
 if i%100 == 0:
 train_accuracy = accuracy.eval(feed_dict={
 x:batch[0], y: batch[1], keep_prob: 1.0})
 print("step %d, training accuracy %g"%(i, train_accuracy))
 train_step.run(feed_dict={x: batch[0], y: batch[1], keep_prob: 0.5})

以上这篇tensorflow实现残差网络方式(mnist数据集)就是小编分享给大家的全部内容了,希望能给大家一个参考,也希望大家多多支持三水点靠木。

Python 相关文章推荐
跟老齐学Python之数据类型总结
Sep 24 Python
Python 编码处理-str与Unicode的区别
Sep 06 Python
Python3调用微信企业号API发送文本消息代码示例
Nov 10 Python
python 自动重连wifi windows的方法
Dec 18 Python
python3 小数位的四舍五入(用两种方法解决round 遇5不进)
Apr 11 Python
CentOS6.9 Python环境配置(python2.7、pip、virtualenv)
May 06 Python
完美解决pycharm导入自己写的py文件爆红问题
Feb 12 Python
Pycharm连接远程服务器过程图解
Apr 30 Python
给ubuntu18安装python3.7的详细教程
Jun 08 Python
使用tkinter实现三子棋游戏
Feb 25 Python
linux中nohup和后台运行进程查看及终止
Jun 24 Python
Python通过loop.run_in_executor执行同步代码 同步变为异步
Apr 11 Python
Python中格式化字符串的四种实现
May 26 #Python
使用tensorflow实现VGG网络,训练mnist数据集方式
May 26 #Python
浅谈Tensorflow加载Vgg预训练模型的几个注意事项
May 26 #Python
Tensorflow加载Vgg预训练模型操作
May 26 #Python
PyQt5如何将.ui文件转换为.py文件的实例代码
May 26 #Python
TensorFlow实现模型断点训练,checkpoint模型载入方式
May 26 #Python
python 日志模块 日志等级设置失效的解决方案
May 26 #Python
You might like
菜鸟修复电子管记
2021/03/02 无线电
针对初学PHP者的疑难问答(1)
2006/10/09 PHP
php+javascript的日历控件
2009/11/19 PHP
Windows7下PHP开发环境安装配置图文方法
2010/05/20 PHP
php批量删除操作代码分享
2017/02/26 PHP
Laravel 5.4因特殊字段太长导致migrations报错的解决
2017/10/22 PHP
javascript写的日历类(基于pj)
2010/12/28 Javascript
Extjs中通过Tree加载右侧TabPanel具体实现
2013/05/05 Javascript
js实现页面跳转重定向的几种方式
2014/05/29 Javascript
Node.js 的异步 IO 性能探讨
2014/10/08 Javascript
IE下支持文本框和密码框placeholder效果的JQuery插件分享
2015/01/31 Javascript
jQuery操作表单常用控件方法小结
2015/03/23 Javascript
JS实现控制表格行文本对齐的方法
2015/03/30 Javascript
JS 日期与时间戮相互转化的简单实例
2016/06/22 Javascript
JS实现Ajax的方法分析
2016/12/20 Javascript
实现一个简单的vue无限加载指令方法
2017/01/10 Javascript
JavaScript使用FileReader实现图片上传预览效果
2020/03/27 Javascript
Vue中对iframe实现keep alive无刷新的方法
2019/07/23 Javascript
微信小程序如何引用外部js,外部样式,公共页面模板
2019/07/23 Javascript
[35:34]Liquid vs Winstrike 2018国际邀请赛小组赛BO2 第一场 8.18
2018/08/19 DOTA
python的描述符(descriptor)、装饰器(property)造成的一个无限递归问题分享
2014/07/09 Python
Python实现字符串反转的常用方法分析【4种方法】
2017/09/30 Python
Python装饰器原理与用法分析
2018/04/30 Python
python 自定义异常和异常捕捉的方法
2018/10/18 Python
python使用装饰器作日志处理的方法
2019/07/11 Python
Python 用matplotlib画以时间日期为x轴的图像
2019/08/06 Python
keras获得model中某一层的某一个Tensor的输出维度教程
2020/01/24 Python
Python利用Faiss库实现ANN近邻搜索的方法详解
2020/08/03 Python
python基本算法之实现归并排序(Merge sort)
2020/09/01 Python
COACH德国官方网站:纽约现代奢侈品牌,1941年
2018/06/09 全球购物
家长会学生家长演讲稿
2013/12/29 职场文书
采购部主管岗位职责
2014/01/01 职场文书
车间统计员岗位职责
2014/01/05 职场文书
2014年党员创先争优承诺书
2014/05/29 职场文书
个人四风问题对照检查材料
2014/09/26 职场文书
车辆委托书范本
2014/10/05 职场文书