编程 Python

tensorflow实现残差网络方式(mnist数据集)

Posted in Python onMay 26, 2020

介绍

残差网络是何凯明大神的神作，效果非常好，深度可以达到1000层。但是，其实现起来并没有那末难，在这里以tensorflow作为框架，实现基于mnist数据集上的残差网络，当然只是比较浅层的。

如下图所示：

实线的Connection部分，表示通道相同，如上图的第一个粉色矩形和第三个粉色矩形，都是3x3x64的特征图，由于通道相同，所以采用计算方式为H(x)=F(x)+x

虚线的的Connection部分，表示通道不同，如上图的第一个绿色矩形和第三个绿色矩形，分别是3x3x64和3x3x128的特征图，通道不同，采用的计算方式为H(x)=F(x)+Wx，其中W是卷积操作，用来调整x维度的。

根据输入和输出尺寸是否相同，又分为identity_block和conv_block，每种block有上图两种模式，三卷积和二卷积，三卷积速度更快些，因此在这里选择该种方式。

具体实现见如下代码：

#tensorflow基于mnist数据集上的VGG11网络，可以直接运行
from tensorflow.examples.tutorials.mnist import input_data
import tensorflow as tf
#tensorflow基于mnist实现VGG11
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

#x=mnist.train.images
#y=mnist.train.labels
#X=mnist.test.images
#Y=mnist.test.labels
x = tf.placeholder(tf.float32, [None,784])
y = tf.placeholder(tf.float32, [None, 10])
sess = tf.InteractiveSession()

def weight_variable(shape):
#这里是构建初始变量
 initial = tf.truncated_normal(shape, mean=0,stddev=0.1)
#创建变量
 return tf.Variable(initial)

def bias_variable(shape):
 initial = tf.constant(0.1, shape=shape)
 return tf.Variable(initial)

#在这里定义残差网络的id_block块，此时输入和输出维度相同
def identity_block(X_input, kernel_size, in_filter, out_filters, stage, block):
 """
 Implementation of the identity block as defined in Figure 3

 Arguments:
 X -- input tensor of shape (m, n_H_prev, n_W_prev, n_C_prev)
 kernel_size -- integer, specifying the shape of the middle CONV's window for the main path
 filters -- python list of integers, defining the number of filters in the CONV layers of the main path
 stage -- integer, used to name the layers, depending on their position in the network
 block -- string/character, used to name the layers, depending on their position in the network
 training -- train or test

 Returns:
 X -- output of the identity block, tensor of shape (n_H, n_W, n_C)
 """

 # defining name basis
 block_name = 'res' + str(stage) + block
 f1, f2, f3 = out_filters
 with tf.variable_scope(block_name):
  X_shortcut = X_input

  #first
  W_conv1 = weight_variable([1, 1, in_filter, f1])
  X = tf.nn.conv2d(X_input, W_conv1, strides=[1, 1, 1, 1], padding='SAME')
  b_conv1 = bias_variable([f1])
  X = tf.nn.relu(X+ b_conv1)

  #second
  W_conv2 = weight_variable([kernel_size, kernel_size, f1, f2])
  X = tf.nn.conv2d(X, W_conv2, strides=[1, 1, 1, 1], padding='SAME')
  b_conv2 = bias_variable([f2])
  X = tf.nn.relu(X+ b_conv2)

  #third

  W_conv3 = weight_variable([1, 1, f2, f3])
  X = tf.nn.conv2d(X, W_conv3, strides=[1, 1, 1, 1], padding='SAME')
  b_conv3 = bias_variable([f3])
  X = tf.nn.relu(X+ b_conv3)
  #final step
  add = tf.add(X, X_shortcut)
  b_conv_fin = bias_variable([f3])
  add_result = tf.nn.relu(add+b_conv_fin)

 return add_result


#这里定义conv_block模块，由于该模块定义时输入和输出尺度不同，故需要进行卷积操作来改变尺度，从而得以相加
def convolutional_block( X_input, kernel_size, in_filter,
    out_filters, stage, block, stride=2):
 """
 Implementation of the convolutional block as defined in Figure 4

 Arguments:
 X -- input tensor of shape (m, n_H_prev, n_W_prev, n_C_prev)
 kernel_size -- integer, specifying the shape of the middle CONV's window for the main path
 filters -- python list of integers, defining the number of filters in the CONV layers of the main path
 stage -- integer, used to name the layers, depending on their position in the network
 block -- string/character, used to name the layers, depending on their position in the network
 training -- train or test
 stride -- Integer, specifying the stride to be used

 Returns:
 X -- output of the convolutional block, tensor of shape (n_H, n_W, n_C)
 """

 # defining name basis
 block_name = 'res' + str(stage) + block
 with tf.variable_scope(block_name):
  f1, f2, f3 = out_filters

  x_shortcut = X_input
  #first
  W_conv1 = weight_variable([1, 1, in_filter, f1])
  X = tf.nn.conv2d(X_input, W_conv1,strides=[1, stride, stride, 1],padding='SAME')
  b_conv1 = bias_variable([f1])
  X = tf.nn.relu(X + b_conv1)

  #second
  W_conv2 =weight_variable([kernel_size, kernel_size, f1, f2])
  X = tf.nn.conv2d(X, W_conv2, strides=[1,1,1,1], padding='SAME')
  b_conv2 = bias_variable([f2])
  X = tf.nn.relu(X+b_conv2)

  #third
  W_conv3 = weight_variable([1,1, f2,f3])
  X = tf.nn.conv2d(X, W_conv3, strides=[1, 1, 1,1], padding='SAME')
  b_conv3 = bias_variable([f3])
  X = tf.nn.relu(X+b_conv3)
  #shortcut path
  W_shortcut =weight_variable([1, 1, in_filter, f3])
  x_shortcut = tf.nn.conv2d(x_shortcut, W_shortcut, strides=[1, stride, stride, 1], padding='VALID')

  #final
  add = tf.add(x_shortcut, X)
  #建立最后融合的权重
  b_conv_fin = bias_variable([f3])
  add_result = tf.nn.relu(add+ b_conv_fin)


 return add_result



x = tf.reshape(x, [-1,28,28,1])
w_conv1 = weight_variable([2, 2, 1, 64])
x = tf.nn.conv2d(x, w_conv1, strides=[1, 2, 2, 1], padding='SAME')
b_conv1 = bias_variable([64])
x = tf.nn.relu(x+b_conv1)
#这里操作后变成14x14x64
x = tf.nn.max_pool(x, ksize=[1, 3, 3, 1],
    strides=[1, 1, 1, 1], padding='SAME')


#stage 2
x = convolutional_block(X_input=x, kernel_size=3, in_filter=64, out_filters=[64, 64, 256], stage=2, block='a', stride=1)
#上述conv_block操作后，尺寸变为14x14x256
x = identity_block(x, 3, 256, [64, 64, 256], stage=2, block='b' )
x = identity_block(x, 3, 256, [64, 64, 256], stage=2, block='c')
#上述操作后张量尺寸变成14x14x256
x = tf.nn.max_pool(x, [1, 2, 2, 1], strides=[1,2,2,1], padding='SAME')
#变成7x7x256
flat = tf.reshape(x, [-1,7*7*256])

w_fc1 = weight_variable([7 * 7 *256, 1024])
b_fc1 = bias_variable([1024])

h_fc1 = tf.nn.relu(tf.matmul(flat, w_fc1) + b_fc1)
keep_prob = tf.placeholder(tf.float32)
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
w_fc2 = weight_variable([1024, 10])
b_fc2 = bias_variable([10])
y_conv = tf.matmul(h_fc1_drop, w_fc2) + b_fc2


#建立损失函数，在这里采用交叉熵函数
cross_entropy = tf.reduce_mean(
 tf.nn.softmax_cross_entropy_with_logits(labels=y, logits=y_conv))

train_step = tf.train.AdamOptimizer(1e-3).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
#初始化变量

sess.run(tf.global_variables_initializer())

print("cuiwei")
for i in range(2000):
 batch = mnist.train.next_batch(10)
 if i%100 == 0:
 train_accuracy = accuracy.eval(feed_dict={
 x:batch[0], y: batch[1], keep_prob: 1.0})
 print("step %d, training accuracy %g"%(i, train_accuracy))
 train_step.run(feed_dict={x: batch[0], y: batch[1], keep_prob: 0.5})

以上这篇tensorflow实现残差网络方式(mnist数据集)就是小编分享给大家的全部内容了，希望能给大家一个参考，也希望大家多多支持三水点靠木。

tensorflow实现残差网络方式(mnist数据集)

- Author -

Tom Hardy

声明：登载此文出于传递更多信息之目的，并不意味着赞同其观点或证实其描述。

Python 相关文章推荐

Python简单连接MongoDB数据库的方法

Mar 15 Python

python中zip()方法应用实例分析

Apr 16 Python

深入理解Python变量与常量

Jun 02 Python

python异常和文件处理机制详解

Jul 19 Python

python三大神器之fabric使用教程

Jun 10 Python

用Python获取摄像头并实时控制人脸的实现示例

Jul 11 Python

django项目用higcharts统计最近七天文章点击量

Aug 17 Python

Django中间件拦截未登录url实例详解

Sep 03 Python

python ctypes库2_指定参数类型和返回类型详解

Nov 19 Python

python 实现矩阵填充0的例子

Nov 29 Python

python opencv实现gif图片分解的示例代码

Dec 13 Python

Tensorflow加载Vgg预训练模型操作

May 26 Python

Python中格式化字符串的四种实现

May 26 #Python

使用tensorflow实现VGG网络,训练mnist数据集方式

May 26 #Python

浅谈Tensorflow加载Vgg预训练模型的几个注意事项

May 26 #Python

Tensorflow加载Vgg预训练模型操作

May 26 #Python

PyQt5如何将.ui文件转换为.py文件的实例代码

May 26 #Python

TensorFlow实现模型断点训练,checkpoint模型载入方式

May 26 #Python

python 日志模块日志等级设置失效的解决方案

May 26 #Python

You might like

PHP JSON格式数据交互实例代码详解

2011/01/13 PHP

Laravel使用Caching缓存数据减轻数据库查询压力的方法

2016/03/15 PHP

javascript 命名空间以提高代码重用性

2008/11/13 Javascript

jQuery 学习6 操纵元素显示效果的函数

2010/02/07 Javascript

js实现日期级联效果

2014/01/23 Javascript

JavaScript中跨域调用Flash的方法

2014/08/11 Javascript

javascript使用for循环批量注册的事件不能正确获取索引值的解决方法

2014/12/20 Javascript

javascript数据结构之串的概念与用法分析

2017/04/12 Javascript

JavaScript实现离开页面前提示功能【附jQuery实现方法】

2017/09/26 jQuery

详解vue中点击空白处隐藏div的实现（用指令实现）

2018/04/19 Javascript

vue + axios get下载文件功能

2019/09/25 Javascript

vue倒计时刷新页面不会从头开始的解决方法

2020/03/03 Javascript

JS实现按比例缩小图片宽高

2020/08/24 Javascript

jQuery列表动态增加和删除的实现方法

2020/11/05 jQuery

python 实现将文件或文件夹用相对路径打包为 tar.gz 文件的方法

2019/06/10 Python

python实现比较类的两个instance(对象)是否相等的方法分析

2019/06/26 Python

Python基础类继承重写实现原理解析

2020/04/03 Python

基于Keras的格式化输出Loss实现方式

2020/06/17 Python

纯CSS3实现的阴影效果

2014/12/24 HTML / CSS

纯CSS3实现8组超炫酷鼠标滑过图片动画

2016/03/16 HTML / CSS

意大利体育用品网上商城：Nencini Sport

2016/08/18 全球购物

匡威荷兰官方网站：Converse荷兰

2018/10/24 全球购物

美国杰西潘尼官网：JCPenney

2019/06/12 全球购物

法国在线药房：DoctiPharma

2020/10/21 全球购物

JD Sports丹麦：英国领先的运动时尚零售商

2020/11/24 全球购物

音乐专业应届生教师求职信

2013/11/04 职场文书

商务助理岗位职责

2013/11/13 职场文书

网页设计个人找工作求职信

2013/11/28 职场文书

可贵的沉默教学反思

2014/02/06 职场文书

护士进修自我鉴定

2014/02/07 职场文书

职工小家建设活动方案

2014/08/25 职场文书

副总经理岗位职责范本

2015/04/08 职场文书

学校标语口号大全

2015/12/26 职场文书

opencv读取视频并保存图像的方法

2021/06/04 Python

GO语言字符串处理函数之处理Strings包

2022/04/14 Golang

MySQL 数据库增删查改、克隆、外键等操作

2022/05/11 MySQL