keras 实现轻量级网络ShuffleNet教程


Posted in Python onJune 19, 2020

ShuffleNet是由旷世发表的一个计算效率极高的CNN架构,它是专门为计算能力非常有限的移动设备(例如,10-150 MFLOPs)而设计的。该结构利用组卷积和信道混洗两种新的运算方法,在保证计算精度的同时,大大降低了计算成本。ImageNet分类和MS COCO对象检测实验表明,在40 MFLOPs的计算预算下,ShuffleNet的性能优于其他结构,例如,在ImageNet分类任务上,ShuffleNet的top-1 error 7.8%比最近的MobileNet低。在基于arm的移动设备上,ShuffleNet比AlexNet实际加速了13倍,同时保持了相当的准确性。

Paper:ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile

Github:https://github.com/zjn-ai/ShuffleNet-keras

网络架构

组卷积

组卷积其实早在AlexNet中就用过了,当时因为GPU的显存不足因而利用组卷积分配到两个GPU上训练。简单来讲,组卷积就是将输入特征图按照通道方向均分成多个大小一致的特征图,如下图所示左面是输入特征图右面是均分后的特征图,然后对得到的每一个特征图进行正常的卷积操作,最后将输出特征图按照通道方向拼接起来就可以了。

keras 实现轻量级网络ShuffleNet教程

目前很多框架都支持组卷积,但是tensorflow真的不知道在想什么,到现在还是不支持组卷积,只能自己写,因此效率肯定不及其他框架原生支持的方法。组卷积层的代码编写思路就与上面所说的原理完全一致,代码如下。

def _group_conv(x, filters, kernel, stride, groups):
 """
 Group convolution
 # Arguments
  x: Tensor, input tensor of with `channels_last` or 'channels_first' data format
  filters: Integer, number of output channels
  kernel: An integer or tuple/list of 2 integers, specifying the
   width and height of the 2D convolution window.
  strides: An integer or tuple/list of 2 integers,
   specifying the strides of the convolution along the width and height.
   Can be a single integer to specify the same value for
   all spatial dimensions.
  groups: Integer, number of groups per channel
  
 # Returns
  Output tensor
 """
 channel_axis = 1 if K.image_data_format() == 'channels_first' else -1
 in_channels = K.int_shape(x)[channel_axis]
 
 # number of input channels per group
 nb_ig = in_channels // groups
 # number of output channels per group
 nb_og = filters // groups
 
 gc_list = []
 # Determine whether the number of filters is divisible by the number of groups
 assert filters % groups == 0
 
 for i in range(groups):
  if channel_axis == -1:
   x_group = Lambda(lambda z: z[:, :, :, i * nb_ig: (i + 1) * nb_ig])(x)
  else:
   x_group = Lambda(lambda z: z[:, i * nb_ig: (i + 1) * nb_ig, :, :])(x)
  gc_list.append(Conv2D(filters=nb_og, kernel_size=kernel, strides=stride, 
        padding='same', use_bias=False)(x_group))
  
 return Concatenate(axis=channel_axis)(gc_list)

通道混洗

通道混洗是这篇paper的重点,尽管组卷积大量减少了计算量和参数,但是通道之间的信息交流也受到了限制因而模型精度肯定会受到影响,因此作者提出通道混洗,在不增加参数量和计算量的基础上加强通道之间的信息交流,如下图所示。

keras 实现轻量级网络ShuffleNet教程

通道混洗层的代码实现很巧妙参考了别人的实现方法。通过下面的代码说明,d代表特征图的通道序号,x是经过通道混洗后的通道顺序。

>>> d = np.array([0,1,2,3,4,5,6,7,8]) 
>>> x = np.reshape(d, (3,3)) 
>>> x = np.transpose(x, [1,0]) # 转置
>>> x = np.reshape(x, (9,)) # 平铺
'[0 1 2 3 4 5 6 7 8] --> [0 3 6 1 4 7 2 5 8]'

利用keras后端实现代码:

def _channel_shuffle(x, groups):
 """
 Channel shuffle layer
 
 # Arguments
  x: Tensor, input tensor of with `channels_last` or 'channels_first' data format
  groups: Integer, number of groups per channel
  
 # Returns
  Shuffled tensor
 """
 
 if K.image_data_format() == 'channels_last':
  height, width, in_channels = K.int_shape(x)[1:]
  channels_per_group = in_channels // groups
  pre_shape = [-1, height, width, groups, channels_per_group]
  dim = (0, 1, 2, 4, 3)
  later_shape = [-1, height, width, in_channels]
 else:
  in_channels, height, width = K.int_shape(x)[1:]
  channels_per_group = in_channels // groups
  pre_shape = [-1, groups, channels_per_group, height, width]
  dim = (0, 2, 1, 3, 4)
  later_shape = [-1, in_channels, height, width]
 
 x = Lambda(lambda z: K.reshape(z, pre_shape))(x)
 x = Lambda(lambda z: K.permute_dimensions(z, dim))(x) 
 x = Lambda(lambda z: K.reshape(z, later_shape))(x)
 
 return x

ShuffleNet Unit

ShuffleNet的主要构成单元。下图中,a图为深度可分离卷积的基本架构,b图为1步长时用的单元,c图为2步长时用的单元。

keras 实现轻量级网络ShuffleNet教程

ShuffleNet架构

注意,对于第二阶段(Stage2),作者没有在第一个1×1卷积上应用组卷积,因为输入通道的数量相对较少。

keras 实现轻量级网络ShuffleNet教程

环境

Python 3.6

Tensorlow 1.13.1

Keras 2.2.4

实现

支持channel first或channel last

# -*- coding: utf-8 -*-
"""
Created on Thu Apr 25 18:26:41 2019
@author: zjn
"""
import numpy as np
from keras.callbacks import LearningRateScheduler
from keras.models import Model
from keras.layers import Input, Conv2D, Dropout, Dense, GlobalAveragePooling2D, Concatenate, AveragePooling2D
from keras.layers import Activation, BatchNormalization, add, Reshape, ReLU, DepthwiseConv2D, MaxPooling2D, Lambda
from keras.utils.vis_utils import plot_model
from keras import backend as K
from keras.optimizers import SGD
 
def _group_conv(x, filters, kernel, stride, groups):
 """
 Group convolution
 
 # Arguments
  x: Tensor, input tensor of with `channels_last` or 'channels_first' data format
  filters: Integer, number of output channels
  kernel: An integer or tuple/list of 2 integers, specifying the
   width and height of the 2D convolution window.
  strides: An integer or tuple/list of 2 integers,
   specifying the strides of the convolution along the width and height.
   Can be a single integer to specify the same value for
   all spatial dimensions.
  groups: Integer, number of groups per channel
  
 # Returns
  Output tensor
 """
 
 channel_axis = 1 if K.image_data_format() == 'channels_first' else -1
 in_channels = K.int_shape(x)[channel_axis]
 
 # number of input channels per group
 nb_ig = in_channels // groups
 # number of output channels per group
 nb_og = filters // groups
 
 gc_list = []
 # Determine whether the number of filters is divisible by the number of groups
 assert filters % groups == 0
 
 for i in range(groups):
  if channel_axis == -1:
   x_group = Lambda(lambda z: z[:, :, :, i * nb_ig: (i + 1) * nb_ig])(x)
  else:
   x_group = Lambda(lambda z: z[:, i * nb_ig: (i + 1) * nb_ig, :, :])(x)
  gc_list.append(Conv2D(filters=nb_og, kernel_size=kernel, strides=stride, 
        padding='same', use_bias=False)(x_group))
  
 return Concatenate(axis=channel_axis)(gc_list)
def _channel_shuffle(x, groups):
 """
 Channel shuffle layer
 
 # Arguments
  x: Tensor, input tensor of with `channels_last` or 'channels_first' data format
  groups: Integer, number of groups per channel
  
 # Returns
  Shuffled tensor
 """
 if K.image_data_format() == 'channels_last':
  height, width, in_channels = K.int_shape(x)[1:]
  channels_per_group = in_channels // groups
  pre_shape = [-1, height, width, groups, channels_per_group]
  dim = (0, 1, 2, 4, 3)
  later_shape = [-1, height, width, in_channels]
 else:
  in_channels, height, width = K.int_shape(x)[1:]
  channels_per_group = in_channels // groups
  pre_shape = [-1, groups, channels_per_group, height, width]
  dim = (0, 2, 1, 3, 4)
  later_shape = [-1, in_channels, height, width]
 
 x = Lambda(lambda z: K.reshape(z, pre_shape))(x)
 x = Lambda(lambda z: K.permute_dimensions(z, dim))(x) 
 x = Lambda(lambda z: K.reshape(z, later_shape))(x)
 
 return x
 
def _shufflenet_unit(inputs, filters, kernel, stride, groups, stage, bottleneck_ratio=0.25):
 """
 ShuffleNet unit
 
 # Arguments
  inputs: Tensor, input tensor of with `channels_last` or 'channels_first' data format
  filters: Integer, number of output channels
  kernel: An integer or tuple/list of 2 integers, specifying the
   width and height of the 2D convolution window.
  strides: An integer or tuple/list of 2 integers,
   specifying the strides of the convolution along the width and height.
   Can be a single integer to specify the same value for
   all spatial dimensions.
  groups: Integer, number of groups per channel
  stage: Integer, stage number of ShuffleNet
  bottleneck_channels: Float, bottleneck ratio implies the ratio of bottleneck channels to output channels
   
 # Returns
  Output tensor
  
 # Note
  For Stage 2, we(authors of shufflenet) do not apply group convolution on the first pointwise layer 
  because the number of input channels is relatively small.
 """
 channel_axis = 1 if K.image_data_format() == 'channels_first' else -1
 in_channels = K.int_shape(inputs)[channel_axis]
 bottleneck_channels = int(filters * bottleneck_ratio)
 
 if stage == 2:
  x = Conv2D(filters=bottleneck_channels, kernel_size=kernel, strides=1,
     padding='same', use_bias=False)(inputs)
 else:
  x = _group_conv(inputs, bottleneck_channels, (1, 1), 1, groups)
 x = BatchNormalization(axis=channel_axis)(x)
 x = ReLU()(x)
 
 x = _channel_shuffle(x, groups)
 x = DepthwiseConv2D(kernel_size=kernel, strides=stride, depth_multiplier=1, 
      padding='same', use_bias=False)(x)
 x = BatchNormalization(axis=channel_axis)(x)
  
 if stride == 2:
  x = _group_conv(x, filters - in_channels, (1, 1), 1, groups)
  x = BatchNormalization(axis=channel_axis)(x)
  avg = AveragePooling2D(pool_size=(3, 3), strides=2, padding='same')(inputs)
  x = Concatenate(axis=channel_axis)([x, avg])
 else:
  x = _group_conv(x, filters, (1, 1), 1, groups)
  x = BatchNormalization(axis=channel_axis)(x)
  x = add([x, inputs])
 return x
 
def _stage(x, filters, kernel, groups, repeat, stage):
 """
 Stage of ShuffleNet
 
 # Arguments
  x: Tensor, input tensor of with `channels_last` or 'channels_first' data format
  filters: Integer, number of output channels
  kernel: An integer or tuple/list of 2 integers, specifying the
   width and height of the 2D convolution window.
  strides: An integer or tuple/list of 2 integers,
   specifying the strides of the convolution along the width and height.
   Can be a single integer to specify the same value for
   all spatial dimensions.
  groups: Integer, number of groups per channel
  repeat: Integer, total number of repetitions for a shuffle unit in every stage
  stage: Integer, stage number of ShuffleNet
  
 # Returns
  Output tensor
 """
 x = _shufflenet_unit(x, filters, kernel, 2, groups, stage)
 
 for i in range(1, repeat):
  x = _shufflenet_unit(x, filters, kernel, 1, groups, stage)
 return x
 
def ShuffleNet(input_shape, classes):
 """
 ShuffleNet architectures
 
 # Arguments
  input_shape: An integer or tuple/list of 3 integers, shape
   of input tensor
  k: Integer, number of classes to predict
  
 # Returns
  A keras model
 """
 inputs = Input(shape=input_shape)
 
 x = Conv2D(24, (3, 3), strides=2, padding='same', use_bias=True, activation='relu')(inputs)
 x = MaxPooling2D(pool_size=(3, 3), strides=2, padding='same')(x)
 
 x = _stage(x, filters=384, kernel=(3, 3), groups=8, repeat=4, stage=2)
 x = _stage(x, filters=768, kernel=(3, 3), groups=8, repeat=8, stage=3)
 x = _stage(x, filters=1536, kernel=(3, 3), groups=8, repeat=4, stage=4)
 
 x = GlobalAveragePooling2D()(x)
 
 x = Dense(classes)(x)
 predicts = Activation('softmax')(x)
 model = Model(inputs, predicts)
 return model
 
if __name__ == '__main__':
 model = ShuffleNet((224, 224, 3), 1000)
 #plot_model(model, to_file='ShuffleNet.png', show_shapes=True)

以上这篇keras 实现轻量级网络ShuffleNet教程就是小编分享给大家的全部内容了,希望能给大家一个参考,也希望大家多多支持三水点靠木。

Python 相关文章推荐
用map函数来完成Python并行任务的简单示例
Apr 02 Python
python单元测试unittest实例详解
May 11 Python
python中字符串类型json操作的注意事项
May 02 Python
Python模拟随机游走图形效果示例
Feb 06 Python
Python实现读取Properties配置文件的方法
Mar 29 Python
详解django中使用定时任务的方法
Sep 27 Python
python re正则匹配网页中图片url地址的方法
Dec 20 Python
详解python函数的闭包问题(内部函数与外部函数详述)
May 17 Python
Windows下pycharm创建Django 项目(虚拟环境)过程解析
Sep 16 Python
如何用Matplotlib 画三维图的示例代码
Jul 28 Python
python实现每天自动签到领积分的示例代码
Aug 18 Python
Python数据分析入门之数据读取与存储
May 13 Python
Python爬虫实现HTTP网络请求多种实现方式
Jun 19 #Python
Keras设置以及获取权重的实现
Jun 19 #Python
Python包和模块的分发详细介绍
Jun 19 #Python
浅谈Keras中shuffle和validation_split的顺序
Jun 19 #Python
Python爬虫headers处理及网络超时问题解决方案
Jun 19 #Python
sklearn和keras的数据切分与交叉验证的实例详解
Jun 19 #Python
Python虚拟环境的创建和包下载过程分析
Jun 19 #Python
You might like
PHP手机号码归属地查询代码(API接口/mysql)
2012/09/04 PHP
php中strstr、strrchr、substr、stristr四个函数的区别总结
2014/09/22 PHP
php通过asort()给关联数组按照值排序的方法
2015/03/18 PHP
PHP实现通过URL提取根域名
2016/03/31 PHP
用PHP的socket实现客户端到服务端的通信实例详解
2017/02/04 PHP
mailto的使用技巧分享
2012/12/21 Javascript
巧用局部变量提升javascript性能
2014/02/24 Javascript
js四舍五入数学函数round使用实例
2014/05/09 Javascript
JavaScript限定图片显示大小的方法
2015/03/11 Javascript
jQuery.position()方法获取不到值的安全替换方法
2015/03/13 Javascript
js文本框输入内容智能提示效果
2015/12/02 Javascript
深入理解React中es6创建组件this的方法
2016/08/29 Javascript
a标签跳转到指定div,jquery添加和移除class属性的实现方法
2016/10/10 Javascript
让浏览器崩溃的12行JS代码(DoS攻击分析及防御)
2016/10/10 Javascript
js 轮播效果实例分享
2016/12/28 Javascript
jQuery实现对象转为url参数的方法
2017/01/11 Javascript
JavaScript中的toString()和toLocaleString()方法的区别
2017/02/15 Javascript
详解Vue中的基本语法和常用指令
2019/07/23 Javascript
JS实现点击发送验证码 xx秒后重新发送功能
2019/07/30 Javascript
24个ES6方法解决JS实际开发问题(小结)
2020/05/31 Javascript
vue中封装axios并实现api接口的统一管理
2020/12/25 Vue.js
python操作gmail实例
2015/01/14 Python
Python基于正则表达式实现检查文件内容的方法【文件检索】
2017/08/30 Python
全面了解Nginx, WSGI, Flask之间的关系
2018/01/09 Python
python实现word 2007文档转换为pdf文件
2018/03/15 Python
Django实现学员管理系统
2019/02/26 Python
Python3转换html到pdf的不同解决方案
2019/03/11 Python
Python 合并多个TXT文件并统计词频的实现
2019/08/23 Python
解决Atom安装Hydrogen无法运行python3的问题
2019/08/28 Python
python 协程中的迭代器,生成器原理及应用实例详解
2019/10/28 Python
Python爬取爱奇艺电影信息代码实例
2019/11/26 Python
日语求职信范文
2013/12/17 职场文书
2014年小学植树节活动方案
2014/03/02 职场文书
保护环境的标语
2014/06/09 职场文书
详解MySQL的Seconds_Behind_Master
2021/05/18 MySQL
ajax请求前端跨域问题原因及解决方案
2021/10/16 Javascript