keras 实现轻量级网络ShuffleNet教程


Posted in Python onJune 19, 2020

ShuffleNet是由旷世发表的一个计算效率极高的CNN架构,它是专门为计算能力非常有限的移动设备(例如,10-150 MFLOPs)而设计的。该结构利用组卷积和信道混洗两种新的运算方法,在保证计算精度的同时,大大降低了计算成本。ImageNet分类和MS COCO对象检测实验表明,在40 MFLOPs的计算预算下,ShuffleNet的性能优于其他结构,例如,在ImageNet分类任务上,ShuffleNet的top-1 error 7.8%比最近的MobileNet低。在基于arm的移动设备上,ShuffleNet比AlexNet实际加速了13倍,同时保持了相当的准确性。

Paper:ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile

Github:https://github.com/zjn-ai/ShuffleNet-keras

网络架构

组卷积

组卷积其实早在AlexNet中就用过了,当时因为GPU的显存不足因而利用组卷积分配到两个GPU上训练。简单来讲,组卷积就是将输入特征图按照通道方向均分成多个大小一致的特征图,如下图所示左面是输入特征图右面是均分后的特征图,然后对得到的每一个特征图进行正常的卷积操作,最后将输出特征图按照通道方向拼接起来就可以了。

keras 实现轻量级网络ShuffleNet教程

目前很多框架都支持组卷积,但是tensorflow真的不知道在想什么,到现在还是不支持组卷积,只能自己写,因此效率肯定不及其他框架原生支持的方法。组卷积层的代码编写思路就与上面所说的原理完全一致,代码如下。

def _group_conv(x, filters, kernel, stride, groups):
 """
 Group convolution
 # Arguments
  x: Tensor, input tensor of with `channels_last` or 'channels_first' data format
  filters: Integer, number of output channels
  kernel: An integer or tuple/list of 2 integers, specifying the
   width and height of the 2D convolution window.
  strides: An integer or tuple/list of 2 integers,
   specifying the strides of the convolution along the width and height.
   Can be a single integer to specify the same value for
   all spatial dimensions.
  groups: Integer, number of groups per channel
  
 # Returns
  Output tensor
 """
 channel_axis = 1 if K.image_data_format() == 'channels_first' else -1
 in_channels = K.int_shape(x)[channel_axis]
 
 # number of input channels per group
 nb_ig = in_channels // groups
 # number of output channels per group
 nb_og = filters // groups
 
 gc_list = []
 # Determine whether the number of filters is divisible by the number of groups
 assert filters % groups == 0
 
 for i in range(groups):
  if channel_axis == -1:
   x_group = Lambda(lambda z: z[:, :, :, i * nb_ig: (i + 1) * nb_ig])(x)
  else:
   x_group = Lambda(lambda z: z[:, i * nb_ig: (i + 1) * nb_ig, :, :])(x)
  gc_list.append(Conv2D(filters=nb_og, kernel_size=kernel, strides=stride, 
        padding='same', use_bias=False)(x_group))
  
 return Concatenate(axis=channel_axis)(gc_list)

通道混洗

通道混洗是这篇paper的重点,尽管组卷积大量减少了计算量和参数,但是通道之间的信息交流也受到了限制因而模型精度肯定会受到影响,因此作者提出通道混洗,在不增加参数量和计算量的基础上加强通道之间的信息交流,如下图所示。

keras 实现轻量级网络ShuffleNet教程

通道混洗层的代码实现很巧妙参考了别人的实现方法。通过下面的代码说明,d代表特征图的通道序号,x是经过通道混洗后的通道顺序。

>>> d = np.array([0,1,2,3,4,5,6,7,8]) 
>>> x = np.reshape(d, (3,3)) 
>>> x = np.transpose(x, [1,0]) # 转置
>>> x = np.reshape(x, (9,)) # 平铺
'[0 1 2 3 4 5 6 7 8] --> [0 3 6 1 4 7 2 5 8]'

利用keras后端实现代码:

def _channel_shuffle(x, groups):
 """
 Channel shuffle layer
 
 # Arguments
  x: Tensor, input tensor of with `channels_last` or 'channels_first' data format
  groups: Integer, number of groups per channel
  
 # Returns
  Shuffled tensor
 """
 
 if K.image_data_format() == 'channels_last':
  height, width, in_channels = K.int_shape(x)[1:]
  channels_per_group = in_channels // groups
  pre_shape = [-1, height, width, groups, channels_per_group]
  dim = (0, 1, 2, 4, 3)
  later_shape = [-1, height, width, in_channels]
 else:
  in_channels, height, width = K.int_shape(x)[1:]
  channels_per_group = in_channels // groups
  pre_shape = [-1, groups, channels_per_group, height, width]
  dim = (0, 2, 1, 3, 4)
  later_shape = [-1, in_channels, height, width]
 
 x = Lambda(lambda z: K.reshape(z, pre_shape))(x)
 x = Lambda(lambda z: K.permute_dimensions(z, dim))(x) 
 x = Lambda(lambda z: K.reshape(z, later_shape))(x)
 
 return x

ShuffleNet Unit

ShuffleNet的主要构成单元。下图中,a图为深度可分离卷积的基本架构,b图为1步长时用的单元,c图为2步长时用的单元。

keras 实现轻量级网络ShuffleNet教程

ShuffleNet架构

注意,对于第二阶段(Stage2),作者没有在第一个1×1卷积上应用组卷积,因为输入通道的数量相对较少。

keras 实现轻量级网络ShuffleNet教程

环境

Python 3.6

Tensorlow 1.13.1

Keras 2.2.4

实现

支持channel first或channel last

# -*- coding: utf-8 -*-
"""
Created on Thu Apr 25 18:26:41 2019
@author: zjn
"""
import numpy as np
from keras.callbacks import LearningRateScheduler
from keras.models import Model
from keras.layers import Input, Conv2D, Dropout, Dense, GlobalAveragePooling2D, Concatenate, AveragePooling2D
from keras.layers import Activation, BatchNormalization, add, Reshape, ReLU, DepthwiseConv2D, MaxPooling2D, Lambda
from keras.utils.vis_utils import plot_model
from keras import backend as K
from keras.optimizers import SGD
 
def _group_conv(x, filters, kernel, stride, groups):
 """
 Group convolution
 
 # Arguments
  x: Tensor, input tensor of with `channels_last` or 'channels_first' data format
  filters: Integer, number of output channels
  kernel: An integer or tuple/list of 2 integers, specifying the
   width and height of the 2D convolution window.
  strides: An integer or tuple/list of 2 integers,
   specifying the strides of the convolution along the width and height.
   Can be a single integer to specify the same value for
   all spatial dimensions.
  groups: Integer, number of groups per channel
  
 # Returns
  Output tensor
 """
 
 channel_axis = 1 if K.image_data_format() == 'channels_first' else -1
 in_channels = K.int_shape(x)[channel_axis]
 
 # number of input channels per group
 nb_ig = in_channels // groups
 # number of output channels per group
 nb_og = filters // groups
 
 gc_list = []
 # Determine whether the number of filters is divisible by the number of groups
 assert filters % groups == 0
 
 for i in range(groups):
  if channel_axis == -1:
   x_group = Lambda(lambda z: z[:, :, :, i * nb_ig: (i + 1) * nb_ig])(x)
  else:
   x_group = Lambda(lambda z: z[:, i * nb_ig: (i + 1) * nb_ig, :, :])(x)
  gc_list.append(Conv2D(filters=nb_og, kernel_size=kernel, strides=stride, 
        padding='same', use_bias=False)(x_group))
  
 return Concatenate(axis=channel_axis)(gc_list)
def _channel_shuffle(x, groups):
 """
 Channel shuffle layer
 
 # Arguments
  x: Tensor, input tensor of with `channels_last` or 'channels_first' data format
  groups: Integer, number of groups per channel
  
 # Returns
  Shuffled tensor
 """
 if K.image_data_format() == 'channels_last':
  height, width, in_channels = K.int_shape(x)[1:]
  channels_per_group = in_channels // groups
  pre_shape = [-1, height, width, groups, channels_per_group]
  dim = (0, 1, 2, 4, 3)
  later_shape = [-1, height, width, in_channels]
 else:
  in_channels, height, width = K.int_shape(x)[1:]
  channels_per_group = in_channels // groups
  pre_shape = [-1, groups, channels_per_group, height, width]
  dim = (0, 2, 1, 3, 4)
  later_shape = [-1, in_channels, height, width]
 
 x = Lambda(lambda z: K.reshape(z, pre_shape))(x)
 x = Lambda(lambda z: K.permute_dimensions(z, dim))(x) 
 x = Lambda(lambda z: K.reshape(z, later_shape))(x)
 
 return x
 
def _shufflenet_unit(inputs, filters, kernel, stride, groups, stage, bottleneck_ratio=0.25):
 """
 ShuffleNet unit
 
 # Arguments
  inputs: Tensor, input tensor of with `channels_last` or 'channels_first' data format
  filters: Integer, number of output channels
  kernel: An integer or tuple/list of 2 integers, specifying the
   width and height of the 2D convolution window.
  strides: An integer or tuple/list of 2 integers,
   specifying the strides of the convolution along the width and height.
   Can be a single integer to specify the same value for
   all spatial dimensions.
  groups: Integer, number of groups per channel
  stage: Integer, stage number of ShuffleNet
  bottleneck_channels: Float, bottleneck ratio implies the ratio of bottleneck channels to output channels
   
 # Returns
  Output tensor
  
 # Note
  For Stage 2, we(authors of shufflenet) do not apply group convolution on the first pointwise layer 
  because the number of input channels is relatively small.
 """
 channel_axis = 1 if K.image_data_format() == 'channels_first' else -1
 in_channels = K.int_shape(inputs)[channel_axis]
 bottleneck_channels = int(filters * bottleneck_ratio)
 
 if stage == 2:
  x = Conv2D(filters=bottleneck_channels, kernel_size=kernel, strides=1,
     padding='same', use_bias=False)(inputs)
 else:
  x = _group_conv(inputs, bottleneck_channels, (1, 1), 1, groups)
 x = BatchNormalization(axis=channel_axis)(x)
 x = ReLU()(x)
 
 x = _channel_shuffle(x, groups)
 x = DepthwiseConv2D(kernel_size=kernel, strides=stride, depth_multiplier=1, 
      padding='same', use_bias=False)(x)
 x = BatchNormalization(axis=channel_axis)(x)
  
 if stride == 2:
  x = _group_conv(x, filters - in_channels, (1, 1), 1, groups)
  x = BatchNormalization(axis=channel_axis)(x)
  avg = AveragePooling2D(pool_size=(3, 3), strides=2, padding='same')(inputs)
  x = Concatenate(axis=channel_axis)([x, avg])
 else:
  x = _group_conv(x, filters, (1, 1), 1, groups)
  x = BatchNormalization(axis=channel_axis)(x)
  x = add([x, inputs])
 return x
 
def _stage(x, filters, kernel, groups, repeat, stage):
 """
 Stage of ShuffleNet
 
 # Arguments
  x: Tensor, input tensor of with `channels_last` or 'channels_first' data format
  filters: Integer, number of output channels
  kernel: An integer or tuple/list of 2 integers, specifying the
   width and height of the 2D convolution window.
  strides: An integer or tuple/list of 2 integers,
   specifying the strides of the convolution along the width and height.
   Can be a single integer to specify the same value for
   all spatial dimensions.
  groups: Integer, number of groups per channel
  repeat: Integer, total number of repetitions for a shuffle unit in every stage
  stage: Integer, stage number of ShuffleNet
  
 # Returns
  Output tensor
 """
 x = _shufflenet_unit(x, filters, kernel, 2, groups, stage)
 
 for i in range(1, repeat):
  x = _shufflenet_unit(x, filters, kernel, 1, groups, stage)
 return x
 
def ShuffleNet(input_shape, classes):
 """
 ShuffleNet architectures
 
 # Arguments
  input_shape: An integer or tuple/list of 3 integers, shape
   of input tensor
  k: Integer, number of classes to predict
  
 # Returns
  A keras model
 """
 inputs = Input(shape=input_shape)
 
 x = Conv2D(24, (3, 3), strides=2, padding='same', use_bias=True, activation='relu')(inputs)
 x = MaxPooling2D(pool_size=(3, 3), strides=2, padding='same')(x)
 
 x = _stage(x, filters=384, kernel=(3, 3), groups=8, repeat=4, stage=2)
 x = _stage(x, filters=768, kernel=(3, 3), groups=8, repeat=8, stage=3)
 x = _stage(x, filters=1536, kernel=(3, 3), groups=8, repeat=4, stage=4)
 
 x = GlobalAveragePooling2D()(x)
 
 x = Dense(classes)(x)
 predicts = Activation('softmax')(x)
 model = Model(inputs, predicts)
 return model
 
if __name__ == '__main__':
 model = ShuffleNet((224, 224, 3), 1000)
 #plot_model(model, to_file='ShuffleNet.png', show_shapes=True)

以上这篇keras 实现轻量级网络ShuffleNet教程就是小编分享给大家的全部内容了,希望能给大家一个参考,也希望大家多多支持三水点靠木。

Python 相关文章推荐
Python列表生成器的循环技巧分享
Mar 06 Python
Python中的XML库4Suite Server的介绍
Apr 14 Python
在Python中marshal对象序列化的相关知识
Jul 01 Python
Python的时间模块datetime详解
Apr 17 Python
Python 使用PIL中的resize进行缩放的实例讲解
Aug 03 Python
计算机二级python学习教程(1) 教大家如何学习python
May 16 Python
Python实现的远程文件自动打包并下载功能示例
Jul 12 Python
PyQtGraph在pyqt中的应用及安装过程
Aug 04 Python
pygame实现贪吃蛇游戏(下)
Oct 29 Python
Python实现实时数据采集新型冠状病毒数据实例
Feb 04 Python
Python基础类继承重写实现原理解析
Apr 03 Python
PyTorch中的拷贝与就地操作详解
Dec 09 Python
Python爬虫实现HTTP网络请求多种实现方式
Jun 19 #Python
Keras设置以及获取权重的实现
Jun 19 #Python
Python包和模块的分发详细介绍
Jun 19 #Python
浅谈Keras中shuffle和validation_split的顺序
Jun 19 #Python
Python爬虫headers处理及网络超时问题解决方案
Jun 19 #Python
sklearn和keras的数据切分与交叉验证的实例详解
Jun 19 #Python
Python虚拟环境的创建和包下载过程分析
Jun 19 #Python
You might like
Content-type 的说明
2006/10/09 PHP
php实现对象克隆的方法
2015/06/20 PHP
图片之间的切换
2006/06/26 Javascript
深入理解JavaScript系列(38):设计模式之职责链模式详解
2015/03/04 Javascript
jQuery插件编写步骤详解
2016/06/03 Javascript
老生常谈 js中this的指向
2016/06/30 Javascript
由浅入深剖析Angular表单验证
2016/07/14 Javascript
mvc中form表单提交的三种方式(推荐)
2016/08/10 Javascript
基于JS实现的随机数字抽签实例
2016/12/08 Javascript
BootStrap+Mybatis框架下实现表单提交数据重复验证
2017/03/23 Javascript
手把手教你搭建ES6的开发运行环境
2017/07/11 Javascript
Vue动态获取width的方法
2018/08/22 Javascript
webpack打包多页面的方法
2018/11/30 Javascript
vue element-ui读取pdf文件的方法
2019/11/26 Javascript
python3中str(字符串)的使用教程
2017/03/23 Python
使用Python实现博客上进行自动翻页
2017/08/23 Python
Python cookbook(数据结构与算法)让字典保持有序的方法
2018/02/18 Python
3行Python代码实现图像照片抠图和换底色的方法
2019/10/10 Python
python列表生成器迭代器实例解析
2019/12/19 Python
python深copy和浅copy区别对比解析
2019/12/26 Python
django 实现后台从富文本提取纯文本
2020/07/02 Python
详解Python GUI编程之PyQt5入门到实战
2020/12/10 Python
Perfume’s Club意大利官网:欧洲美妆电商
2019/05/03 全球购物
澳大利亚最受欢迎的超级商场每日优惠:Catch
2020/11/17 全球购物
解释i节点在文件系统中的作用
2013/11/26 面试题
2014年应届大学生自我评价
2014/01/09 职场文书
文明礼仪演讲稿
2014/05/12 职场文书
优秀家长事迹材料
2014/05/17 职场文书
施工安全标语
2014/06/07 职场文书
群众路线教育查摆剖析材料
2014/10/10 职场文书
债务追讨授权委托书范本
2014/10/16 职场文书
预备党员群众路线教育实践活动思想汇报2014
2014/10/25 职场文书
数学教师求职信范文
2015/03/20 职场文书
社会实践单位意见
2015/06/05 职场文书
导游词之丽江普济寺
2019/10/22 职场文书
梳理总结Python开发中需要摒弃的18个坏习惯
2022/01/22 Python