编程 Python

关于tensorflow softmax函数用法解析

Posted in Python onJune 30, 2020

如下所示：

def softmax(logits, axis=None, name=None, dim=None):
 """Computes softmax activations.
 This function performs the equivalent of
  softmax = tf.exp(logits) / tf.reduce_sum(tf.exp(logits), axis)
 Args:
 logits: A non-empty `Tensor`. Must be one of the following types: `half`,
  `float32`, `float64`.
 axis: The dimension softmax would be performed on. The default is -1 which
  indicates the last dimension.
 name: A name for the operation (optional).
 dim: Deprecated alias for `axis`.
 Returns:
 A `Tensor`. Has the same type and shape as `logits`.
 Raises:
 InvalidArgumentError: if `logits` is empty or `axis` is beyond the last
  dimension of `logits`.
 """
 axis = deprecation.deprecated_argument_lookup("axis", axis, "dim", dim)
 if axis is None:
 axis = -1
 return _softmax(logits, gen_nn_ops.softmax, axis, name)

softmax函数的返回结果和输入的tensor有相同的shape，既然没有改变tensor的形状，那么softmax究竟对tensor做了什么？

答案就是softmax会以某一个轴的下标为索引，对这一轴上其他维度的值进行激活 + 归一化处理。

一般来说，这个索引轴都是表示类别的那个维度（tf.nn.softmax中默认为axis=-1,也就是最后一个维度）

举例：

def softmax(X, theta = 1.0, axis = None):
 """
 Compute the softmax of each element along an axis of X.
 Parameters
 ----------
 X: ND-Array. Probably should be floats.
 theta (optional): float parameter, used as a multiplier
  prior to exponentiation. Default = 1.0
 axis (optional): axis to compute values along. Default is the
  first non-singleton axis.
 Returns an array the same size as X. The result will sum to 1
 along the specified axis.
 """
 
 # make X at least 2d
 y = np.atleast_2d(X)
 
 # find axis
 if axis is None:
  axis = next(j[0] for j in enumerate(y.shape) if j[1] > 1)
 
 # multiply y against the theta parameter,
 y = y * float(theta)
 
 # subtract the max for numerical stability
 y = y - np.expand_dims(np.max(y, axis = axis), axis)
 
 # exponentiate y
 y = np.exp(y)
 
 # take the sum along the specified axis
 ax_sum = np.expand_dims(np.sum(y, axis = axis), axis)
 
 # finally: divide elementwise
 p = y / ax_sum
 
 # flatten if X was 1D
 if len(X.shape) == 1: p = p.flatten()
 
 return p
c = np.random.randn(2,3)
print(c)
# 假设第0维是类别，一共有里两种类别
cc = softmax(c,axis=0)
# 假设最后一维是类别，一共有3种类别
ccc = softmax(c,axis=-1)
print(cc)
print(ccc)

结果：

c:
[[-1.30022268 0.59127472 1.21384177]
 [ 0.1981082 -0.83686108 -1.54785864]]
cc:
[[0.1826746 0.80661068 0.94057075]
 [0.8173254 0.19338932 0.05942925]]
ccc:
[[0.0500392 0.33172426 0.61823654]
 [0.65371718 0.23222472 0.1140581 ]]

可以看到，对axis=0的轴做softmax时，输出结果在axis=0轴上和为1(eg: 0.1826746+0.8173254)，同理在axis=1轴上做的话结果的axis=1轴和也为1(eg: 0.0500392+0.33172426+0.61823654)。

这些值是怎么得到的呢？

以cc为例（沿着axis=0做softmax）：

关于tensorflow softmax函数用法解析

以ccc为例（沿着axis=1做softmax）：

关于tensorflow softmax函数用法解析

知道了计算方法，现在我们再来讨论一下这些值的实际意义：

cc[0,0]实际上表示这样一种概率： P( label = 0 | value = [-1.30022268 0.1981082] = c[*,0] ) = 0.1826746

cc[1,0]实际上表示这样一种概率： P( label = 1 | value = [-1.30022268 0.1981082] = c[*,0] ) = 0.8173254

ccc[0,0]实际上表示这样一种概率： P( label = 0 | value = [-1.30022268 0.59127472 1.21384177] = c[0]) = 0.0500392

ccc[0,1]实际上表示这样一种概率： P( label = 1 | value = [-1.30022268 0.59127472 1.21384177] = c[0]) = 0.33172426

ccc[0,2]实际上表示这样一种概率： P( label = 2 | value = [-1.30022268 0.59127472 1.21384177] = c[0]) = 0.61823654

将他们扩展到更多维的情况：假设c是一个[batch_size , timesteps, categories]的三维tensor

output = tf.nn.softmax(c,axis=-1)

那么 output[1, 2, 3] 则表示 P(label =3 | value = c[1,2] )

以上这篇关于tensorflow softmax函数用法解析就是小编分享给大家的全部内容了，希望能给大家一个参考，也希望大家多多支持三水点靠木。

关于tensorflow softmax函数用法解析

- Author -

ASR_THU

声明：登载此文出于传递更多信息之目的，并不意味着赞同其观点或证实其描述。

Python 相关文章推荐

python修改注册表终止360进程实例

Oct 13 Python

python处理大数字的方法

May 27 Python

Python利用flask sqlalchemy实现分页效果

Aug 02 Python

python+matplotlib实现礼盒柱状图实例代码

Jan 16 Python

python 常用的基础函数

Jul 10 Python

python pygame实现2048游戏

Nov 20 Python

Python matplotlib通过plt.scatter画空心圆标记出特定的点方法

Dec 13 Python

Python一个简单的通信程序(客户端服务器)

Mar 06 Python

使用python画社交网络图实例代码

Jul 10 Python

Django密码系统实现过程详解

Jul 19 Python

python实现银行管理系统

Oct 25 Python

python 调用API接口获取和解析 Json数据

Sep 28 Python

基于tensorflow for循环 while循环案例

Jun 30 #Python

解析Tensorflow之MNIST的使用

Jun 30 #Python

Tensorflow tensor 数学运算和逻辑运算方式

Jun 30 #Python

Python requests模块安装及使用教程图解

Jun 30 #Python

在Tensorflow中实现leakyRelu操作详解(高效)

Jun 30 #Python

TensorFlow-gpu和opencv安装详细教程

Jun 30 #Python

tensorflow 2.1.0 安装与实战教程(CASIA FACE v5)

Jun 30 #Python

You might like

php str_pad 函数用法简介

2009/07/11 PHP

QueryPath PHP 中的jQuery

2010/04/11 PHP

discuz加密解密函数使用方法和中文注释

2014/01/21 PHP

页面利用渐进式JPEG来提升用户体验度

2014/12/01 PHP

护卫神php套件 php版本升级方法(php5.5.24)

2015/05/10 PHP

TP5框架实现的数据库备份功能示例

2020/04/05 PHP

javascript 操作Word和Excel的实现代码

2009/10/26 Javascript

使用Json比用string返回数据更友好，也更面向对象一些

2011/09/13 Javascript

js获取location.href的参数实例代码

2013/08/02 Javascript

javascript模拟实现C# String.format函数功能代码

2013/11/25 Javascript

JavaScript获取function所有参数名的方法

2015/10/30 Javascript

jquery插件之文字间歇自动向上滚动效果代码

2016/02/25 Javascript

jQuery实现下拉加载功能实例代码

2016/04/01 Javascript

getElementById().innerHTML与getElementById().value的区别

2016/10/27 Javascript

jQuery animate()实现背景色渐变效果的处理方法【使用jQuery.color.js插件】

2017/03/15 Javascript

js微信分享实现代码

2020/10/11 Javascript

EasyUI Tree树组件无限循环的解决方法

2017/09/27 Javascript

JavaScript中arguments和this对象用法分析

2018/08/08 Javascript

JS实现百度网盘任意文件强制下载功能

2018/08/31 Javascript

Vue 实现把表单form数据转化成json格式的数据

2019/10/29 Javascript

Node.js控制台彩色输出的方法与原理实例详解

2019/12/01 Javascript

nuxt配置通过指定IP和端口访问的实现

2020/01/08 Javascript

python中的格式化输出用法总结

2016/07/28 Python

利用ctypes提高Python的执行速度

2016/09/09 Python

Django实现发送邮件找回密码功能

2019/08/12 Python

浅谈Python中的继承

2020/06/19 Python

python获取本周、上周、本月、上月及本季的时间代码实例

2020/09/08 Python

CSS 3.0文字悬停跳动特效代码

2020/10/26 HTML / CSS

HTML5 实现一个访问本地文件的实例

2012/12/13 HTML / CSS

前端实现打印图像功能

2019/08/27 HTML / CSS

柯基袜：Corgi Socks

2017/01/26 全球购物

青年创业培训欢迎词

2014/01/08 职场文书

网上书店创业计划书

2014/01/12 职场文书

法人授权委托书

2014/04/03 职场文书

《圆的周长》教学反思

2016/02/17 职场文书

血轮眼轮回眼特效 html+css

2021/03/31 HTML / CSS