编程 Python

使用sklearn之LabelEncoder将Label标准化的方法

Posted in Python onJuly 11, 2018

LabelEncoder可以将标签分配一个0—n_classes-1之间的编码

将各种标签分配一个可数的连续编号：

>>> from sklearn import preprocessing
>>> le = preprocessing.LabelEncoder()
>>> le.fit([1, 2, 2, 6])
LabelEncoder()
>>> le.classes_
array([1, 2, 6])
>>> le.transform([1, 1, 2, 6]) # Transform Categories Into Integers
array([0, 0, 1, 2], dtype=int64)
>>> le.inverse_transform([0, 0, 1, 2]) # Transform Integers Into Categories
array([1, 1, 2, 6])

>>> le = preprocessing.LabelEncoder()
>>> le.fit(["paris", "paris", "tokyo", "amsterdam"])
LabelEncoder()
>>> list(le.classes_)
['amsterdam', 'paris', 'tokyo']
>>> le.transform(["tokyo", "tokyo", "paris"]) # Transform Categories Into Integers
array([2, 2, 1], dtype=int64)
>>> list(le.inverse_transform([2, 2, 1])) #Transform Integers Into Categories
['tokyo', 'tokyo', 'paris']

将DataFrame中的所有ID标签转换成连续编号：

from sklearn.preprocessing import LabelEncoder
import numpy as np
import pandas as pd
df=pd.read_csv('testdata.csv',sep='|',header=None)

0 1 2 3 4 5
0 37 52 55 50 38 54
1 17 32 20 9 6 48
2 28 10 56 51 45 16
3 27 49 41 30 53 19
4 44 29 8 1 46 13
5 11 26 21 14 7 33
6 0 39 22 33 35 43
7 18 15 47 5 25 34
8 23 2 4 9 3 31
9 12 57 36 40 42 24

le = LabelEncoder()
le.fit(np.unique(df.values))
df.apply(le.transform)

0 1 2 3 4 5
0 37 52 55 50 38 54
1 17 32 20 9 6 48
2 28 10 56 51 45 16
3 27 49 41 30 53 19
4 44 29 8 1 46 13
5 11 26 21 14 7 33
6 0 39 22 33 35 43
7 18 15 47 5 25 34
8 23 2 4 9 3 31
9 12 57 36 40 42 24

将DataFrame中的每一行ID标签分别转换成连续编号：

import pandas as pd
from sklearn.preprocessing import LabelEncoder
from sklearn.pipeline import Pipeline
class MultiColumnLabelEncoder:
 def __init__(self,columns = None):
 self.columns = columns # array of column names to encode
 def fit(self,X,y=None):
 return self # not relevant here
 def transform(self,X):
 '''
 Transforms columns of X specified in self.columns using
 LabelEncoder(). If no columns specified, transforms all
 columns in X.
 '''
 output = X.copy()
 if self.columns is not None:
  for col in self.columns:
  output[col] = LabelEncoder().fit_transform(output[col])
 else:
  for colname,col in output.iteritems():
  output[colname] = LabelEncoder().fit_transform(col)
 return output
 def fit_transform(self,X,y=None):
 return self.fit(X,y).transform(X)

MultiColumnLabelEncoder(columns = [0, 1, 2, 3, 4, 5]).fit_transform(df)

或者

df.apply(LabelEncoder().fit_transform)

0 1 2 3 4 5
0 8 8 8 7 5 9
1 3 5 2 2 1 8
2 7 1 9 8 7 1
3 6 7 6 4 9 2
4 9 4 1 0 8 0
5 1 3 3 3 2 5
6 0 6 4 5 4 7
7 4 2 7 1 3 6
8 5 0 0 2 0 4
9 2 9 5 6 6 3

# Create some toy data in a Pandas dataframe
fruit_data = pd.DataFrame({
 'fruit': ['apple','orange','pear','orange'],
 'color': ['red','orange','green','green'],
 'weight': [5,6,3,4]
})

color fruit weight
0 red apple 5
1 orange orange 6
2 green pear 3
3 green orange 4

MultiColumnLabelEncoder(columns = ['fruit','color']).fit_transform(fruit_data)

或者

fruit_data[['fruit','color']]=fruit_data[['fruit','color']].apply(LabelEncoder().fit_transform)

color fruit weight
0 2 0 5
1 1 1 6
2 0 2 3
3 0 1 4

以上这篇使用sklearn之LabelEncoder将Label标准化的方法就是小编分享给大家的全部内容了，希望能给大家一个参考，也希望大家多多支持三水点靠木。

使用sklearn之LabelEncoder将Label标准化的方法

- Author -

?大宝

声明：登载此文出于传递更多信息之目的，并不意味着赞同其观点或证实其描述。

Python 相关文章推荐

python实现哈希表

Feb 07 Python

python简单实现基于SSL的IRC bot实例

Jun 15 Python

在django中使用自定义标签实现分页功能

Jul 04 Python

Python装饰器用法示例小结

Feb 11 Python

Python实现查找字符串数组最长公共前缀示例

Mar 27 Python

Python实现字符串匹配的KMP算法

Apr 04 Python

Python3.8中使用f-strings调试

May 22 Python

python频繁写入文件时提速的方法

Jun 26 Python

详解python实现数据归一化处理的方式：（0,1）标准化

Jul 17 Python

python代码能做成软件吗

Jul 24 Python

Python偏函数实现原理及应用

Nov 20 Python

python爬虫爬取某网站视频的示例代码

Feb 20 Python

Python实现识别图片内容的方法分析

Jul 11 #Python

对python 数据处理中的LabelEncoder 和 OneHotEncoder详解

Jul 11 #Python

python对离散变量的one-hot编码方法

Jul 11 #Python

Python基于多线程操作数据库相关问题分析

Jul 11 #Python

pandas 按照特定顺序输出的实现代码

Jul 10 #Python

Python OpenCV处理图像之图像直方图和反向投影

Jul 10 #Python

Python中 map()函数的用法详解

Jul 10 #Python

You might like

ASP知识讲座四

2006/10/09 PHP

PHP中mb_convert_encoding与iconv函数的深入解析

2013/06/21 PHP

两个php日期控制类实例

2014/12/09 PHP

php $_SESSION会员登录实例分享

2021/01/19 PHP

thinkPHP+PHPExcel实现读取文件日期的方法(含时分秒)

2016/07/07 PHP

深入理解PHP JSON数组与对象

2016/07/19 PHP

php从数据库中读取特定的行(实例)

2017/06/02 PHP

JavaScript中Math对象使用说明

2008/01/16 Javascript

js中判断Object、Array、Function等引用类型对象是否相等

2012/08/29 Javascript

JQuery操作单选按钮以及复选按钮示例

2013/09/23 Javascript

在ASP.NET中使用JavaScript脚本的方法

2013/11/12 Javascript

浅谈jQuery的offset()方法及示例分享

2015/07/17 Javascript

JavaScript快速切换繁体中文和简体中文的方法及网站支持简繁体切换的绝招

2016/03/07 Javascript

ajax的分页查询示例（不刷新页面）

2017/01/11 Javascript

Node.js查找当前目录下文件夹实例代码

2017/03/07 Javascript

webpack4 升级迁移的实现

2018/09/12 Javascript

JS中数据结构与算法---排序算法(Sort Algorithm)实例详解

2019/06/17 Javascript

JS几个常用的函数和对象定义与用法示例

2020/01/15 Javascript

JS实现的定时器展示简单秒表、页面弹框及跳转操作完整示例

2020/01/26 Javascript

js禁止查看源文件屏蔽Ctrl+u/s、F12、右键等兼容IE火狐chrome

2020/10/01 Javascript

[02:39]DOTA2英雄基础教程天怒法师

2013/11/29 DOTA

基于python的汉字转GBK码实现代码

2012/02/19 Python

python实现bitmap数据结构详解

2014/02/17 Python

解决python3 urllib中urlopen报错的问题

2017/03/25 Python

Python 用三行代码提取PDF表格数据

2019/10/13 Python

CSS3中Animation属性的使用详解

2015/08/06 HTML / CSS

SmartBuyGlasses英国：购买太阳镜和眼镜

2018/01/29 全球购物

Pretty Little Thing美国：时尚女性服饰

2018/08/27 全球购物

美国农场商店：Blain’s Farm & Fleet

2020/01/17 全球购物

Herschel Supply Co.美国：背包、手提袋及配件

2020/11/24 全球购物

公立医院改革实施方案

2014/03/14 职场文书

2014党员民主评议个人总结

2014/09/10 职场文书

批评与自我批评总结

2014/10/17 职场文书

2016年教师政治思想表现评语

2015/12/02 职场文书

工作一年自我鉴定

2019/06/20 职场文书

Mysql如何查看是否使用到索引

2022/12/24 MySQL