编程 Python

Python3处理文件中每个词的方法

Posted in Python onMay 22, 2015

本文实例讲述了Python3处理文件中每个词的方法。分享给大家供大家参考。具体实现方法如下：

''''' 
Created on Dec 21, 2012 
处理文件中的每个词 
@author: liury_lab 
''' 
import codecs 
the_file = codecs.open('d:/text.txt', 'rU', 'UTF-8') 
for line in the_file: 
  for word in line.split(): 
    print(word, end = "|") 
the_file.close() 
# 若词的定义有变，可使用正则表达式 
# 如词被定义为数字字母，连字符或单引号构成的序列 
import re 
the_file = codecs.open('d:/text.txt', 'rU', 'UTF-8') 
print() 
print('************************************************************************') 
re_word = re.compile('[\w\'-]+') 
for line in the_file: 
  for word in re_word.finditer(line): 
    print(word.group(0), end = "|") 
the_file.close() 
# 封装成迭代器 
def words_of_file(file_path, line_to_words = str.split): 
  the_file = codecs.open('d:/text.txt', 'rU', 'UTF-8') 
  for line in the_file: 
    for word in line_to_words(line): 
      yield word 
  the_file.close() 
print() 
print('************************************************************************') 
for word in words_of_file('d:/text.txt'): 
  print(word, end = '|') 
def words_by_re(file_path, repattern = '[\w\'-]+'): 
  the_file = codecs.open('d:/text.txt', 'rU', 'UTF-8') 
  re_word = re.compile('[\w\'-]+') 
 
  def line_to_words(line): 
    for mo in re_word.finditer(line): 
      yield mo.group(0) # 原书为return，发现结果不对，改为yield 
  return words_of_file(file_path, line_to_words) 
print() 
print('************************************************************************') 
for word in words_by_re('d:/text.txt'): 
  print(word, end = '|')

希望本文所述对大家的Python程序设计有所帮助。

Python3处理文件中每个词的方法

- Author -

皮蛋

声明：登载此文出于传递更多信息之目的，并不意味着赞同其观点或证实其描述。

Python 相关文章推荐

python正则表达式抓取成语网站

Nov 20 Python

python函数返回多个值的示例方法

Dec 04 Python

零基础写python爬虫之HTTP异常处理

Nov 05 Python

Python文本相似性计算之编辑距离详解

Nov 28 Python

python 同时运行多个程序的实例

Jan 07 Python

Django上使用数据可视化利器Bokeh解析

Jul 31 Python

python批量图片处理简单示例

Aug 06 Python

在PyTorch中Tensor的查找和筛选例子

Aug 18 Python

python用线性回归预测股票价格的实现代码

Sep 04 Python

Python基于pandas爬取网页表格数据

May 11 Python

Python实现将元组中的元素作为参数传入函数的操作

Jun 05 Python

使用python求斐波那契数列中第n个数的值示例代码

Jul 26 Python

Python3读取UTF-8文件及统计文件行数的方法

May 22 #Python

在Python中操作时间之mktime()方法的使用教程

May 22 #Python

Python中的localtime()方法使用详解

May 22 #Python

在Python中操作日期和时间之gmtime()方法的使用

May 22 #Python

Python中的ctime()方法使用教程

May 22 #Python

Python3实现从文件中读取指定行的方法

May 22 #Python

Python3搜索及替换文件中文本的方法

May 22 #Python

You might like

PHP JSON 数据解析代码

2010/05/26 PHP

PHP对文件进行加锁、解锁实例

2015/01/23 PHP

PHP中使用file_get_contents post数据代码例子

2015/02/13 PHP

php微信公众号js-sdk开发应用

2016/11/28 PHP

PHP基于自定义函数实现的汉字转拼音功能实例

2017/09/30 PHP

javascript jQuery插件练习

2008/12/24 Javascript

javascript 面向对象编程万物皆对象

2009/09/17 Javascript

javascript 进阶篇3 Ajax 、JSON、 Prototype介绍

2012/03/14 Javascript

jquery mobile事件多次绑定示例代码

2013/09/13 Javascript

js计算德州扑克牌面值的方法

2015/03/04 Javascript

分享jQuery封装好的一些常用操作

2016/07/28 Javascript

jQuery实现获取元素索引值index的方法

2016/09/18 Javascript

jQuery实现点击后高亮背景固定显示的菜单效果【附demo源码下载】

2016/09/21 Javascript

Javascript 获取鼠标当前的位置实现方法

2016/10/27 Javascript

javascript-解决mongoose数据查询的异步操作

2016/12/22 Javascript

JavaScript中的toString()和toLocaleString()方法的区别

2017/02/15 Javascript

详解Vue-Cli 异步加载数据的一些注意点

2017/08/12 Javascript

vue二级路由设置方法

2018/02/09 Javascript

泛谈JS逻辑判断选择器 || &&

2019/05/24 Javascript

小程序实现日历左右滑动效果

2019/10/21 Javascript

vue 使用localstorage实现面包屑的操作

2020/11/16 Javascript

Django中Forms的使用代码解析

2018/02/10 Python

python语言中with as的用法使用详解

2018/02/23 Python

Python基于百度云文字识别API

2018/12/13 Python

pycharm远程开发项目的实现步骤

2019/01/20 Python

基于python traceback实现异常的获取与处理

2019/12/13 Python

Django项目使用ckeditor详解(不使用admin)

2019/12/17 Python

使用Django和Postgres进行全文搜索的实例代码

2020/02/13 Python

python语言实现贪吃蛇游戏

2020/11/13 Python

美国最大的香水出口：FragranceX.com

2017/11/04 全球购物

联想德国官网：Lenovo Germany

2018/07/04 全球购物

师范应届生语文教师求职信

2013/10/29 职场文书

倡议书格式

2014/04/14 职场文书

厨师个人自我鉴定范文

2014/04/19 职场文书

一份恶作剧的检讨书

2014/09/13 职场文书

2015年医德考评自我评价

2015/03/03 职场文书