编程 Python

python opencv pytesseract 验证码识别的实现

Posted in Python onAugust 28, 2020

一、环境配置

需要 pillow 和 pytesseract 这两个库，pip install 安装就好了。

install pillow -i http://pypi.douban.com/simple --trusted-host pypi.douban.com
pip install pytesseract -i http://pypi.douban.com/simple --trusted-host pypi.douban.com

安装好Tesseract-OCR.exe

pytesseract 库的配置：搜索找到pytesseract.py，打开该.py文件，找到 tesseract_cmd，改变它的值为刚才安装 tesseract.exe 的路径。

python opencv pytesseract 验证码识别的实现

二、验证码识别

识别验证码，需要先对图像进行预处理，去除会影响识别准确度的线条或噪点，提高识别准确度。

实例1

import cv2 as cv
import pytesseract
from PIL import Image


def recognize_text(image):
  # 边缘保留滤波 去噪
  dst = cv.pyrMeanShiftFiltering(image, sp=10, sr=150)
  # 灰度图像
  gray = cv.cvtColor(dst, cv.COLOR_BGR2GRAY)
  # 二值化
  ret, binary = cv.threshold(gray, 0, 255, cv.THRESH_BINARY_INV | cv.THRESH_OTSU)
  # 形态学操作  腐蚀 膨胀
  erode = cv.erode(binary, None, iterations=2)
  dilate = cv.dilate(erode, None, iterations=1)
  cv.imshow('dilate', dilate)
  # 逻辑运算 让背景为白色 字体为黑 便于识别
  cv.bitwise_not(dilate, dilate)
  cv.imshow('binary-image', dilate)
  # 识别
  test_message = Image.fromarray(dilate)
  text = pytesseract.image_to_string(test_message)
  print(f'识别结果：{text}')


src = cv.imread(r'./test/044.png')
cv.imshow('input image', src)
recognize_text(src)
cv.waitKey(0)
cv.destroyAllWindows()

运行效果如下：

识别结果：3n3D

Process finished with exit code 0

python opencv pytesseract 验证码识别的实现

实例2

import cv2 as cv
import pytesseract
from PIL import Image


def recognize_text(image):
  # 边缘保留滤波 去噪
  blur =cv.pyrMeanShiftFiltering(image, sp=8, sr=60)
  cv.imshow('dst', blur)
  # 灰度图像
  gray = cv.cvtColor(blur, cv.COLOR_BGR2GRAY)
  # 二值化
  ret, binary = cv.threshold(gray, 0, 255, cv.THRESH_BINARY_INV | cv.THRESH_OTSU)
  print(f'二值化自适应阈值：{ret}')
  cv.imshow('binary', binary)
  # 形态学操作 获取结构元素 开操作
  kernel = cv.getStructuringElement(cv.MORPH_RECT, (3, 2))
  bin1 = cv.morphologyEx(binary, cv.MORPH_OPEN, kernel)
  cv.imshow('bin1', bin1)
  kernel = cv.getStructuringElement(cv.MORPH_OPEN, (2, 3))
  bin2 = cv.morphologyEx(bin1, cv.MORPH_OPEN, kernel)
  cv.imshow('bin2', bin2)
  # 逻辑运算 让背景为白色 字体为黑 便于识别
  cv.bitwise_not(bin2, bin2)
  cv.imshow('binary-image', bin2)
  # 识别
  test_message = Image.fromarray(bin2)
  text = pytesseract.image_to_string(test_message)
  print(f'识别结果：{text}')


src = cv.imread(r'./test/045.png')
cv.imshow('input image', src)
recognize_text(src)
cv.waitKey(0)
cv.destroyAllWindows()

运行效果如下：

二值化自适应阈值：181.0
识别结果：8A62N1

Process finished with exit code 0

python opencv pytesseract 验证码识别的实现

实例3

import cv2 as cv
import pytesseract
from PIL import Image


def recognize_text(image):
  # 边缘保留滤波 去噪
  blur = cv.pyrMeanShiftFiltering(image, sp=8, sr=60)
  cv.imshow('dst', blur)
  # 灰度图像
  gray = cv.cvtColor(blur, cv.COLOR_BGR2GRAY)
  # 二值化 设置阈值 自适应阈值的话 黄色的4会提取不出来
  ret, binary = cv.threshold(gray, 185, 255, cv.THRESH_BINARY_INV)
  print(f'二值化设置的阈值：{ret}')
  cv.imshow('binary', binary)
  # 逻辑运算 让背景为白色 字体为黑 便于识别
  cv.bitwise_not(binary, binary)
  cv.imshow('bg_image', binary)
  # 识别
  test_message = Image.fromarray(binary)
  text = pytesseract.image_to_string(test_message)
  print(f'识别结果：{text}')


src = cv.imread(r'./test/045.jpg')
cv.imshow('input image', src)
recognize_text(src)
cv.waitKey(0)
cv.destroyAllWindows()

运行效果如下：

二值化设置的阈值：185.0
识别结果：7364

Process finished with exit code 0

python opencv pytesseract 验证码识别的实现

到此这篇关于python opencv pytesseract 验证码识别的实现的文章就介绍到这了,更多相关opencv pytesseract 验证码识别内容请搜索三水点靠木以前的文章或继续浏览下面的相关文章希望大家以后多多支持三水点靠木！

python opencv pytesseract 验证码识别的实现

- Author -

叶庭云

声明：登载此文出于传递更多信息之目的，并不意味着赞同其观点或证实其描述。

Python 相关文章推荐

Python lambda和Python def区别分析

Nov 30 Python

举例简单讲解Python中的数据存储模块shelve的用法

Mar 03 Python

浅谈Python中的私有变量

Feb 28 Python

Python日期时间对象转换为字符串的实例

Jun 22 Python

Python OpenCV之图片缩放的实现（cv2.resize）

Jun 28 Python

10分钟用python搭建一个超好用的CMDB系统

Jul 17 Python

Python 实例方法、类方法、静态方法的区别与作用

Aug 14 Python

PyCharm搭建Spark开发环境的实现步骤

Sep 05 Python

Django项目中使用JWT的实现代码

Nov 04 Python

用python的turtle模块实现给女票画个小心心

Nov 23 Python

pycharm实现猜数游戏

Dec 07 Python

python日志通过不同的等级打印不同的颜色(示例代码)

Jan 13 Python

简单的命令查看安装的python版本号

Aug 28 #Python

python进行OpenCV实战之画图（直线、矩形、圆形）

Aug 27 #Python

python opencv实现简易画图板

Aug 27 #Python

python实现画图工具

Aug 27 #Python

20行Python代码实现一款永久免费PDF编辑工具的实现

Aug 27 #Python

基于python实现操作redis及消息队列

Aug 27 #Python

Python3如何在服务器打印资产信息

Aug 27 #Python

You might like

thinkphp实现多语言功能(语言包)

2014/03/04 PHP

PHP7扩展开发之基于函数方式使用lib库的方法详解

2018/01/15 PHP

PHP程序员必须知道的两种日志实例分析

2020/05/14 PHP

jQuery实现点击标题输入详细信息

2013/04/16 Javascript

jquery实现带复选框的表格行选中删除时高亮显示

2013/08/01 Javascript

javascript操作html控件实例(javascript添加html)

2013/12/02 Javascript

jquery each的几种常用的使用方法示例

2014/01/21 Javascript

jquery引用方法时传递参数原理分析

2014/10/13 Javascript

分享一个插件实现水珠自动下落效果

2016/06/01 Javascript

Javascript发送AJAX请求实例代码

2016/08/21 Javascript

jquery.multiselect多选下拉框实现代码

2016/11/11 Javascript

Bootstrap选项卡动态切换效果

2016/11/28 Javascript

JavaScript算法教程之sku（库存量单位）详解

2017/06/29 Javascript

快速将Vue项目升级到webpack3的方法步骤

2017/09/14 Javascript

React 高阶组件入门介绍

2018/01/11 Javascript

Vue父组件调用子组件事件方法

2018/02/23 Javascript

移动端图片上传旋转、压缩问题的方法

2018/10/16 Javascript

Javascript实现鼠标点击冒泡特效

2019/12/24 Javascript

node.js如何根据URL返回指定的图片详解

2020/10/21 Javascript

django开发之settings.py中变量的全局引用详解

2017/03/29 Python

python3实现多线程聊天室

2018/12/12 Python

Python使用ctypes调用C/C++的方法

2019/01/29 Python

python+selenium+PhantomJS抓取网页动态加载内容

2020/02/25 Python

jupyter notebook 重装教程

2020/04/16 Python

python实现读取类别频数数据画水平条形图案例

2020/04/24 Python

Python爬虫之Selenium警告框(弹窗)处理

2020/12/04 Python

HTML5 Web Database 数据库的SQL语句的使用方法

2012/12/09 HTML / CSS

英国手机零售商：Metrofone

2019/03/18 全球购物

关于Assembly命名空间的三个面试题

2015/07/23 面试题

什么是封装

2013/03/26 面试题

班子个人四风问题整改措施

2014/10/04 职场文书

酒店辞职书怎么写

2015/02/26 职场文书

爱国教育主题班会

2015/08/14 职场文书

面试必问:圣杯布局和双飞翼布局的区别

2021/05/13 HTML / CSS

idea 在springboot中使用lombok插件的方法

2021/08/02 Java/Android

nginx访问报403错误的几种情况详解

2022/07/23 Servers