文本二值化 [英] Text binarization

查看:98
本文介绍了文本二值化的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想对这张图片进行二值化处理:

I'd like to binarize this image:

与tesseract-ocr一起使用.目前,我设法做到了:

to use it with tesseract-ocr. Currently, I managed to get this:

但是我需要只有文本的清晰图像,而没有黑色背景部分,就像这样:

But I need clear image with only text, without black background parts, like this one:

我当前的代码:

img = cv2.imread(path, 0)
blur = cv2.GaussianBlur(img, (3, 3), 0)
filtered = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 405, 1)
bitnot = cv2.bitwise_not(filtered)
cv2.imshow('image', bitnot)
cv2.imwrite("h2kcw2/out1.png", bitnot)
cv2.waitKey(0)
cv2.destroyAllWindows()

推荐答案

常规阈值可以提供良好的结果:

A regular threshold can present a good result:

img = cv2.imread(path, 0)
ret, thresh = cv2.threshold(img, 70, 255, cv2.THRESH_BINARY_INV)
cv2.imshow('image', thresh)
cv2.imwrite("h2kcw2/out1.png", thresh)
cv2.waitKey(0)
cv2.destroyAllWindows()

这篇关于文本二值化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆