文本二值化 [英] Text binarization
本文介绍了文本二值化的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我想对这张图片进行二值化处理:
I'd like to binarize this image:
与tesseract-ocr一起使用.目前,我设法做到了:
to use it with tesseract-ocr. Currently, I managed to get this:
但是我需要只有文本的清晰图像,而没有黑色背景部分,就像这样:
But I need clear image with only text, without black background parts, like this one:
我当前的代码:
img = cv2.imread(path, 0)
blur = cv2.GaussianBlur(img, (3, 3), 0)
filtered = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 405, 1)
bitnot = cv2.bitwise_not(filtered)
cv2.imshow('image', bitnot)
cv2.imwrite("h2kcw2/out1.png", bitnot)
cv2.waitKey(0)
cv2.destroyAllWindows()
推荐答案
常规阈值可以提供良好的结果:
A regular threshold can present a good result:
img = cv2.imread(path, 0)
ret, thresh = cv2.threshold(img, 70, 255, cv2.THRESH_BINARY_INV)
cv2.imshow('image', thresh)
cv2.imwrite("h2kcw2/out1.png", thresh)
cv2.waitKey(0)
cv2.destroyAllWindows()
这篇关于文本二值化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文