如何使用 tensorflow ocr 训练数据? [英] How to train data using tensorflow ocr?

查看:69
本文介绍了如何使用 tensorflow ocr 训练数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是 tensorflow 的新手,所以我在 tensorflow 中有点困惑,有多种模型可以像执行 OCR 一样

I am new to tensorflow, so i am little bit confuse in tensorflow there are multiple model to perform OCR like

  1. attention_ocr
  2. 街道

我有以下文件,我必须执行 OCR.我尝试使用 pytesseract 读取图像但没有给出正确的结果.

I have below document which i have to perform OCR. I tried to use pytesseract to read image but not giving proper result.

我需要上图的以下结果

  • D MANIKANDAN

DURAISAMY

16/07/1986

BNZPM2501F

请建议我张量流模态对于在 OCR 之上执行很有用.我使用下面的代码从 pytesseract

Please suggest me please tensorflow modal is useful to perform above OCR. I am using below code to get data from pytesseract

def getData(coordinate, image):
    (y1, y2, x1, x2, classification) = coordinate
    ts = int(time.time())
    height = y2-y1
    width = x2-x1
    crop = image[y1:y1+height, x1:x1+width]
    CROP_IMAGE_URL = EXPORT_PATH +"data.jpg"
    cv2.imwrite(CROP_IMAGE_URL, crop)
    img = cv2.imread(CROP_IMAGE_URL)
    text = pytesseract.image_to_string(img)
    os.remove(CROP_IMAGE_URL)
    return text

推荐答案

步骤:

  • 检测轮廓.

  • Detect contours.

根据轮廓提取 ROI 后,使用 tesseract 提取文本.

After extracting ROI based on contour, extract your text using tesseract.

import cv2
import pytesseract
import matplotlib.pyplot as plt
import matplotlib

img = cv2.imread('pan2.jpg')
image= img.copy()
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
blur = cv2.GaussianBlur(gray, (3,3), 0)
thresh = 255 - cv2.threshold(blur, 0,255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)[1]

# Dilate to combine adjacent text contours
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (4,2))
dilate = cv2.dilate(thresh, kernel, iterations=2)

# Find contours, highlight text areas, and extract ROIs
cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]

ROI_number = 0
ROI_images = []
for c in cnts:
    area = cv2.contourArea(c)
    x,y,w,h = cv2.boundingRect(c)
    if area > 1000 and 12<h<18:
        cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 3)
        ROI = img[y:y+h, x:x+w]
        # cv2.imwrite('ROI_{}.png'.format(ROI_number), ROI)
        ROI_number += 1
        ROI_images.append(ROI)
plt.subplot(131)
plt.imshow(thresh)
plt.subplot(132)
plt.imshow(dilate)
plt.subplot(133)
plt.imshow(image)
plt.show()

<小时>

for i in ROI_images:
    text = pytesseract.image_to_string(i,config='--psm 10 --oem 3 -c tessedit_char_whitelist=0123456789')
    print("text:",text)
    plt.imshow(i)
    plt.show()

<小时>

这篇关于如何使用 tensorflow ocr 训练数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆