使用Python从Cloud Vision API格式化OCR文本注释 [英] Format OCR text annotation from Cloud Vision API in Python

查看:90
本文介绍了使用Python从Cloud Vision API格式化OCR文本注释的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用的一个小程序上使用适用于Python的Google Cloud Vision API.该功能正在运行,并且我获得了OCR结果,但是我需要先格式化它们,然后才能使用它们.

I am using the Google Cloud Vision API for Python on a small program I'm using. The function is working and I get the OCR results, but I need to format these before being able to work with them.

这是功能:

# Call to OCR API
def detect_text_uri(uri):
    """Detects text in the file located in Google Cloud Storage or on the Web.
    """
    client = vision.ImageAnnotatorClient()
    image = types.Image()
    image.source.image_uri = uri

    response = client.text_detection(image=image)
    texts = response.text_annotations

    for text in texts:
        textdescription = ("    "+ text.description )
        return textdescription

我特别需要逐行对文本进行切片,并在开头添加四个空格,在末尾添加换行符,但是目前这仅适用于第一行,其余部分作为一行返回斑点.

I specifically need to slice the text line by line and add four spaces in the beginning and a line break in the end, but at this moment this is only working for the first line, and the rest is returned as a single line blob.

我一直在检查官方文档,但并没有真正了解API响应的格式.

I've been checking the official documentation but didn't really find out about the format of the response of the API.

推荐答案

您几乎就在那里.如要逐行分割文本,而不是循环文本注释,请尝试从google vision的响应中直接获取"描述",如下所示.

You are almost right there. As you want to slice the text line by line, instead of looping the text annotations, try to get the direct 'description' from google vision's response as shown below.

def parse_image(image_path=None):
    """
    Parse the image using Google Cloud Vision API, Detects "document" features in an image
    :param image_path: path of the image
    :return: text content
    :rtype: str
    """

    client = vision.ImageAnnotatorClient()
    response = client.text_detection(image=open(image_path, 'rb'))
    text = response.text_annotations
    del response     # to clean-up the system memory

    return text[0].description

上面的函数返回一个字符串,其中包含图像中的内容,行之间用"\ n"分隔

The above function returns a string with the content in the image, with the lines separated by "\n"

现在,您可以添加前缀&每行都需要后缀.

Now, you can add prefix & suffix as you need to each line.

image_content = parse_image(image_path="path\to\image")

my_formatted_text = ""
for line in image_content.split("\n"):
    my_formatted_text += "    " + line + "\n"

my_formatted_text是您需要的文本.

这篇关于使用Python从Cloud Vision API格式化OCR文本注释的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆