Google Cloud Vision API 是否检测 OCR 文本中的格式,如粗体、斜体、字体名称(黑体或新罗马)等? [英] Does Google Cloud Vision API detect formatting in OCRed text like bold, italics, font name (helvetica or times new roman), etc?

查看:51
本文介绍了Google Cloud Vision API 是否检测 OCR 文本中的格式,如粗体、斜体、字体名称(黑体或新罗马)等?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

敏捷的棕色狐狸跳过懒惰的狗

在这种情况下,假设也有不同的字体系列,cloud VIsion API 可以检测到这一点.或者任何其他 OCR API 都能干净地检测到这一点.Tesseract 有能力,但它太不准确了.

解决方案

谷歌云视觉 API 是否检测 OCRed 文本中的格式,如粗体、斜体、字体名称(helvetica 或 times new roman)等?

不幸的是,没有.

在我的项目中,我为此使用了 并收到以下结果

The quick brown fox jumps over the lazy dog

In such a case like this, assuming there are different font families too, can cloud VIsion API detect this. Or any other OCR API detect this cleanly. Tesseract has capabilities but its so inaccurate.

解决方案

Does google cloud vision API detect formatting in OCRed text like bold, italics, font name (helvetica or times new roman), etc?

Unfortunately, no.

In my project, I use ABBYY Cloud OCR SDK for this purpose. If you want to try, you can start free trial which includes 500 free requests (pages). After you create your trial account, you will receive an email from ABBYY which will contain your Application ID and Application password. Use these 2 values to create your authentication header according to Authentication.

See the following example:

  1. Perform processImage request. Pass your image in the request body.

Request:

POST / https://cloud.ocrsdk.com/v2/processImage?exportFormat=xml&profile=documentConversion&xml:writeFormatting=true
Authorization: <your token>

Response:

{
    "taskId": "a226a0b6-6705-4d6f-9f4c-517fa9b4e28e",
    "registrationTime": "2020-07-26T09:42:39Z",
    "statusChangeTime": "2020-07-26T09:42:39Z",
    "status": "Queued",
    "filesCount": 1,
    "requestStatusDelay": 10000
}

  1. Perform getTaskStatus request in order to check if your task is completed. Use taskId from the response of the previous step.

Request:

GET / https://cloud.ocrsdk.com/v2/getTaskStatus?taskId=a226a0b6-6705-4d6f-9f4c-517fa9b4e28e
Authorization: <your token>

Response:

{
    "taskId": "a226a0b6-6705-4d6f-9f4c-517fa9b4e28e",
    "registrationTime": "2020-07-26T09:42:39Z",
    "statusChangeTime": "2020-07-26T09:42:40Z",
    "status": "Completed",
    "filesCount": 1,
    "requestStatusDelay": 0,
    "resultUrls": [
        "https://ocrsdk.blob.core.windows.net/files/a226a0b6-6705-4d6f-9f4c-517fa9b4e28e.result?sv=2012-02-12&se=2020-07-26T19%3A00%3A00Z&sr=b&si=downloadResults&sig=4k9FcRoBfhodq%2BMj%2Ffj%2BGLBfwK2BsO7sj15JQOLcArk%3D"
    ]
}

  1. Download the result (see resultUrls from the response of the previous step).

I used the following picture and received the following result

这篇关于Google Cloud Vision API 是否检测 OCR 文本中的格式,如粗体、斜体、字体名称(黑体或新罗马)等?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆