使用谷歌视觉 OCR API 从特定图像位置提取数据 [英] extracting data from specific image locations using google vision OCR API

查看:56
本文介绍了使用谷歌视觉 OCR API 从特定图像位置提取数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 Googles Vision OCR API 尝试从图像中提取 2 种类型的数据 1) 文本框中的手写文本;下面用红色圆圈标记,2) 勾选或复选框中的x";下面用绿色圆圈标记.我将把这些数据输入到数据库中,所以我需要为这两种类型的数据返回一个字符串

I am using Googles Vision OCR API to try and extract 2 types of data from an image 1) handwritten text from text-boxes; marked with red circles below and 2) ticks or 'x' from check-boxes; marked with green circles below. I will be entering this data into a database so I will need a string returned for both types of data

目前,当我将此图像传递到 API 时,我得到一个包含所有数据的字符串:

Currently, when I pass this image into the API I get a string with all of the data:

中学学习学生对计算机的看法 LO 13.您的家庭成员中是否有从事计算/IT 工作的人员?如果是,是哪个家庭成员(例如,父母、监护人、兄弟、姐妹>等) 兄弟 14.你以前有没有计算经验(甚至参加过一天)?选择>一个或多个领域:U Coder DOJO IN SCHOOL CAMP VSELF TAUGHT JOTHER 如果您从 Q14 中选择了任何一个,是>一般经验:好、好或坏、坏、坏以及为什么(简短回答,4 个字以内)>学习新技能待完营后.新闻LRY 1 .我现在会考虑从事 > 计算/IT 方面的职业.非常同意 同意 没有意见 不同意 非常不同意 2.该训练营向我展示了计算/IT 领域的真正职业.?非常同意 同意 没有意见 不同意 非常不同意 3 >.训练营表明/强调我不擅长编程或计算.非常同意 同意 >没有意见 不同意 非常不同意 4.给出两件你直到训练营结束后才知道的关于计算/* 编程的事情?java 语言 Eclipse IDE va 5.我在编程/计算方面比我最初想象的要好(在训练营之前).?同意 没有意见 不同意 非常不同意?O >非常同意 6 .关于营地的任何反馈/评论(好或坏)?好营,学到很多.>感谢您参加本次调查.第 2 页,共 2 页

Secondary School Study Student Perception of Computers LO 13 . Are any of your family members working >in computing / IT ? If so , what family member ( s ) is it ( eg , parent , guardian , brother , sister >etc . ) brother 14 . Have you any previous computing experience ( even attended a single day ) ? Select >one or many areas : U CODER DOJO IN SCHOOL CAMP VSELF TAUGHT JOTHER If you selected any from Q14 , was >the general experience : GOOD NEITHER GOOD OR BAD BAD BAD And why ( short answer , under 4 words ) >learned new skills To be completed after the camp . NewsLRY 1 . I would now consider a career in >computing / IT . Strongly Agree Agree No Opinion Disagree Strongly Disagree 2 . The camp showed me what >a career in computing / IT really was . ? Strongly Agree Agree No Opinion Disagree Strongly Disagree 3 >. The camp showed / highlighted that I was no good at programming or computing . Strongly Agree Agree >No Opinion Disagree Strongly Disagree 4 . Give two things that you did not know about computing / >programming until after the camp ? java Language Eclipse IDE va 5 . I was better than I first thought ( >before the camp ) at programming / computing . ? Agree No Opinion Disagree Strongly Disagree ? O >Strongly Agree 6 . Any feedback / comments about the camp ( good or bad ) ? good camp , Learned a lot . >Thank you for taking this survey . Page 2 of 2

我的代码:

 public static void Main(string[] args)
        {

            string credential_path = @"C:\Users\35385\nodal.json";
            System.Environment.SetEnvironmentVariable("GOOGLE_APPLICATION_CREDENTIALS", credential_path);

            // Instantiates a client
            var client = ImageAnnotatorClient.Create();
            // Load the image file into memory
            var image = Image.FromFile("stack.jpg");
            // Performs text detection on the image file
            var response = client.DetectDocumentText(image);

            string words = "";

            foreach (var page in response.Pages)
            {
                foreach (var block in page.Blocks)
                {
                    string box = string.Join(" - ", block.BoundingBox.Vertices.Select(v => $"({v.X}, {v.Y})"));
                    foreach (var paragraph in block.Paragraphs)
                    {
                        box = string.Join(" - ", paragraph.BoundingBox.Vertices.Select(v => $"({v.X}, {v.Y})"));
                        foreach (var word in paragraph.Words)
                        {
                            words += $" {string.Join("", word.Symbols.Select(s => s.Text))}";
                        }
                    }
                }
            }

            Console.WriteLine(words);


        }

所以我的问题:

  1. 如何从每个红色框中提取数据(即第一个文本框将返回兄弟",第二个文本框应返回学到的新技能")?
  2. 如何提取每个绿色问题中标记的复选框(即问题 13 应返回是",问题 14.应返回自学"等)?

推荐答案

我只是使用了一些 PHP 脚本中的 API,但我认为您的问题与编程语言无关.您需要使用检测到的单词的坐标(准确地说是四个顶点的框).然后,您可以找到与参与者的写作相关的问卷元素.这个脚本对我来说是一个很好的切入点:

I just used the API from some PHP-scripts but I think your problem does not depend on the programming language. You need to use the coordinates (boxes with four vertices to be precise) of the detected words. Then you can find the elements of your questionaire relative to the writing of the participant. A good entry point for me was this script:

https://www.leanx.eu/tutorials/use-google-cloud-vision-api-to-process-invoices-and-receipts

您可以在任何支持 PHP 的网络空间上按原样"使用它,它为您提供结构良好的概览,了解如何检索 API 返回的框.

You can use it "as is" on any PHP-enabled webspace and it gives you a well structured overview on how you can retrieve the boxes that the API returns.

有了这些框并知道你的问卷文本,如果谷歌检测到参与者所做的复选标记,应该很容易找到它们.对勾的检测可能并不总是适用于 google vision,因为 google 的 OCR 并不总是能找到单个字符".

Having those boxes and knowing the text of your questionaire it should be quite easy to locate the checkmarks that your participants made if google detects them. The detection of the checkmark might not always work with google vision, since a single "character" is not always found by google's OCR.

这篇关于使用谷歌视觉 OCR API 从特定图像位置提取数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆