使用谷歌视觉OCR API从特定图像位置提取数据 [英] extracting data from specific image locations using google vision OCR API

查看:162
本文介绍了使用谷歌视觉OCR API从特定图像位置提取数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Google的Vision OCR API尝试从图像中提取2种类型的数据:1)从文本框中输入手写文本;下方标有红色圆圈,并在复选框中标有2)勾号或"x";下面用绿色圆圈标记.我将把这些数据输入数据库,所以我需要为这两种类型的数据返回一个字符串

I am using Googles Vision OCR API to try and extract 2 types of data from an image 1) handwritten text from text-boxes; marked with red circles below and 2) ticks or 'x' from check-boxes; marked with green circles below. I will be entering this data into a database so I will need a string returned for both types of data

当前,当我将此图像传递到API时,我得到了包含所有数据的字符串:

Currently, when I pass this image into the API I get a string with all of the data:

中学学习学生对计算机的感知LO 13.您的家人有没有在计算机/IT领域工作?如果是这样,是什么家庭成员(如父母,监护人,兄弟,姐妹等)兄弟14.您是否有过任何计算经验(甚至参加过一天)?选择>一个或多个领域:U CODER DOJO在学校校园中VSELF TAUGHT JOTHER如果您从Q14中选择了一个,是>一般经验:很好,很好或不好不好,为什么(简短回答,少于4个字)>学习了新技能待营后完成.新闻LRY 1.我现在将考虑从事>计算/IT领域的职业.强烈同意同意无意见强烈反对2.这次训练营向我展示了>计算机/IT的真正职业. ?强烈同意同意无意见强烈反对3>.营地显示/强调我不擅长编程或计算.强烈同意同意>无意见强烈反对4.请给您提供有关计算/编程之前尚不了解的两件事? Java语言Eclipse IDE va 5.在编程/计算方面,我比我最初想的要好(>在阵营之前). ?同意无意见强烈不同意? O>非常同意6.关于营地的任何反馈/评论(好还是坏)?好阵营,学到了很多. >感谢您参加这项调查.第2页,共2页

Secondary School Study Student Perception of Computers LO 13 . Are any of your family members working >in computing / IT ? If so , what family member ( s ) is it ( eg , parent , guardian , brother , sister >etc . ) brother 14 . Have you any previous computing experience ( even attended a single day ) ? Select >one or many areas : U CODER DOJO IN SCHOOL CAMP VSELF TAUGHT JOTHER If you selected any from Q14 , was >the general experience : GOOD NEITHER GOOD OR BAD BAD BAD And why ( short answer , under 4 words ) >learned new skills To be completed after the camp . NewsLRY 1 . I would now consider a career in >computing / IT . Strongly Agree Agree No Opinion Disagree Strongly Disagree 2 . The camp showed me what >a career in computing / IT really was . ? Strongly Agree Agree No Opinion Disagree Strongly Disagree 3 >. The camp showed / highlighted that I was no good at programming or computing . Strongly Agree Agree >No Opinion Disagree Strongly Disagree 4 . Give two things that you did not know about computing / >programming until after the camp ? java Language Eclipse IDE va 5 . I was better than I first thought ( >before the camp ) at programming / computing . ? Agree No Opinion Disagree Strongly Disagree ? O >Strongly Agree 6 . Any feedback / comments about the camp ( good or bad ) ? good camp , Learned a lot . >Thank you for taking this survey . Page 2 of 2

我的代码不变:

 public static void Main(string[] args)
        {

            string credential_path = @"C:\Users\35385\nodal.json";
            System.Environment.SetEnvironmentVariable("GOOGLE_APPLICATION_CREDENTIALS", credential_path);

            // Instantiates a client
            var client = ImageAnnotatorClient.Create();
            // Load the image file into memory
            var image = Image.FromFile("stack.jpg");
            // Performs text detection on the image file
            var response = client.DetectDocumentText(image);

            string words = "";

            foreach (var page in response.Pages)
            {
                foreach (var block in page.Blocks)
                {
                    string box = string.Join(" - ", block.BoundingBox.Vertices.Select(v => $"({v.X}, {v.Y})"));
                    foreach (var paragraph in block.Paragraphs)
                    {
                        box = string.Join(" - ", paragraph.BoundingBox.Vertices.Select(v => $"({v.X}, {v.Y})"));
                        foreach (var word in paragraph.Words)
                        {
                            words += $" {string.Join("", word.Symbols.Select(s => s.Text))}";
                        }
                    }
                }
            }

            Console.WriteLine(words);


        }

所以我的问题:

  1. 如何从每个红色框中提取数据(即第一个文本框将返回兄弟",第二个文本框应返回已学习的新技能")?
  2. 如何从每个绿色问题中提取标记了哪个复选框(即问题13应该返回是",问题14应该返回自选"等)?

推荐答案

我只是使用了一些PHP脚本中的API,但我认为您的问题并不取决于编程语言. 您需要使用检测到的单词的坐标(精确到四个顶点的框).然后,您可以找到与参与者写作有关的问卷调查要素. 这个脚本对我来说是一个很好的切入点:

I just used the API from some PHP-scripts but I think your problem does not depend on the programming language. You need to use the coordinates (boxes with four vertices to be precise) of the detected words. Then you can find the elements of your questionaire relative to the writing of the participant. A good entry point for me was this script:

https://www.leanx.eu/tutorials/use-google-cloud-vision-api-to-process-invoices-and-receipts

您可以在任何启用PHP的网站空间上按原样使用它,并且它为您提供了如何检索API返回的框的结构良好的概述.

You can use it "as is" on any PHP-enabled webspace and it gives you a well structured overview on how you can retrieve the boxes that the API returns.

具有这些框并知道调查表的文本,如果Google检测到它们,则查找参与者所做的复选标记应该非常容易.由于Google的OCR并不总是能找到单个字符",因此对选中标记的检测可能并不总是与谷歌视觉兼容.

Having those boxes and knowing the text of your questionaire it should be quite easy to locate the checkmarks that your participants made if google detects them. The detection of the checkmark might not always work with google vision, since a single "character" is not always found by google's OCR.

这篇关于使用谷歌视觉OCR API从特定图像位置提取数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆