使用ABBYY OCR SDK从图像检索到的坐标不正确 [英] Incorrect coordinates retrieved from image using ABBYY OCR SDK

查看:189
本文介绍了使用ABBYY OCR SDK从图像检索到的坐标不正确的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用ABBYY OCR SDK使用放置的示例代码处理图像。



对于每个已识别的字符,它将包含 charParams 如您所链接的答案中所示的 元素。元素将包含页面像素的坐标 - 相同的XML还包含页面元素:

 < page width =...height =...resolution =...originalCoords =...> 

存储图像宽度和高度。所以 l r 每个 charParams 元素都在范围内相应页面的 0..width-1 t b charParams 元素的$ c>在相应页面的 0..height-1 范围内。



还明确提到所有坐标都在像素中 - 它们完全与分辨率无关。这就是为什么每当您尝试突出显示图像上的任何内容时,您都必须考虑缩放 - 图像可能不会始终按设备软件显示,但会缩小尺寸,因此您必须将页面坐标映射到缩放上 - 适当的图像坐标和高亮显示。


I'm trying to process an image using ABBYY OCR SDK using the sample code placed in this question but I'm not able get the co-ordinates right for a specific word say "OCR" on the screenshot below.

I want to draw an overlay (yellow rectangle over the word "OCR") and sometimes the rectangle is placed very far away from the actual word.

解决方案

The XML you get is synthesised according to this schema.

For each recognized character it will contain an instance of charParams element as shown in the answer you linked to. The element will contain the coordinates in page pixels - the same XML also contains a page element:

<page width="..." height="..." resolution="..." originalCoords="...">

where the image width and height are stored. So l and r for each charParams element is in range 0..width-1 of the corresponding page and t and b for each charParams element is in range 0..height-1 of the corresponding page.

Also it's worth mentioning explicitly that all coordinates are in pixels - they are completely resolution-agnostic. This is why whenever you try to highlight anything on an image you have to take zoom into account - the image will likely not be always displayed as is by your device software, but will be downscaled and so you have to map page coordinates onto your zoomed-out image coordinates and highlight appropriately.

这篇关于使用ABBYY OCR SDK从图像检索到的坐标不正确的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆