iOS如何获取PDF页面中的所有单词坐标 [英] iOS How to get all words coordinates in PDF page

查看:201
本文介绍了iOS如何获取PDF页面中的所有单词坐标的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我浏览了许多教程,通常堆栈用户会向pdfkitten链接,但是在测试时,我对结果不满意.因此,搜索不适用于乘以单词等.

I have looked through many tutorials and usually stack users trow links to the pdfkitten, but as I've tested it I have not satisfied with result. So the search does not work with multiply word and etc.

所以我要寻找的是从pdf页面获取所有单词,并在单词越过某个矩形时突出显示该单词.

So what I am looking for I need to get all words from the pdf page and highlight it if the words cross some rectangle.

推荐答案

我也使用PDFKitten.

I used PDFKitten for the same.

  • 我所做的是在扫描PDF时-识别分开的单词 按空格.
  • 保存RenderingState(PDFKitten代码中的模型)一词为 遇到的将其保存在当前模型中 RenderingState(PDFKitten代码中的Model),它将是初始状态. 找到完整的单词(以空格分隔)后,再次保存 当前的RenderingState作为最终状态.
  • 使用以下命令将RenderingState转换为实际视图的框架的代码 PDFKitten中提供了高于初始状态和最终状态的信息.你可以 请参阅该代码.
  • 将当前媒体盒变换应用到帧.
  • 最后不要忘记 将结果框架转换为用户的坐标系.否则 您会看到相反的效果.
  • What I did was while scanning the PDF - Identify the words separated by spaces.
  • Save the RenderingState(Model in PDFKitten code)word is encountered save that word in a model with it's current RenderingState (Model in PDFKitten code) which will be initial state. When the complete word is found(space separated) again save the current RenderingState as final state.
  • The code for converting RenderingState to actual view's frame using above initial state and final state, is present in PDFKitten. You can refer to that code.
  • apply current media box transform to frame.
  • And finally don't forget to convert resulted frame into user's co-ordinate system. Otherwise you will observe the reverse effect.

这篇关于iOS如何获取PDF页面中的所有单词坐标的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆