iOS如何获取PDF页面中的所有单词坐标 [英] iOS How to get all words coordinates in PDF page

查看：201 发布时间：2020/5/25 5:24:19 ios pdf nsscanner cgpdfdocument cgpdfscanner

本文介绍了iOS如何获取PDF页面中的所有单词坐标的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我浏览了许多教程，通常堆栈用户会向pdfkitten链接，但是在测试时，我对结果不满意.因此，搜索不适用于乘以单词等.

I have looked through many tutorials and usually stack users trow links to the pdfkitten, but as I've tested it I have not satisfied with result. So the search does not work with multiply word and etc.

所以我要寻找的是从pdf页面获取所有单词，并在单词越过某个矩形时突出显示该单词.

So what I am looking for I need to get all words from the pdf page and highlight it if the words cross some rectangle.

推荐答案

我也使用PDFKitten.

I used PDFKitten for the same.

我所做的是在扫描PDF时-识别分开的单词按空格.
保存RenderingState(PDFKitten代码中的模型)一词为遇到的将其保存在当前模型中 RenderingState(PDFKitten代码中的Model)，它将是初始状态. 找到完整的单词(以空格分隔)后，再次保存当前的RenderingState作为最终状态.
使用以下命令将RenderingState转换为实际视图的框架的代码 PDFKitten中提供了高于初始状态和最终状态的信息.你可以请参阅该代码.
将当前媒体盒变换应用到帧.
最后不要忘记将结果框架转换为用户的坐标系.否则您会看到相反的效果.

What I did was while scanning the PDF - Identify the words separated by spaces.
Save the RenderingState(Model in PDFKitten code)word is encountered save that word in a model with it's current RenderingState (Model in PDFKitten code) which will be initial state. When the complete word is found(space separated) again save the current RenderingState as final state.
The code for converting RenderingState to actual view's frame using above initial state and final state, is present in PDFKitten. You can refer to that code.
apply current media box transform to frame.
And finally don't forget to convert resulted frame into user's co-ordinate system. Otherwise you will observe the reverse effect.

这篇关于iOS如何获取PDF页面中的所有单词坐标的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

iOS如何获取PDF页面中的所有单词坐标 [英] iOS How to get all words coordinates in PDF page

问题描述

推荐答案

相关文章

移动开发最新文章

热门教程

热门工具

登录关闭

iOS如何获取PDF页面中的所有单词坐标 [英] iOS How to get all words coordinates in PDF page

问题描述

推荐答案

相关文章

移动开发最新文章

热门教程

热门工具

登录 关闭

登录关闭