ITextSharp 在 PDF 中查找特定文本的坐标 [英] ITextSharp Find coordinates of specific text in PDF

查看:63
本文介绍了ITextSharp 在 PDF 中查找特定文本的坐标的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我发现许多网站和帖子的问题与我的相同,但它们似乎都有一个共同点,即人们都用如何在特定位置插入新文本的示例来回答它们.我有一个由我无法控制的另一个程序生成的 PDF 文档,它有一行供客户登录,但该行不是绝对位置,因此我们使用的名为 AssureSign 的服务将无法正常工作,因为你必须知道签名线的位置在哪里.所以我需要创建一个新程序来找到签名行的位置并将该信息发送到保证系统.

I have found many sites and postings that the question is the same as mine but what they all seem to have in common is people are answering them with examples of how to insert new text at specific locations. I have a PDF document that is generated by another program that I have no control over and it has a line for a client to sign on but that line is not in an absolute position so a service that we use called AssureSign will not work properly because you have to know where the position of the signature line is. So I need to create a new program that will find the position of the signature line and send that information to the assuresign system.

这真的应该很简单,但由于某种原因我不明白

This really should be simple but for some reason I am not getting it

推荐答案

您可以利用 iText (Sharp) 的解析器包来查找给定文本的位置.不过,您必须实现自己的 RenderListener,因为该包的主要用例是文本提取,而不是文本位置查找.

You can make use of the parser package of iText (Sharp) to find the position of a given text. You do have to implement your own RenderListener, though, as the main use case of that package is text extraction, not text position finding.

这并不像您想象的那么容易,例如单词的各个字符可以按任意顺序单独出现.

It is not as easy as you might think as e.g. the individual characters of the words might come in separately in any order.

附注:

首先,您必须找出签名的行是否由字符组成(正如您的问题所暗示的那样),或者它是否是绘制的路径.此外,您还必须确定该行在文档中是否唯一.

First you will have to find out, though, whether the line for the signature consists of characters (as your question seems to imply) or whether it is a drawn path. Additionally you will have to find out whether that line is unique in the document.

在前一种情况下,您需要的 RenderListener 实现必须检查在其 RenderText 方法中转发以进行处理的 TextRenderInfo 对象.如果它的文本内容包含构成signatrue 行的那些唯一字符,则必须存储此TextRenderInfo 的位置数据.如果行字符不是唯一的,您将必须找到一些使它们唯一的附加条件,例如某些前面的字符串或可能是这些字符在文档中最后出现的事实.

In the former case, the RenderListener implementation you need has to inspect the TextRenderInfo objects forwarded for processing in its RenderText method. If its text content contains those unique characters building the signatrue line, you have to store the position data of this TextRenderInfo. If the line characters are not unique, you will have to find some additional criteria making them unique, e.g. some preceding string or possibly a fact that its the last occurance of those characters in the document.

在后一种情况下,解析器包的功能必须有所扩展,因为它目前不报告路径.根据 iText 邮件列表,ToDo 列表中有这样的扩展.

In the latter case the parser package functionality has to be somewhat extended as it currently does not report paths. According to the iText mailing list, an extension like that is on the ToDo list.

这篇关于ITextSharp 在 PDF 中查找特定文本的坐标的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆