使用iTextSharp的高亮显示PDF中的文字,不显示在浏览器中突出显示的文字 [英] Highlight words in a pdf using itextsharp, not displaying highlighted word in browser

查看:1128
本文介绍了使用iTextSharp的高亮显示PDF中的文字,不显示在浏览器中突出显示的文字的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

突出显示的话是不是在浏览器中使用iTextSharp的显示。



的Adobe





浏览器





代码



<级=前片段 - 代码HTML语言,HTML prettyprint-覆盖> 列表< iTextSharp.text.Rectangle> MatchesFound = strategy.GetTextLocations(splitText [I] .Trim(),StringComparison.CurrentCultureIgnoreCase);的foreach(在MatchesFound矩形RECT){浮动[] =四路{rect.Left - 3.0F,rect.Bottom,rect.Right,rect.Bottom,rect.Left - 3.0F,rect.Top + 1.0F,rect.Right ,rect.Top + 1.0F}; //创建我们的hightlight PdfAnnotation亮点= PdfAnnotation.CreateMarkup(stamper.Writer,矩形,空,PdfAnnotation.MARKUP_HIGHLIGHT,四); //设置颜色highlight.Color = BaseColor.YELLOW; //添加注释stamper.AddAnnotation(亮点,帮您做生意); }



请帮我解决这个问题。



Updaetd代码

 私人无效highlightPDF()
{
//创建一个简单的测试文件
串OUTPUTFILE =使用Server.Mappath(〜/ PDF / 16193037V_Dhana-FI_NK-QA_Completed.pdf);
字符串文件名=HL+ Convert.ToString(会话[文件名])+.PDF
会话[文件名] =HL+ Convert.ToString(会话[文件名]);
//使用高亮
串highLightFile =使用Server.Mappath(〜/ PDF /+文件名)创建一个从我们的测试文件中一个新的文件;

//绑定一个阅读器和压模我们的测试PDF

PdfReader读卡器=新PdfReader(OUTPUTFILE);
iTextSharp.text.pdf.PdfContentByte帆布;
INT帮您做生意= Convert.ToInt16(txtPageno.Text);
使用(的FileStream FS =新的FileStream(highLightFile,FileMode.Create,FileAccess.Write,FileShare.None))
{使用
(PdfStamper模子=新PdfStamper(读卡器,FS))
{
帆布= stamper.GetUnderContent(帮您做生意);
myLocationTextExtractionStrategy策略=新myLocationTextExtractionStrategy();
strategy.UndercontentCharacterSpacing = canvas.CharacterSpacing;
strategy.UndercontentHorizo​​ntalScaling = canvas.Horizo​​ntalScaling;

串currentText = PdfTextExtractor.GetTextFromPage(读卡器,帮您做生意,策略);
字符串文本= txtHighlight.Text.Replace(\r\\\
,).Replace(\\\\
,\\\
)。更换(, );
的String [] = splitText text.Split(新的String [] {\\\
},StringSplitOptions.RemoveEmptyEntries);
的for(int i = 0; I< splitText.Length;我++)
{
名单,LT; iTextSharp.text.Rectangle> MatchesFound = strategy.GetTextLocations(splitText [I] .Trim(),StringComparison.CurrentCultureIgnoreCase);
的foreach(在MatchesFound矩形RECT)
{
canvas.SaveState();
canvas.SetColorFill(BaseColor.YELLOW);
canvas.Rectangle(RECT);
canvas.Fill();
canvas.RestoreState();
}
}

}
}
reader.Close();


}



这不是突出显示文本。我通过文字和页面没有突出的文字。


解决方案

首先...



为什么OP的(更新)的代码无法正常工作



实际上有两个因素。



所有首先,在OP的代码的问题,以一个矩形添加到他使用


$ b的路径$ b

  canvas.Rectangle(RECT);在长方形  



不幸的是这不,他预计什么C>类有超过一个矩形,选择左右边框,边框颜色,和内部的颜色,而 PdfContentByte.Rectangle(矩形)绘制一个最重要的信息仅仅坐标多个属性根据这些属性矩形



在手边的情况下,虽然,矩形仅用于运输的坐标矩形的,所以这些额外的属性都是。因此, canvas.Rectangle(矩形)什么也不做!



相反,OP应该使用



  canvas.Rectangle(rect.Left,rect.Bottom,rect.Width,rect.Height); 



在这里。



此外,@Bruno在他的回答中提到




请注意,你不会看到黄色矩形,如果你在一个不透明的添加。形状(例如下图)




不幸的是究竟这是这里的情况:该文件实际上是一个扫描文件,被每一页根据该等效绘制文本(可能OCR'ing之后)页填充的图像,使文本复制和放大器;粘贴



因此,无论OP的代码可以借鉴在 UnderContent ,将通过非常形象的隐藏。



因此,让我们尝试不同的东西。 ..



如何使它在他的回答工作



@Bruno还指出一个解决方案对于这样的情况:




在这种情况下,你可以在上面添加一个透明的矩形的 的现有内容




根据这一建议,我们将

 帆布= stamper.GetUnderContent(帮您做生意); 



 帆布= stamper.GetOverContent(帮您做生意); 

PdfGState状态=新PdfGState();
state.FillOpacity = .3f;
canvas.SetGState(州);



第三页文档,我们得到的选择支持一词:





黄在这里很苍白。



使用 0.6 透明度值,而不是我们得到





现在的黄色更加激烈,但文本开始淡了



有关这样的任务,其实我更喜欢使用混合模式的变暗。这可以通过使用



  state.BlendMode =新PdfName(变暗)来完成; 



而不是 state.FillOpacity = .3f 。这导致





这IMO看起来更好。



客户端怎么做的



的OP评论




客户都给予了PDF格式。在这一点,他们突出显示的文本,突出显示的文本显示在浏览器




客户端的PDF实际使用注解,就像在OP他原码,但相比之下每一个客户的注释包含外观流它通过iText的产生亮点的注释没有。



提供一个外观可选,PDF浏览器的确如果没有给出应该生成的外观。很显然,虽然有依赖于出场众多PDF浏览器的PDF带来一起。



顺便说一句,在客户端的PDF的外观实际使用的混合模式<强>乘。对于潜在的白色和黑色两种颜色,变暗有同样的结果。



使其与注释工作



在注释的OP想知道




请多了一个疑问,如果用户错误地强调了那么如何去掉黄色(或更改黄白色)?我改变了黄白色,但它不工作。 canvas.SetColorFill(BaseColor.WHITE);




撤消更改页面内容通常比撤消加入注释更加困难。因此,让我们的OP的原代码也工作,即增加一个外观流将高亮标注。



作为另一条评论报告的OP,他第一次尝试添加外观流失败:




  PdfAppearance外观= PdfAppearance.CreateAppearance(stamper.Writer,rect.Width, rect.Height); 
appearance.Rectangle(rect.Left,rect.Bottom,rect.Width,rect.Height);
appearance.SetColorFill(BaseColor.WHITE);
appearance.Fill();
highlight.SetAppearance(PdfAnnotation.APPEARANCE_NORMAL,外观);
stamper.AddAnnotation(亮点,帮您做生意);



但它无法正常工作。




在他的企图的问题是:




  • 外观模板的起源是的左下角注释区,而不是网页的。颜色有问题的区域,因此,矩形必须在(0,0)。

  • 严格地说颜色必须设置的的起点有其左下道路的建设。

  • 不同的颜色较白应该用于突出。

  • 透明度或合适的渲染模式应该被用来允许原,标记的文字展现出来。



因此,下面的代码演示了如何做到这一点。

 私人无效highlightPDFAnnotation(字符串OUTPUTFILE,串highLightFile,诠释帮您做生意,字符串[] splitText)
{
PdfReader读者=新PdfReader(OUTPUTFILE);
iTextSharp.text.pdf.PdfContentByte帆布;
使用(的FileStream FS =新的FileStream(highLightFile,FileMode.Create,FileAccess.Write,FileShare.None))
{使用
(PdfStamper模子=新PdfStamper(读卡器,FS))
{
myLocationTextExtractionStrategy策略=新myLocationTextExtractionStrategy();
strategy.UndercontentHorizo​​ntalScaling = 100;

串currentText = PdfTextExtractor.GetTextFromPage(读卡器,帮您做生意,策略);
的for(int i = 0; I< splitText.Length;我++)
{
名单,LT; iTextSharp.text.Rectangle> MatchesFound = strategy.GetTextLocations(splitText [I] .Trim(),StringComparison.CurrentCultureIgnoreCase);
的foreach(在MatchesFound矩形RECT)
{
浮法[] =四路{rect.Left - 3.0F,rect.Bottom,rect.Right,rect.Bottom,rect.Left - 3.0楼rect.Top + 1.0F,rect.Right,rect.Top + 1.0F};
//创建我们的hightlight
PdfAnnotation亮点= PdfAnnotation.CreateMarkup(stamper.Writer,矩形,空,PdfAnnotation.MARKUP_HIGHLIGHT,四);
//设置颜色
highlight.Color = BaseColor.YELLOW;

PdfAppearance外观= PdfAppearance.CreateAppearance(stamper.Writer,rect.Width,rect.Height);
PdfGState状态=新PdfGState();
state.BlendMode =新PdfName(乘);
appearance.SetGState(州);
appearance.Rectangle(0,0,rect.Width,rect.Height);
appearance.SetColorFill(BaseColor.YELLOW);
appearance.Fill();

highlight.SetAppearance(PdfAnnotation.APPEARANCE_NORMAL,外观);

//添加注释
stamper.AddAnnotation(亮点,帮您做生意);
}
}
}
}
reader.Close();
}



这些注解被Chrome显示,也并注释他们可以很容易地删除。


Highlighted words are not displaying in browser using itextsharp.

Adobe

Browser

CODE

 List<iTextSharp.text.Rectangle> MatchesFound = strategy.GetTextLocations(splitText[i].Trim(), StringComparison.CurrentCultureIgnoreCase);
                    foreach (Rectangle rect in MatchesFound)
                    {
                        float[] quad = { rect.Left - 3.0f, rect.Bottom, rect.Right, rect.Bottom, rect.Left - 3.0f, rect.Top + 1.0f, rect.Right, rect.Top + 1.0f };
                        //Create our hightlight
                        PdfAnnotation highlight = PdfAnnotation.CreateMarkup(stamper.Writer, rect, null, PdfAnnotation.MARKUP_HIGHLIGHT, quad);
                        //Set the color
                        highlight.Color = BaseColor.YELLOW;
                       
                        //Add the annotation
                        stamper.AddAnnotation(highlight, pageno);
                        
                    }

Kindly help me to solve this issue.

Updaetd Code

  private void highlightPDF()
{
    //Create a simple test file
    string outputFile = Server.MapPath("~/pdf/16193037V_Dhana-FI_NK-QA_Completed.pdf");
    string filename = "HL" + Convert.ToString(Session["Filename"]) + ".pdf";
    Session["Filename"] = "HL" + Convert.ToString(Session["Filename"]);
    //Create a new file from our test file with highlighting
    string highLightFile = Server.MapPath("~/pdf/" + filename);

    //Bind a reader and stamper to our test PDF

    PdfReader reader = new PdfReader(outputFile);
    iTextSharp.text.pdf.PdfContentByte canvas;
    int pageno = Convert.ToInt16(txtPageno.Text);
    using (FileStream fs = new FileStream(highLightFile, FileMode.Create, FileAccess.Write, FileShare.None))
    {
        using (PdfStamper stamper = new PdfStamper(reader, fs))
        {
            canvas = stamper.GetUnderContent(pageno);
            myLocationTextExtractionStrategy strategy = new myLocationTextExtractionStrategy();
            strategy.UndercontentCharacterSpacing = canvas.CharacterSpacing;
            strategy.UndercontentHorizontalScaling = canvas.HorizontalScaling;

            string currentText = PdfTextExtractor.GetTextFromPage(reader, pageno, strategy);
            string text = txtHighlight.Text.Replace("\r\n", "").Replace("\\n", "\n").Replace("  ", " ");
            string[] splitText = text.Split(new string[] { "\n" }, StringSplitOptions.RemoveEmptyEntries);
            for (int i = 0; i < splitText.Length; i++)
            {
                List<iTextSharp.text.Rectangle> MatchesFound = strategy.GetTextLocations(splitText[i].Trim(), StringComparison.CurrentCultureIgnoreCase);
                foreach (Rectangle rect in MatchesFound)
                {
                    canvas.SaveState();
                    canvas.SetColorFill(BaseColor.YELLOW);
                    canvas.Rectangle(rect);
                    canvas.Fill();
                    canvas.RestoreState();                      
                }
            }

        }
    }
    reader.Close();      


}

It's not highlighting the text. I passed the text and page no to highlight the text.

解决方案

First of all...

Why does the OP's (updated) code not work

There actually are two factors.

First of all, there is an issue in the OP's code, to add a rectangle to a path he uses

canvas.Rectangle(rect);

Unfortunately this does not what he expects: The Rectangle class has multiple properties beyond the mere coordinates of a rectangle, foremost information about selected borders, border colors, and an interior color, and PdfContentByte.Rectangle(Rectangle) draws a rectangle according to those properties.

In the case at hand, though, rect is used only to transport the coordinates of a rectangle, so those additional properties all are false or null. Thus, canvas.Rectangle(rect) does nothing!

Instead the OP should use

canvas.Rectangle(rect.Left, rect.Bottom, rect.Width, rect.Height);

here.

Furthermore, @Bruno mentioned in his answer

Note that you won't see the yellow rectangle if you add it under an opaque shape (e.g. under an image).

Unfortunately exactly this is the case here: The document actually is a scanned document, each page been a page-filling image under which the equivalent text is drawn (probably after OCR'ing) to allow textual copy&paste.

Thus, whatever the OP's code may draw on the UnderContent, it will be hidden by that very image.

Thus, let's try something different...

How to make it work

@Bruno in his answer also indicated a solution for such a case:

In that case, you could add a transparent rectangle on top of the existing content.

Following this advice we replace

canvas = stamper.GetUnderContent(pageno);

by

canvas = stamper.GetOverContent(pageno);

PdfGState state = new PdfGState();
state.FillOpacity = .3f;
canvas.SetGState(state);

Selecting the word "support" on the third document page we get:

The yellow is quite pale here.

Using an Opacity value of .6 instead we get

Now the yellow is more intense but the text starts to pale out.

For tasks like this I actually prefer using the blend mode Darken. This can be done by using

state.BlendMode = new PdfName("Darken");

instead of state.FillOpacity = .3f. This results in

This IMO looks better.

How the client did it

The OP commented

Client have given a pdf. In that, they highlighted text, the highlighted text is displayed in browser

The client's PDF actually uses annotations, just like the OP in his original code, but in contrast each of the client's annotations contains an appearance stream which the highlight annotations generated by iText don't.

Supplying an appearance is optional and PDF viewers indeed should generate an appearance if none is given. Obviously, though, there are numerous PDF viewers which rely on appearances the PDF brings along.

By the way, the appearances in the client's PDF actually use the blend mode Multiply. For underlying white and black colors, Darken and Multiply have the same result.

Making it work with annotations

In a comment the OP wondered

Please one more doubt, if the user wrongly highlighted then how to remove yellow color(or change yellow to white)? i changed yellow to white but it's not working. canvas.SetColorFill(BaseColor.WHITE);

Undoing a change to the page content generally is more difficult than undoing the addition of an annotation. Thus, let's make the OP's original code also work, i.e. adding an appearance stream to the highlight annotations.

As the OP reported in another comment, his first attempt to add an appearance stream failed:

PdfAppearance appearance = PdfAppearance.CreateAppearance(stamper.Writer, rect.Width, rect.Height);
appearance.Rectangle(rect.Left, rect.Bottom, rect.Width, rect.Height);
appearance.SetColorFill(BaseColor.WHITE);
appearance.Fill();
highlight.SetAppearance( PdfAnnotation.APPEARANCE_NORMAL, appearance );
stamper.AddAnnotation(highlight, pageno);

but it's not working.

The problems in his attempt are:

  • The origin of the appearance template is in the lower left corner of the annotation area, not of the page. To color the area in question, therefore, the rectangle must have its lower left at (0, 0).
  • Strictly speaking the color must be set before starting the path building.
  • A different color than white should be used for highlighting.
  • Transparency or an appropriate rendering mode should be used to allow the original, marked text to shine through.

Thus, the following code shows how to do it.

private void highlightPDFAnnotation(string outputFile, string highLightFile, int pageno, string[] splitText)
{
    PdfReader reader = new PdfReader(outputFile);
    iTextSharp.text.pdf.PdfContentByte canvas;
    using (FileStream fs = new FileStream(highLightFile, FileMode.Create, FileAccess.Write, FileShare.None))
    {
        using (PdfStamper stamper = new PdfStamper(reader, fs))
        {
            myLocationTextExtractionStrategy strategy = new myLocationTextExtractionStrategy();
            strategy.UndercontentHorizontalScaling = 100;

            string currentText = PdfTextExtractor.GetTextFromPage(reader, pageno, strategy);
            for (int i = 0; i < splitText.Length; i++)
            {
                List<iTextSharp.text.Rectangle> MatchesFound = strategy.GetTextLocations(splitText[i].Trim(), StringComparison.CurrentCultureIgnoreCase);
                foreach (Rectangle rect in MatchesFound)
                {
                    float[] quad = { rect.Left - 3.0f, rect.Bottom, rect.Right, rect.Bottom, rect.Left - 3.0f, rect.Top + 1.0f, rect.Right, rect.Top + 1.0f };
                    //Create our hightlight
                    PdfAnnotation highlight = PdfAnnotation.CreateMarkup(stamper.Writer, rect, null, PdfAnnotation.MARKUP_HIGHLIGHT, quad);
                    //Set the color
                    highlight.Color = BaseColor.YELLOW;

                    PdfAppearance appearance = PdfAppearance.CreateAppearance(stamper.Writer, rect.Width, rect.Height);
                    PdfGState state = new PdfGState();
                    state.BlendMode = new PdfName("Multiply");
                    appearance.SetGState(state);
                    appearance.Rectangle(0, 0, rect.Width, rect.Height);
                    appearance.SetColorFill(BaseColor.YELLOW);
                    appearance.Fill();

                    highlight.SetAppearance(PdfAnnotation.APPEARANCE_NORMAL, appearance);

                    //Add the annotation
                    stamper.AddAnnotation(highlight, pageno);
                }
            }
        }
    }
    reader.Close();
}

These annotation are displayed by Chrome, too, and as annotations they can easily be removed.

这篇关于使用iTextSharp的高亮显示PDF中的文字,不显示在浏览器中突出显示的文字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆