通过itext找出pdf页面的内容百分比 [英] Find out what percentage of a pdf page is content with itext

查看:391
本文介绍了通过itext找出pdf页面的内容百分比的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我如何知道使用itext有多少页面是文本?即文本占整个页面的50%?或25%?有没有办法可以得到文本完成位置的Y坐标?那么你知道在哪里写下一段文字。

How do I know using itext how much of a page is text? ie does the text take up 50% of the whole page? or 25%? is there a way you can get the Y co-ordinate of where the text finishes? So then you know where to write the next bit of text to.

谢谢你

推荐答案

这绝对不是小事。
但最简单的实现(做了很多假设)有点像这样:

This is definitely not trivial. But the easiest implementation (which makes a lot of assumptions) goes a little bit like this:

class TextMeasurementListener implements IEventListener {

    private float space = 0.0f;

    public TextMeasurementListener(PdfDocument pdfDocument, int pageNr)
    {
        new PdfDocumentContentParser(pdfDocument).processContent(pageNr, this);
    }

    @Override
    public void eventOccurred(IEventData data, EventType type) {
        if(type != EventType.RENDER_TEXT)
            return;

        TextRenderInfo textRenderInfo = (TextRenderInfo) data;
        for(TextRenderInfo charInfo : textRenderInfo.getCharacterRenderInfos())
        {
            CharacterRenderInfo characterRenderInfo = new CharacterRenderInfo(charInfo);
            space += characterRenderInfo.getBoundingBox().getWidth() * characterRenderInfo.getBoundingBox().getHeight();
        }
    }

    public float getReservedSpaceInPoints()
    {
        return space;
    }

    @Override
    public Set<EventType> getSupportedEvents() {
        return null;
    }
}

此方法实际上处理单个页面,并计算每个角色的每个边界框的区域。

This method essentially processes a single page, and counts the area of each bounding box of each character.

这篇关于通过itext找出pdf页面的内容百分比的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆