html解析iText中的异常 [英] html parsing exceptions in iText

查看:239
本文介绍了html解析iText中的异常的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在apface中使用ap:编辑器,其中用户粘贴包含一些电子邮件模板的word文档并将其保存在DB中。

I have a p:editor in primefaces where the users are pasting word documents having some email templates and saving it in DB.

现在我需要将此内容转换为PDF格式。但是我从DB返回的是该word文档的HTML转换。

Now I need to convert this content into pdf. But what I am getting returned from DB is an HTML conversion of that word document.

在使用iText解析此HTML内容时,由于xhtml无效,我遇到了很多错误,如下所示

While parsing this HTML content with iText, I am running into lot of errors beacause of invalid xhtml like below

<span style="font-family: Arial, Verdana; font-size: 13.3333px;"><img src="9#credit_cards_logos#9"></span>

上面的代码片段,我收到错误无效的span标记。预期关闭img标签。当我删除img周围的span标签时,它工作正常。

With above snippet, I am getting error invalid span tag. Expected closing img tag. When I remove span tag around img, it works fine.

现在这样的错误到处都是。并且不可能手动去修复所有这些,因为它是一个巨大的模板(有100个模板。)

Now errors like this are all over the place. And it's not possible to manually go and fix all of them as it's a huge template (there are 100s of templates.)

这是我用来解析它的函数。

Here is my function which I am using to parse it.

public StreamedContent getFile() throws IOException, DocumentException{
        final PortletResponse portletResponse = (PortletResponse) FacesContext.getCurrentInstance().getExternalContext()
                .getResponse();
        final HttpServletResponse res = PortalUtil.getHttpServletResponse(portletResponse);
        res.setContentType("application/pdf");
        res.setHeader("Cache-Control", "no-store, no-cache, must-revalidate");
        res.setHeader("Content-Disposition", "attachment; filename=" + subject + ".pdf");
        res.setHeader("Refresh", "1");
        res.flushBuffer();
        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        OutputStream out = res.getOutputStream();
        Document document = new Document(PageSize.LETTER);
        PdfWriter pdfWriter =PdfWriter.getInstance(document, baos);
        document.open();
        document.addCreationDate();
        XMLWorkerHelper worker = XMLWorkerHelper.getInstance();
        //htmlWorker.parse(new StringReader(getMessage()));
        worker.parseXHtml(pdfWriter, document, new StringReader(getMessage()));
        document.close();
        baos.writeTo(out);
        out.flush();
        out.close();
        return null;
    }

有解决方法吗?

编辑____________

EDIT____________

是否有类似p:dataExporter(仅适用于数据表)在primefaces中将内容转换为pdf而无需解析HTML。

Is there something like p:dataExporter(only for datatables) in primefaces which will convert the contents into pdf without the need to parse the HTML.

推荐答案


在primefaces中是否存在类似p:dataExporter(仅用于数据表)的内容,它将内容转换为pdf而无需解析HTML。

Is there something like p:dataExporter(only for datatables) in primefaces which will convert the contents into pdf without the need to parse the HTML.

是:不,没有

这篇关于html解析iText中的异常的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆