c#将pdf转换成html [英] c# converting pdf to html

查看:333
本文介绍了c#将pdf转换成html的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我可以使用哪个.dll文件作为输入和.html文件使用.pdf文件?
我想将.pdf转换为.html。我的同事说,从pdf中获取文本/字体/图像/边距/链接等,然后创建具有相同内容的新HTML文件是非常困难的。他说这几乎是不可能的。所以我在想 - 如果有一些可以用作参考的dll可以这样做?

Is there a .dll I can use which uses .pdf file as an input and .html file as an output? I want to convert .pdf to .html. My colleague says that it's very difficult going step by step, getting text/font/image/margins/links etc. from pdf and then creating new html file with the same content. He says it's nearly impossible. So I was thinking - if there's some dll which I can use as a reference to do that?

推荐答案

编写一个程序来做这绝对不是微不足道的。如果没有找到任何.NET Library来执行此操作(我至少不能免费),我只需要下载这个并以编程方式调用它来获取我的html。

Writing a program to do it is definitely not trivial. If you don't find any .NET Library to do this (I couldn't, at least not free), I would just download this and invoke it programmatically to get my html.

如果您有空闲时间和/或PDFToHtml不能为您提供可接受的输出,您可以使用iText 自己编写程序。这是一个非常成熟的免费pdf库。我曾经用它来操纵PDF(合并,创建等)。

If you have the time to spare and/or PDFToHtml does not produce acceptable output for you, you could use iText to write the program yourself. It's a very mature free pdf library. I've used it in the past to manipulate PDFs (merge, create, etc).

更新

如Quandary的评论所述, PDFSharp 库提供了更轻松的许可(MIT)与iText提供的商业或AGPL许可相比。在选择您的图书馆时,请记住这一点。我没有自己使用PDFSharp库,我不知道他们在功能上的比较。

As noted in the comment by Quandary, the PDFSharp library offers a more relaxed license (MIT) compared to the Commercial or AGPL license offered by iText. Keep this is mind when choosing your library. I have not used the PDFSharp library myself and I don't know how they compare in terms of functionality.

这篇关于c#将pdf转换成html的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆