在ASP.NET中,将PDF文件转换为HTML的最佳方式是什么? [英] In ASP.NET what is the best way to convert a PDF file to HTML?

查看:226
本文介绍了在ASP.NET中,将PDF文件转换为HTML的最佳方式是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的用户将会在他们的机器上选择一个PDF文档,并将其上传到我的网站,在那里我将转换成一个HTML文档在网站上显示。该文档将被转换后存储在数据库中。



将PDF转换为HTML的最佳方式是什么?



我已经递交了一个要求,即用户将创建一个新闻故事作为pdf,然后将其上传到服务器,然后将其转换为HTML并显示在网站上。

$ b $任何可以将文档保存为PDF的文档创建软件都可以将它们保存为HTML。我假设问题在于您的用户将创建丰富的文档(大量嵌入的图像),这会导致出现多个文件,并且您的要求源于希望尽可能简单地向用户上传这些文档。



有很多转换软件包可以为您做这件事,但是当您谈论丰富的内容时,您正在讨论文本和图片。这些图像必须存储在某处并以某种方式提供服务,无论您使用哪种转换方法,都需要检查所有图像源,以确保它们指向服务器上的有效位置。



我想建议一种替代的方式来实现这一点,您可以将其带到您的团队中:实现众多博客API中的一个来发布内容。有免费和商业软件包使用这些API将内容直接发布到网站,例如Windows Live Writer和Microsoft Word。您的用户可以简单地创建他们的内容并直接上传到您的网站,而无需先将其发布为PDF,然后再上传。所以这个过程对于你的用户来说变得更加顺畅,你可以通过一种形式获得这些帖子,而不需要花费数千美元来开发或购买转换代码。



两种最常用的API是 MetaWeblog API 。两者都非常简单并且易于实施。我认为这种方式比你想要做的更好。


What my users will do is select a PDF document on their machine, upload it to my website, where I will convert into an HTML document for display on the website. The document will be stored in a database after conversion.

What's the best way to convert a PDF to HTML?

I have been handed a requirement where a user would create a "news" story as a pdf and then would upload it to the sever, where it will be converted to HTML and displayed on the website.

解决方案

Any document creation software that can save documents as PDF can save them as HTML. I'm assuming the issue is that your users will be creating rich documents (lots of embedded images), which results in multiple files, and your requirements stem from a desire to make uploading these documents as simple as possible to the user.

There are numerous conversion packages that can probably do this for you, however when you're talking about rich content, you are talking about text plus images. Those images have to be stored somewhere and served somehow, and whatever conversion method you use will require you to examine all image sources to make sure they point to valid locations on your server.

I would like to suggest an alternate way of doing this that you can take to your team: Implement one of the many blog APIs for publishing content. There are free and commercial software packages that use these APIs to publish content directly to a website, such as Windows Live Writer and Microsoft Word. Your users can simply create their content and upload it directly to your website without having to publish it as PDF first then upload it. So the process becomes much smoother for your users, and you get the posts in a form that doesn't require you spend thousands of dollars on developing or buying conversion code.

The two most common APIs are the MetaWeblog API and the Movable Type API. Both are very simple and easy to implement. I think this way would be a MUCH better alternative than what you're thinking about doing.

这篇关于在ASP.NET中,将PDF文件转换为HTML的最佳方式是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆