MS Word文档到RTF文档 [英] MS Word documents to RTF documents

查看:164
本文介绍了MS Word文档到RTF文档的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个问题:我的应用程序必须将ms的Word文档(从另一个系统导入)转换为rtf文档,以便使用OOo API进行操作并避免错误(出于编码不兼容的原因).

I've a problem: my application must convert ms word documents (imported from another system) into rtf documents, in order to be manipulated with OOo APIs and to be immune from mistakes (for coding incompatibility reasons).

我问你:如何直接从Java应用程序中处理ms word文档?有API(例如POI或OOo)可以让我在没有任何编码不兼容的情况下完成工作吗?

I ask you: how can I manipulate ms word documents directly from my Java application? There are APIs (like POI or OOo) that allow me to do my work without any coding incompatibility?

我的系统在Linux服务器计算机上运行(例如所有公共生产系统),而我只安装了OOo.

My system runs on Linux server machines (such as all production systems for public) and I've installed only OOo.

使用OOo Java API,我可以打开,操作和保存文档,但是在最近的这段时间里,我看到了很多有关Word密闭编码与OOo打开文档格式编码之间编码不兼容的问题(我指的是作家). 在许多情况下,带有特定项目符号的列表(例如'-'或嵌套列表),页面编号(例如x格式的1)和许多其他格式设置选项,输出文档(来自操作)由于以下原因而显示出许多错误:我认为这是两种编码格式之间的不兼容.

Using the OOo java APIs I can open, manipulate and save the documents, but, in this last period I'm viewing a lot of problems concerning the incompatibility for coding between the Ms Word closed coding and the OOo opend document format coding (I refer to swriter). In many cases, list with particular bullets (e.g., '-' or also nested list), page numbering (e.g., 1 of x format), and many others formatting options, the output document (from manipulation) shows many errors due to, I think, incompatibility between the two coding formats.

现在,我正在研究Apache POI功能,以了解是否可以用它打开Word女士,并将文档保存为RTF格式,并且该格式可以将不兼容程度降低到最小程度.

Now, I'm studying the Apache POI capabilities in order to understand if I can open Ms Word with it, and save the document in RTF format that is and interchange format able to reduce the incompatibility to minimal level.

您有同样的问题吗?您能为我指定一个功能更强大的Java开源库吗?或者,您可以建议我使用POI + iText之类的组合方法来将ms字转换为rtf吗?

Do you have a same problem? Can you indicate me a Java open source library more powefull of POI? Or, can you suggest me a combined approach such as POI+iText to do the conversion step ms word to rtf?

推荐答案

当我被要求提供一种将文档可靠地转换为tiff的方法时,我进行了一些研究.有许多库-免费和商业库,它们声称能够渲染ms.docs.它们都不提供100%准确的渲染.

When I was asked to provide a way to reliably convert a doc to a tiff I did some research. There is a number of libraries out there - both free and commercial which claim to be able to render ms.docs. None of them provide 100% accurate rendering.

我必须执行的方法是在包装器中运行MS Word,并通过OLE Automation操作它来执行我需要的操作.这个(后台运行的Word)本身虽然安静一些,但是通过周到的设计,您可以使其正常工作.

The way I had to do it is to run MS Word in a wrapper and manipulate it to do what I need through the OLE Automation. This (running Word in background) in itself has quiet a few gotchas but with thoughtful design you can make it work.

您的案例比我的案例还要容易,因为您所需要做的只是打开文档,然后将其另存为.

Your case is even easier than mine because all you need is to open the doc and then save it as.

修改

@Paolo-您去了.我经历了同样的事情-评估了包括OO在内的各种软件包,发现它们很mmmm ...不够精确.当然,这完全取决于客户对文档格式设置的严格程度.我的矿井非常挑剔-取决于边距大小和图片位置.

@Paolo - There you go. I've been through the same - evaluating various packages, OO included and finding that they are mmmm... less than precise. Of course it all depends on how strict you customers are about document formatting. Mine were extremely picky - up to the margin sizes and picture positioning.

另一种选择是给出(并获得批准)清单.不幸的是,对于每一个新文档,您都将有机会找到一个新文档

Another option would be to give (and get approval of) a list of imprecisions. Unfortunately with every new doc you will run a chance to hit a new one

这篇关于MS Word文档到RTF文档的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆