如何使用Java加载旧的Microsoft Office XML文件(Excel) [英] How to load old Microsoft Office XML file (Excel) using Java

查看:244
本文介绍了如何使用Java加载旧的Microsoft Office XML文件(Excel)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我无法以较旧的Office XML格式(认为Office 2002或2003版本)加载到Java中的Excel文件。我试过JXL和Apache的POI(3.7版本)。 POI不起作用,因为它似乎需要较新的Office .xlsx 格式。



这是一个示例



可以通过将工作簿保存为格式XML Spreadsheet 2003,从MS Excel 2010生成类似的XML文件?



是否有任何可以加载XMLSS格式的开源Java库?否则我别无选择,只能写一个自定义解析器:读取XML文件,然后解释单元格标签来构建单元格矩阵。在这种XML格式中,任何具有空单元格值的行都被跳过,下一个单元格的数据位置与索引属性一样,就像列中的一个偏移量一样,我假定在XML文件中节省空间。

解决方案

复制Mark Beardsley的POI团队的答案 http://apache-poi.1045710.n5.nabble.com/How-to-convert-xml-to-xls-td2306602.html



你有一个Office 2003 xml文件,而不是一个OpenXML文件;这是Microsoft早期尝试为Excel创建一个基于xml的文件格式,它在这个意义上是一个有效的Office文件格式。



可惜POI无法解释这个文件在所有,这就是为什么你看到异常,当你试图将它包装在InputStream并传递给WorkbookFactory(s)构造函数。然而,您有许多选项;




  • 您可以使用Excel本身并手动打开并保存要转换的每个文件,如已经完成了。

  • 如果您可以访问Visual Studio并且可以编写Visual Basic或C#代码,那么您可以使用一个控件来允许您以编程方式控制Excel。这样,您可以使用Excel本身自动执行文件转换过程。那么一旦文件被转换成二进制或OpenXML格式,POI就可以用来处理它。

  • 如果你运行在独立的PC上,Excel的副本是安装并使用Windows操作系统,那么您可以使用OLE从Java代码执行非常相似的操作。如上所述,POI可以用于处理转换后的文件。

  • 如果您可以访问OpenOffice,则可以使用Java代码访问相当不错的API。您可以使用它在您的文件类型之间进行转换 - 这只是发现在这种情况下使用的正确的过滤器的问题。除了最复杂的文件,OpenOffice对所有人都很好,你可以使用POI来处理转换后的文件。但是,如果您选择此路线,最好使用OpenOffice的UNO API进行所有工作。

  • 根据您对文件内容的处理方式,您可以创建自己的解析器使用核心java代码和SAX或Xerces解析器(考虑使用xmlBeans(http://xmlbeans.apache.org/))。如果您只需使用简单的文本编辑器打开原始的xml文件,就可以看到该结构并不复杂,如果您希望获取的是原始数据,那么这可能是您最好的选择。


I'm not able to load an Excel file in the older Office XML format (think Office 2002 or 2003 version) into Java. I tried JXL and Apache's POI (version 3.7). POI doesn't work since it appears to want the newer Office .xlsx format.

Here's an example of the older Office XML format.

One can generate a similar XML file from MS Excel 2010 by saving the workbook as the format "XML Spreadsheet 2003"?

Are there any open-source Java libraries that will load the XMLSS format? Otherwise I have no choice but to write a custom parser: read the XML file then interpret the cell tags to build out the cell matrix. In this XML format, any rows with empty cell values are skipped, the next cell with data positioned with an index attribute that acts like an offset in the columns, I assume to save space in the XML file.

解决方案

Copying Mark Beardsley's answer from POI team http://apache-poi.1045710.n5.nabble.com/How-to-convert-xml-to-xls-td2306602.html :

You have got an Office 2003 xml file there, not an OpenXML file; it is an early attempt by Microsoft to create an xml based file format for Excel and it is in that sense a 'valid' Office file format.

Sadly, POI cannot interpret this file at all and that is why you saw the exception when you tried to wrap it up in the InputStream and pass it to WorkbookFactory(s) constructor. You do however have a number of options;

  • You could use Excel itself and manually open and save each file you wish to convert, as you already have done.
  • If you have access to Visual Studio and can write Visual Basic or C# code then you could use a control that will allow you to control Excel programmatically. This way you could automate a file conversion process using Excel itself. Then once the file has been converted wither to the binary or OpenXML formats, POI can be used to process it.
  • If you are running on a stand alone PC on which a copy of Excel is installed and using the Windows operating system, then you could use OLE to do something very similar from Java code. As above, POI can be used to process the file following the conversion.
  • If you have access to OpenOffice, it has a rather good API that is accessible from Java code. You could use it to convert between the file types for you - it is simply a matter of discovering the correct filter to use in this case. OpenOffice is good for all except the most complex files and you should be able to use POI to process the file following conversion. However, if you choose this route, it may be best to do all of the work using OpenOffice's UNO api.
  • Depending upon what you want to do with the file's contents, you could create your own parser using core java code and either the SAX or Xerces parsers (consider using xmlBeans (http://xmlbeans.apache.org/) ). If you simply open the original xml file using a simple text editor, you can see that the structure is not complex and, if all you wish to get at is the raw data it contains, this could be your best option.

这篇关于如何使用Java加载旧的Microsoft Office XML文件(Excel)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆