无法使用 Apache POI 读取 Excel 2010 文件.第一行号是 -1 [英] Can't read Excel 2010 file with Apache POI. First Row number is -1

查看:41
本文介绍了无法使用 Apache POI 读取 Excel 2010 文件.第一行号是 -1的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 Apache POI API(当前版本 3-10-FINAL).以下测试代码

I am trying the this testfile with the Apache POI API (current version 3-10-FINAL). The following test code

import java.io.FileInputStream;
import org.apache.poi.xssf.usermodel.XSSFSheet;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;

public class ExcelTest {

    public static void main(String[] args) throws Exception {
        String filename = "testfile.xlsx";
        XSSFWorkbook wb = new XSSFWorkbook(new FileInputStream(filename));
        XSSFSheet sheet = wb.getSheetAt(0);
        System.out.println(sheet.getFirstRowNum());
    }
}

导致第一个行号为 -1(并且现有行返回为空).测试文件是由 Excel 2010 创建的(我无法控制那部分),可以使用 Excel 读取而不会出现警告或问题.如果我用我的 Excel (2013) 版本打开并保存文件,它可以按预期完美读取.

results in the first row number to be -1 (and existing rows come back as null). The test file was created by Excel 2010 (I have no control over that part) and can be read with Excel without warnings or problems. If I open and save the file with my version of Excel (2013) it can be read perfectly as expected.

任何有关为什么我无法读取原始文件或如何读取的提示,我们都非常感谢.

Any hints into why I can't read the original file or how I can is highly appreciated.

推荐答案

testfile.xlsx 是使用SpreadsheetGear 7.1.1.120"创建的.使用可以处理 ZIP 档案的软件打开 XLSX 文件,然后查看 /xl/workbook.xml 以查看.在 worksheets/sheet?.xml 文件中要注意所有行元素都没有行号.如果我在第一个行标记中放置一个行号,例如 <row r="1"> 然后 apache POI 可以读取该行.

The testfile.xlsx is created with "SpreadsheetGear 7.1.1.120". Open the XLSX file with a software which can deal with ZIP archives and look into /xl/workbook.xml to see that. In the worksheets/sheet?.xml files is to notice that all row elements are without row numbers. If I put a row number in the first row-tag like <row r="1"> then apache POI can read this row.

如果说到谁该为此负责的问题,那么答案肯定是 Apache Poi 和 SpreadsheetGear ;-).Apache POI 因为 row 元素中的属性 r 是可选的.但 SpreadsheetGear 也是因为如果 Excel 本身曾经使用过 r 属性,则没有理由不使用它.

If it comes to the question, who is to blame for this, then the answer is definitely both Apache Poi and SpreadsheetGear ;-). Apache POI because the attribute r in the row element is optional. But SpreadsheetGear also because there is no reason not to use the r attribute if Excel itself does it ever.

如果您无法以 Apache POI 可以直接读取的格式获取 testfile.xlsx,那么您必须使用底层对象.以下适用于您的 testfile.xlsx:

If you cannot get the testfile.xlsx in a format which can Apache POI read directly, then you must work with the underlying objects. The following works with your testfile.xlsx:

import org.apache.poi.xssf.usermodel.*;
import org.apache.poi.ss.usermodel.*;
import org.apache.poi.ss.util.*;
import org.apache.poi.openxml4j.exceptions.InvalidFormatException;

import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.FileInputStream;
import java.io.InputStream;

import org.openxmlformats.schemas.spreadsheetml.x2006.main.CTWorksheet;
import org.openxmlformats.schemas.spreadsheetml.x2006.main.CTSheetData;
import org.openxmlformats.schemas.spreadsheetml.x2006.main.CTRow;

import java.util.List;

class Testfile {

 public static void main(String[] args) {
  try {

   InputStream inp = new FileInputStream("testfile.xlsx");
   Workbook wb = WorkbookFactory.create(inp);

   Sheet sheet = wb.getSheetAt(0);

   System.out.println(sheet.getFirstRowNum());

   CTWorksheet ctWorksheet = ((XSSFSheet)sheet).getCTWorksheet();

   CTSheetData ctSheetData = ctWorksheet.getSheetData();

   List<CTRow> ctRowList = ctSheetData.getRowList();

   Row row = null;
   Cell[] cell = new Cell[2];

   for (CTRow ctRow : ctRowList) {
    row = new MyRow(ctRow, (XSSFSheet)sheet);
    cell[0] = row.getCell(0);
    cell[1] = row.getCell(1);
    if (cell[0] != null && cell[1] != null && cell[0].toString() != "" && cell[1].toString() != "") 
       System.out.println(cell[0].toString()+"\t"+cell[1].toString());
   }

  } catch (InvalidFormatException ifex) {
  } catch (FileNotFoundException fnfex) {
  } catch (IOException ioex) {
  }
 }
}

class MyRow extends XSSFRow {
 MyRow(org.openxmlformats.schemas.spreadsheetml.x2006.main.CTRow row, XSSFSheet sheet) {
  super(row, sheet);
 }
}

我用过:

  • org.openxmlformats.schemas.spreadsheetml.x2006.main.CTWorksheet
  • org.openxmlformats.schemas.spreadsheetml.x2006.main.CTSheetData
  • org.openxmlformats.schemas.spreadsheetml.x2006.main.CTRow

哪些是 Apache POI Binary Distribution poi-bin-3.10.1-20140818 的一部分,并且在 poi-ooxml-schemas-3.10.1-20140818.jar 中代码>

Which are part of the Apache POI Binary Distribution poi-bin-3.10.1-20140818 and there are within poi-ooxml-schemas-3.10.1-20140818.jar

有关文档,请参阅 http://grepcode.com/snapshot/repo1.maven.org/maven2/org.apache.poi/ooxml-schemas/1.1/

并且我扩展了 XSSFRow,因为我们不能直接使用 XSSFRow 构造函数,因为它具有受保护的访问权限.

And I have extend XSSFRow, because we can't use the XSSFRow constructor directly since it has protected access.

这篇关于无法使用 Apache POI 读取 Excel 2010 文件.第一行号是 -1的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆