无法读取Excel 2010文件与Apache POI。第一个行号-1 [英] Can't read Excel 2010 file with Apache POI. First Row number is -1

查看:430
本文介绍了无法读取Excel 2010文件与Apache POI。第一个行号-1的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想这将 testfile的与Apache POI API(当前版本3-10决赛)。下面的测试code

I am trying the this testfile with the Apache POI API (current version 3-10-FINAL). The following test code

import java.io.FileInputStream;
import org.apache.poi.xssf.usermodel.XSSFSheet;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;

public class ExcelTest {

    public static void main(String[] args) throws Exception {
        String filename = "testfile.xlsx";
        XSSFWorkbook wb = new XSSFWorkbook(new FileInputStream(filename));
        XSSFSheet sheet = wb.getSheetAt(0);
        System.out.println(sheet.getFirstRowNum());
    }
}

的结果的第一行数为-1(和现有的行回来为空)。测试文件是由Excel 2010中创建的(我有超过的那部分没有控制),并可以与Excel没有警告或问题进行读取。如果我打开并保存我的版本的Excel(2013)文件如预期它能够完美读取。

results in the first row number to be -1 (and existing rows come back as null). The test file was created by Excel 2010 (I have no control over that part) and can be read with Excel without warnings or problems. If I open and save the file with my version of Excel (2013) it can be read perfectly as expected.

任何提示到,为什么我不能读取原始文件或如何,我可以非常AP preciated。

Any hints into why I can't read the original file or how I can is highly appreciated.

推荐答案

该testfile.xlsx与S preadsheetGear 7.1.1.120创建。打开与可以处理ZIP档案,看看 /xl/workbook.xml 地看到,软件XLSX文件。在工作表/表?的.xml 文件是要注意到所有行元素是没有行号。如果我把一个行数的第一行标记像<行r =1方式> 然后apache的POI可以读取该行

The testfile.xlsx is created with "SpreadsheetGear 7.1.1.120". Open the XLSX file with a software which can deal with ZIP archives and look into /xl/workbook.xml to see that. In the worksheets/sheet?.xml files is to notice that all row elements are without row numbers. If I put a row number in the first row-tag like <row r="1"> then apache POI can read this row.

如果它涉及到的问题,谁应该为此负责,那么答案肯定是既阿帕奇浦二和S preadsheetGear ;-)。 Apache的POI,因为研究行中的属性元素是可选的。但是小号preadsheetGear也因为没有理由不使用研究属性Excel是否本身并不以往任何时候。

If it comes to the question, who is to blame for this, then the answer is definitely both Apache Poi and SpreadsheetGear ;-). Apache POI because the attribute r in the row element is optional. But SpreadsheetGear also because there is no reason not to use the r attribute if Excel itself does it ever.

如果你不能在可以的Apache POI直接读取的格式testfile.xlsx,则必须与底层对象。下面与您的testfile.xlsx:

If you cannot get the testfile.xlsx in a format which can Apache POI read directly, then you must work with the underlying objects. The following works with your testfile.xlsx:

import org.apache.poi.xssf.usermodel.*;
import org.apache.poi.ss.usermodel.*;
import org.apache.poi.ss.util.*;
import org.apache.poi.openxml4j.exceptions.InvalidFormatException;

import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.FileInputStream;
import java.io.InputStream;

import org.openxmlformats.schemas.spreadsheetml.x2006.main.CTWorksheet;
import org.openxmlformats.schemas.spreadsheetml.x2006.main.CTSheetData;
import org.openxmlformats.schemas.spreadsheetml.x2006.main.CTRow;

import java.util.List;

class Testfile {

 public static void main(String[] args) {
  try {

   InputStream inp = new FileInputStream("testfile.xlsx");
   Workbook wb = WorkbookFactory.create(inp);

   Sheet sheet = wb.getSheetAt(0);

   System.out.println(sheet.getFirstRowNum());

   CTWorksheet ctWorksheet = ((XSSFSheet)sheet).getCTWorksheet();

   CTSheetData ctSheetData = ctWorksheet.getSheetData();

   List<CTRow> ctRowList = ctSheetData.getRowList();

   Row row = null;
   Cell[] cell = new Cell[2];

   for (CTRow ctRow : ctRowList) {
    row = new MyRow(ctRow, (XSSFSheet)sheet);
    cell[0] = row.getCell(0);
    cell[1] = row.getCell(1);
    if (cell[0] != null && cell[1] != null && cell[0].toString() != "" && cell[1].toString() != "") 
       System.out.println(cell[0].toString()+"\t"+cell[1].toString());
   }

  } catch (InvalidFormatException ifex) {
  } catch (FileNotFoundException fnfex) {
  } catch (IOException ioex) {
  }
 }
}

class MyRow extends XSSFRow {
 MyRow(org.openxmlformats.schemas.spreadsheetml.x2006.main.CTRow row, XSSFSheet sheet) {
  super(row, sheet);
 }
}

我已经使用
org.openxmlformats.schemas.s preadsheetml.x2006.main.CTWorksheet,
org.openxmlformats.schemas.s preadsheetml.x2006.main.CTSheetData,
org.openxmlformats.schemas.s preadsheetml.x2006.main.CTRow。
这是部分在Apache POI二进制分发POI彬3.10.1-20140818并有内部的POI-OOXML-架构 - 3.10.1-20140818.jar

I have used org.openxmlformats.schemas.spreadsheetml.x2006.main.CTWorksheet, org.openxmlformats.schemas.spreadsheetml.x2006.main.CTSheetData, org.openxmlformats.schemas.spreadsheetml.x2006.main.CTRow. Which are part of the Apache POI Binary Distribution poi-bin-3.10.1-20140818 and there are within poi-ooxml-schemas-3.10.1-20140818.jar

对于文档看的http://grep$c$c.com/snapshot/repo1.maven.org/maven2/org.apache.poi/ooxml-schemas/1.1/

和我有延长XSSFRow,因为我们不能直接使用XSSFRow构造,因为它保护了访问。

And I have extend XSSFRow, because we can't use the XSSFRow constructor directly since it has protected access.

这篇关于无法读取Excel 2010文件与Apache POI。第一个行号-1的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆