Apache POI 没有为来自 Excel 的大数返回正确的值 [英] Apache POI not returning the proper value for large numbers coming from Excel

查看:27
本文介绍了Apache POI 没有为来自 Excel 的大数返回正确的值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个值为 6228480018362050000 的 excel 文件,导出的 csv 看起来像这样...

整数、大整数、字符串1,6228480018362050000,非常大

当我尝试运行以下代码时...

InputStream inp = new FileInputStream("/.../test.xlsx");DataFormatter df = new DataFormatter(true);df.formatCellValue(WorkbookFactory.create(inp).getSheetAt(0).getRow(1).getCell(1));

我得到 6228480018362049500 这是错误的数字,因为精度受到限制.有没有办法得到实际值?

解决方案

如果我们将长数字放入 Excel 单元格中,那么这些数字将被截断为 15 位有效数字.这是因为 Excel 不知道大整数之类的东西.它只有浮点数来存储数值.对于那些,它遵循

知识库文章中提到了另一种可能的解决方法:要解决此问题,请将单元格格式化为文本,然后键入数字.然后该单元格最多可以显示 1,024 个字符.".如果数字不是真正的数字,而是例如标识符或其他一些数字仅表示字符的字符串,这很好.使用这种文本数字"进行计算当然是不可能的,除非将它们重新转换为浮点数,这会再次带来问题.

I have an excel file with the value 6228480018362050000 the exported csv looks like this...

Int,Bigint,String
1,6228480018362050000,Very big

When I try running the following code...

InputStream inp = new FileInputStream("/.../test.xlsx");
DataFormatter df = new DataFormatter(true);
df.formatCellValue(WorkbookFactory.create(inp).getSheetAt(0).getRow(1).getCell(1));

I get 6228480018362049500 which is the wrong number because precision is hosed. Is there a way to get the actual value?

解决方案

If we put long numbers into Excel cells, then those numbers will be truncated to 15 significant digits. This is because Excel does not know such things like big integers. It has only floating point to store numeric values. And with those it follows the IEEE 754 specification. But some numbers cannot be stored as floating point numbers according to the IEEE 754 specification. With your example the 6228480018362050000, which is 6.22848001836205E+018, cannot be stored as such. It will be 6.2284800183620495E+018 or 6228480018362049500 according to IEEE 754 specification.

Microsoft's knowledge base mentions: "Excel follows the IEEE 754 specification on how to store and calculate floating-point numbers. Excel therefore stores only 15 significant digits in a number, and changes digits after the fifteenth place to zeroes."

This is not the whole truth. In reality at least with Office OpenXML (*.xlsx) it stores the values according to IEEE 754 specification and not only 15 significant digits. With your example it stores <v>6.2284800183620495E+18</v>. But thats secondary. Because even if it would store 6.22848001836205E+018, somewhere this must be reconverted to floating point and then it will be 6.2284800183620495E+18 again. Excel does the same while opening the workbook. It converts <v>6.2284800183620495E+18</v> to floating point and then it only displays 15 significant digits.

So if you really need to store the 6228480018362050000 as a number in Excel, then the only way to get the same results as in Excel is to do the same as Excel. To do so we can use BigDecimal and it's round method which is able to use a MathContext with setted precision.

Example:

import org.apache.poi.ss.usermodel.*;

import java.io.*;

import java.math.BigDecimal;
import java.math.MathContext;

class ReadExcelBigNumbers {

 public static void main(String[] args) throws Exception{

  for (int i = 0; i < 10; i++) {
   String v = "6.2284800183620" + i + "E+018";
   double d = Double.parseDouble(v);
   System.out.print(v + "\t");
   System.out.print(d + "\t");
   BigDecimal bd = new BigDecimal(d);
   v = bd.round(new MathContext(15)).toPlainString();
   System.out.println(v);
  }

  InputStream inp = new FileInputStream("test.xlsx");
  Workbook wb = WorkbookFactory.create(inp);
  for (int i = 1; i < 9; i++) {
   double d = wb.getSheetAt(0).getRow(i).getCell(1).getNumericCellValue();
   BigDecimal bd = new BigDecimal(d);
   String v = bd.round(new MathContext(15)).toPlainString();
   System.out.println(v);
  }
 }
}

The first part prints:

6.22848001836200E+018   6.2284800183620004E18   6228480018362000000
6.22848001836201E+018   6.2284800183620096E18   6228480018362010000
6.22848001836202E+018   6.2284800183620198E18   6228480018362020000
6.22848001836203E+018   6.2284800183620301E18   6228480018362030000
6.22848001836204E+018   6.2284800183620403E18   6228480018362040000
6.22848001836205E+018   6.2284800183620495E18   6228480018362050000
6.22848001836206E+018   6.2284800183620598E18   6228480018362060000
6.22848001836207E+018   6.22848001836207E18     6228480018362070000
6.22848001836208E+018   6.2284800183620803E18   6228480018362080000
6.22848001836209E+018   6.2284800183620905E18   6228480018362090000

There you can see the difference between wanted floating point value, real floating point value according IEEE 754 specification and reformatted BigDecimal. As you see only the 6.22848001836207E+018 can be stored according to the IEEE 754 specification directly.

The second part does the same using the following Excel sheet:

Another possible workaround is mentioned in the knowledge base article : "To work around this behavior, format the cell as text, then type the numbers. The cell can then display up to 1,024 characters. ". This is good if the numbers are not really numbers but Identifiers for example or some other strings where the digits are only meant as characters. Calculations with such "Text-Numbers" are of course not possible without reconverting them to floating point which will bring the problem again.

这篇关于Apache POI 没有为来自 Excel 的大数返回正确的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆