对于来自Excel的大量数字,Apache POI没有返回正确的值 [英] Apache POI not returning the proper value for large numbers coming from Excel

查看:103
本文介绍了对于来自Excel的大量数字,Apache POI没有返回正确的值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个excel文件,其值为6228480018362050000,导出的csv看起来像这样...

I have an excel file with the value 6228480018362050000 the exported csv looks like this...

Int,Bigint,String
1,6228480018362050000,Very big

当我尝试运行以下代码...

When I try running the following code...

InputStream inp = new FileInputStream("/.../test.xlsx");
DataFormatter df = new DataFormatter(true);
df.formatCellValue(WorkbookFactory.create(inp).getSheetAt(0).getRow(1).getCell(1));

我得到的6228480018362049500是错误的数字,因为精确度降低了.有没有办法获得实际值?

I get 6228480018362049500 which is the wrong number because precision is hosed. Is there a way to get the actual value?

推荐答案

如果我们将长数字放入Excel单元格,则这些数字将被截断为15个有效数字.这是因为Excel不知道诸如大整数之类的东西.它只有浮点来存储数字值.遵循 IEEE 754规范.但是根据IEEE 754规范,某些数字不能存储为浮点数.对于您的示例,不能这样存储6228480018362050000,即6.22848001836205E + 018.根据IEEE 754规范,它将是6.2284800183620495E + 018或6228480018362049500.

If we put long numbers into Excel cells, then those numbers will be truncated to 15 significant digits. This is because Excel does not know such things like big integers. It has only floating point to store numeric values. And with those it follows the IEEE 754 specification. But some numbers cannot be stored as floating point numbers according to the IEEE 754 specification. With your example the 6228480018362050000, which is 6.22848001836205E+018, cannot be stored as such. It will be 6.2284800183620495E+018 or 6228480018362049500 according to IEEE 754 specification.

Microsoft的知识库提到:"Excel遵循IEEE 754规范存储和计算浮点数.因此,Excel在一个数字中仅存储15位有效数字,并将第15位之后的数字更改为零."

Microsoft's knowledge base mentions: "Excel follows the IEEE 754 specification on how to store and calculate floating-point numbers. Excel therefore stores only 15 significant digits in a number, and changes digits after the fifteenth place to zeroes."

这不是全部事实.实际上,至少使用Office OpenXML(*.xlsx),它根据IEEE 754规范存储值,而仅存储15个有效数字.在您的示例中,它存储<v>6.2284800183620495E+18</v>.但是那是次要的.因为即使它将存储6.22848001836205E + 018,也必须将其重新转换为浮点,然后才能再次为6.2284800183620495E + 18. Excel在打开工作簿时会执行相同的操作.它将<v>6.2284800183620495E+18</v>转换为浮点,然后然后仅显示15个有效数字.

This is not the whole truth. In reality at least with Office OpenXML (*.xlsx) it stores the values according to IEEE 754 specification and not only 15 significant digits. With your example it stores <v>6.2284800183620495E+18</v>. But thats secondary. Because even if it would store 6.22848001836205E+018, somewhere this must be reconverted to floating point and then it will be 6.2284800183620495E+18 again. Excel does the same while opening the workbook. It converts <v>6.2284800183620495E+18</v> to floating point and then it only displays 15 significant digits.

因此,如果您真的需要将6228480018362050000存储为Excel中的数字,那么获得与Excel中相同结果的唯一方法就是与Excel相同.为此,我们可以使用BigDecimal和它的round方法,该方法可以使用设置精度的MathContext.

So if you really need to store the 6228480018362050000 as a number in Excel, then the only way to get the same results as in Excel is to do the same as Excel. To do so we can use BigDecimal and it's round method which is able to use a MathContext with setted precision.

示例:

import org.apache.poi.ss.usermodel.*;

import java.io.*;

import java.math.BigDecimal;
import java.math.MathContext;

class ReadExcelBigNumbers {

 public static void main(String[] args) throws Exception{

  for (int i = 0; i < 10; i++) {
   String v = "6.2284800183620" + i + "E+018";
   double d = Double.parseDouble(v);
   System.out.print(v + "\t");
   System.out.print(d + "\t");
   BigDecimal bd = new BigDecimal(d);
   v = bd.round(new MathContext(15)).toPlainString();
   System.out.println(v);
  }

  InputStream inp = new FileInputStream("test.xlsx");
  Workbook wb = WorkbookFactory.create(inp);
  for (int i = 1; i < 9; i++) {
   double d = wb.getSheetAt(0).getRow(i).getCell(1).getNumericCellValue();
   BigDecimal bd = new BigDecimal(d);
   String v = bd.round(new MathContext(15)).toPlainString();
   System.out.println(v);
  }
 }
}

第一部分打印:

6.22848001836200E+018   6.2284800183620004E18   6228480018362000000
6.22848001836201E+018   6.2284800183620096E18   6228480018362010000
6.22848001836202E+018   6.2284800183620198E18   6228480018362020000
6.22848001836203E+018   6.2284800183620301E18   6228480018362030000
6.22848001836204E+018   6.2284800183620403E18   6228480018362040000
6.22848001836205E+018   6.2284800183620495E18   6228480018362050000
6.22848001836206E+018   6.2284800183620598E18   6228480018362060000
6.22848001836207E+018   6.22848001836207E18     6228480018362070000
6.22848001836208E+018   6.2284800183620803E18   6228480018362080000
6.22848001836209E+018   6.2284800183620905E18   6228480018362090000

您可以看到所需的浮点值,根据IEEE 754规范的实际浮点值和重新格式化的BigDecimal之间的差异.如您所见,只能根据IEEE 754规范直接存储6.22848001836207E + 018.

There you can see the difference between wanted floating point value, real floating point value according IEEE 754 specification and reformatted BigDecimal. As you see only the 6.22848001836207E+018 can be stored according to the IEEE 754 specification directly.

第二部分使用以下Excel工作表执行相同的操作:

The second part does the same using the following Excel sheet:

知识库文章中提到了另一种可能的解决方法:要变通解决此问题,请将单元格设置为文本格式,然后键入数字.单元格最多可以显示1,024个字符."如果数字不是真正的数字,而是标识符(例如,标识符)或某些其他仅将数字表示为字符的字符串,则这很好.如果不将其转换为浮点数,使用这种文本数"进行的计算当然是不可能的.

Another possible workaround is mentioned in the knowledge base article : "To work around this behavior, format the cell as text, then type the numbers. The cell can then display up to 1,024 characters. ". This is good if the numbers are not really numbers but Identifiers for example or some other strings where the digits are only meant as characters. Calculations with such "Text-Numbers" are of course not possible without reconverting them to floating point which will bring the problem again.

这篇关于对于来自Excel的大量数字,Apache POI没有返回正确的值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆