将excel列读取为字符串,即使它包含数值而无需使用apache-poi进行格式化 [英] Read excel columns as string even it contains numeric value without formatting using apache-poi

查看:93
本文介绍了将excel列读取为字符串,即使它包含数值而无需使用apache-poi进行格式化的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要一个有关尝试读取看起来像是excel值的问题的帮助.让我用例子来解释,

I need help about an issue that try to reading excel values as it looks. Let me explain with example,

在excel中,我的第一行的行(与字符串值混合)

In excel my rows with first column (mixed with string values),

10,4
10,2
11.5 //一些使用的列.表示法
someString

10,4
10,2
11.5 //some columns using . notation
someString

当我尝试使用apache-poi读取值时输出:(不需要将,"替换为.")
10.4
10.2
11.5
someString

When I try to read values using apache-poi outputs: (it replaces "," with "." not wanted)
10.4
10.2
11.5
someString

它应该给出完全相同的输出.如果使用.",则输出应为"10.4;如果使用"",则应为"10,4"

it should give excatly same output. if it use ".", output should "10.4 if it use "," it should "10,4"

即使我将单元格的格式设置为来自excel的文本,仍然会得到相同的输出.我不想使用字符串替换,因为输入的可能是带实点的10.4.

Even I format cells the column as text from excel, still get same output. I don't want to use string replace because input might be 10.4 with real dot.

以下是我正在使用的示例代码,

Here as example code I am using,

    Sheet firstSheet = excel.getSheetAt(0);
    for (Row s : firstSheet) {
        Iterator<Cell> cellIterator = s.iterator();
        while (cellIterator.hasNext()) {
            Cell currentCell = cellIterator.next();
            if (currentCell.getCellType() == CellType.STRING) {
                System.out.print(currentCell.getStringCellValue() + "--");
            } else if (currentCell.getCellType() == CellType.NUMERIC) {
                System.out.print(currentCell.getNumericCellValue() + "--");
                String textValue = NumberToTextConverter.toText(currentCell.getNumericCellValue());
            }
        }
    }

注意:
Apache-poi版本:4.0
Java:8.0

Notes:
Apache-poi version: 4.0
Java: 8.0

预先感谢

推荐答案

如果需要以字符串形式获取所有单元格值,则不应通过 CellType 来获取单元格值.相反,您应该使用 apache poi DataFormatter . DataFormatter 可以设置一个 Locale .然后,此 Locale 决定使用哪种十进制分隔符和数千个分隔符显示数值.

If the need is to get all cell values as string, then you should not get the cell values by CellType. Instead you should using apache poi's DataFormatter. The DataFormatter can have set a Locale. This Locale then determines using what decimal delimiters and thousands separators numeric values are shown.

示例:

import org.apache.poi.ss.usermodel.*;
import java.io.FileInputStream;
import java.util.Locale;

class GetDataFromExcelUsingDataFormatter {

 public static void main(String[] args) throws Exception {

  Workbook workbook = WorkbookFactory.create(new FileInputStream("ExcelExample.xlsx"));

  DataFormatter dataFormatter = new DataFormatter(Locale.GERMANY);
  FormulaEvaluator formulaEvaluator = workbook.getCreationHelper().createFormulaEvaluator();

  Sheet sheet = workbook.getSheetAt(0);

  for (Row row : sheet) {
   for (Cell cell : row) {
    String cellValue = dataFormatter.formatCellValue(cell, formulaEvaluator);
    System.out.println(cellValue);
   }
  }

  workbook.close();
 }
}

由于德语语言环境的设置,这将像 10.4 这样的数值显示为 10,4 .

This shows numeric values like 10.4 as 10,4 because of the German locale set.

Excel 文件始终包含en_US语言环境形式的数字内容.因此,如果内容 10.4 是数字,则将其存储为 10.4 .然后是 Excel 应用程序,该应用程序根据操作系统的语言环境将内容转换为其他语言环境形式.这就是为什么 apache poi DataFormatter 也需要区域设置的原因.

The Excel file always contains numeric content in en_US locale form. So the content 10.4 will be stored as 10.4 if it is numeric. It is the Excel application then which transforms the content into other locale forms according to the locale of the operating system. That's why apache poi's DataFormatter also needs locale settings.

一个 Excel 工作簿 都以点作为小数定界符,它以逗号作为小数定界符.它不能同时具有两者.因此,如果在同一个 Excel 工作表中,一个单元格包含 10.4 ,而另一个单元格包含 10,4 ,则两者之一都必须是文本.在同一张 Excel 工作表中, 10.4 10,4 不能同时为数字. DataFormatter 移交完全未触及的文本内容.因此,如果 10,4 是数字值,则 DataFormatter 使用 Locale 对其进行格式化.但是 10.4 必须是文本,并且 DataFormatter 会原封不动地将其移交为 10.4 .

A Excel workbook either has dot as decimal delimiter or it has comma as decimal delimiter. It cannot have both the same time. So if in the same Excel sheet one cell contains 10.4 and another cell contains 10,4 then one of both must be text. In same Excel sheet 10.4 and 10,4 cannot be numeric the same time. The DataFormatter hands over text content totally untouched. So if 10,4 is the numeric value, then DataFormatterformats it using the Locale. But the 10.4 then must be text and DataFormatter will hand over it untouched as 10.4.

这篇关于将excel列读取为字符串,即使它包含数值而无需使用apache-poi进行格式化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆