如何使用POI解析Excel文件中的UTF-8字符 [英] How to parse UTF-8 characters in Excel files using POI

查看:367
本文介绍了如何使用POI解析Excel文件中的UTF-8字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在使用POI成功解析XLS和XLSX文件.但是,我无法从Excel电子表格中正确提取特殊字符,例如UTF-8编码的字符(如中文或日语).我已经弄清楚了如何从UTF-8编码的csv或制表符分隔的文件中提取数据,但是Excel文件却没有运气.有人可以帮忙吗?

I have been using POI to parse XLS and XLSX files successfully. However, I am unable to correctly extract special characters, such as UTF-8 encoded characters like Chinese or Japanese, from an Excel spreadsheet. I have figured out how to extract data from a UTF-8 encoded csv or tab delimited file, but no luck with the Excel file. Can anyone help?

( 注释中的代码段)

HSSFSheet sheet = workbook.getSheet(worksheet); 
HSSFEvaluationWorkbook ewb = HSSFEvaluationWorkbook.create(workbook); 
while (rowCtr <= lastRow && !rowBreakOut) 
{ 
    Row row = sheet.getRow(rowCtr);//rows.next(); 
    for (int col=firstCell; col<lastCell && !breakOut; col++) { 
      Cell cell; 
      cell = row.getCell(col,Row.RETURN_BLANK_AS_NULL); 
      if (ctype == Cell.CELL_TYPE_STRING) { 
         sValue = cell.getStringCellValue(); 
         log.warn("String value = "+sValue); 
         String encoded = URLEncoder.encode(sValue, "UTF-8"); 
         log.warn("URL-encoded with UTF-8: " + encoded); 
         ....

推荐答案

从Excel文件中提取波斯文字时,我遇到了同样的问题.我使用的是Eclipse,只需转到Project-> Properties,然后将文本文件编码"更改为UTF-8即可解决问题.

I had the same problem while extracting Persian text from an Excel file. I was using Eclipse, and simply going to Project -> Properties and changing the "text file encoding" to UTF-8 solved the problem.

这篇关于如何使用POI解析Excel文件中的UTF-8字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆