使用Jackson的CSV数据格式模块解析CSV文件时出现CharConversionException [英] CharConversionException in parsing CSV file using Jackson's CSV data format module

查看:175
本文介绍了使用Jackson的CSV数据格式模块解析CSV文件时出现CharConversionException的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 CSV数据格式模块.

我尝试了在其项目首页上给出的示例代码( https://github.com/FasterXML/jackson-dataformat-csv )

I tried sample code given on their project homepage (https://github.com/FasterXML/jackson-dataformat-csv)

CsvMapper mapper = new CsvMapper();
mapper.enable(CsvParser.Feature.WRAP_AS_ARRAY);
File csvFile = new File("input.csv");
MappingIterator<String[]> it =  mapper.reader(String[].class).readValues(csvFile);
while (it.hasNext()) {
    String[] row = it.next();
    System.out.println(row)
}

这个小代码给我错误

Exception in thread "main" java.io.CharConversionException: Invalid UTF-8 start byte 0x92 (at char #269, byte #-1)
at com.fasterxml.jackson.dataformat.csv.impl.UTF8Reader.reportInvalidInitial(UTF8Reader.java:393)
at com.fasterxml.jackson.dataformat.csv.impl.UTF8Reader.read(UTF8Reader.java:245)
at com.fasterxml.jackson.dataformat.csv.impl.CsvReader.loadMore(CsvReader.java:438)
at com.fasterxml.jackson.dataformat.csv.impl.CsvReader.hasMoreInput(CsvReader.java:475)
at com.fasterxml.jackson.dataformat.csv.CsvParser._handleStartDoc(CsvParser.java:461)
at com.fasterxml.jackson.dataformat.csv.CsvParser.nextToken(CsvParser.java:414)
at com.fasterxml.jackson.databind.ObjectReader._bindAndReadValues(ObjectReader.java:1492)
at com.fasterxml.jackson.databind.ObjectReader.readValues(ObjectReader.java:1335)
at com.til.etwealth.etmoney.util.alok.main(alok.java:18)  

我能够使用 openCSV
读取相同的文件 我试图通过Internet找出此错误,但找不到有用的方法.请有人告诉我我想念的是什么?

I am able to read same file using openCSV
I tried to find out through this error on internet but could not find useful. please someone tell what I am missing?

推荐答案

您很可能正在阅读非UTF-8编码的内容,但正在使用其他内容,例如Latin-1(ISO-8859-1). 我认为您收到的错误消息不是很好,所以也许可以改进它以暗示可能的原因,因为这是相对常见的问题.

Most likely you are reading content that is not UTF-8 encoded, but using something else, such as Latin-1 (ISO-8859-1). I think that error message you get is not very good, so maybe it could be improved to suggest likely reason, as this is relatively common problem.

要读取非Unicode编码,您需要自己构造Reader(因为不可能可靠地自动检测差异-尽管可能有些Java库可以使用试探法来尝试自动确定此差异):

To read non-Unicode encodings, you need to construct Reader yourself (since it is not possible to reliably auto-detect difference -- although there may be Java libs that could use heuristics to try to determine this automatically):

mapper.readValues(new InputStreamReader(new FileInputStream(csvFile), "ISO-8859-1");

或者,可能是任何用于编码文件的文件都应指定要使用的UTF-8编码.

alternatively it may be that whatever is used to encode the file should specify UTF-8 encoding to be used.

还有其他可能的原因(例如文件截断),但是字符编码不匹配是常见的原因.实际上,主要的奇怪之处在于特定的字符代码,在(大多数?)ISO-8859-x编码中,这不是可打印的字符.

There are other possible reasons (such as file truncation), but mismatching character encoding is a common reason. The main oddity here is actually that particular character code, which is not a printable character in (most?) ISO-8859-x encodings.

这篇关于使用Jackson的CSV数据格式模块解析CSV文件时出现CharConversionException的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆