良好且有效的Java CSV/TSV阅读器 [英] Good and effective CSV/TSV Reader for Java
问题描述
我正在尝试读取具有大约1000000
行或更多行的大型CSV
和TSV
(制表符分隔)文件.现在,我尝试读取包含~2500000
行和 opencsv
的TSV
,但是它把我扔了java.lang.NullPointerException
.它适用于带有~250000
行的较小的TSV
文件.所以我想知道是否还有其他Libraries
支持读取大型CSV
和TSV
文件.你有什么主意吗?
I am trying to read big CSV
and TSV
(tab-separated) Files with about 1000000
rows or more. Now I tried to read a TSV
containing ~2500000
lines with opencsv
, but it throws me an java.lang.NullPointerException
. It works with smaller TSV
Files with ~250000
lines. So I was wondering if there are any other Libraries
that support the reading of huge CSV
and TSV
Files. Do you have any ideas?
每个对我的代码感兴趣的人(我缩短了代码,所以Try-Catch
显然是无效的):
Everybody who is interested in my Code (I shorten it, so Try-Catch
is obviously invalid):
InputStreamReader in = null;
CSVReader reader = null;
try {
in = this.replaceBackSlashes();
reader = new CSVReader(in, this.seperator, '\"', this.offset);
ret = reader.readAll();
} finally {
try {
reader.close();
}
}
这是构造InputStreamReader
的方法:
private InputStreamReader replaceBackSlashes() throws Exception {
FileInputStream fis = null;
Scanner in = null;
try {
fis = new FileInputStream(this.csvFile);
in = new Scanner(fis, this.encoding);
ByteArrayOutputStream out = new ByteArrayOutputStream();
while (in.hasNext()) {
String nextLine = in.nextLine().replace("\\", "/");
// nextLine = nextLine.replaceAll(" ", "");
nextLine = nextLine.replaceAll("'", "");
out.write(nextLine.getBytes());
out.write("\n".getBytes());
}
return new InputStreamReader(new ByteArrayInputStream(out.toByteArray()));
} catch (Exception e) {
in.close();
fis.close();
this.logger.error("Problem at replaceBackSlashes", e);
}
throw new Exception();
}
推荐答案
我没有尝试过,但是我之前已经研究过superCSV.
I have not tried it, but I had investigated superCSV earlier.
http://sourceforge.net/projects/supercsv/
http://supercsv.sourceforge.net/
检查250万行是否对您有效.
Check if that works for you, 2.5 million lines.
这篇关于良好且有效的Java CSV/TSV阅读器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!