用于保存已解析的CSV文件内容的数据结构 [英] Data structure for holding the content of a parsed CSV file
问题描述
我正在尝试找出最好的方法是解析Java中的csv文件.现在,每行将具有X信息量.例如,第一行最多可以包含5个字符串词(用逗号隔开),而接下来的几行则可以包含3或6个字符.
I'm trying to figure out what the best approach would be to parse a csv file in Java. Now each line will have an X amount of information. For example, the first line can have up to 5 string words (with commas separating them) while the next few lines can have maybe 3 or 6 or what ever.
我的问题不是从文件中读取字符串.只是要清楚.我的问题是哪种数据结构最适合容纳每一行以及该行中的每个单词?
My problem isn't reading the strings from the file. Just to be clear. My problem is what data structure would be best to hold each line and also each word in that line?
起初我考虑过使用2D数组,但是问题在于数组大小必须是静态的(第二个索引大小将容纳每行中有多少个单词,这可能因行而异) .
At first I thought about using a 2D array, but the problem with that is that array sizes must be static (the 2nd index size would hold how many words there are in each line, which can be different from line to line).
这是CSV文件的前几行:
Here's the first few lines of the CSV file:
0,MONEY
1,SELLING
2,DESIGNING
3,MAKING
DIRECTOR,3DENT95VGY,EBAD,SAGHAR,MALE,05/31/2011,null,0,10000,07/24/2011
3KEET95TGY,05/31/2011,04/17/2012,120050
3LERT9RVGY,04/17/2012,03/05/2013,132500
3MEFT95VGY,03/05/2013,null,145205
DIRECTOR,XKQ84P6CDW,AGHA,ZAIN,FEMALE,06/06/2011,null,1,1000,01/25/2012
XK4P6CDW,06/06/2011,09/28/2012,105000
XKQ8P6CW,09/28/2012,null,130900
DIRECTOR,YGUSBQK377,AYOUB,GRAMPS,FEMALE,10/02/2001,12/17/2007,2,12000,01/15/2002
推荐答案
您可以使用Map<Integer, List<String>>
.键是csv文件中的行号,列表是每行中的单词.
You could use a Map<Integer, List<String>>
. The keys being the line numbers in the csv file, and the List being the words in each line.
另外一点:您可能最终会经常使用List#get(int)
方法.在这种情况下,请不要使用链表.这是因为链接列表的get(int)
是O(n).我认为ArrayList
是您的最佳选择.
An additional point: you will probably end up using List#get(int)
method quite often. Do not use a linked list if this is the case. This is because get(int)
for linked list is O(n). I think an ArrayList
is your best option here.
编辑(基于AlexWien的观察):
Edit (based on AlexWien's observation):
在这种特殊情况下,由于键是行号,因此产生了一组连续的整数,因此ArrayList<ArrayList<String>>
可能是一个更好的数据结构.这样可以更快地检索密钥.
In this particular case, since the keys are line numbers, thus yielding a contiguous set of integers, an even better data structure could be ArrayList<ArrayList<String>>
. This will lead to faster key retrievals.
这篇关于用于保存已解析的CSV文件内容的数据结构的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!