用于保存已解析的CSV文件内容的数据结构 [英] Data structure for holding the content of a parsed CSV file

查看:167
本文介绍了用于保存已解析的CSV文件内容的数据结构的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试找出最好的方法是解析Java中的csv文件.现在,每行将具有X信息量.例如,第一行最多可以包含5个字符串词(用逗号隔开),而接下来的几行则可以包含3或6个字符.

I'm trying to figure out what the best approach would be to parse a csv file in Java. Now each line will have an X amount of information. For example, the first line can have up to 5 string words (with commas separating them) while the next few lines can have maybe 3 or 6 or what ever.

我的问题不是从文件中读取字符串.只是要清楚.我的问题是哪种数据结构最适合容纳每一行以及该行中的每个单词?

My problem isn't reading the strings from the file. Just to be clear. My problem is what data structure would be best to hold each line and also each word in that line?

起初我考虑过使用2D数组,但是问题在于数组大小必须是静态的(第二个索引大小将容纳每行中有多少个单词,这可能因行而异) .

At first I thought about using a 2D array, but the problem with that is that array sizes must be static (the 2nd index size would hold how many words there are in each line, which can be different from line to line).

这是CSV文件的前几行:

Here's the first few lines of the CSV file:

0,MONEY
1,SELLING
2,DESIGNING
3,MAKING
DIRECTOR,3DENT95VGY,EBAD,SAGHAR,MALE,05/31/2011,null,0,10000,07/24/2011
3KEET95TGY,05/31/2011,04/17/2012,120050
3LERT9RVGY,04/17/2012,03/05/2013,132500
3MEFT95VGY,03/05/2013,null,145205
DIRECTOR,XKQ84P6CDW,AGHA,ZAIN,FEMALE,06/06/2011,null,1,1000,01/25/2012
XK4P6CDW,06/06/2011,09/28/2012,105000
XKQ8P6CW,09/28/2012,null,130900
DIRECTOR,YGUSBQK377,AYOUB,GRAMPS,FEMALE,10/02/2001,12/17/2007,2,12000,01/15/2002

推荐答案

您可以使用Map<Integer, List<String>>.键是csv文件中的行号,列表是每行中的单词.

You could use a Map<Integer, List<String>>. The keys being the line numbers in the csv file, and the List being the words in each line.

另外一点:您可能最终会经常使用List#get(int)方法.在这种情况下,请不要使用链表.这是因为链接列表的get(int)是O(n).我认为ArrayList是您的最佳选择.

An additional point: you will probably end up using List#get(int) method quite often. Do not use a linked list if this is the case. This is because get(int) for linked list is O(n). I think an ArrayList is your best option here.

编辑(基于AlexWien的观察):

Edit (based on AlexWien's observation):

在这种特殊情况下,由于键是行号,因此产生了一组连续的整数,因此ArrayList<ArrayList<String>>可能是一个更好的数据结构.这样可以更快地检索密钥.

In this particular case, since the keys are line numbers, thus yielding a contiguous set of integers, an even better data structure could be ArrayList<ArrayList<String>>. This will lead to faster key retrievals.

这篇关于用于保存已解析的CSV文件内容的数据结构的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆