选择正确的数据结构来解析文件 [英] choosing right data structure to parse a file
问题描述
我有一个csv文件,内容格式如下:
I have a csv file with contents in the following format:
CSE110, Mon, 1:00 PM, Fri, 1:00 PM
CSE114, Mon, 8:00 AM, Wed, 8:00 AM, Fri, 8:00 AM
$ b
这是最好的数据结构来解析和存储这些数据?
what's the best data structure to parse and store this data?
我尝试使用命名元组,如下所示:
I tried using named tuples as follows:
CourseTimes = namedtuple('CourseTimes', 'course_name, day, start_time ')
但是单个课程可以在多个日期和时间如上面的cse114所示。这只能在运行时决定。如何处理此问题?
But a single course can be scheduled on multiple days and time as shown for cse114 above. This can only be decided at run-time. How to handle this?
我可以使用字典或列表吗?
or else, Can I make use of Dictionary or List?
以解决调度问题以将TA指派给课程。我可能需要比较时间以检查未来的任何冲突。
I am trying to solve a scheduling problem to assign TAs to courses. I might have to compare times to check for any collisions in the future
同样复杂的事情,输入文件还有其他数据,我需要解析。基本上格式如下。
Also to complicate things up, the input file has other data as well which I need to parse. Basically the following is the format.
//Course times
CSE110, Mon, 1:00 PM, Fri, 1:00 PM
CSE114, Mon, 8:00 AM, Wed, 8:00 AM, Fri, 8:00 AM
....
//Course recitation times
CSE306, Mon, 2:30 PM
CSE307, Fri, 4:00 PM
...
//class strength
CSE101, 44, yes
CSE101, 115, yes
...
将所有这些存储在我想象的单独的数据结构中。
I need store all this in separate data structures I suppose. What could be the right reg-ex patterns for each of the category?
推荐答案
首先注意一些关于数据的事情:
Start with noting a few things about your data:
- 您有一些独特的字符串(课程)
- 每门课程结束后,
这样,你有一系列唯一的键,每个键都有一个
With that, you have a series of unique keys that each have a number of values.
听起来像一个字典。
要将数据导入字典,请从阅读文件。接下来,您可以使用正则表达式选择每个 [day],[hour]:[分] [AM / PM]
部分或原来 string.split()用逗号将行分成几部分。课程字符串是字典中的键,其余的行作为元组或值列表。移动到下一行。
To get that data into a dictionary, start with reading the file. Next, you can either use regular expressions to select each [day], [hour]:[minutes] [AM/PM]
section or plain old string.split() to break the line into sections by the commas. The course string is the key into the dictionary with the rest of the line as a tuple or list of values. Move onto the next line.
这篇关于选择正确的数据结构来解析文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!