忽略许多行的列中的类似条目 [英] Ignoring similar entries in a column of many rows
问题描述
大家好;
我有一个包含一列的文件,其中的元素用逗号(,)分隔。该列有很多行,如下所示:
45,56,890
67,92,4502
76,367,89
67,92,4502
67,92,4502
92,14,05
02,56,125
25,02,61
02,56,125
。
。
。
我想通过列中的所有行忽略相似的行并将结果放在输出文件中。例如,我想计算一次67,92,4502并忽略其他类似的(第4行和第5行是相同的)。
我希望我已经清楚地解释了这个问题。
在此先感谢,Atrisa
Hi everyone;
I have a file with one column the elements of which are separated by commas (,). The column has many rows, as in below:
45,56,890
67,92,4502
76,367,89
67,92,4502
67,92,4502
92,14,05
02,56,125
25,02,61
02,56,125
.
.
.
I want to go through all the rows in the column ignoring the similar rows and put the result in an output file. For example I want to count 67,92,4502 once and ignore the other similar ones (rows 4 and 5 are the same).
I hope I have explained the problem clearly.
Thanks in advance, Atrisa
推荐答案
这可以通过中的 轻松完成 operator:
This can easily be accomplished with the in operator:
如果您有大量数据,请使用集合,即每组三个数字将成为元组添加一套。套装不要添加重复,但确实改变了顺序。如果要保留原始文件顺序,请将OrderedDict用于大型数据集。如果bvdet'的解决方案在合理的时间内运行,那么没有理由改变。如果您想要有关替代解决方案的更多信息,请回发。
If you have a large amount of data, use sets, i.e each group of three numbers would become a tuple added to a set. Sets don''t add duplicates but do alter the order. If you want to retain the original file order, then use an OrderedDict for a large data set. If bvdet''s solution runs in a reasonable amount of time then there is no reason to change. Post back if you want more info on either alternate solution.
非常感谢你们。我需要使用bvdet的代码,如下所示:
Thanks a lot both of you. I need to use bvdet''s code like in the following:
这篇关于忽略许多行的列中的类似条目的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!