忽略许多行的列中的类似条目 [英] Ignoring similar entries in a column of many rows

查看:63
本文介绍了忽略许多行的列中的类似条目的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

大家好;


我有一个包含一列的文件,其中的元素用逗号(,)分隔。该列有很多行,如下所示:


45,56,890

67,92,4502

76,367,89

67,92,4502

67,92,4502

92,14,05

02,56,125

25,02,61

02,56,125








我想通过列中的所有行忽略相似的行并将结果放在输出文件中。例如,我想计算一次67,92,4502并忽略其他类似的(第4行和第5行是相同的)。


我希望我已经清楚地解释了这个问题。

在此先感谢,Atrisa

Hi everyone;

I have a file with one column the elements of which are separated by commas (,). The column has many rows, as in below:

45,56,890
67,92,4502
76,367,89
67,92,4502
67,92,4502
92,14,05
02,56,125
25,02,61
02,56,125
.
.
.

I want to go through all the rows in the column ignoring the similar rows and put the result in an output file. For example I want to count 67,92,4502 once and ignore the other similar ones (rows 4 and 5 are the same).

I hope I have explained the problem clearly.
Thanks in advance, Atrisa

推荐答案

这可以通过中的 轻松完成 operator:
This can easily be accomplished with the in operator:
展开 | 选择 | Wrap | 行号


如果您有大量数据,请使用集合,即每组三个数字将成为元组添加一套。套装不要添加重复,但确实改变了顺序。如果要保留原始文件顺序,请将OrderedDict用于大型数据集。如果bvdet'的解决方案在合理的时间内运行,那么没有理由改变。如果您想要有关替代解决方案的更多信息,请回发。
If you have a large amount of data, use sets, i.e each group of three numbers would become a tuple added to a set. Sets don''t add duplicates but do alter the order. If you want to retain the original file order, then use an OrderedDict for a large data set. If bvdet''s solution runs in a reasonable amount of time then there is no reason to change. Post back if you want more info on either alternate solution.


非常感谢你们。我需要使用bvdet的代码,如下所示:

Thanks a lot both of you. I need to use bvdet''s code like in the following:

展开 | 选择 | Wrap | 行号


这篇关于忽略许多行的列中的类似条目的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆