按第一列排序文本文件,并重复计数python [英] Sort text file by first column and count repeats python
本文介绍了按第一列排序文本文件,并重复计数python的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个文本文件需要按第一列排序,并将所有重复与数据左侧的计数合并,然后将排序/计数的数据写入已创建的csv文件。
Ex文本文件:
,00.000.00.000,word,00
,00.000.00.001,word,00
,00.000.00.002,word,00
,00.000.00.000,word,00
,00.000.00.002,word,00
, 00.000.00.000,word,00
结果:
$ b b
,3,00.000.00.000,word,00
,1,00,00.00.001,字,00
,2,00,00.00.002, 00
我的代码:
for open in(list.txt):
$每当我测试代码,我得到错误
with open(ip.strip()+。txt,a)as ip_file:
for line in open(data.txt):
new_line = line.split()
如果在new_line中blocked:
如果src =+ ip.strip new_line:
ip_file.write(,+ new_line [11])$ b $ b ip_file.write(,+ new_line [12])
ip_file.write 13])
用于os.listdir(sub_dir)中的ip_file:
with open(os.path.join(sub_dir,ip_file),a)as f:
data = f.readlines()
data.sort(key = lambda l:float(l.split()[0]),reverse = True)
TypeError:'str'对象不可调用
或类似的东西。我不能使用.split().read().strip()
等,而不会得到错误。
问题:如何对文件内容进行排序并计算重复行(不定义函数)?
:
sort -k1 | uniq -c | sed's / ^ /,/'>> test.csv
解决方案
D = {}
for open('data.txt'):#使用字典计算和过滤重复行
如果k在D中:
D [k] + = 1 #increase k一个如果已经看到。
else:
D [k] = 1#初始化键,如果第一次看到的话。
for sk in sorted(D):#sort keys
print(',',D [sk],sk.rstrip(),file = open('test.csv', 'a'))#print一个逗号,然后是行数加上行。
#Output
,3,00.000.00.000,word,00
,1,00,00.00.001,word,00
,2,00,00.00.002,word ,00
I have a text file that needs to be sorted by the first column and merge all repeats with the count to the left of the data, and then write the sorted/counted data into an already created csv file.
Ex text file:
, 00.000.00.000, word, 00 , 00.000.00.001, word, 00 , 00.000.00.002, word, 00 , 00.000.00.000, word, 00 , 00.000.00.002, word, 00 , 00.000.00.000, word, 00
Ex result:
, 3, 00.000.00.000, word, 00 , 1, 00.000.00.001, word, 00 , 2, 00.000.00.002, word, 00
My code:
for ip in open("list.txt"): with open(ip.strip()+".txt", "a") as ip_file: for line in open("data.txt"): new_line = line.split(" ") if "blocked" in new_line: if "src="+ip.strip() in new_line: ip_file.write(", " + new_line[11]) ip_file.write(", " + new_line[12]) ip_file.write(", " + new_line[13]) for ip_file in os.listdir(sub_dir): with open(os.path.join(sub_dir, ip_file), "a") as f: data = f.readlines() data.sort(key = lambda l: float(l.split()[0]), reverse = True)
Whenever I test the code, I get the error
TypeError: 'str' object is not callable
or something similar. I can't use.split() .read() .strip()
etc without getting the error.Question: How can I sort the files' contents and count repeating lines (without defining a function)?
I'm basically trying to:
sort -k1 | uniq -c | sed 's/^/,/' >> test.csv
解决方案D = {} for k in open('data.txt'): #use dictionary to count and filter duplicate lines if k in D: D[k] += 1 #increase k by one if already seen. else: D[k] = 1 #initialize key with one if seen for first time. for sk in sorted(D): #sort keys print(',', D[sk], sk.rstrip(), file=open('test.csv', 'a')) #print a comma, followed by number of lines plus line. #Output , 3, 00.000.00.000, word, 00 , 1, 00.000.00.001, word, 00 , 2, 00.000.00.002, word, 00
这篇关于按第一列排序文本文件,并重复计数python的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文