解析csv文件并在python中聚合值 [英] Parsing a csv file and aggregate values in python
本文介绍了解析csv文件并在python中聚合值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我想解析一个csv文件并聚合2列.
I'm looking to parse a csv file and aggregate 2 columns.
csv文件中的数据:
Data in csv file:
'IP Address', Severity
10.0.0.1, High
10.0.0.1, High
10.0.0.1, Low
10.0.0.1, Medium
10.0.0.2, Medium
10.0.0.2, High
10.0.0.2, Low
10.0.0.3, Medium
10.0.0.3, High
10.0.0.3, Medium
我正在寻求获得以下方面的输出:
I'm looking to obtain output along the lines of:
'IP Address', Severity
10.0.0.1, High:2, Medium:1, Low:1
10.0.0.2, High:1, Medium:1, Low:1
10.0.0.3, High:1, Medium:2, Low:0
或(理想情况下)
'IP Address', High, Medium, Low
10.0.0.1, 2, 1, 1
10.0.0.2, 1, 1, 1
10.0.0.3, 1, 2, 0
我最接近的是这里: 解析CSV文件并汇总值
The closest I have come is here: Parse CSV file and aggregate the values
我似乎无法汇总字符串(严重性)变量.
I can't seem to aggregate on string (Severity) variable.
如何输出这些数据?
感谢您的帮助.
推荐答案
这是我的解决方案,ag.py:
Here is my solution, ag.py:
import collections
import csv
import sys
output = collections.defaultdict(collections.Counter)
with open(sys.argv[1]) as infile:
reader = csv.reader(infile)
reader.next() # Skip header line
for ip,level in reader:
level = level.strip() # Remove surrounding spaces
output[ip][level] += 1
print "'IP Address',High,Medium,Low"
for ip, count in output.items():
print '{0},{1[High]},{1[Medium]},{1[Low]}'.format(ip, count)
要运行该解决方案,请发出以下命令:
To run the solution, issue the following command:
python ag.py data.csv
讨论
-
output
是一个字典,其键是IP,值是collections.Counter
对象. - 每个计数器对象对特定IP计数为高",中"和低"
- 我的解决方案将打印到标准输出,您可以对其进行修改以打印到文件
output
is a dictionary whose keys are the IP, and values arecollections.Counter
objects.- Each counter object counts 'High', 'Medium', and 'Low' for a particular IP
- My solution prints to the stdout, you can modify it to print to file
Discussion
这篇关于解析csv文件并在python中聚合值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文