根据Python中CSV的标准映射发生次数 [英] Mapping occurrence count based on criterion from CSV in Python
问题描述
我有一个包含许多列的CSV,我只关心两列,它们是文本字段(受影响的环境)"和文本字段(评分)".
I have a CSV with numerous columns, there's only two columns I'm concerned with, they are 'Text field (Environment/s Affected)' and 'Text field (Rating)'.
环境"列具有诸如dev,test,prod之类的条目.评分列中包含P1,P2,P3,P4,P5之类的条目.
The environment column has entries like dev, test, prod. The rating column has entries like P1, P2, P3, P4, P5.
我需要以某种方式绘制出每个环境发生了多少次事件.用Python做到这一点的最佳方法是什么?
I need to somehow map out how many occurrences each of the environments has had. What would be the best way to do this in Python?
最终目标将是这样的: 测试中的P1/P2:15 测试总数:30 分期中的P1/P2:24 测试总数:30
The end goal would be something like this: P1/P2 in Test: 15 Total in Test: 30 P1/P2 in Staging: 24 Total in Test: 30
P1/P2将是这些值的总和,Total将是其他值(即P3,P4,P5)的总和
P1/P2 would be an aggregate of those, Total would be an aggregate of the others, i.e. P3, P4, P5
推荐答案
您已经用pandas
标记了您的问题,所以我假设您的数据已经以DataFrame
的形式出现.如果是这样,则应执行以下命令:
You have tagged your question with pandas
, so I assume your data is already in the form of a DataFrame
. If so, the following command should do:
df.groupby(['env', (df['rating'].isin(['P1', 'P2']))]).size().rename(index={True: 'P1/P2', False: 'Total'}, level=1)
(这假设您的DataFrame
被命名为df
,并且受影响的环境"和评级"列分别被命名为env
和rating
.)
(This assumes that your DataFrame
is named df
and that your "Environment/s Affected" and "Rating" columns are named env
and rating
respectively.)
这将对env
列的第一个唯一值,然后对rating
列的唯一值进行分组,具体取决于其中包含的值是"P1"还是"P2".然后,它计算每个子组中的行数.
This performs a grouping across first unique values of the env
column, and then the rating
column, depending on whether the value contained in it is either 'P1' or 'P2', or not. It then counts the number of rows within each subgroup.
如果您的数据还不是DataFrame
格式,则需要从CSV中将其作为一个数据加载,这可以通过以下命令完成:
If your data is not yet in the form of a DataFrame
, you will need to load it as one from a CSV, which can be done with the following command:
df = pd.read_csv(file_path)
您可能需要稍微调整参数,具体取决于文件的格式;可以在此处找到.
You may need to tweak the arguments a little, depending on the format of your file; the document can be found here.
这篇关于根据Python中CSV的标准映射发生次数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!