pandas groupby并汇总到新列中 [英] pandas groupby and aggregate into new column

查看:53
本文介绍了 pandas groupby并汇总到新列中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

进行了一些搜索,但没有任何结果产生期望的结果,即按日期对数据进行分组并计算频率.我可以使用aggregation来做到这一点,但是我不确定如何用结果创建一个新列.

did some searching but nothing yields the desired result, which is grouping the data by date and counting the frequency. I am able to do this with aggregate but I'm not sure how to create a new column with the results, thanks.

文件中的数据:

Domain  Dates
twitter.com 2016-08-08
google.com  2016-08-09
apple.com   2016-08-09
linkedin.com    2016-08-09
microsoft.com   2016-08-09
slack.com   2016-08-12
instagram.com   2016-08-12
ibm.com 2016-08-12

代码

import pandas as pd
import matplotlib.pyplot as plt
import datetime
import numpy as np

df = pd.read_csv('domains.tsv', sep='\t')
df = df.groupby([pd.to_datetime(df.Dates).dt.date]).agg({'Dates':'size'})
print(df)

收益

            Dates
Dates
2016-08-08      1
2016-08-09      4
2016-08-12      3

理想情况下,我希望count列为'count',然后我将另存为新的csv.

Ideally, I would like the count column to be 'count' and then I will save as a new csv.

推荐答案

import pandas as pd


df = pd.read_csv('domains.tsv', sep='\t')
counter = df.groupby('Dates').count().rename(columns={'Domain': 'count'})
counter.to_csv('count.csv')

您将获得count.csv,其中包括当前目录中的以下结果.

You will get count.csv including following result on your current dir.

Dates,count
2016-08-08,1
2016-08-09,4
2016-08-12,3

这篇关于 pandas groupby并汇总到新列中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆