如何遍历 Pandas 组并检查每个组中是否有一个字符串? [英] How do I iterate through a pandas group and check to see if a string is in each group?
问题描述
我有一个 Pandas 数据框,其中包含特定日期的一堆记录.我需要按日期对它们进行分组,并检查第二天是否有前一天的记录,特别是我需要输出哪些记录被删除.
I have a pandas dataframe with a bunch of records on certain dates. I need to group them by date, and do a check to see if the next day has records that are also in the prior day, specifically I need to output what records were deleted.
这是一个示例数据集:
Date Item
20160101 apple
20160101 pear
20160101 banana
20160102 apple
20160102 pear
20160102 beans
我需要弄清楚每个日期发生的差异,因此对于 2016 年 1 月 2 日的此示例,添加了一个字符串 'beans',并从组中删除了一个 'banana'.
I need to figure out the differences that occur for each date, so for this example from 01/02/2016 there is a an added string 'beans' and a 'banana' was removed from the group.
到目前为止,我的代码是:
So far I have as my code:
groups = frame['Item'].groupby(frame['Date'])
for date, item in groups:
for i in item:
if i not in item[:-1]:
print date, item, 'Deleted'
这似乎不起作用.我应该期待:
This doesn't seem to be working. I should be expecting:
20160102 , banana, Deleted
感谢您的帮助!
推荐答案
diffs = frame.groupby(frame.columns.tolist()).size().unstack(fill_value=0).diff()
diffs
diffs.mask(diffs.eq(0)).stack().map({-1: 'deleted', 1: 'added'})
Date Item
20160102 banana deleted
beans added
dtype: object
这篇关于如何遍历 Pandas 组并检查每个组中是否有一个字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!