pandas -在两列中找到具有匹配值的行,并在另一列中将值相乘 [英] Pandas - find rows with matching values in two columns and multiply value in another column
问题描述
首先假设我们在下面有一个数据框:
First suppose we have a dataframe below:
import pandas as pd
data = pd.DataFrame({'id':['1','2','3','4','5','6','7','8'],
'A':['foo', 'bar', 'foo', 'bar','foo', 'bar', 'foo', 'foo'],
'C':['10','10','10','30','50','60','50','8'],
'D':['9','8','7','6','5','4','3','2']})
print(data)
A C D id
0 foo 10 9 1
1 bar 10 8 2
2 foo 10 7 3
3 bar 30 6 4
4 foo 50 5 5
5 bar 60 4 6
6 foo 50 3 7
7 foo 8 2 8
我想做的是找到匹配行,然后进行一些计算.
What I would like to do is find match rows and then do some calculation.
for any two ids(idx, idy) in data.iterrows():
if idx.A == idy.A and idx.C = idy.C:
result = idx.D * idy.D
,然后生成一个具有三列['id']
,['A']
和['result']
的新数据框.
and then generate a new dataframe with three columns ['id']
, ['A']
and ['result']
.
因此,预期结果的几行是:
So a few rows of expected result is:
id A result
0 1 foo 63
1 3 foo 63
2 5 foo 15
3 7 foo 15
我尝试过,但是结果是错误的逻辑或错误的代码/数据格式. 有人可以帮我吗?
I have tried but the results are either wrong logic or wrong code/data format. Can someone give me a hand please?
推荐答案
一种方法是对A + C进行分组,对产品进行计数并进行计数,过滤掉组中只有一个项目的项目,然后进行内部合并原始帧的A + C,例如:
One way is to groupby A + C, take the product and count, filter out those that only have a single item in the group, then inner merge back on A + C to your original frame, eg:
df.merge(
df.groupby(['A', 'C']).D.agg(['prod', 'count'])
[lambda r: r['count'] > 1],
left_on=['A', 'C'],
right_index=True
)
给你:
A C D id prod count
0 foo 10 9 1 63 2
2 foo 10 7 3 63 2
4 foo 50 5 5 15 2
6 foo 50 3 7 15 2
然后适当地拖放/重命名列.
Then drop/rename columns as appropriate.
这篇关于 pandas -在两列中找到具有匹配值的行,并在另一列中将值相乘的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!