从其他数据帧按行查找 [英] Row-wise lookup from other data frame

查看:39
本文介绍了从其他数据帧按行查找的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个要根据特定条件合并的数据帧.这是第一个数据帧,每行代表一个错误(因此ID多次出现):

I have two data frames that I would like to combine based on certain conditions. This is the first data frame, each line represents one obversation (thus IDs occure multiple times):

df1

  ID  Count  Publication
0  A     10         1990
1  B     15         1990
2  A     17         1990
3  B     19         1991
4  A     13         1991

这是第二个数据帧.在这里,每个ID仅显示一次,但随着时间的推移(此处为1990年至1993年)显示.

This is the second data frame. Here, each ID is shown only once but over time (here 1990 to 1993).

df2

  ID  1990  1991  1992  1993
0  A   1.1   1.2   1.3   1.4
1  B   2.3   2.4   2.4   2.6
2  C   3.4   3.5   3.6   3.7
3  D   4.5   4.6   4.7   4.8

我的目标是在df1中添加一个结果列,其中我将df1 ["Count"]列中的值乘以df2中的相应值("ID年"对),例如第一行:"1990"中的"ID" A是1.1乘以"Count" 10 = 11.

My goal is to add a results column to df1, in which I multiply the value from the df1["Count"] column with the respective value (ID-Year pair) from df2, e.g. first line: "ID" A in "1990" is 1.1 multiplied with "Count" 10 = 11.

results

  ID  Count  Publication  Results
0  A     10         1990     11.0
1  B     15         1990     34.5
2  A     17         1990     18.7
3  B     19         1991     45.6
4  A     13         1991     15.6

到目前为止,我已经使用pandas .apply()函数尝试了多个选项,但是它不起作用.我也曾尝试根据ID将df2列中的.merge()从df2列到df1中,但此后我仍然无法进行计算(我希望这可以简化问题).

So far I have tried multiple options using pandas .apply() function but it did not work. I have also tried to .merge() the columns from df2 to df1 based on IDs but I still fail to make the calculation afterwards (I was hoping this simplies the problem).

问题:是否有一种简单有效的方法来逐行遍历df1并从df2中拾取"相应的值进行计算?

Question: Is there an easy an efficient way to go throug df1 row by row and "pick" the corresponding values from df2 for the calculation?

推荐答案

使用lookup

df2.set_index('ID').lookup(df1.ID,df1.Publication.astype(str))
Out[189]: array([1.1, 2.3, 1.1, 2.4, 1.2])

df1['Results']=df2.set_index('ID').lookup(df1.ID,df1.Publication.astype(str))*(df1.Count)
df1
Out[194]: 
  ID  Count  Publication  Results
0  A     10         1990     11.0
1  B     15         1990     34.5
2  A     17         1990     18.7
3  B     19         1991     45.6
4  A     13         1991     15.6

这篇关于从其他数据帧按行查找的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆