将两个输入DataFrame中的每一个仅相乘一列 [英] Multiplying just one column from each of the 2 input DataFrames together

查看:758
本文介绍了将两个输入DataFrame中的每一个仅相乘一列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个DataFrames,每个都是精确的理智维度,我想将它们各自的一个特定列相乘在一起.

I have two DataFrames that are each of the exact sane dimensions and I would like to multiply just one specific column from each of them together:

我的第一个DataFrame是:

My first DataFrame is:

In [834]: patched_benchmark_df_sim
Out[834]: 
     build_number      name  cycles
0             390     adpcm   21598
1             390       aes    5441
2             390  blowfish     NaN
3             390     dfadd     463
....
284           413      jpeg  766742
285           413      mips    4263
286           413     mpeg2    2021
287           413       sha  348417

[288 rows x 3 columns]

我的第二个DataFrame是:

My second DataFrame is:

In [835]: patched_benchmark_df_syn
Out[835]: 
     build_number      name    fmax
0             390     adpcm  143.45
1             390       aes  309.60
2             390  blowfish     NaN
3             390     dfadd  241.02
....
284           413      jpeg  197.75
285           413      mips  202.39
286           413     mpeg2  291.29
287           413       sha  243.19

[288 rows x 3 columns]

我想将patched_benchmark_df_simcycles列的每个元素乘以patched_benchmark_df_synfmax列的对应元素,然后将结果存储在具有完全相同的结构,包含build_numbername列,但是现在包含所有数值数据的最后一列将称为latency,这是fmaxcycles的乘积.

And I would like to take each element of the cycles column of patched_benchmark_df_sim and multiply that to the corresponding element of the fmax column of patched_benchmark_df_syn, and then store the result in a new DataFrame that has exactly the same structure, contiaining the build_number and name columns, but now the last column containing all the numerical data will be called latency, which is the product of fmax and cycles.

因此,输出DataFrame必须看起来像这样:

So the output DataFrame has to look something like this:

    build_number      name    latency
0            390     adpcm    ## each value here has to be product of cycles and fmax and they must correspond to one another ##
......

我尝试做一个简单的patched_benchmark_df_sim * patched_benchmark_df_syn,但是没有用,因为我的DataFrames的name列是字符串类型的.有没有内置的熊猫方法可以为我做到这一点?如何进行乘法运算以获得所需的结果?

I tried doing a straightforward patched_benchmark_df_sim * patched_benchmark_df_syn but that did not work as my DataFrames had the name column that's of string type. Is there no builtin pandas method that can do this for me? How could I proceed with the multiplication to get the result I need?

非常感谢您.

推荐答案

最简单的方法是将新列添加到df中,然后选择所需的列,如果要将该列分配给新的df:

The simplest thing to do is to add a new column to the df and then select the columns you want and if you want assign that to a new df:

In [356]:

df['latency'] = df['cycles'] * df1['fmax']
df
Out[356]:
     build_number      name  cycles       latency
0             390     adpcm   21598  3.098233e+06
1             390       aes    5441  1.684534e+06
2             390  blowfish     NaN           NaN
3             390     dfadd     463  1.115923e+05
284           413      jpeg  766742  1.516232e+08
285           413      mips    4263  8.627886e+05
286           413     mpeg2    2021  5.886971e+05
287           413       sha  348417  8.473153e+07
In [357]:

new_df = df[['build_number', 'name', 'latency']]
new_df
Out[357]:
     build_number      name       latency
0             390     adpcm  3.098233e+06
1             390       aes  1.684534e+06
2             390  blowfish           NaN
3             390     dfadd  1.115923e+05
284           413      jpeg  1.516232e+08
285           413      mips  8.627886e+05
286           413     mpeg2  5.886971e+05
287           413       sha  8.473153e+07

如您所见,您不能像尝试过的那样将非数字类型df相乘.以上假设两个dfs的build_number和name列相同.

As you've found you can't multiply non-numeric type df's together like you tried. The above is assuming that the build_number and name columns are the same from both dfs.

这篇关于将两个输入DataFrame中的每一个仅相乘一列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆