将两个输入DataFrame中的每一个仅相乘一列 [英] Multiplying just one column from each of the 2 input DataFrames together
问题描述
我有两个DataFrames,每个都是精确的理智维度,我想将它们各自的一个特定列相乘在一起.
I have two DataFrames that are each of the exact sane dimensions and I would like to multiply just one specific column from each of them together:
我的第一个DataFrame是:
My first DataFrame is:
In [834]: patched_benchmark_df_sim
Out[834]:
build_number name cycles
0 390 adpcm 21598
1 390 aes 5441
2 390 blowfish NaN
3 390 dfadd 463
....
284 413 jpeg 766742
285 413 mips 4263
286 413 mpeg2 2021
287 413 sha 348417
[288 rows x 3 columns]
我的第二个DataFrame是:
My second DataFrame is:
In [835]: patched_benchmark_df_syn
Out[835]:
build_number name fmax
0 390 adpcm 143.45
1 390 aes 309.60
2 390 blowfish NaN
3 390 dfadd 241.02
....
284 413 jpeg 197.75
285 413 mips 202.39
286 413 mpeg2 291.29
287 413 sha 243.19
[288 rows x 3 columns]
我想将patched_benchmark_df_sim
的cycles
列的每个元素乘以patched_benchmark_df_syn
的fmax
列的对应元素,然后将结果存储在具有完全相同的结构,包含build_number
和name
列,但是现在包含所有数值数据的最后一列将称为latency
,这是fmax
和cycles
的乘积.
And I would like to take each element of the cycles
column of patched_benchmark_df_sim
and multiply that to the corresponding element of the fmax
column of patched_benchmark_df_syn
, and then store the result in a new DataFrame that has exactly the same structure, contiaining the build_number
and name
columns, but now the last column containing all the numerical data will be called latency
, which is the product of fmax
and cycles
.
因此,输出DataFrame必须看起来像这样:
So the output DataFrame has to look something like this:
build_number name latency
0 390 adpcm ## each value here has to be product of cycles and fmax and they must correspond to one another ##
......
我尝试做一个简单的patched_benchmark_df_sim * patched_benchmark_df_syn
,但是没有用,因为我的DataFrames的name
列是字符串类型的.有没有内置的熊猫方法可以为我做到这一点?如何进行乘法运算以获得所需的结果?
I tried doing a straightforward patched_benchmark_df_sim * patched_benchmark_df_syn
but that did not work as my DataFrames had the name
column that's of string type. Is there no builtin pandas method that can do this for me? How could I proceed with the multiplication to get the result I need?
非常感谢您.
推荐答案
最简单的方法是将新列添加到df中,然后选择所需的列,如果要将该列分配给新的df:
The simplest thing to do is to add a new column to the df and then select the columns you want and if you want assign that to a new df:
In [356]:
df['latency'] = df['cycles'] * df1['fmax']
df
Out[356]:
build_number name cycles latency
0 390 adpcm 21598 3.098233e+06
1 390 aes 5441 1.684534e+06
2 390 blowfish NaN NaN
3 390 dfadd 463 1.115923e+05
284 413 jpeg 766742 1.516232e+08
285 413 mips 4263 8.627886e+05
286 413 mpeg2 2021 5.886971e+05
287 413 sha 348417 8.473153e+07
In [357]:
new_df = df[['build_number', 'name', 'latency']]
new_df
Out[357]:
build_number name latency
0 390 adpcm 3.098233e+06
1 390 aes 1.684534e+06
2 390 blowfish NaN
3 390 dfadd 1.115923e+05
284 413 jpeg 1.516232e+08
285 413 mips 8.627886e+05
286 413 mpeg2 5.886971e+05
287 413 sha 8.473153e+07
如您所见,您不能像尝试过的那样将非数字类型df相乘.以上假设两个dfs的build_number和name列相同.
As you've found you can't multiply non-numeric type df's together like you tried. The above is assuming that the build_number and name columns are the same from both dfs.
这篇关于将两个输入DataFrame中的每一个仅相乘一列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!