我想将 Pandas DataFrame 中的两列相乘并将结果添加到新列中 [英] I want to multiply two columns in a pandas DataFrame and add the result into a new column
问题描述
我正在尝试将 Pandas Dataframe (orders_df) 中的两个现有列相乘 - Prices(股票收盘价)和 Amount(股票数量),并将计算结果添加到名为Value"的新列中.出于某种原因,当我运行此代码时,值"列下的所有行都是正数,而某些行应该是负数.在 DataFrame 的 Action 列下,有七行带有Sell"字符串,七行带有Buy"字符串.
I'm trying to multiply two existing columns in a pandas Dataframe (orders_df) - Prices (stock close price) and Amount (stock quantities) and add the calculation to a new column called 'Value'. For some reason when I run this code, all the rows under the 'Value' column are positive numbers, while some of the rows should be negative. Under the Action column in the DataFrame there are seven rows with the 'Sell' string and seven with the 'Buy' string.
for i in orders_df.Action:
if i == 'Sell':
orders_df['Value'] = orders_df.Prices*orders_df.Amount
elif i == 'Buy':
orders_df['Value'] = -orders_df.Prices*orders_df.Amount)
请告诉我我做错了什么!
Please let me know what i'm doing wrong !
推荐答案
如果我们愿意牺牲 Hayden 解决方案的简洁性,也可以这样做:
If we're willing to sacrifice the succinctness of Hayden's solution, one could also do something like this:
In [22]: orders_df['C'] = orders_df.Action.apply(
lambda x: (1 if x == 'Sell' else -1))
In [23]: orders_df # New column C represents the sign of the transaction
Out[23]:
Prices Amount Action C
0 3 57 Sell 1
1 89 42 Sell 1
2 45 70 Buy -1
3 6 43 Sell 1
4 60 47 Sell 1
5 19 16 Buy -1
6 56 89 Sell 1
7 3 28 Buy -1
8 56 69 Sell 1
9 90 49 Buy -1
现在我们消除了对 if
语句的需要.使用 DataFrame.apply()
,我们还取消了 for
循环.正如海登所指出的,矢量化操作总是更快.
Now we have eliminated the need for the if
statement. Using DataFrame.apply()
, we also do away with the for
loop. As Hayden noted, vectorized operations are always faster.
In [24]: orders_df['Value'] = orders_df.Prices * orders_df.Amount * orders_df.C
In [25]: orders_df # The resulting dataframe
Out[25]:
Prices Amount Action C Value
0 3 57 Sell 1 171
1 89 42 Sell 1 3738
2 45 70 Buy -1 -3150
3 6 43 Sell 1 258
4 60 47 Sell 1 2820
5 19 16 Buy -1 -304
6 56 89 Sell 1 4984
7 3 28 Buy -1 -84
8 56 69 Sell 1 3864
9 90 49 Buy -1 -4410
这个解决方案需要两行代码而不是一行代码,但更容易阅读.我怀疑计算成本也相似.
This solution takes two lines of code instead of one, but is a bit easier to read. I suspect that the computational costs are similar as well.
这篇关于我想将 Pandas DataFrame 中的两列相乘并将结果添加到新列中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!