如何对数据框的多列执行单个操作 [英] How to perform single operation on Multiple columns of Dataframe
问题描述
我有以下数据框:
df
>>> TSLA MSFT
2017-05-15 00:00:00+00:00 320 68
2017-05-16 00:00:00+00:00 319 69
2017-05-17 00:00:00+00:00 314 61
2017-05-18 00:00:00+00:00 313 66
2017-05-19 00:00:00+00:00 316 62
2017-05-22 00:00:00+00:00 314 65
2017-05-23 00:00:00+00:00 310 63
max_idx = df.idxmax() # returns index of max value
>>> TSLA 2017-05-15 00:00:00+00:00
>>> MSFT 2017-05-16 00:00:00+00:00
max_value = df.max() # returns max value
>>> TSLA = 320
>>> MSFT = 69
def pct_change(first, second): # pct chg formula
return (second-first) / first*100.00
我想在max_value
和两列的每个连续值都从max_idx
(df.loc[max_idx:]
)开始的情况下获得百分比变化.只是为了确保
百分比变化不低于5%.
I want to get percent change between max_value
and with each consecutive value starting from max_idx
(df.loc[max_idx:]
) for both columns. Just to ensure that,
the percent change is not below 5%.
Example:
for TSLA: 320 with 319 = 2% for MSFT: 69 with 61 = 4%
320 with 314 = 4% 69 with 66 = 5%
320 with 313 = 5% 69 with 62 = 10%
编辑:如果您觉得很难回答,那么我仅对要用于此类操作的函数或方法的类型感到满意.
If you find it difficult to answer, i can be happy with just a reference to what type of function or method i shall use for such operations.
注意:我只想确保百分比变化不低于5%.
Note: I just want to ensure that percent change isn't below 5%.
推荐答案
我不确定您的正确/错误条件,但由于@JohnGalt,我想您需要类似的东西:
I am not sure about your true/false conditions, but I think you need something like this, thanks to @JohnGalt:
df.apply(lambda x: ((1 - x/x.max()) > 0.05).all())
或使用您的逻辑:
df.apply(lambda x: ((x[x.idxmax()]-x)/x[x.idxmax()]*100>5).all())
输出:
TSLA False
MSFT False
dtype: bool
我们来看一列,
约翰的公式:
1 - df.TSLA/df.TSLA.max()
返回:
2017-05-15 00:00:00+00:00 0.000000
2017-05-16 00:00:00+00:00 0.003125
2017-05-17 00:00:00+00:00 0.018750
2017-05-18 00:00:00+00:00 0.021875
2017-05-19 00:00:00+00:00 0.012500
2017-05-22 00:00:00+00:00 0.018750
2017-05-23 00:00:00+00:00 0.031250
Name: TSLA, dtype: float64
如果所有这些值均大于5,则返回True,否则返回False.
If all of those values are greater than 5 return True, else return False.
我的原始公式也可以工作,只需要更多的计算即可完成与John公式相同的操作. 最后,使用lambda函数将此公式独立应用于每个列.
My original formula works also, just a bit more calculation to do the same thing that John formula does. Lastly, use lambda function to apply this formula to each column independently.
这篇关于如何对数据框的多列执行单个操作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!