计算Python/Pandas中两行之间的差异 [英] Calculating difference between two rows in Python / Pandas
问题描述
在python中,如何引用上一行并根据它计算一些内容?具体来说,我正在使用pandas
中的dataframes
-我有一个数据框,其中充满了股价信息,如下所示:
In python, how can I reference previous row and calculate something against it? Specifically, I am working with dataframes
in pandas
- I have a data frame full of stock price information that looks like this:
Date Close Adj Close
251 2011-01-03 147.48 143.25
250 2011-01-04 147.64 143.41
249 2011-01-05 147.05 142.83
248 2011-01-06 148.66 144.40
247 2011-01-07 147.93 143.69
这是我创建此数据框的方式:
Here is how I created this dataframe:
import pandas
url = 'http://ichart.finance.yahoo.com/table.csv?s=IBM&a=00&b=1&c=2011&d=11&e=31&f=2011&g=d&ignore=.csv'
data = data = pandas.read_csv(url)
## now I sorted the data frame ascending by date
data = data.sort(columns='Date')
从第2行开始,或者在这种情况下,我想是250(PS-是索引吗?),我想为每个条目计算2011-01-03和2011-01-04之间的差在此数据框中.我相信适当的方法是编写一个函数,该函数采用当前行,然后找出前一行,并计算它们之间的差,使用pandas
apply
函数使用该值更新数据帧.
Starting with row number 2, or in this case, I guess it's 250 (PS - is that the index?), I want to calculate the difference between 2011-01-03 and 2011-01-04, for every entry in this dataframe. I believe the appropriate way is to write a function that takes the current row, then figures out the previous row, and calculates the difference between them, the use the pandas
apply
function to update the dataframe with the value.
这是正确的方法吗?如果是这样,我是否应该使用索引来确定差异? (请注意-我仍处于python初学者模式,因此index可能不是正确的术语,甚至不是实现此目标的正确方法)
Is that the right approach? If so, should I be using the index to determine the difference? (note - I'm still in python beginner mode, so index may not be the right term, nor even the correct way to implement this)
推荐答案
我认为您想执行以下操作:
I think you want to do something like this:
In [26]: data
Out[26]:
Date Close Adj Close
251 2011-01-03 147.48 143.25
250 2011-01-04 147.64 143.41
249 2011-01-05 147.05 142.83
248 2011-01-06 148.66 144.40
247 2011-01-07 147.93 143.69
In [27]: data.set_index('Date').diff()
Out[27]:
Close Adj Close
Date
2011-01-03 NaN NaN
2011-01-04 0.16 0.16
2011-01-05 -0.59 -0.58
2011-01-06 1.61 1.57
2011-01-07 -0.73 -0.71
这篇关于计算Python/Pandas中两行之间的差异的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!