计算Python/Pandas中两行之间的差异 [英] Calculating difference between two rows in Python / Pandas

查看:839
本文介绍了计算Python/Pandas中两行之间的差异的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在python中,如何引用上一行并根据它计算一些内容?具体来说,我正在使用pandas中的dataframes-我有一个数据框,其中充满了股价信息,如下所示:

In python, how can I reference previous row and calculate something against it? Specifically, I am working with dataframes in pandas - I have a data frame full of stock price information that looks like this:

           Date   Close  Adj Close
251  2011-01-03  147.48     143.25
250  2011-01-04  147.64     143.41
249  2011-01-05  147.05     142.83
248  2011-01-06  148.66     144.40
247  2011-01-07  147.93     143.69

这是我创建此数据框的方式:

Here is how I created this dataframe:

import pandas

url = 'http://ichart.finance.yahoo.com/table.csv?s=IBM&a=00&b=1&c=2011&d=11&e=31&f=2011&g=d&ignore=.csv'
data = data = pandas.read_csv(url)

## now I sorted the data frame ascending by date 
data = data.sort(columns='Date')

从第2行开始,或者在这种情况下,我想是250(PS-是索引吗?),我想为每个条目计算2011-01-03和2011-01-04之间的差在此数据框中.我相信适当的方法是编写一个函数,该函数采用当前行,然后找出前一行,并计算它们之间的差,使用pandas apply函数使用该值更新数据帧.

Starting with row number 2, or in this case, I guess it's 250 (PS - is that the index?), I want to calculate the difference between 2011-01-03 and 2011-01-04, for every entry in this dataframe. I believe the appropriate way is to write a function that takes the current row, then figures out the previous row, and calculates the difference between them, the use the pandas apply function to update the dataframe with the value.

这是正确的方法吗?如果是这样,我是否应该使用索引来确定差异? (请注意-我仍处于python初学者模式,因此index可能不是正确的术语,甚至不是实现此目标的正确方法)

Is that the right approach? If so, should I be using the index to determine the difference? (note - I'm still in python beginner mode, so index may not be the right term, nor even the correct way to implement this)

推荐答案

我认为您想执行以下操作:

I think you want to do something like this:

In [26]: data
Out[26]: 
           Date   Close  Adj Close
251  2011-01-03  147.48     143.25
250  2011-01-04  147.64     143.41
249  2011-01-05  147.05     142.83
248  2011-01-06  148.66     144.40
247  2011-01-07  147.93     143.69

In [27]: data.set_index('Date').diff()
Out[27]: 
            Close  Adj Close
Date                        
2011-01-03    NaN        NaN
2011-01-04   0.16       0.16
2011-01-05  -0.59      -0.58
2011-01-06   1.61       1.57
2011-01-07  -0.73      -0.71

这篇关于计算Python/Pandas中两行之间的差异的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆