Pandas:通过向左移动行来转换上三角数据框 [英] Pandas: convert upper triangular dataframe by shifting rows to the left

查看:34
本文介绍了Pandas:通过向左移动行来转换上三角数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个看起来像上三角"的数据框:

I have a dataframe that looks "upper-triangular":

            31-May-11  30-Jun-11  31-Jul-11  31-Aug-11  30-Sep-11  31-Oct-11
OpenDate
2011-05-31  68.432797  81.696071  75.083249  66.659008  68.898034  72.622304
2011-06-30        NaN   1.711097   1.501082   1.625213   1.774645   1.661183
2011-07-31        NaN        NaN   0.422364   0.263561   0.203572   0.234376
2011-08-31        NaN        NaN        NaN   1.077009   1.226946   1.520701
2011-09-30        NaN        NaN        NaN        NaN   0.667091   0.495993

我想通过将 i 行向左移动 i-1 来转换它:

and I would like to convert it by shifting the ith row to the left by i-1:

            31-May-11  30-Jun-11  31-Jul-11  31-Aug-11  30-Sep-11  31-Oct-11
OpenDate
2011-05-31  68.432797  81.696071  75.083249  66.659008  68.898034  72.622304
2011-06-30  1.711097   1.501082   1.625213   1.774645   1.661183   NaN
2011-07-31  0.422364   0.263561   0.203572   0.234376   NaN        NaN
2011-08-31  1.077009   1.226946   1.520701   NaN        NaN        NaN
2011-09-30  0.667091   0.495993   NaN        NaN        NaN        NaN   

我不能排除矩阵的上部可能存在 NaN,所以我们可能会看到这样的东西:

I can't exclude that there might be NaNs present in the upper part of the matrix, so we migth see something like this:

            31-May-11  30-Jun-11  31-Jul-11  31-Aug-11  30-Sep-11  31-Oct-11
OpenDate
2011-05-31  68.432797  81.696071  75.083249  66.659008  68.898034  72.622304
2011-06-30        NaN        NaN   1.501082   1.625213   1.774645   1.661183
2011-07-31        NaN        NaN   0.422364   0.263561   0.203572   0.234376
2011-08-31        NaN        NaN        NaN   1.077009   1.226946   1.520701
2011-09-30        NaN        NaN        NaN        NaN   0.667091   0.495993

应该变成

            31-May-11  30-Jun-11  31-Jul-11  31-Aug-11  30-Sep-11  31-Oct-11
OpenDate
2011-05-31  68.432797  81.696071  75.083249  66.659008  68.898034  72.622304
2011-06-30  NaN        1.501082   1.625213   1.774645   1.661183   NaN
2011-07-31  0.422364   0.263561   0.203572   0.234376   NaN        NaN
2011-08-31  1.077009   1.226946   1.520701   NaN        NaN        NaN
2011-09-30  0.667091   0.495993   NaN        NaN        NaN        NaN   

任何想法如何实现这一目标?

Any ideas how to achieve this?

谢谢,安妮

推荐答案

这里有一种方法可以使用 numpy

Here's a way that you can do this using numpy

输入:

In [96]: df
Out[96]:
                 1       2       3       4       5       6
0
2011-05-31  68.433  81.696  75.083  66.659  68.898  72.622
2011-06-30     NaN   1.711   1.501   1.625   1.775   1.661
2011-07-31     NaN     NaN   0.422   0.264   0.204   0.234
2011-08-31     NaN     NaN     NaN   1.077   1.227   1.521
2011-09-30     NaN     NaN     NaN     NaN   0.667   0.496

代码

roller = lambda (i, x): np.roll(x, -i)
row_terator = enumerate(df.values)
rolled = map(roller, row_terator)
result = DataFrame(np.vstack(rolled), index=df.index, columns=df.columns)

输出:

                 1       2       3       4       5       6
0
2011-05-31  68.433  81.696  75.083  66.659  68.898  72.622
2011-06-30   1.711   1.501   1.625   1.775   1.661     NaN
2011-07-31   0.422   0.264   0.204   0.234     NaN     NaN
2011-08-31   1.077   1.227   1.521     NaN     NaN     NaN
2011-09-30   0.667   0.496     NaN     NaN     NaN     NaN

让我们timeit

In [95]: %%timeit
   ....: roller = lambda (i, x): np.roll(x, -i)
   ....: row_terator = enumerate(df.values)
   ....: rolled = map(roller, row_terator)
   ....: result = DataFrame(np.vstack(rolled), index=df.index, columns=df.columns)
   ....:
10000 loops, best of 3: 101 us per loop

注意 np.roll 在这里很重要.它需要一个数组、要移动的整数个位置和一个 axis 参数,因此您可以沿其任何轴移动 ndarray.

Note that np.roll is the important thing here. It takes an array, an integer number of places to shift and an axis argument so you can shift an ndarray along any of its axes.

这篇关于Pandas:通过向左移动行来转换上三角数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆