Pandas:通过向左移动行来转换上三角数据框 [英] Pandas: convert upper triangular dataframe by shifting rows to the left
问题描述
我有一个看起来像上三角"的数据框:
I have a dataframe that looks "upper-triangular":
31-May-11 30-Jun-11 31-Jul-11 31-Aug-11 30-Sep-11 31-Oct-11
OpenDate
2011-05-31 68.432797 81.696071 75.083249 66.659008 68.898034 72.622304
2011-06-30 NaN 1.711097 1.501082 1.625213 1.774645 1.661183
2011-07-31 NaN NaN 0.422364 0.263561 0.203572 0.234376
2011-08-31 NaN NaN NaN 1.077009 1.226946 1.520701
2011-09-30 NaN NaN NaN NaN 0.667091 0.495993
我想通过将 i
行向左移动 i-1
来转换它:
and I would like to convert it by shifting the i
th row to the left by i-1
:
31-May-11 30-Jun-11 31-Jul-11 31-Aug-11 30-Sep-11 31-Oct-11
OpenDate
2011-05-31 68.432797 81.696071 75.083249 66.659008 68.898034 72.622304
2011-06-30 1.711097 1.501082 1.625213 1.774645 1.661183 NaN
2011-07-31 0.422364 0.263561 0.203572 0.234376 NaN NaN
2011-08-31 1.077009 1.226946 1.520701 NaN NaN NaN
2011-09-30 0.667091 0.495993 NaN NaN NaN NaN
我不能排除矩阵的上部可能存在 NaN,所以我们可能会看到这样的东西:
I can't exclude that there might be NaNs present in the upper part of the matrix, so we migth see something like this:
31-May-11 30-Jun-11 31-Jul-11 31-Aug-11 30-Sep-11 31-Oct-11
OpenDate
2011-05-31 68.432797 81.696071 75.083249 66.659008 68.898034 72.622304
2011-06-30 NaN NaN 1.501082 1.625213 1.774645 1.661183
2011-07-31 NaN NaN 0.422364 0.263561 0.203572 0.234376
2011-08-31 NaN NaN NaN 1.077009 1.226946 1.520701
2011-09-30 NaN NaN NaN NaN 0.667091 0.495993
应该变成
31-May-11 30-Jun-11 31-Jul-11 31-Aug-11 30-Sep-11 31-Oct-11
OpenDate
2011-05-31 68.432797 81.696071 75.083249 66.659008 68.898034 72.622304
2011-06-30 NaN 1.501082 1.625213 1.774645 1.661183 NaN
2011-07-31 0.422364 0.263561 0.203572 0.234376 NaN NaN
2011-08-31 1.077009 1.226946 1.520701 NaN NaN NaN
2011-09-30 0.667091 0.495993 NaN NaN NaN NaN
任何想法如何实现这一目标?
Any ideas how to achieve this?
谢谢,安妮
推荐答案
这里有一种方法可以使用 numpy
Here's a way that you can do this using numpy
输入:
In [96]: df
Out[96]:
1 2 3 4 5 6
0
2011-05-31 68.433 81.696 75.083 66.659 68.898 72.622
2011-06-30 NaN 1.711 1.501 1.625 1.775 1.661
2011-07-31 NaN NaN 0.422 0.264 0.204 0.234
2011-08-31 NaN NaN NaN 1.077 1.227 1.521
2011-09-30 NaN NaN NaN NaN 0.667 0.496
代码
roller = lambda (i, x): np.roll(x, -i)
row_terator = enumerate(df.values)
rolled = map(roller, row_terator)
result = DataFrame(np.vstack(rolled), index=df.index, columns=df.columns)
输出:
1 2 3 4 5 6
0
2011-05-31 68.433 81.696 75.083 66.659 68.898 72.622
2011-06-30 1.711 1.501 1.625 1.775 1.661 NaN
2011-07-31 0.422 0.264 0.204 0.234 NaN NaN
2011-08-31 1.077 1.227 1.521 NaN NaN NaN
2011-09-30 0.667 0.496 NaN NaN NaN NaN
让我们timeit
In [95]: %%timeit
....: roller = lambda (i, x): np.roll(x, -i)
....: row_terator = enumerate(df.values)
....: rolled = map(roller, row_terator)
....: result = DataFrame(np.vstack(rolled), index=df.index, columns=df.columns)
....:
10000 loops, best of 3: 101 us per loop
注意 np.roll
在这里很重要.它需要一个数组、要移动的整数个位置和一个 axis
参数,因此您可以沿其任何轴移动 ndarray
.
Note that np.roll
is the important thing here. It takes an array, an integer number of places to shift and an axis
argument so you can shift an ndarray
along any of its axes.
这篇关于Pandas:通过向左移动行来转换上三角数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!