3维Numpy数组到Multiindex Pandas数据框 [英] 3 dimensional numpy array to multiindex pandas dataframe
问题描述
我有一个3维numpy
数组(z, x, y)
. z
是时间维度,x
和y
是坐标.
I have a 3 dimensional numpy
array, (z, x, y)
. z
is a time dimension and x
and y
are coordinates.
我想将其转换为多索引的pandas.DataFrame
.我希望行索引为z维
并且每一列都具有来自唯一的x,y坐标的值(因此,每一列都将是多索引的).
I want to convert this to a multiindexed pandas.DataFrame
. I want the row index to be the z dimension
and each column to have values from a unique x, y coordinate (and so, each column would be multi-indexed).
最简单的情况(不是多索引的):
The simplest case (not multi-indexed):
>>> array.shape
(500L, 120L, 100L)
>>> df = pd.DataFrame(array[:,0,0])
>>> df.shape
(500, 1)
我一直在尝试使用pd.MultiIndex.from_arrays将整个数组传递到多索引数据帧中,但出现错误: NotImplementedError:>目前不支持1 ndim Categorical
I've been trying to pass the whole array into a multiindex dataframe using pd.MultiIndex.from_arrays but I'm getting an error: NotImplementedError: > 1 ndim Categorical are not supported at this time
看起来它应该很简单,但我无法弄清楚.
Looks like it should be fairly simple but I cant figure it out.
推荐答案
我认为您可以使用面板-然后为Multiindex DataFrame
添加 to_frame
:
I think you can use panel - and then for Multiindex DataFrame
add to_frame
:
np.random.seed(10)
arr = np.random.randint(10, size=(5,3,2))
print (arr)
[[[9 4]
[0 1]
[9 0]]
[[1 8]
[9 0]
[8 6]]
[[4 3]
[0 4]
[6 8]]
[[1 8]
[4 1]
[3 6]]
[[5 3]
[9 6]
[9 1]]]
df = pd.Panel(arr).to_frame()
print (df)
0 1 2 3 4
major minor
0 0 9 1 4 1 5
1 4 8 3 8 3
1 0 0 9 0 4 9
1 1 0 4 1 6
2 0 9 8 6 3 9
1 0 6 8 6 1
还 transpose
可以有用:
Also transpose
can be useful:
df = pd.Panel(arr).transpose(1,2,0).to_frame()
print (df)
0 1 2
major minor
0 0 9 0 9
1 1 9 8
2 4 0 6
3 1 4 3
4 5 9 9
1 0 4 1 0
1 8 0 6
2 3 4 8
3 8 1 6
4 3 6 1
使用 concat
的另一种可能的解决方案:
Another possible solution with concat
:
arr = arr.transpose(1,2,0)
df = pd.concat([pd.DataFrame(x) for x in arr], keys=np.arange(arr.shape[2]))
print (df)
0 1 2 3 4
0 0 9 1 4 1 5
1 4 8 3 8 3
1 0 0 9 0 4 9
1 1 0 4 1 6
2 0 9 8 6 3 9
1 0 6 8 6 1
np.random.seed(10)
arr = np.random.randint(10, size=(500,120,100))
df = pd.Panel(arr).transpose(2,0,1).to_frame()
print (df.shape)
(60000, 100)
print (df.index.max())
(499, 119)
这篇关于3维Numpy数组到Multiindex Pandas数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!