3维Numpy数组到Multiindex Pandas数据框 [英] 3 dimensional numpy array to multiindex pandas dataframe

查看:58
本文介绍了3维Numpy数组到Multiindex Pandas数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个3维numpy数组(z, x, y). z是时间维度,xy是坐标.

I have a 3 dimensional numpy array, (z, x, y). z is a time dimension and x and y are coordinates.

我想将其转换为多索引的pandas.DataFrame.我希望行索引为z维 并且每一列都具有来自唯一的x,y坐标的值(因此,每一列都将是多索引的).

I want to convert this to a multiindexed pandas.DataFrame. I want the row index to be the z dimension and each column to have values from a unique x, y coordinate (and so, each column would be multi-indexed).

最简单的情况(不是多索引的):

The simplest case (not multi-indexed):

>>> array.shape
(500L, 120L, 100L)

>>> df = pd.DataFrame(array[:,0,0])

>>> df.shape
(500, 1)

我一直在尝试使用pd.MultiIndex.from_arrays将整个数组传递到多索引数据帧中,但出现错误: NotImplementedError:>目前不支持1 ndim Categorical

I've been trying to pass the whole array into a multiindex dataframe using pd.MultiIndex.from_arrays but I'm getting an error: NotImplementedError: > 1 ndim Categorical are not supported at this time

看起来它应该很简单,但我无法弄清楚.

Looks like it should be fairly simple but I cant figure it out.

推荐答案

我认为您可以使用面板-然后为Multiindex DataFrame添加 to_frame :

I think you can use panel - and then for Multiindex DataFrame add to_frame:

np.random.seed(10)
arr = np.random.randint(10, size=(5,3,2))
print (arr)
[[[9 4]
  [0 1]
  [9 0]]

 [[1 8]
  [9 0]
  [8 6]]

 [[4 3]
  [0 4]
  [6 8]]

 [[1 8]
  [4 1]
  [3 6]]

 [[5 3]
  [9 6]
  [9 1]]]

df = pd.Panel(arr).to_frame()
print (df)
             0  1  2  3  4
major minor               
0     0      9  1  4  1  5
      1      4  8  3  8  3
1     0      0  9  0  4  9
      1      1  0  4  1  6
2     0      9  8  6  3  9
      1      0  6  8  6  1

transpose 可以有用:

Also transpose can be useful:

df = pd.Panel(arr).transpose(1,2,0).to_frame()
print (df)
             0  1  2
major minor         
0     0      9  0  9
      1      1  9  8
      2      4  0  6
      3      1  4  3
      4      5  9  9
1     0      4  1  0
      1      8  0  6
      2      3  4  8
      3      8  1  6
      4      3  6  1

使用 concat 的另一种可能的解决方案:

Another possible solution with concat:

arr = arr.transpose(1,2,0)
df = pd.concat([pd.DataFrame(x) for x in arr], keys=np.arange(arr.shape[2]))
print (df)
    0  1  2  3  4
0 0  9  1  4  1  5
  1  4  8  3  8  3
1 0  0  9  0  4  9
  1  1  0  4  1  6
2 0  9  8  6  3  9
  1  0  6  8  6  1


np.random.seed(10)
arr = np.random.randint(10, size=(500,120,100))
df = pd.Panel(arr).transpose(2,0,1).to_frame()
print (df.shape)
(60000, 100)

print (df.index.max())
(499, 119)

这篇关于3维Numpy数组到Multiindex Pandas数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆