使用可变长度DataArray索引XArray数据 [英] Indexing xarray data with variable length DataArray

查看：105 发布时间：2020/7/28 5:16:59 numpy python-xarray

本文介绍了使用可变长度DataArray索引XArray数据的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试使用DataArray索引从xarray数据集中提取数据.我的目标是沿重叠数组的不同线段获取数据.为此，我获得了每条线的索引(根据长度，它们的大小不同).

I am trying to extract data from xarray dataset using DataArray indexing. My goal is to obtain the data along different line segments overlapping the array. For that I have obtained indices of each of the lines (these are of different sizes based on the length).

例如对于第1行:x = [1,2,3], y=[7,8,9]，类似地，对于第2行是x=[1,4,5,6,8], y=[0,2,7,9,6]，依此类推，我有一些行是100x2.为此，我尝试如下:

For example for line 1 : x = [1,2,3], y=[7,8,9] and similarly for line 2 is x=[1,4,5,6,8], y=[0,2,7,9,6] and so on I have some of the lines which are 100x 2. For this I have tried like below :

df=xarray_dataset
indx=xr.DataArray([[1,2,3],[1,4,5,6,8],[2,3]])
indy=xr.DataArray([[7,9,8],[0,2,7,9,6],[4,5]])
dx_sel=df.isel(x=indx,y=indy)

不过，据我了解，每个数据数组索引的长度都必须相等.有没有办法我可以处理此类问题.基本上，这些索引代表数据帧内不同段的x和y坐标，并获取每个段的平均值.如果只有很少的段数，我将有100个这样的段，我将能够为每个段使用循环索引，但是对每个段使用循环在计算上不是很有效.

However what I understand that the length of each of the data array index needs to be equal. Is there a way I can handle such issues. Basically these indices represent the x and y coordinates of different segments within the data frame and get the mean of each of the segment, I have 100s of such segments if there are only few I would be able to use a loop for each of the segment indexes however it's not computationally efficient to use a loop for each segment.

这也是numpy数组的类似问题.有没有办法在索引中传递NaN或类似的东西，以便我们可以形成相同的形状，但是没有为该索引提取数据.

This is a similar issue with numpy array as well. Is there a way to pass NaN or something similar in the index so that we could make the equal shape but no data is extracted for that index.

推荐答案

您可以使用set_index-> unstack机制，它基于pd.MultiIndex.

You can use set_index -> unstack mechanism, which is based on pd.MultiIndex.

In [4]: df = xr.DataArray(np.arange(110).reshape(10, 11),  
   ...:                   dims=['x', 'y'])  
In [5]: indx=xr.DataArray([1,2,3, 1,4,5,6,8, 2,3], 
   ...:                   dims=['index'],  
   ...:                   coords={'i': ('index', [0,0,0, 1,1,1,1,1, 2,2]), 
   ...:                           'j': ('index', [0,1,2, 0,1,2,3,4, 0,1])}) 
   ...:  
   ...: indy=xr.DataArray([7,9,8, 0,2,7,9,6, 4,5], dims=['index'], 
   ...:                   coords={'i': ('index', [0,0,0, 1,1,1,1,1, 2,2]), 
   ...:                           'j': ('index', [0,1,2, 0,1,2,3,4, 0,1])})       

In [8]: df.isel(x=indx, y=indy).set_index(index=['i', 'j']).unstack('index')                                         
Out[8]: 
<xarray.DataArray (i: 3, j: 5)>
array([[18., 31., 41., nan, nan],
       [11., 46., 62., 75., 94.],
       [26., 38., nan, nan, nan]])
Coordinates:
  * i        (i) int64 0 1 2
  * j        (j) int64 0 1 2 3 4

在这里，indx和indy具有无量纲坐标i和j，它们实质上是索引在二维空间中的原始位置.

Here, indx and indy has non-dimensional coordinates, i and j, which are essentially the original position of the index in the 2-dimensional space.

这篇关于使用可变长度DataArray索引XArray数据的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用可变长度DataArray索引XArray数据 [英] Indexing xarray data with variable length DataArray

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用可变长度DataArray索引XArray数据 [英] Indexing xarray data with variable length DataArray

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭