从xarray数据集中的某些变量中删除维度 [英] Remove a dimension from some variables in an xarray Dataset

查看:890
本文介绍了从xarray数据集中的某些变量中删除维度的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个xarray数据集,其中某些变量的尺寸超出了必要的尺寸(例如,一个3D数据集,其中纬度"和经度"变量也随时间变化).如何删除多余的尺寸?

I have an xarray Dataset where some variables have more dimensions than necessary (e.g., a 3D dataset where the "latitude" and "longitude" variables also vary along time). How do I remove the extra dimensions?

例如,在下面的数据集中,"bar"是沿xy轴的2D变量,具有沿x轴的恒定值.如何从"bar"中删除x维度,而不从"foo"中删除?

For example, in the dataset below, 'bar' is a 2D variable along the x and y axes, with constant values along the x axis. How do I remove the x dimension from 'bar' but not 'foo'?

>>> ds = xr.Dataset({'foo': (('x', 'y'), np.random.randn(2, 3))},
                    {'x': [1, 2], 'y': [1, 2, 3],
                     'bar': (('x', 'y'), [[4, 5, 6], [4, 5, 6]])})
>>> ds
<xarray.Dataset>
Dimensions:  (x: 2, y: 3)
Coordinates:
  * x        (x) int64 1 2
  * y        (y) int64 1 2 3
    bar      (x, y) int64 4 5 6 4 5 6
Data variables:
    foo      (x, y) float64 -0.9595 0.6704 -1.047 0.9948 0.8241 1.643

推荐答案

最直接的删除多余维度的方法(使用索引编制)会导致出现一些令人困惑的错误消息:

The most direct way to remove the extra dimension (using indexing) results in a slightly confusing error message:

>>> ds['bar'] = ds['bar'].sel(x=1)
ValueError: dimension 'x' already exists as a scalar variable

问题是,当您在xarray中建立索引时,它会将索引坐标保留为标量坐标:

The problem is that when you do indexing in xarray, it keeps around indexed coordinates as scalar coordinates:

>>> ds['bar'].sel(x=1)
<xarray.DataArray 'bar' (y: 3)>
array([4, 5, 6])
Coordinates:
    x        int64 1
  * y        (y) int64 1 2 3
    bar      (y) int64 4 5 6

这通常很有用,但是在这种情况下,当您尝试在原始数据集上进行设置时,索引数组上的标量坐标'x'与非标量坐标(和维数)'x'会发生冲突.因此,xarray错误而不是覆盖变量.

This is often useful, but in this case the scalar coordinate 'x' on the indexed array conflicts with the non-scalar coordinate (and dimension) 'x' when you try to set it on the original dataset. Hence xarray errors instead of overriding the variable.

要解决此问题,您需要在建立索引后删除标量'x'.在当前版本的xarray中,您可以使用drop:

To get around this, you need to drop the scalar 'x' after indexing. In the current version of xarray, you can do this with drop:

>>> ds['bar'] = ds['bar'].sel(x=1).drop('x')
>>> ds
<xarray.Dataset>
Dimensions:  (x: 2, y: 3)
Coordinates:
  * x        (x) int64 1 2
  * y        (y) int64 1 2 3
    bar      (y) int64 4 5 6
Data variables:
    foo      (x, y) float64 -0.9595 0.6704 -1.047 0.9948 0.8241 1.643

在将来的xarray版本(v0.9及更高版本)中,通过编写drop=True(例如ds['bar'].sel(x=1, drop=True)),您将能够在索引编制时删除坐标.

In future versions of xarray (v0.9 and later), you will be able to drop coordinates when indexing by writing drop=True, e.g., ds['bar'].sel(x=1, drop=True).

这篇关于从xarray数据集中的某些变量中删除维度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆