从xarray数据集中的某些变量中删除维度 [英] Remove a dimension from some variables in an xarray Dataset
问题描述
我有一个xarray数据集,其中某些变量的尺寸超出了必要的尺寸(例如,一个3D数据集,其中纬度"和经度"变量也随时间变化).如何删除多余的尺寸?
I have an xarray Dataset where some variables have more dimensions than necessary (e.g., a 3D dataset where the "latitude" and "longitude" variables also vary along time). How do I remove the extra dimensions?
例如,在下面的数据集中,"bar"是沿x
和y
轴的2D变量,具有沿x
轴的恒定值.如何从"bar"中删除x
维度,而不从"foo"中删除?
For example, in the dataset below, 'bar' is a 2D variable along the x
and y
axes, with constant values along the x
axis. How do I remove the x
dimension from 'bar' but not 'foo'?
>>> ds = xr.Dataset({'foo': (('x', 'y'), np.random.randn(2, 3))},
{'x': [1, 2], 'y': [1, 2, 3],
'bar': (('x', 'y'), [[4, 5, 6], [4, 5, 6]])})
>>> ds
<xarray.Dataset>
Dimensions: (x: 2, y: 3)
Coordinates:
* x (x) int64 1 2
* y (y) int64 1 2 3
bar (x, y) int64 4 5 6 4 5 6
Data variables:
foo (x, y) float64 -0.9595 0.6704 -1.047 0.9948 0.8241 1.643
推荐答案
最直接的删除多余维度的方法(使用索引编制)会导致出现一些令人困惑的错误消息:
The most direct way to remove the extra dimension (using indexing) results in a slightly confusing error message:
>>> ds['bar'] = ds['bar'].sel(x=1)
ValueError: dimension 'x' already exists as a scalar variable
问题是,当您在xarray中建立索引时,它会将索引坐标保留为标量坐标:
The problem is that when you do indexing in xarray, it keeps around indexed coordinates as scalar coordinates:
>>> ds['bar'].sel(x=1)
<xarray.DataArray 'bar' (y: 3)>
array([4, 5, 6])
Coordinates:
x int64 1
* y (y) int64 1 2 3
bar (y) int64 4 5 6
这通常很有用,但是在这种情况下,当您尝试在原始数据集上进行设置时,索引数组上的标量坐标'x'
与非标量坐标(和维数)'x'
会发生冲突.因此,xarray错误而不是覆盖变量.
This is often useful, but in this case the scalar coordinate 'x'
on the indexed array conflicts with the non-scalar coordinate (and dimension) 'x'
when you try to set it on the original dataset. Hence xarray errors instead of overriding the variable.
要解决此问题,您需要在建立索引后删除标量'x'
.在当前版本的xarray中,您可以使用drop
:
To get around this, you need to drop the scalar 'x'
after indexing. In the current version of xarray, you can do this with drop
:
>>> ds['bar'] = ds['bar'].sel(x=1).drop('x')
>>> ds
<xarray.Dataset>
Dimensions: (x: 2, y: 3)
Coordinates:
* x (x) int64 1 2
* y (y) int64 1 2 3
bar (y) int64 4 5 6
Data variables:
foo (x, y) float64 -0.9595 0.6704 -1.047 0.9948 0.8241 1.643
在将来的xarray版本(v0.9及更高版本)中,通过编写drop=True
(例如ds['bar'].sel(x=1, drop=True)
),您将能够在索引编制时删除坐标.
In future versions of xarray (v0.9 and later), you will be able to drop coordinates when indexing by writing drop=True
, e.g., ds['bar'].sel(x=1, drop=True)
.
这篇关于从xarray数据集中的某些变量中删除维度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!