遍历netcdf文件并运行计算-Python或R [英] Loop through netcdf files and run calculations - Python or R

查看:165
本文介绍了遍历netcdf文件并运行计算-Python或R的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我第一次使用netCDF,我正在努力工作.

This is my first time using netCDF and I'm trying to wrap my head around working with it.

我有多个版本3的netcdf文件(NOAA NARR air.整整一年平均每天200万次).每个文件的时间跨度为1979年至2012年.它们是349 x 277网格,分辨率约为32km.数据是从此处下载的.

I have multiple version 3 netcdf files (NOAA NARR air.2m daily averages for an entire year). Each file spans a year between 1979 - 2012. They are 349 x 277 grids with approximately a 32km resolution. Data was downloaded from here.

维度是时间(自1800年1月1日以来的小时数),我感兴趣的变量是air.我需要计算温度< 0.例如

The dimension is time (hours since 1/1/1800) and my variable of interest is air. I need to calculate accumulated days with a temperature < 0. For example

    Day 1 = +4 degrees, accumulated days = 0
    Day 2 = -1 degrees, accumulated days = 1
    Day 3 = -2 degrees, accumulated days = 2
    Day 4 = -4 degrees, accumulated days = 3
    Day 5 = +2 degrees, accumulated days = 0
    Day 6 = -3 degrees, accumulated days = 1

我需要将此数据存储在新的netcdf文件中.我对Python和R有点熟悉.什么是每天循环浏览,检查前几天的值的最佳方法,然后在此基础上,将值输出到具有完全相同的尺寸和变量的新netcdf文件... ,或者只是将另一个变量添加到我正在寻找的输出中的原始netcdf文件中.

I need to store this data in a new netcdf file. I am familiar with Python and somewhat with R. What is the best way to loop through each day, check the previous days value, and based on that, output a value to a new netcdf file with the exact same dimension and variable.... or perhaps just add another variable to the original netcdf file with the output I'm looking for.

最好将所有文件分开还是合并?我将它们与ncrcat结合使用,效果很好,但文件大小为2.3gb.

Is it best to leave all the files separate or combine them? I combined them with ncrcat and it worked fine, but the file is 2.3gb.

感谢您的输入.

我在python中的当前进度:

My current progress in python:

import numpy
import netCDF4
#Change my working DIR
f = netCDF4.Dataset('air7912.nc', 'r')
for a in f.variables:
  print(a)

#output =
     lat
     long
     x
     y
     Lambert_Conformal
     time
     time_bnds
     air

f.variables['air'][1, 1, 1]
#Output
     298.37473

为了帮助我更好地理解这一点,我正在使用哪种类型的数据结构?上例中的['air']键是键,[1,1,1]也是键吗?得到298.37473的值.然后如何遍历[1,1,1]?

To help me understand this better what type of data structure am I working with? Is ['air'] the key in the above example and [1,1,1] are also keys? to get the value of 298.37473. How can I then loop through [1,1,1]?

推荐答案

您可以使用netCDF4中非常好的MFDataset功能将一堆文件视为一个聚合文件,而无需使用ncrcat.因此您的代码应如下所示:

You can use the very nice MFDataset feature in netCDF4 to treat a bunch of files as one aggregated file, without the need to use ncrcat. So you code would look like this:

from pylab import *
import netCDF4

f = netCDF4.MFDataset('/usgs/data2/rsignell/models/ncep/narr/air.2m.19??.nc')
# print variables
f.variables.keys()

atemp = f.variables['air']
print atemp

ntimes, ny, nx = shape(atemp)
cold_days = zeros((ny,nx),dtype=int)

for i in xrange(ntimes):
    cold_days += atemp[i,:,:].data-273.15 < 0

pcolormesh(cold_days)
colorbar()

这是一种写入文件的方法(可能有更简单的方法):

And here's one way to write the file (there might be easier ways):

# create NetCDF file
nco = netCDF4.Dataset('/usgs/data2/notebook/cold_days.nc','w',clobber=True)
nco.createDimension('x',nx)
nco.createDimension('y',ny)

cold_days_v = nco.createVariable('cold_days', 'i4',  ( 'y', 'x'))
cold_days_v.units='days'
cold_days_v.long_name='total number of days below 0 degC'
cold_days_v.grid_mapping = 'Lambert_Conformal'

lono = nco.createVariable('lon','f4',('y','x'))
lato = nco.createVariable('lat','f4',('y','x'))
xo = nco.createVariable('x','f4',('x'))
yo = nco.createVariable('y','f4',('y'))
lco = nco.createVariable('Lambert_Conformal','i4')

# copy all the variable attributes from original file
for var in ['lon','lat','x','y','Lambert_Conformal']:
    for att in f.variables[var].ncattrs():
        setattr(nco.variables[var],att,getattr(f.variables[var],att))

# copy variable data for lon,lat,x and y
lono[:]=f.variables['lon'][:]
lato[:]=f.variables['lat'][:]
xo[:]=f.variables['x'][:]
yo[:]=f.variables['y'][:]

#  write the cold_days data
cold_days_v[:,:]=cold_days

# copy Global attributes from original file
for att in f.ncattrs():
    setattr(nco,att,getattr(f,att))

nco.Conventions='CF-1.6'
nco.close()

如果我尝试在 Unidata NetCDF-Java工具-UI GUI ,看来还可以: 还要注意,这里我只是下载了两个数据集进行测试,所以我使用了

If I try looking at the resulting file in the Unidata NetCDF-Java Tools-UI GUI, it seems to be okay: Also note that here I just downloaded two of the datasets for testing, so I used

f = netCDF4.MFDataset('/usgs/data2/rsignell/models/ncep/narr/air.2m.19??.nc')

为例.对于所有数据,您可以使用

as an example. For all the data, you could use

f = netCDF4.MFDataset('/usgs/data2/rsignell/models/ncep/narr/air.2m.????.nc')

f = netCDF4.MFDataset('/usgs/data2/rsignell/models/ncep/narr/air.2m.*.nc')

这篇关于遍历netcdf文件并运行计算-Python或R的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆