将csv转换为netcdf [英] convert csv to netcdf
问题描述
我试图通过Python将.csv文件转换为netCDF4,但我无法弄清楚如何将信息从.csv表格式存储到netCDF。我主要关心的是我们如何声明的变量从列到一个可行的netCDF4格式?我发现的所有东西通常从netCDF4提取信息到.csv或ASCII。我已经提供了示例数据,示例代码和我的错误声明相应的数组。任何帮助将非常感激。
示例表如下:
站名国家代码纬度mn.yr temp1 temp2 temp3 hpa
某处US 12340 35.52 23.358 1.19 -8.3 -13.1 -5 69.5
某地US 12340 2.1971 -10.7 -13.9 -7.9 27.9
某地美国12340 3.1971 -8.4 -13 -4.3 90.8
我的示例代码是:
#!/ usr / bin / env python
import scipy
import numpy
import netCDF4
import csv
来自numpy import arange,dtype
#Declare空数组
v1 = []
v2 = []
v3 = []
v4 = []
b $ b
#打开csv文件并为每个标题声明数组的变量
f = open('station_data.csv','r')。readlines()
在f [1:]中的行:
fields = line.split(',')
v1.append(fields [0])#station
v2.append(fields [1])#country
v3.append(int(fields [2]))#code
v4.append(float(fields [3]))#lat
v5.append(float(fields [3]))#lon
包含更多的变量,但这只是一个简略列表
print v1
print v2
print v3
print v4
#convert to netcdf4 framework that as a netcdf
ncout = netCDF4.Dataset('station_data。 nc','w')
#纬度和经度。包含缺少数字的NaN
lats_out = -25.0 + 5.0 * arange(v4,dtype ='float32')
lons_out = -125.0 + 5.0 * arange(v5,dtype ='float32')
#output data。
press_out = 900. + arange(v4 * v5,dtype ='float32 ')#1d array
press_out.shape =(v4,v5)#reshape to 2d array
temp_out = 9. + 0.25 * arange(v4 * v5,dtype ='float32')#1d array
temp_out.shape =(v4,v5)#reshape to 2d array
创建纬度和经度维。
ncout.createDimension('latitude',v4)
ncout .createDimension('longitude',v5)
#定义坐标变量。它们将保存坐标信息
lats = ncout.createVariable('latitude',dtype('float32') .char,('latitude',))
lons = ncout.createVariable('longitude',dtype('float32')。char,('longitude',))
#分配单位属性以协调var数据。
lats.units ='degrees_north '
lons.units ='degrees_east'
#write data to coordinate vars
lats [:] = lats_out
lons [:] = lons_out
#创建压力和温度变量
press = ncout.createVariable('pressure',dtype('float32')。char,('latitude','longitude'))
temp = ncout.createVariable ('temperature',dtype('float32')。char,'latitude','longitude'))
< #设置单位属性。press.units ='hPa'
temp.units ='celsius'
#将数据写入变量。 / p>
按[:] = press_out
temp [:] = temp_out
ncout.close ()
f.close()
错误:
跟踪(最近一次调用):
文件station_data.py,第33行,在< module>
v4.append(float(fields [3]))#lat
ValueError:无法将字符串转换为float:
解决方案如果您看到输入文件,则第二行中没有与 Lat 列对应的值。
当你读取csv文件时,这个值即fields [3]
被存储为一个空字符串。这就是为什么你得到一个
ValueError
。
而不是使用默认的函数,你可以定义一个新的函数可以处理这个错误:def str_to_float ):
try:
number = float(str)
,除了ValueError:
number = 0.0
#你可以分配一个合适的值,
return number
现在你可以使用这个函数来代替内置的float function方式:
v4.append(str_to_float(fields [3]))
I am trying to convert a .csv file to a netCDF4 via Python but I am having trouble figuring out how I can store information from a .csv table format into a netCDF. My main concern is how do we declare the variables from the columns into a workable netCDF4 format? Everything I have found is normally extracting information from a netCDF4 to a .csv or ASCII. I have provided the sample data, sample code, and my errors for declaring the appropriate arrays. Any help would be much appreciated.
The sample table is below:
Station Name Country Code Lat Lon mn.yr temp1 temp2 temp3 hpa Somewhere US 12340 35.52 23.358 1.19 -8.3 -13.1 -5 69.5 Somewhere US 12340 2.1971 -10.7 -13.9 -7.9 27.9 Somewhere US 12340 3.1971 -8.4 -13 -4.3 90.8
My sample code is:
#!/usr/bin/env python
import scipy import numpy import netCDF4 import csv from numpy import arange, dtype
#Declare empty arrays
v1 = [] v2 = [] v3 = [] v4 = []
# Open csv file and declare variable for arrays for each heading
f = open('station_data.csv', 'r').readlines() for line in f[1:]: fields = line.split(',') v1.append(fields[0]) #station v2.append(fields[1])#country v3.append(int(fields[2]))#code v4.append(float(fields[3]))#lat v5.append(float(fields[3]))#lon #more variables included but this is just an abridged list print v1 print v2 print v3 print v4
#convert to netcdf4 framework that works as a netcdf
ncout = netCDF4.Dataset('station_data.nc','w')
# latitudes and longitudes. Include NaN for missing numbers
lats_out = -25.0 + 5.0*arange(v4,dtype='float32') lons_out = -125.0 + 5.0*arange(v5,dtype='float32')
# output data.
press_out = 900. + arange(v4*v5,dtype='float32') # 1d array press_out.shape = (v4,v5) # reshape to 2d array temp_out = 9. + 0.25*arange(v4*v5,dtype='float32') # 1d array temp_out.shape = (v4,v5) # reshape to 2d array
# create the lat and lon dimensions.
ncout.createDimension('latitude',v4) ncout.createDimension('longitude',v5)
# Define the coordinate variables. They will hold the coordinate information
lats = ncout.createVariable('latitude',dtype('float32').char,('latitude',)) lons = ncout.createVariable('longitude',dtype('float32').char,('longitude',))
# Assign units attributes to coordinate var data. This attaches a text attribute to each of the coordinate variables, containing the units.
lats.units = 'degrees_north' lons.units = 'degrees_east'
# write data to coordinate vars.
lats[:] = lats_out lons[:] = lons_out
# create the pressure and temperature variables
press = ncout.createVariable('pressure',dtype('float32').char,('latitude','longitude')) temp = ncout.createVariable('temperature',dtype('float32').char,'latitude','longitude'))
# set the units attribute.
press.units = 'hPa' temp.units = 'celsius'
# write data to variables.
press[:] = press_out temp[:] = temp_out ncout.close() f.close()
error:
Traceback (most recent call last): File "station_data.py", line 33, in <module> v4.append(float(fields[3]))#lat ValueError: could not convert string to float:
解决方案If you see your input file, there is no value corresponding to column Lat in second row. When you read the csv file this value i.e.
fields[3]
is stored as an empty string""
. That's why you are getting aValueError
. Instead of using the default function you can define a new function which can handle this error:def str_to_float(str): try: number = float(str) except ValueError: number = 0.0 # you can assign an appropriate value instead of 0.0 which suits your requirement return number
Now you can use this function in place of built-in float function this way:
v4.append(str_to_float(fields[3]))
这篇关于将csv转换为netcdf的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!