如何在numpy中读取非结构化ASCII数据? [英] How to read unstructured ASCII data in numpy?

查看:78
本文介绍了如何在numpy中读取非结构化ASCII数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要将非结构化ASCII数据读取到numpy数组中.例如,文件可能如下所示:

I need to read unstructured ASCII data into numpy arrays. As an example, a file could look like this:

August 2005  OMI/MLS Tropo O3 Column (Dobson Units) X 10
Longitudes:  288 bins centered on 179.375W to 179.375E (1.25 degree steps)
Latitudes:  120 bins centered on -59.5S to 59.5N (1.00 degree steps)
 328322313320278255239243234240225243250276274188185228257307324334334266313
 302258249235303178184163133153233228193216245221235281235224200210217230239
 191168179199198202222218245272269260258253218217210250231221213216240220230
 216279262220205244255248266272235220215247221247253256261267284338317329327
 275288270253286272233215227999999999999999999999999999999999999999999999239
 242999999999999999999999999999999999999999999999999999999999999999999999999
 999999999999999999999999999636424663381417483472317200246338302140324258325
 317230243347274290259261255330322375318317342306373366375352345250278335368
 375393564999999999999999999999999999438341418448231272245265308299313365342
 345314325296273307328359375259284351376369317330358317321366329340334339373
 407376272226292357341348382369355358374361347367379368403379381398398391323
 347378367379364312306309280258236214206  lat =  -59.5
 316310310293280262250206199190174179239247204207187196190270280309302278294
 308261231270273168191184156219199179222218215193232261268223237236261272214
 236220178158177207189221200198234246226229180204217215226241245235215222225
 209205234227275264281264261234208289284263250249258265225251284273276301269
 239243250255236228260229255329236284274231245262999999999999999999999999999
 999999999999999459638999999999999999999999999999999999999999999999999999999
 366999999999999999999582427465386389430321336350319362400413409449373362351
 271274248359373294236244235229267275324307397319313380360399346279304265237
 247239249134219323393348313334215295273333329373309298298304349363369356338
 319343300279282287322317319324342311372379331353318288305319319373341352331
 354353342325316319356388409388344360388383361374397365361341379362384403407
 350343334328301279293228252243246231241  lat =  -58.5

跳过前三行后,随后的每12行包含总2D数组的一行.它由三位数字组成的int串联在一起.

After skipping the first three lines, each following 12 lines contain one row of the total 2D array. It's concatenated ints of three digits each.

是否有一种在numpy中很好地做到这一点的方法? loadtxt需要一个delimiter关键字,但是这里没有定界符,所以我迷路了.

Is there a way of doing this (somewhat) nicely in numpy? loadtxt needs a delimiter keyword, but I don't have a delimiter here, so I'm lost.

当然可以手动完成所有操作,即手动读取文件,对行进行计数,分割字符串并分别进行转换.但这很麻烦.所以我正在寻找更优雅的东西.

Of course it's possible to do this all by hand, i.e. manually reading the file, counting lines, splitting strings, and converting them individually. But that's quite cumbersome. So I'm looking for something more elegant.

编辑:lat = XXXXX可以忽略.我可以轻松地从标头信息中重建纬度.

the lat = XXXXX can be ignored. I can easily reconstruct the latitudes from the header information.

推荐答案

有点hack,但是它没有手工" 读取.

A bit of a hack, but it doesn't read it "by hand".

nrows = 2
ncols = 25

nlines = 12
lastline = 13

a = np.genfromtxt('tmp.txt',
                  skip_header=3,
                  delimiter=[4]+[3,]*(ncols-1),
                  comments='l',
                  dtype=int)
a = a.reshape(nrows,-1)[:,:ncols*(nlines-1)+lastline]

您可以使用delimiter = [length of widths],它对您来说是[4, 3, 3, 3, ...],因为每行的第一个值都有一个空格,使其宽度为4.

You can use delimiter = [length of widths] which for you is [4, 3, 3, 3, ...] because the first value of each row has a space, which makes its width 4.

您可以使用comments = 'l'

最大的问题是,您必须先整形然后切断(因为最后一个文件行较短,数组不是矩形"的,所以用-1填充.这需要您对形状有所了解. /文件大小.

The biggest problem is that you have to reshape and then cut off (because the last file line is shorter, the array is not 'rectangular' so it fills with -1s. This requires you to know something about the shape/size of your file.

这篇关于如何在numpy中读取非结构化ASCII数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆