Fortran语言:如何做阅读未知大小的文件时,我分配数组? [英] Fortran: How do I allocate arrays when reading a file of unknown size?

查看:1010
本文介绍了Fortran语言:如何做阅读未知大小的文件时,我分配数组?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的典型使用的Fortran开始阅读未知大小(通常5-100MB)的文件。我目前的做法给数组分配涉及读取文件的两倍。首先,以确定问题(分配阵列)和第二时间来读取数据到这些数组的大小。

My typical use of Fortran begins with reading in a file of unknown size (usually 5-100MB). My current approach to array allocation involves reading the file twice. First to determine the size of the problem (to allocate arrays) and a second time to read the data into those arrays.

有没有更好的方法来确定大小/数组分配?我刚刚看了一下在另一篇文章,似乎更容易自动数组分配(下面的例子)。

Are there better approaches to size determination/array allocation? I just read about automatic array allocation (example below) in another post that seemed much easier.

array = [array,new_data]

什么是所有选项及其利弊?

What are all the options and their pros and cons?

推荐答案

我会咬,但问题是摇摇欲坠接近到离现实性。你的选择是:

I'll bite, though the question is teetering close to off-topicality. Your options are:


  1. 读取文件一次得到数组的大小,分配,再次阅读。

  2. 读片逐片,(重新)分配,当您去。选择片的大小如你所愿(或者,也许,你认为可能是最快速的为你的情况)来读取。

  3. 始终,一如既往地与包含元数据文件的工作,告诉一个感兴趣的节目有多少数据是;例如一个块
    标题行告诉你的元素有多少数据是下一个
    块。

选项3是最好迄今。一点点额外的思想,code约一整条生产线,在项目的开始和这么多的时间和精力浪费保存下来就行了。你不必跳上HDF5或类似的重量级文件的设计方法,只是采取足够的纪律来维持的文件内容的使用寿命。对于迭代-通过迭代从宇宙的模拟转储,家庭自酿的做法会做(说实话,你是谁的曾经去看看他们的唯一的人)。对于在$ 1M每TB的成本大约收集的数据(卫星观测,海上地震的痕迹,的的),那么HDF5或类似的东西。

Option 3 is the best by far. A little extra thought, and about one whole line of code, at the beginning of a project and so much wasted time and effort saved down the line. You don't have to jump on HDF5 or a similar heavyweight file design method, just adopt enough discipline to last the useful life of the contents of the file. For iteration-by-iteration dumps from your simulation of the universe, a home-brewed approach will do (be honest, you're the only person who's ever going to look at them). For data gathered at an approximate cost of $1M per TB (satellite observations, offshore seismic traces, etc) then HDF5 or something similar.

选项1是好的了。它不象,你必须等待需要倒带时看书了。 (嗯,有些做的,但他们是在一个利基这些天来,和去归档系统往往会从磁带文件移动到磁盘,如果他们使用。)

Option 1 is fine too. It's not like you have to wait for the tapes to rewind between reads any more. (Well, some do, but they're in a niche these days, and a de-archiving system will often move files from tape to disk if they're to be used.)

选项2是一个faff。它也可能是最差的,但在所有但最大文件表现最差可能是最好的纳米世纪内。如果这对你很重要,然后检查一下。

Option 2 is a faff. It may also be the worst performing but on all but the largest files the worst performance may be within a nano-century of the best. If that's important to you then check it out.

如果你想要我的意见量化上的文件在你的硬件上运行自己的实验。

If you want quantification of my opinions run your own experiments on your files on your hardware.

PS我真的没有一点头绪付出多大的代价来获得卫星或地震1TB的数据,它的发明来支持参数的的factoid。

PS I haven't really got a clue how much it costs to get 1TB of satellite or seismic data, it's a factoid invented to support an argument.

这篇关于Fortran语言:如何做阅读未知大小的文件时,我分配数组?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆