读取未知长度的字符串 [英] Reading a character string of unknown length

查看:502
本文介绍了读取未知长度的字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的任务是编写一个Fortran 95程序,它将从文件中读取字符输入,然后(开始)再次将其重新输出。
棘手的部分是这些输入行的长度不同(没有给出最大长度),文件中可以有任意数量的行。



我使用了

  do 
read(1,*,iostat =读到EOF - 好!
if(IO< 0)exit!如果到达EOF,退出do
I = I + 1
NumRec = I!以后用于总数。的记录
分配(Seq(I))
Seq(I)= DNA
print *,I,Seq(I)
X = Len_Trim单个序列的长度
print *,'序列大小:',X
打印*
结束do

但是,我的初始语句列表

 字符(100) :: Seq 
character(100)DNA

和相应的整数等。



我想我要问的是,如果有任何方式不列出字符串的大小在第一个实例。说我有一个200多个字符的DNA字符串,然后另一个只有25,有一种方法,该程序可以读取什么,而不需要包括所有额外的空白?可以这样做而不需要使用 len_trim ,因为它不能在声明语句中引用?

解决方案

要逐步读取Fortran 95中的记录,请使用非前进输入。例如:

  CHARACTER(10):: buffer 
INTEGER :: size
READ (A),ADVANCE ='NO',SIZE = size,EOR = 10,END = 20)buffer


$ b b

每次调用时最多读取10个字符(缓冲区的长度)。一旦整个记录已被一系列一个或多个非前进读取读取,文件位置将仅前进到下一个记录(下一行)。



除了文件结束条件, size 变量将被定义为读入<



EOR END 和说明符用于在记录结束或文件结束条件分别发生时控制执行流程(执行将跳转到适当标记的语句)。您还可以使用 IOSTAT 说明符来检测这些条件,但是用于这两个条件的特定负值是与处理器有关的。



您可以在特定记录中合计 size 以计算该特定记录的长度。





在Fortran 95中,可以在文件的末尾添加一个文本文件,局部字符变量的长度规范必须是规范表达式 - 实质上是可以在包含变量声明的作用域的第一个可执行语句之前进行安全计算的表达式。常量表示最简单的情况,但是过程中的规范表达式可以涉及该过程的哑元参数,等等。



阅读任意长度的整个记录​​是多阶段过程:




  • 使用一系列增量读取来确定当前记录的长度。

  • Backspace
  • code>将文件返回到感兴趣的记录。
  • 调用过程,将当前记录的长度作为虚拟参数传递。

  • 在该过程中,使用正常的前进输入将当前记录读入该字符变量。

  • 对该字符变量进行进一步处理!



请注意,每条记录最后被读取两次 - 确定其长度,第二个实际上将数据读入正确的长度字符变量。



存在使用可分配(或自动)字符数组长度为1 。总体战略是一样的。有关示例,请查看常见ISO_VARYING_STRING实施中的Get过程的代码



Fortran 2003引入了延迟长度字符变量,它们的长度可以由alloc语句中的任意表达式指定,或者对于可分配变量,由赋值语句中右侧的长度指定。这(结合其他可分配增强)允许确定记录长度的渐进读取也构建保持记录的内容的字符变量。您的主管需要将他的Fortran环境更新到最新。


I have been tasked with writing a Fortran 95 program that will read character input from a file, and then (to start with) simply spit it back out again. The tricky part is that these lines of input are of varying length (no maximum length given) and there can be any number of lines within the file.

I've used

    do
      read( 1, *, iostat = IO ) DNA    ! reads to EOF -- GOOD!!
      if ( IO < 0 ) exit               ! if EOF is reached, exit do
      I = I + 1
      NumRec = I                       ! used later for total no. of records
      allocate( Seq(I) )
      Seq(I) = DNA
      print*, I, Seq(I)
      X = Len_Trim( Seq(I) )           ! length of individual sequence
      print*, 'Sequence size: ', X
      print*
    end do

However, my initial statements list

    character(100), dimension(:), allocatable :: Seq
    character(100)  DNA

and the appropriate integers etc.

I guess what I'm asking is if there is any way to NOT list the size of the character strings in the first instance. Say I've got a string of DNA that is 200+ characters, and then another that is only 25, is there a way that the program can just read what there is and not need to include all the additional blanks? Can this be done without needing to use len_trim, since it can't be referenced in the declaration statements?

解决方案

To progressively read a record in Fortran 95, use non-advancing input. For example:

CHARACTER(10) :: buffer
INTEGER :: size
READ (unit, "(A)", ADVANCE='NO', SIZE=size, EOR=10, END=20) buffer

will read up to 10 characters worth (the length of buffer) each time it is called. The file position will only advance to the next record (the next line) once the entire record has been read by a series of one or more non-advancing reads.

Barring an end of file condition, the size variable will be defined with the actual number of characters read into buffer each time the read statement is executed.

The EOR and END and specifiers are used to control execution flow (execution will jump to the appropriately labelled statement) when end of record or end of file conditions occur respectively. You can also use an IOSTAT specifier to detect these conditions, but the particular negative values to use for the two conditions are processor dependent.

You can sum size within a particular record to work out the length of that particular record.

Wrap such a non-advancing read in a loop that appropriately detects for end of file and end of record and you have the incremental reading part.

In Fortran 95, the length specification for a local character variable must be a specification expression - essentially an expression that can be safely evaluated prior to the first executable statement of the scope that contains the variable's declaration. Constants represent the simplest case, but a specification expression in a procedure can involve dummy arguments of that procedure, amongst other things.

Reading the entire record of arbitrary length in is then a multi stage process:

  • Determine the length of the current record by using a series of incremental reads. These incremental reads for a particular record finish when the end of record condition occurs, at which time the file position will have moved to the next record.
  • Backspace the file back to the record of interest.
  • Call a procedure, passing the length of the current record as a dummy argument. Inside that procedure have an character variable whose length is given by the dummy argument.
  • Inside that called procedure, read the current record into that character variable using normal advancing input.
  • Carry out further processing on that character variable!

Note that each record ends up being read twice - once to determine its length, the second to actually read the data into the correctly "lengthed" character variable.

Alternative approaches exist that use allocatable (or automatic) character arrays of length one. The overall strategy is the same. Look at the code of the Get procedures in the common ISO_VARYING_STRING implementation for an example.

Fortran 2003 introduces deferred length character variables, which can have their length specified by an arbitrary expression in an allocate statement or, for allocatable variables, by the length of the right hand side in an assignment statement. This (in conjunction with other "allocatable" enhancements) allows the progressive read that determines the record length to also build the character variable that holds the contents of the record. Your supervisor needs to bring his Fortran environment up to date.

这篇关于读取未知长度的字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆