如何将文本文件分成两个数组? [英] How to split a text file into two arrays?

查看:51
本文介绍了如何将文本文件分成两个数组?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在解析一个文本文件,如下所示:

I am parsing a text file that looks like this:

ABCD
EFGH
IJKL

MNOP
QRST
UVWX

是否有可能以导致两个4x3数组的方式在Perl中进行解析?例如, array1 [2] [2] = K 和array2 [0] [1] = N .代码:

Is it possible to parse this in Perl in a way which results in two 4x3 arrays? So for example, array1[2][2] = K and array2[0][1] = N. Code:

#!/usr/bin/perl
use strict;
use warnings;
use diagnostics;

open(FH, '<', 'gwas.txt') or die "Couldn't open file $!";

while(<FH>) {

    #parse file into 2 arrays
}
close(FH);

推荐答案

注释中解释的程序,精简

The procedure explained in a comment, condensed

my @matrix = map { [ split '', $_ ]  } <$fh>;

I/O运算符),从而由 map 中的块和返回的列表分配给 @matrix .

The diamond operator <> in the list context returns all lines (see I/O operators), whereby each is processed by the block in map and the returned list is assigned to @matrix.

在块中 split 中断每一行(匿名数组...] ).给定 split 的默认设置,可以将其写为 map {[split'']} .

In the block split breaks each line ($_) into characters (''), and an anonymous array is made of that list ([...]). Given split's defaults this can be written as map { [ split '' ] }.

总是使用 lexical filehandles ,那样最好.

my $file = 'gwas.txt';
open my $fh, '<', $file or die "Couldn't open $file: $!";


正如注释中指出的那样,这会将整个文件处理为一个数组.要处理两个文本块,每个文本块都放入自己的数组中,我们可以将其编写为循环(并使用空行来区分块).


As pointed out in comments, this processes the whole file into one array. To process two blocks of text, each into its own array, we can write it as a loop (and use empty lines to distinguish blocks).

my @matrix;
my $index = 0;    
while (<$fh>) {
    $matrix[$index++] = [ split '', $_ ];
}

这将创建一个带有行元素的匿名数组 [...] ,并将其分配给数组中的 $ index 点@matrix (并增加索引).另一种方法是

This makes an anonymous array [ ... ] with line elements and assigns it to $index spot in the array @matrix (and increments index). Another way of doing this is

my @row = split '', $_;
$matrix[$index++] = \@row;

在每个迭代上构造一个新数组,并为其分配一个引用.

where a new array is constructed on every iteration and a reference to it assigned.

然后,我们需要使用空行来区分块.我们还需要管理两个数组,这可以通过引用另一个数据结构(例如数组)中的数组(矩阵)来很好地完成.

Then we need to use empty lines to tell blocks apart. We also need to manage the two arrays, what is nicely done by having references to arrays (matrices) in another data structure, say an array.

use warnings;
use strict;
use Data::Dump qw(dd);

my $matrices;  # will be an arrayref, for references to matrices

my $file = 'matrices.txt';
open my $fh, '<', $file or die "Can't open $file: $!";

my @matrix;
my $index = 0;   
while (<$fh>) {
    chomp;

    if (/^\s*$/) {                     # blank line, done with one matrix
        $index = 0;                    # reset index
        push @$matrices, [ @matrix ];  # store anonymous array for @matrix
    }
    else {
        @matrix[$index] = [ split '', $_ ];
        ++$index;
    }
}
push @$matrices, [ @matrix ];          # the last one in the file

close $fh;

print "Spot check: \$matrices->[0][2][2]: $matrices->[0][2][2]\n";
dd($matrices);

这里假设有关数据,通常具有确切的预期格式.

This holds assumptions about the data, in general that it has the exact expected format.

请参阅有关参考资料 perlreftut 的教程以及有关数据结构的食谱 perldsc .

Please see the tutorial on references perlreftut and a cookbook on data strucures perldsc.

另请参见xxfelixxx的答案,所有方式都略有不同.

Also see the answer by xxfelixxx, with all this in a very slightly different way.

还有许多其他方法可以做到这一点.

There are a number of other ways to do this.

这篇关于如何将文本文件分成两个数组?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆