解析固定宽度的文件 [英] Parse fixed-width files

查看:130
本文介绍了解析固定宽度的文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有很多带有固定宽度字段的文本文件:

I have a lot of text files with fixed-width fields:

<c>     <c>       <c>
Dave    Thomas    123 Main
Dan     Anderson  456 Center
Wilma   Rainbow   789 Street

其余文件采用类似的格式,其中<c>将标记列的开头,但是它们具有各种(未知)列&空间宽度.解析这些文件的最佳方法是什么?

The rest of the files are in a similar format, where the <c> will mark the beginning of a column, but they have various (unknown) column & space widths. What's the best way to parse these files?

我尝试使用Text::CSV,但是由于没有定界符,因此很难获得一致的结果(除非我使用的模块错误):

I tried using Text::CSV, but since there's no delimiter it's hard to get a consistent result (unless I'm using the module wrong):

my $csv = Text::CSV->new();
$csv->sep_char (' ');

while (<FILE>){
    if ($csv->parse($_)) {
        my @columns=$csv->fields();
        print $columns[1] . "\n";
    }
}

推荐答案

如user604939所述,unpack是用于固定宽度字段的工具.但是,需要将unpack传递给模板以使用.由于您说字段可以更改宽度,因此解决方案是从文件的第一行构建此模板:

As user604939 mentions, unpack is the tool to use for fixed width fields. However, unpack needs to be passed a template to work with. Since you say your fields can change width, the solution is to build this template from the first line of your file:

my @template = map {'A'.length}        # convert each to 'A##'
               <DATA> =~ /(\S+\s*)/g;  # split first line into segments
$template[-1] = 'A*';                  # set the last segment to be slurpy

my $template = "@template";
print "template: $template\n";

my @data;
while (<DATA>) {
    push @data, [unpack $template, $_]
}

use Data::Dumper;

print Dumper \@data;

__DATA__
<c>     <c>       <c>
Dave    Thomas    123 Main
Dan     Anderson  456 Center
Wilma   Rainbow   789 Street

打印:


template: A8 A10 A*
$VAR1 = [
          [
            'Dave',
            'Thomas',
            '123 Main'
          ],
          [
            'Dan',
            'Anderson',
            '456 Center'
          ],
          [
            'Wilma',
            'Rainbow',
            '789 Street'
          ]
        ];

这篇关于解析固定宽度的文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆