Perl - Regexp操纵.csv [英] Perl - Regexp to manipulate .csv

查看:178
本文介绍了Perl - Regexp操纵.csv的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在Perl中有一个函数读取文件夹中最后修改的.csv,并将其值解析为变量。

I've got a function in Perl that reads the last modified .csv in a folder, and parses it's values into variables.

我发现一些问题与正则表达式。
我的.csv看起来像:

I'm finding some problems with the regular expressions. My .csv look like:

Title is: "NAME_NAME_NAME"
"Period end","Duration","Sample","Corner","Line","PDP OUT TOTAL","PDP OUT OK","PDP OUT NOK","PDP OUT OK Rate"
"04/12/2014 11:00:00","3600","1","GPRS_OUT","ARG - NAME 1","536","536","0","100%"
"04/12/2014 11:00:00","3600","1","GPRS_OUT","USA - NAME 2","1850","1438","412","77.72%"
"04/12/2014 11:00:00","3600","1","GPRS_OUT","AUS - NAME 3","8","6","2","75%"


.(ignore this dot, you will understand later)

我有一些帮助来解析值到一些变量,通过:

So far, I've had some help to parse the values into some variables, by:

open my $file, "<", $newest_file
        or die qq(Cannot open file "$newest_file" for reading.);
while ( my $line = <$file> ) {

    my ($date_time, $duration, $sample, $corner, $country_name, $pdp_in_total, $pdp_in_ok, $pdp_in_not_ok, $pdp_in_ok_rate) 
            = parse_line ',', 0, $line;

    my ($date, $time) = split /\s+/, $date_time;
    my ($country, $name) = $country_name =~ m/(.+) - (.*)/;

    print "$date, $time, $country, $name, $pdp_in_total, $pdp_in_ok_rate";
}

问题是:


  1. 我不知道如何忽略第一个AND第二行(即来自.csv的列名称);

  2. 文件有时在文件末尾有2-5个空行,如我在示例中所示(忽略其末尾的点,它不存在于文件中)。

如何做到这一点?

推荐答案

csv文件与列标题,并希望将数据解析为变量,最简单的选择是使用 Text :: CSV 。此代码显示如何将您的数据导入散列引用 $ row 。 (即 my%data =%$ row

When you have a csv file with column headers and want to parse the data into variables, the simplest choice would be to use Text::CSV. This code shows how you get your data into the hash reference $row. (I.e. my %data = %$row)

use strict;
use warnings;
use Text::CSV;
use feature 'say';

my $csv = Text::CSV->new({
        binary  => 1,
        eol => $/,
    });
# open the file, I use the DATA internal file handle here
my $title = <DATA>;

# Set the headers using the header line
$csv->column_names( $csv->getline(*DATA) );

while (my $row = $csv->getline_hr(*DATA)) {
    # you can now access the variables via their header names, e.g.:
    if (defined $row->{Duration}) {  # this will skip the blank lines
        say $row->{Duration};
    }
}

__DATA__
Title is: "NAME_NAME_NAME"    
"Period end","Duration","Sample","Corner","Line","PDP IN TOTAL","PDP IN OK","PDP IN NOT OK","PDP IN OK Rate"
"04/12/2014 10:00:00","3600","1","GRPS_INB","CHN - Name 1","1198","1195","3","99.74%"
"04/12/2014 10:00:00","3600","1","GRPS_INB","ARG - Name 2","1198","1069","129","89.23%"
"04/12/2014 10:00:00","3600","1","GRPS_INB","NLD - Name 3","813","798","15","98.15%"

如果我们用 Data :: Dumper $ row c $ c>,它显示了我们从 Text :: CSV 中获得的结构:

If we print one of the $row variables with Data::Dumper, it shows the structure we are getting back from Text::CSV:

$VAR1 = {
          'PDP IN TOTAL' => '1198',
          'PDP IN NOT OK' => '3',
          'PDP IN OK' => '1195',
          'Period end' => '04/12/2014 10:00:00',
          'Line' => 'CHN - Name 1',
          'Duration' => '3600',
          'Sample' => '1',
          'PDP IN OK Rate' => '99.74%',
          'Corner' => 'GRPS_INB'
        };

这篇关于Perl - Regexp操纵.csv的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆