Perl - Regexp操纵.csv [英] Perl - Regexp to manipulate .csv
问题描述
我在Perl中有一个函数读取文件夹中最后修改的.csv,并将其值解析为变量。
I've got a function in Perl that reads the last modified .csv in a folder, and parses it's values into variables.
我发现一些问题与正则表达式。
我的.csv看起来像:
I'm finding some problems with the regular expressions. My .csv look like:
Title is: "NAME_NAME_NAME"
"Period end","Duration","Sample","Corner","Line","PDP OUT TOTAL","PDP OUT OK","PDP OUT NOK","PDP OUT OK Rate"
"04/12/2014 11:00:00","3600","1","GPRS_OUT","ARG - NAME 1","536","536","0","100%"
"04/12/2014 11:00:00","3600","1","GPRS_OUT","USA - NAME 2","1850","1438","412","77.72%"
"04/12/2014 11:00:00","3600","1","GPRS_OUT","AUS - NAME 3","8","6","2","75%"
.(ignore this dot, you will understand later)
我有一些帮助来解析值到一些变量,通过:
So far, I've had some help to parse the values into some variables, by:
open my $file, "<", $newest_file
or die qq(Cannot open file "$newest_file" for reading.);
while ( my $line = <$file> ) {
my ($date_time, $duration, $sample, $corner, $country_name, $pdp_in_total, $pdp_in_ok, $pdp_in_not_ok, $pdp_in_ok_rate)
= parse_line ',', 0, $line;
my ($date, $time) = split /\s+/, $date_time;
my ($country, $name) = $country_name =~ m/(.+) - (.*)/;
print "$date, $time, $country, $name, $pdp_in_total, $pdp_in_ok_rate";
}
问题是:
- 我不知道如何忽略第一个AND第二行(即来自.csv的列名称);
- 文件有时在文件末尾有2-5个空行,如我在示例中所示(忽略其末尾的点,它不存在于文件中)。
如何做到这一点?
推荐答案
csv文件与列标题,并希望将数据解析为变量,最简单的选择是使用 Text :: CSV
。此代码显示如何将您的数据导入散列引用 $ row
。 (即 my%data =%$ row
)
When you have a csv file with column headers and want to parse the data into variables, the simplest choice would be to use Text::CSV
. This code shows how you get your data into the hash reference $row
. (I.e. my %data = %$row
)
use strict;
use warnings;
use Text::CSV;
use feature 'say';
my $csv = Text::CSV->new({
binary => 1,
eol => $/,
});
# open the file, I use the DATA internal file handle here
my $title = <DATA>;
# Set the headers using the header line
$csv->column_names( $csv->getline(*DATA) );
while (my $row = $csv->getline_hr(*DATA)) {
# you can now access the variables via their header names, e.g.:
if (defined $row->{Duration}) { # this will skip the blank lines
say $row->{Duration};
}
}
__DATA__
Title is: "NAME_NAME_NAME"
"Period end","Duration","Sample","Corner","Line","PDP IN TOTAL","PDP IN OK","PDP IN NOT OK","PDP IN OK Rate"
"04/12/2014 10:00:00","3600","1","GRPS_INB","CHN - Name 1","1198","1195","3","99.74%"
"04/12/2014 10:00:00","3600","1","GRPS_INB","ARG - Name 2","1198","1069","129","89.23%"
"04/12/2014 10:00:00","3600","1","GRPS_INB","NLD - Name 3","813","798","15","98.15%"
如果我们用 Data :: Dumper $打印
$ row
c $ c>,它显示了我们从 Text :: CSV
中获得的结构:
If we print one of the $row
variables with Data::Dumper
, it shows the structure we are getting back from Text::CSV
:
$VAR1 = {
'PDP IN TOTAL' => '1198',
'PDP IN NOT OK' => '3',
'PDP IN OK' => '1195',
'Period end' => '04/12/2014 10:00:00',
'Line' => 'CHN - Name 1',
'Duration' => '3600',
'Sample' => '1',
'PDP IN OK Rate' => '99.74%',
'Corner' => 'GRPS_INB'
};
这篇关于Perl - Regexp操纵.csv的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!