使用Perl中散列哈希数组解析CSV文件 [英] Parsing a CSV file using array of hash of hashes in Perl

查看:185
本文介绍了使用Perl中散列哈希数组解析CSV文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有这种形式的CSV数据:

  Sl.No,Label,Type1,Type2 ... 
1,label1,Y,N,N ...
2,label2,N,Y,Y ...
...

其中Y和N表示是否将相应的标签打印到文件。

  while(< $ fh>){#刷新CSV档案

$ filter = $ _;
chomp $ filter;
$ filter =〜tr / \r // d;

if($。== 1){
@fieldNames = split,,$ filter;
}
else {
@fields = split,,$ filter;
$ numCustomers = scalar(@fields) - 2;
push @labels,$ fields [2];

for($ i = 0; $ i <$ numCustomers; $ i ++){

for($ j = 0; $ j< scalar(@labels) ; $ j ++){
$ customer [$ i] [$ j] = $ fields [2 + $ i];
}

$ custFile =customer。 $ i。 _external.h;

open($ fh1,>,$ custFile)或die无法为客户$ i创建外部头文件;
}
}
}

for($ i = 0; $ i< scalar(@labels); $ i ++){

for($ j = 0; $ j< $ numCustomers; $ j ++){

$ Hash {$ fieldNames [2 + $ i]} - > {$ labels [$ i] } = $ customer [$ j] [$ i];
push @aoh,%Hash; #散列散列
}
}

my @headerLines = read_file($ intFile); #读取内部文件,并将CSV文件中未标记为
#N的那些行仅复制到
#到外部文件。

#遍历每个哈希的元素,只有当值为'Y'时打印标签

foreach my $ headerLine(@headerLines){

chomp $ headerLine;

for $ i(0 .. $#aoh){

for my $ cust1(排序键%{$ aoh [$ i]}){#HERE

for my $ reqLabel1(keys%{$ aoh [$ i] {$ cust1}}){

print$ cust1,$ reqLabel1:$ aoh [$ i] { $ cust1} {$ reqLabel1} \\\
;

if($ aoh [$ i] {$ cust1} {$ reqLabel1} eqY){

for($ j = 0; $ j< $ numCustomers; $ j ++){
$ req [$ j] [$ i] = $ reqLabel1;
}
}
else {
for($ j = 0; $ j< $ numCustomers; $ j ++){
$ nreq [$ j] [$ i] = $ reqLabel1;
}
}
}

}

if(grep {$ headerLine =〜/ $ _ /} @nreq){
next; #不要在外部文件中打印此行
}
else {
print $ fh1 $ headerLine。 \\\
; #print外部文件中的这一行
}
}
}

这会引用不能使用字符串Type1作为哈希REF,引用标记为#HERE的行。



我试图在任何地方转储数据结构,但我不知道这是从哪里来的。



任何见解都将不胜感激。



我收到了反馈,表示使用 Text :: CSV 将是一个更好的解决方案。

解决方案

确定,你的问题通过 Text :: CSV



但你的问题其实是这样的:

  push @aoh,%Hash; #散列散列

这不会创建一个哈希数组。从%Hash 中提取所有元素(除了配对的键和值之外,没有特定顺序),并将它们插入到 @aoh





code> push @aoh,\%Hash;

或者:

  push @aoh,{%Hash}; 

我不是很清楚,因为你重用%Hash ,所以你可能会得到重复。这最好由 use strict; 使用警告; 来正确处理你的哈希值。


I have CSV data in this form:

Sl.No, Label, Type1, Type2...
1, "label1", Y, N, N...
2, "label2", N, Y, Y...
...

Where "Y" and "N" denote whether the corresponding label is to be printed to a file or not.

while ( <$fh> ) {    #Reading the CSV file

    $filter = $_;
    chomp $filter;
    $filter =~ tr/\r//d;

    if ( $. == 1 ) {
        @fieldNames = split ",", $filter;
    }
    else {
        @fields = split ",", $filter;
        $numCustomers = scalar(@fields) - 2;
        push @labels, $fields[2];

        for ( $i = 0; $i < $numCustomers; $i++ ) {

            for ( $j = 0; $j < scalar(@labels); $j++ ) {
                $customer[$i][$j] = $fields[ 2 + $i ];
            }

            $custFile = "customer" . $i . "_external.h";

            open( $fh1, ">", $custFile ) or die "Unable to create external header file for customer $i";
        }
    }
}

for ( $i = 0; $i < scalar(@labels); $i++ ) {

    for ( $j = 0; $j < $numCustomers; $j++ ) {

        $Hash{ $fieldNames[ 2 + $i ] }->{ $labels[$i] } = $customer[$j][$i];
        push @aoh, %Hash;    #Array of hashes
    }
}

my @headerLines = read_file($intFile);  # read the internal file, and copy only
                                        # those lines that are not marked with
                                        # "N" in the CSV file to the external file.

# iterate over elements of each hash and print the labels only if value is 'Y'

foreach my $headerLine (@headerLines) {

    chomp $headerLine;

    for $i ( 0 .. $#aoh ) {

        for my $cust1 ( sort keys %{ $aoh[$i] } ) {    #HERE

            for my $reqLabel1 ( keys %{ $aoh[$i]{$cust1} } ) {

                print "$cust1, $reqLabel1 : $aoh[$i]{$cust1}{$reqLabel1}\n";

                if ( $aoh[$i]{$cust1}{$reqLabel1} eq "Y" ) {

                    for ( $j = 0; $j < $numCustomers; $j++ ) {
                        $req[$j][$i] = $reqLabel1;
                    }
                }
                else {
                    for ( $j = 0; $j < $numCustomers; $j++ ) {
                        $nreq[$j][$i] = $reqLabel1;
                    }
                }
            }

        }

        if ( grep { $headerLine =~ /$_/ } @nreq ) {
            next;    #Don't print this line in the external file
        }
        else {
            print $fh1 $headerLine . "\n";    #print this line in the external file
        }
    }
}

This complains "Cannot use string Type1 as a hash REF", referring to the line marked as #HERE.

I've tried dumping data structures everywhere, but I'm not sure where this cropped up from.

Any insights would be appreciated.

I have received feedback that using Text::CSV would be a better solution. How would it reduce the need to use nested data structures?

解决方案

Ok, your problem gets a lot easier with Text::CSV. I would suggest looking at a rewrite, or reasking your question framing it as such.

But your problem is actually this:

push @aoh, %Hash;                #Array of hashes

That doesn't create an array of hashes at all. That extracts all the elements from %Hash (in no particular order, aside from keys and values being paired) and inserts them into @aoh.

You probably want:

push @aoh, \%Hash;

Or perhaps:

push @aoh, { %Hash }; 

I'm not entirely clear, because you're reusing %Hash, so you may get duplication. This is best dealt with by use strict; use warnings; and lexically scoping your hashes correctly.

这篇关于使用Perl中散列哈希数组解析CSV文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆