通过Text :: CSV_XS解析时出错? [英] Bug with parsing by Text::CSV_XS?

查看:143
本文介绍了通过Text :: CSV_XS解析时出错?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

试图使用Text :: CSV_XS解析一些日志.但是,以下代码无法实现我的期望-根据分隔符" "将行分成几部分.

Tried to use Text::CSV_XS to parse some logs. However, the following code doesn't do what I expected -- split the line into pieces according to separator " ".

有趣的是,如果我删除字符串$a中的双引号,那么它将进行拆分.

The funny thing is, if I remove the double quote in the string $a, then it will do splitting.

想知道这是一个错误还是我错过了什么.谢谢!

Wonder if it's a bug or I missed something. Thanks!

use Text::CSV_XS;

$a = 'id=firewall time="2010-05-09 16:07:21 UTC"';

$userDefinedSeparator = Text::CSV_XS->new({sep_char => " "});
print "$userDefinedSeparator\n";
$userDefinedSeparator->parse($a);
my $e;
foreach $e ($userDefinedSeparator->fields) {
    print $e, "\n";
}

在上面的代码片段中,我将=(在time之后)更改为空格,然后可以正常工作.开始怀疑这到底是否是一个错误?

In the above code snippet, it I change the = (after time) to be a space, then it works fine. Started to wonder whether this is a bug after all?

$a = 'id=firewall time "2010-05-09 16:07:21 UTC"';

推荐答案

您已经通过将引号和转义符都设置为双引号"来使模块感到困惑,然后将其嵌入在所需的字段中分裂.

You have confused the module by leaving both the quote character and the escape character set to double quote ", and then left them embedded in the fields you want to split.

禁用quote_charescape_char,就像这样

use strict;
use warnings;

use Text::CSV_XS;

my $string = 'id=firewall time="2010-05-09 16:07:21 UTC"';

my $space_sep = Text::CSV_XS->new({
   sep_char    => ' ',
   quote_char  => undef,
   escape_char => undef,
});

$space_sep->parse($string);

for my $field ($space_sep->fields) {
    print "$field\n";
}

输出

id=firewall
time="2010-05-09
16:07:21
UTC"

但是请注意,您已经实现了与print "$_\n" for split ' ', $string完全相同的功能,因此更可取,因为它效率更高且更简洁.

But note that you have achieved exactly the same things as print "$_\n" for split ' ', $string, which is to be preferred as it is both more efficient and more concise.

此外,您必须始终 use strictuse warnings;和从不使用$a$b作为变量名,这是因为它们被sort使用并且无意义且没有描述性.

In addition, you must always use strict and use warnings; and never use $a or $b as variable names, both because they are used by sort and because they are meaningless and undescriptive.

更新

正如@ThisSuitIsBlackNot所指出的那样,您的意图可能不是分割空格而是提取一系列key=value对.如果是这样,则此方法会将值直接放入哈希中.

As @ThisSuitIsBlackNot points out, your intention is probably not to split on spaces but to extract a series of key=value pairs. If so then this method puts the values straight into a hash.

use strict;
use warnings;

my $string = 'id=firewall time="2010-05-09 16:07:21 UTC"';

my %data = $string =~ / ([^=\s]+) \s* = \s* ( "[^"]*" | [^"\s]+ ) /xg;

use Data::Dump;
dd \%data;

输出

{ id => "firewall", time => "\"2010-05-09 16:07:21 UTC\"" }


更新

该程序将提取两个name=value字符串并将它们打印在单独的行上.

This program will extract the two name=value strings and print them on separate lines.

use strict;
use warnings;

my $string = 'id=firewall time="2010-05-09 16:07:21 UTC"';

my @fields = $string =~ / (?: "[^"]*" | \S )+ /xg;

print "$_\n" for @fields;

输出

id=firewall
time="2010-05-09 16:07:21 UTC"

这篇关于通过Text :: CSV_XS解析时出错?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆