通过Text :: CSV_XS解析时出错? [英] Bug with parsing by Text::CSV_XS?
问题描述
试图使用Text :: CSV_XS解析一些日志.但是,以下代码无法实现我的期望-根据分隔符" "
将行分成几部分.
Tried to use Text::CSV_XS to parse some logs. However, the following code doesn't do what I expected -- split the line into pieces according to separator " "
.
有趣的是,如果我删除字符串$a
中的双引号,那么它将进行拆分.
The funny thing is, if I remove the double quote in the string $a
, then it will do splitting.
想知道这是一个错误还是我错过了什么.谢谢!
Wonder if it's a bug or I missed something. Thanks!
use Text::CSV_XS;
$a = 'id=firewall time="2010-05-09 16:07:21 UTC"';
$userDefinedSeparator = Text::CSV_XS->new({sep_char => " "});
print "$userDefinedSeparator\n";
$userDefinedSeparator->parse($a);
my $e;
foreach $e ($userDefinedSeparator->fields) {
print $e, "\n";
}
在上面的代码片段中,我将=
(在time
之后)更改为空格,然后可以正常工作.开始怀疑这到底是否是一个错误?
In the above code snippet, it I change the =
(after time
) to be a space, then it works fine. Started to wonder whether this is a bug after all?
$a = 'id=firewall time "2010-05-09 16:07:21 UTC"';
推荐答案
您已经通过将引号和转义符都设置为双引号"
来使模块感到困惑,然后将其嵌入在所需的字段中分裂.
You have confused the module by leaving both the quote character and the escape character set to double quote "
, and then left them embedded in the fields you want to split.
禁用quote_char
和escape_char
,就像这样
use strict;
use warnings;
use Text::CSV_XS;
my $string = 'id=firewall time="2010-05-09 16:07:21 UTC"';
my $space_sep = Text::CSV_XS->new({
sep_char => ' ',
quote_char => undef,
escape_char => undef,
});
$space_sep->parse($string);
for my $field ($space_sep->fields) {
print "$field\n";
}
输出
id=firewall
time="2010-05-09
16:07:21
UTC"
但是请注意,您已经实现了与print "$_\n" for split ' ', $string
完全相同的功能,因此更可取,因为它效率更高且更简洁.
But note that you have achieved exactly the same things as print "$_\n" for split ' ', $string
, which is to be preferred as it is both more efficient and more concise.
此外,您必须始终 use strict
和use warnings
;和从不使用$a
或$b
作为变量名,这是因为它们被sort
使用并且无意义且没有描述性.
In addition, you must always use strict
and use warnings
; and never use $a
or $b
as variable names, both because they are used by sort
and because they are meaningless and undescriptive.
更新
正如@ThisSuitIsBlackNot
所指出的那样,您的意图可能不是分割空格而是提取一系列key=value
对.如果是这样,则此方法会将值直接放入哈希中.
As @ThisSuitIsBlackNot
points out, your intention is probably not to split on spaces but to extract a series of key=value
pairs. If so then this method puts the values straight into a hash.
use strict;
use warnings;
my $string = 'id=firewall time="2010-05-09 16:07:21 UTC"';
my %data = $string =~ / ([^=\s]+) \s* = \s* ( "[^"]*" | [^"\s]+ ) /xg;
use Data::Dump;
dd \%data;
输出
{ id => "firewall", time => "\"2010-05-09 16:07:21 UTC\"" }
更新
该程序将提取两个name=value
字符串并将它们打印在单独的行上.
This program will extract the two name=value
strings and print them on separate lines.
use strict;
use warnings;
my $string = 'id=firewall time="2010-05-09 16:07:21 UTC"';
my @fields = $string =~ / (?: "[^"]*" | \S )+ /xg;
print "$_\n" for @fields;
输出
id=firewall
time="2010-05-09 16:07:21 UTC"
这篇关于通过Text :: CSV_XS解析时出错?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!