如何使用正则表达式解析 Perl 中引用的 CSV? [英] How can I parse quoted CSV in Perl with a regex?

查看:15
本文介绍了如何使用正则表达式解析 Perl 中引用的 CSV?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在解析带引号的 CSV 数据时遇到了一些问题.我的主要问题是字段中的引号.在下面的示例中,第 1 - 4 行正常工作,但第 5,6 和第 7 行没有.

I'm having some issues with parsing CSV data with quotes. My main problem is with quotes within a field. In the following example lines 1 - 4 work correctly but 5,6 and 7 don't.

COLLOQ_TYPE,COLLOQ_NAME,COLLOQ_CODE,XDATA
S,"BELT,FAN",003541547,
S,"BELT V,FAN",000324244,
S,SHROUD SPRING SCREW,000868265,
S,"D" REL VALVE ASSY,000771881,
S,"YBELT,"V"",000323030,
S,"YBELT,'V'",000322933,

我想避免使用 Text::CSV,因为它没有安装在目标服务器上.意识到 CSV 比看起来更复杂,我正在使用 Perl Cookbook 中的食谱.

I'd like to avoid Text::CSV as it isn't installed on the target server. Realising that CSV's are are more complicated than they look I'm using a recipe from the Perl Cookbook.

sub parse_csv {
  my $text = shift; #record containg CSVs
  my @columns = ();
  push(@columns ,$+) while $text =~ m{
    # The first part groups the phrase inside quotes
    "([^"\]*(?:\.[^"\]*)*)",?
      | ([^,]+),?
      | ,
    }gx;
  push(@columns ,undef) if substr($text, -1,1) eq ',';
  return @columns ; # list of vars that was comma separated.
}

有没有人有改进正则表达式来处理上述情况的建议?

Does anyone have a suggestion for improving the regex to handle the above cases?

推荐答案

请尝试使用 CPAN

没有理由不能下载Text::CSV,或任何其他基于非 XS 的 CSV 解析器实现,并将其安装在您的本地目录或项目的 lib/子目录中,以便与您的项目推出一起安装.

Please, Try Using CPAN

There's no reason you couldn't download a copy of Text::CSV, or any other non-XS based implementation of a CSV parser and install it in your local directory, or in a lib/ sub directory of your project so its installed along with your projects rollout.

如果你不能在你的项目中存储文本文件,那么我想知道你是如何编码你的项目的.

If you can't store text files in your project, then I'm wondering how it is you are coding your project.

http://novosial.org/perl/life-with-cpan/非根/

应该是一个很好的指南,可以让您了解如何在本地将这些设置为工作状态.

Should be a good guide on how to get these into a working state locally.

在尝试编写自己的 CSV 实现之前,请考虑这一点.

Please consider this before trying to write your own CSV implementation.

Text::CSV 超过一百行代码,包括修复的错误和边缘情况,从头开始重新编写它只会让您了解 CSV 的困难程度.

Text::CSV is over a hundred lines of code, including fixed bugs and edge cases, and re-writing this from scratch will just make you learn how awful CSV can be the hard way.

注意:我是通过艰难的方式了解到这一点的.我花了一整天的时间在 PHP 中获得了一个可以工作的 CSV 解析器,然后我发现在更高版本中添加了一个内置的解析器.这真的是一件很糟糕的事情.

这篇关于如何使用正则表达式解析 Perl 中引用的 CSV?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆