在Perl中删除换行符的最巧妙方法 [英] Neatest way to remove linebreaks in Perl

查看:2289
本文介绍了在Perl中删除换行符的最巧妙方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在维护一个脚本,该脚本可以从各种来源获取其输入,并且可以每行对其进行处理.根据实际使用的来源,换行符可能是Unix风格,Windows风格,甚至对于某些聚合输入来说,还可能是mix(!).

I'm maintaining a script that can get its input from various sources, and works on it per line. Depending on the actual source used, linebreaks might be Unix-style, Windows-style or even, for some aggregated input, mixed(!).

从文件中读取时,会出现以下内容:

When reading from a file it goes something like this:

@lines = <IN>;
process(\@lines);

...

sub process {
    @lines = shift;
    foreach my $line (@{$lines}) {
        chomp $line;
        #Handle line by line
    }
}

所以,我要做的是用消除Unix风格或Windows风格的换行符的方式替换chomp. 我想出了太多的方法来解决这个问题,这是Perl的常见弊端之一:)

So, what I need to do is replace the chomp with something that removes either Unix-style or Windows-style linebreaks. I'm coming up with way too many ways of solving this, one of the usual drawbacks of Perl :)

您对切断通用换行符的最简洁方法有何看法?什么是最有效的?

What's your opinion on the neatest way to chomp off generic linebreaks? What would be the most efficient?

一个小小的澄清-方法'process'从某处获取行列表,不必从文件中读取.每行可能都有

A small clarification - the method 'process' gets a list of lines from somewhere, not nessecarily read from a file. Each line might have

  • 没有尾随的换行符
  • Unix风格的换行符
  • Windows风格的换行符
  • 仅返回回车(当原始数据具有Windows样式的换行符并以$/='\ n'读取时)
  • 线条不同样式的集合集

推荐答案

经过 perlre 文档,到目前为止,我将提出我的最佳建议,该建议似乎效果很好. Perl 5.10添加了\ R字符类作为通用换行符:

After digging a bit through the perlre docs a bit, I'll present my best suggestion so far that seems to work pretty good. Perl 5.10 added the \R character class as a generalized linebreak:

$line =~ s/\R//g;

与以下相同:

(?>\x0D\x0A?|[\x0A-\x0C\x85\x{2028}\x{2029}])

我将让这个问题待一会儿,只是为了看看是否还有更多漂亮的方法值得建议.

I'll keep this question open a while yet, just to see if there's more nifty ways waiting to be suggested.

这篇关于在Perl中删除换行符的最巧妙方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆