在Perl中,如何匹配两个连续的回车符? [英] In Perl, how to match two consecutive Carriage Returns?

查看:183
本文介绍了在Perl中,如何匹配两个连续的回车符?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

StackOverflow好友,

Hi StackOverflow buddies,

我在 Windows平台上;我有一个数据文件,但是出了点问题,并且(我不知道为什么)回车+新行"的所有组合都变成了回车+回车+新行",(190128)例如:

I'm on Windows platform; I have a data file but something wrong happened and (I don't know why) all combinations of "Carriage Return + New Line" became "Carriage Return + Carriage Return + New Line", (190128 edit:) for example:

以纯文本格式查看文件时,它是:

When viewing the file as plain text, it is:

以十六进制模式查看同一文件时,它是:

When viewing the same file in hex mode, it is:

出于实际目的,我需要删除双"0D"(如".... 30 30 0D 0D 0A 30 30 ....")中多余的"0D",并将其更改为".... 30 30 0D 0A 30 30 ....".

Out of practical purposes I need to remove the extra "0D" in double "0D"s like ".... 30 30 0D 0D 0A 30 30 ....", and change it to ".... 30 30 0D 0A 30 30 ....".

190129另外,为了确保可以重现我的问题,我将数据文件上传到了URL的GitHub(在使用二进制\十六进制编辑器之前,应先下载并解压缩该文件,然后再使用它;可以在第一行中0D 0D 0A): https://github.com/katyusza/hello_world/blob/master/ram_init.zip

190129 edit: Besides, to ensure that my problem can be reproduced, I uploaded my data file to GitHub at URL (should download & unzip it before using; in a binary \ hex editor you can 0D 0D 0A in the first line): https://github.com/katyusza/hello_world/blob/master/ram_init.zip

我使用以下Perl脚本删除了多余的回车符,但令我惊讶的是我的正则表达式不起作用!我的整个代码是( 190129编辑:此处是过去的整个Perl脚本):

I used the following Perl script to remove the extra Carriage Return, but to my astonishment my regex just do NOT work!! My entire code is (190129 edit: past entire Perl script here):

use warnings            ;
use strict              ;
use File::Basename      ;

#-----------------------------------------------------------
# command line handling, file open \ create
#-----------------------------------------------------------

# Capture input input filename from command line:
my $input_fn = $ARGV[0] or
die "Should provide input file name at command line!\n";

# Parse input file name, and generate output file name:
my ($iname, $ipath, $isuffix) = fileparse($input_fn, qr/\.[^.]*/);
my $output_fn = $iname."_pruneNonPrintable".$isuffix;

# Open input file:
open (my $FIN, "<", $input_fn) or die "Open file error $!\n";

# Create output file:
open (my $FO, ">", $output_fn) or die "Create file error $!\n";


#-----------------------------------------------------------
# Read input file, search & replace, write to output
#-----------------------------------------------------------

# Read all lines in one go:
$/ = undef;

# Read entire file into variable:
my $prune_txt = <$FIN> ;

# Do match & replace:
 $prune_txt =~ s/\x0D\x0D/\x0D/g;          # do NOT work.
# $prune_txt =~ s/\x0d\x0d/\x30/g;          # do NOT work.
# $prune_txt =~ s/\x30\x0d/\x0d/g;          # can work.
# $prune_txt =~ s/\x0d\x0d\x0a/\x0d\x0a/gs; # do NOT work.

# Print end time of processing:
print $FO $prune_txt  ;

# Close files:
close($FIN)     ;
close($FO)      ;

我竭尽所能匹配两个连续的回车,但是失败了.任何人都可以指出我的错误,或告诉我正确的做法吗?预先感谢!

I did everything I could to match two consecutive Carriage Returns, but failed. Can anyone please point out my mistake, or tell me the right way to go? Thanks in advance!

推荐答案

在Windows上,默认情况下为文件句柄赋予:crlf层.

On Windows, file handles have a :crlf layer given to them by default.

  • 此层在读取时将CR LF转换为LF.
  • 此层在写入时将LF转换为CR LF.

解决方案1:补偿:crlf层.

Solution 1: Compensate for the :crlf layer.

如果要以系统适当的行结尾作为结束,则可以使用此解决方案.

You'd use this solution if you want to end up with system-appropriate line endings.

# ... read ...      # CR CR LF ⇒ CR LF
s/\r+\n/\n/g;       # CR LF    ⇒ LF
# ... write ...     # LF       ⇒ CR LF

解决方案2:删除:crlf层.

Solution 2: Remove the :crlf layer.

如果要无条件使用CR LF,则可以使用此解决方案.

You'd use this solution if you want to end up with CR LF unconditionally.

使用<:raw>:raw代替<>作为模式.

Use <:raw and >:raw instead of < and > as the mode.

# ... read ...      # CR CR LF ⇒ CR CR LF
s/\r*\n/\r\n/g;     # CR CR LF ⇒ CR LF
# ... write ...     # CR LF    ⇒ CR LF

这篇关于在Perl中,如何匹配两个连续的回车符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆