删除单个换行符,并保持“空".线 [英] Remove single line breaks, keep "empty" lines

查看:84
本文介绍了删除单个换行符,并保持“空".线的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

说我有用光标选择的文本,如下所示:

This is a test. 
This 
is a test.

This is a test. 
This is a 
test.

我想将其转换为:

This is a test. This is a test

This is a test. This is a test

换句话说,我想用空格替换单个换行符,而将空行留空.

我认为类似以下的方法会起作用:

RemoveSingleLineBreaks()
{
  ClipSaved := ClipboardAll
  Clipboard =
  send ^c
  Clipboard := RegExReplace(Clipboard, "([^(\R)])(\R)([^(\R)])", "$1$3")    
  send ^v
  Clipboard := ClipSaved
  ClipSaved = 
}

但事实并非如此.如果我将其应用于上面的文本,它会产生:

This is a test. This is a test.
This is a test. This is a test.

这也删除了中间的空行".这不是我想要的.

要澄清:用空行表示任何带有白色"字符(例如制表符或空白)的行

有什么想法吗?

解决方案

RegExReplace(Clipboard, "([^\r\n])\R(?=[^\r\n])", "$1$2")

如果新换行令牌的末尾包含CRLF(例如CRLFCR+LFLF+CR),这将去除单个换行符.它不将空格视为空白.

您的主要问题是使用\R:

字符类中的

\ R仅仅是字母"R" [源代码]

解决方案是直接使用CRLF字符.


为了澄清:用空行表示任何带有白色"字符(例如制表符或空白)的行

RegExReplace(Clipboard, "(\S.*?)\R(?=.*?\S)", "$1")

与上面的相同,但是将空格视为空白.之所以起作用,是因为它接受除换行符(*?)以外的所有字符,直到换行符前后的第一个非空白字符为止,因为.默认情况下不匹配换行符.

提前行用于避免吃"(匹配)下一个字符,该字符在单字符行上可能会中断.请注意,由于不匹配,因此不会替换它,我们可以将其保留在替换字符串之外.因为PCRE不支持可变长度的lookbehinds,所以不能使用lookbehind.因此,在其中使用了常规捕获组和反向引用.


我想用空格替换单个换行符,不留空行.

如果要用空格替换换行符,这更合适:

RegExReplace(Clipboard, "(\S.*?)\R(?=.*?\S)", "$1 ")

这将用空格替换单个换行符.


如果您要使用先行和后行:


单行换行符:

RegExReplace(Clipboard, "(?<=[^\r\n\t ][^\r\n])\R(?=[^\r\n][^\r\n\t ])", "")


用空格替换单个换行符:

RegExReplace(Clipboard, "(?<=[^\r\n\t ][^\r\n])\R(?=[^\r\n][^\r\n\t ])", " ")

由于某些原因,\S似乎无法在先行和先行中起作用.至少不是我的测试.

Say I have text like the following text selected with the cursor:

This is a test. 
This 
is a test.

This is a test. 
This is a 
test.

I would like to transform it into:

This is a test. This is a test

This is a test. This is a test

In other words, I would like to replace single line breaks by spaces, leaving empty lines alone.

I thought something like the following would work:

RemoveSingleLineBreaks()
{
  ClipSaved := ClipboardAll
  Clipboard =
  send ^c
  Clipboard := RegExReplace(Clipboard, "([^(\R)])(\R)([^(\R)])", "$1$3")    
  send ^v
  Clipboard := ClipSaved
  ClipSaved = 
}

But it doesn't. If I apply it to the text above, it yields:

This is a test. This is a test.
This is a test. This is a test.

which also removed the "empty line" in the middle. This is not what I want.

To clarify: By an empty line I mean any line with "white" characters (e.g. tabs or white spaces)

Any thoughts how to do this?

解决方案

RegExReplace(Clipboard, "([^\r\n])\R(?=[^\r\n])", "$1$2")

This will strip single line breaks assuming the new line token contains either a CR or a LF at the end (e.g. CR, LF, CR+LF, LF+CR). It does not count whitespace as empty.

Your main problem was the use of \R:

\R inside a character class is merely the letter "R" [source]

The solution is to use the CR and LF characters directly.


To clarify: By an empty line I mean any line with "white" characters (e.g. tabs or white spaces)

RegExReplace(Clipboard, "(\S.*?)\R(?=.*?\S)", "$1")

This is the same as the above one, but counts whitespace as empty. It works because it accepts all characters except line breaks non-greedily (*?) up to the first non-whitespace character both behind and in front of the linebreaks, since the . does not match line breaks by default.

A lookahead is used to avoid 'eating' (matching) the next character, which can break on single-character lines. Note that since it is not matched, it is not replaced and we can leave it out of the replacement string. A lookbehind cannot be used because PCRE does not support variable-length lookbehinds, so a normal capture group and backreference are used there instead.


I would like to replace single line breaks by spaces, leaving empty lines alone.

If you want to replace the line break with spaces, this is more appropriate:

RegExReplace(Clipboard, "(\S.*?)\R(?=.*?\S)", "$1 ")

This will replace single line breaks with a space.


And if you wanted to use lookbehinds and lookaheads:


Strip single line breaks:

RegExReplace(Clipboard, "(?<=[^\r\n\t ][^\r\n])\R(?=[^\r\n][^\r\n\t ])", "")


Replace single line breaks with spaces:

RegExReplace(Clipboard, "(?<=[^\r\n\t ][^\r\n])\R(?=[^\r\n][^\r\n\t ])", " ")

For some reason, \S doesn't seem to work in lookbehinds and lookaheads. At least, not with my testing.

这篇关于删除单个换行符,并保持“空".线的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆