删除一定长度的单词之间的空格 [英] Remove spaces between words of a certain length

查看:29
本文介绍了删除一定长度的单词之间的空格的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下种类的字符串:

I have strings of the following variety:

A B C Company
XYZ Inc
S & K Co

我想删除这些字符串中 1 个字母长度的单词之间的空格.例如,在第一个字符串中,我想删除 A BC 之间的空格,而不是 C 之间的空格> 和公司.结果应该是:

I would like to remove the spaces in these strings that are only between words of 1 letter length. For example, in the first string I would like to remove the spaces between A B and C but not between C and Company. The result should be:

ABC Company
XYZ Inc
S&K Co

为此在 gsub 中使用的正确正则表达式是什么?

What is the proper regex expression to use in gsub for this?

推荐答案

这里有一种方法可以让你看到 & 是如何混入而不是一个单词字符......

Here is one way you could do this seeing how & is mixed in and not a word character ...

x <- c('A B C Company', 'XYZ Inc', 'S & K Co', 'A B C D E F G Company')
gsub('(?<!\\S\\S)\\s+(?=\\S(?!\\S))', '', x, perl=TRUE)
# [1] "ABC Company"     "XYZ Inc"         "S&K Co"          "ABCDEFG Company"

说明:

首先我们断言两个非空白字符不前后接.然后我们查找并匹配空格一次或多次".接下来我们先行断言后面跟着一个非空白字符,同时断言下一个字符不是一个非空白字符.

First we assert that two non-whitespace characters do not precede back to back. Then we look for and match whitespace "one or more" times. Next we lookahead to assert that a non-whitespace character follows while asserting that the next character is not a non-whitespace character.

(?<!        # look behind to see if there is not:
  \S        #   non-whitespace (all but \n, \r, \t, \f, and " ")
  \S        #   non-whitespace (all but \n, \r, \t, \f, and " ")
)           # end of look-behind
\s+         # whitespace (\n, \r, \t, \f, and " ") (1 or more times)
(?=         # look ahead to see if there is:
  \S        #   non-whitespace (all but \n, \r, \t, \f, and " ")
  (?!       #   look ahead to see if there is not:
    \S      #     non-whitespace (all but \n, \r, \t, \f, and " ")
  )         #   end of look-ahead
)           # end of look-ahead

这篇关于删除一定长度的单词之间的空格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆