正则表达式拆分空白但不转义空白 [英] Regex to split on whitespace but not escaped whitespace

查看:41
本文介绍了正则表达式拆分空白但不转义空白的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在标准空格 " " 上拆分但不转义空格 "\ "

例如,使用字符串 'my name\ is\ max'(单引号,所以 \ 是文字)

我想得到 ["my", "name\ is\max"]

我试过这个正则表达式:/[^\\]\s/

但结果是这样的:

=>["m", "name\\ is\\ max"]

这很接近,但我不知道如何在 my

中保留 y<小时>

编辑

作为另一个例子考虑这个字符串:

"./db/users/WGDl-HATof-uhdtT7sPfog: email=maxpleaner@gmail.com name=max\\ p"

我想把它分成三份:

<预><代码>["./db/users/WGDl-HATof-uhdtT7sPfog:","email=maxpleaner@gmail.com",名称=最大\\ p"]

解决方案

关于

<块引用>

我试图在前面没有反斜杠的空白处进行拆分.

如果您只关心空格前的反斜杠并且没有其他特殊情况需要考虑,请在 \s 之前使用负向后视 (?<!\\):

s.split(/(?

这里,\s+ 匹配 1+ 个空格,如果前面没有反斜杠((?<!\\)负后视 检查当前位置左侧的文本是否与模式匹配,如果是,则匹配失败).

如果需要考虑多个空格,并且需要处理转义序列,请使用

s.scan(/(?:[^\s\\]|\\.)+/)

查看 Ruby 演示

这里,(?:[^\s\\]|\\.)+ 匹配 1 个或多个除反斜杠和空格以外的字符 ([^\s\\]) 或任何转义序列 (\\.).添加 /m 修饰符使 . 也匹配换行符.

I want to split on standard whitespace " " but not escaped whitespace "\ "

For example, with the string 'my name\ is\ max' (single quotes so \ is literal)

I want to get ["my", "name\ is\ max"]

I've tried this regex: /[^\\]\s/

but the result is this:

=> ["m", "name\\ is\\ max"]

This is close but I don't know how to keep the y in my


edit

As another example consider this string:

"./db/users/WGDl-HATof-uhdtT7sPfog: email=maxpleaner@gmail.com name=max\\ p"

I want to split it into three:

[
  "./db/users/WGDl-HATof-uhdtT7sPfog:",
  "email=maxpleaner@gmail.com",
  "name=max\\ p"
]

解决方案

Regarding

I'm trying to split on whitespace that is not preceeded by a backslash.

If you only care about backslash before whitespace and there are no other special cases to consider, use a negative lookbehind (?<!\\) before \s:

s.split(/(?<!\\)\s/)

Here, \s+ matches 1+ whitespaces if not preceded with a backslash ((?<!\\) is a negative lookbehind that checks if the text to the left of the current location matches the pattern, and if yes, the match is failed).

In case there are multiple whitespaces to consider, and in case there is need to deal with escape sequences, use

s.scan(/(?:[^\s\\]|\\.)+/) 

See the Ruby demo

Here, (?:[^\s\\]|\\.)+ matches 1 or more chars other than a backslash and whitespace ([^\s\\]) or any escape sequence (\\.). Add /m modifier to make . match line break chars, too.

这篇关于正则表达式拆分空白但不转义空白的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆