解析字符串:提取单词和短语[JavaScript] [英] parsings strings: extracting words and phrases [JavaScript]

查看:169
本文介绍了解析字符串:提取单词和短语[JavaScript]的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要在空格分隔的术语列表中支持精确短语(用引号括起来)。
因此,用空格字符拆分相应的字符串是不够的。

I need to support exact phrases (enclosed in quotes) in an otherwise space-separated list of terms. Thus splitting the respective string by the space-character is not sufficient anymore.

示例:

input : 'foo bar "lorem ipsum" baz'
output: ['foo', 'bar', 'lorem ipsum', 'baz']

我想知道这是否可以通过一个RegEx来实现,而不是执行复杂的解析或拆分和重新加入操作。

I wonder whether this could be achieved with a single RegEx, rather than performing complex parsing or split-and-rejoin operations.

非常感谢任何帮助!

推荐答案

var str = 'foo bar "lorem ipsum" baz';  
var results = str.match(/("[^"]+"|[^"\s]+)/g);

...返回你正在寻找的数组。

注意,但是:

... returns the array you're looking for.
Note, however:


  • 包含了绑定引号,因此可以使用 replace(/ ^([^])删除+)结果上的$ /,$ 1)

  • 引号之间的空格将保持不变。所以,如果<$之间有三个空格c $ c> lorem 和 ipsum ,它们将出现在结果中。您可以通过运行 replace来解决此问题( / \s + /,)关于结果。

  • 如果没有关闭之后 ipsum (即错误引用的短语)你最终会得到: ['foo','bar','lorem','ipsum ','baz']

  • Bounding quotes are included, so can be removed with replace(/^"([^"]+)"$/,"$1") on the results.
  • Spaces between the quotes will stay intact. So, if there are three spaces between lorem and ipsum, they'll be in the result. You can fix this by running replace(/\s+/," ") on the results.
  • If there's no closing " after ipsum (i.e. an incorrectly-quoted phrase) you'll end up with: ['foo', 'bar', 'lorem', 'ipsum', 'baz']

这篇关于解析字符串:提取单词和短语[JavaScript]的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆