使用正则表达式在字符串中查找多个带引号的单词 [英] Find multiple quoted words in a string with regex

查看:101
本文介绍了使用正则表达式在字符串中查找多个带引号的单词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的应用支持 5 种语言.我有一个字符串,里面有一些双引号.该字符串在 localizable.strings 文件中被翻译成 5 种语言.

My app supports 5 languages. I have a string which has some double quotes in it. This string is translated into 5 languages in the localizable.strings files.

示例:

title_identifier = "Hi \"how\", are \"you\"";

我想通过查找这些单词的范围来加粗该字符串中的how"和you".所以我试图从字符串中提取这些引用的单词,结果将是一个包含how"和you"或其范围的数组.

I would like to bold out "how" and "you" in this string by finding the range of these words. So I am trying to fetch these quoted words out of the string and the result would be an array containing "how" and "you" or their range.

func matches(for regex: String, in text: String) -> [String] {
  do {
        let regex = try NSRegularExpression(pattern: regex)
        let results = regex.matches(in: text,
                                    range: NSRange(text.startIndex..., in: text))
        return results.map {
            String(text[Range($0.range, in: text)!])
        }
    } catch let error {
        print("invalid regex: \(error.localizedDescription)")
        return []
    }
}

matches(for: "(?<=\")[^\"]*(?=\")", in: str)

结果是:["how", ", are ", "you"] 而不是 ["how","you"].我认为这个正则表达式需要添加一些内容,以便在找到两个引号后搜索下一个引号,从而避免引号之间的单词.

The result is: ["how", ", are ", "you"] rather than ["how","you"]. I think this regex needs some addition to allow it to search for next quote once two quotes are found, so to avoid the words in between quotes.

推荐答案

您的问题在于使用了不使用文本但检查其模式是否匹配并返回 true.请参阅您的正则表达式使用 匹配,因为上一场比赛中的最后一个 " 没有被消耗,正则表达式索引保持在 w 之后,所以下一场比赛可以从 " 开始.你需要在这里使用消费模式,"([^"]*)".

Your problem is in the use of lookarounds that do not consume text but check if their patterns match and return either true or false. See your regex in action, the , are matches because the last " in the previous match was not consumed, the regex index remained right after w, so the next match could start with ". You need to use a consuming pattern here, "([^"]*)".

但是,您的代码只会返回完整匹配项.您可以使用 .map {$0.trimmingCharacters(in: ["\""])} 修剪第一个和最后一个 " ,因为正则表达式只匹配一个在开头和结尾引用:

However, your code will only return full matches. You can just trim the first and last "s here with .map {$0.trimmingCharacters(in: ["\""])}, as the regex only matches one quote at the start and end:

matches(for: "\"[^\"]*\"", in: str).map {$0.trimmingCharacters(in: ["\""])}

这是正则表达式演示.

或者,通过在 $0.range 后附加 (at: 1) 来访问 Group 1 值:

Alternatively, access Group 1 value by appending (at: 1) after $0.range:

func matches(for regex: String, in text: String) -> [String] {
  do {
        let regex = try NSRegularExpression(pattern: regex)
        let results = regex.matches(in: text,
                                    range: NSRange(text.startIndex..., in: text))
        return results.map {
            String(text[Range($0.range(at: 1), in: text)!])
        }
    } catch let error {
        print("invalid regex: \(error.localizedDescription)")
        return []
    }
}

let str = "Hi \"how\", are \"you\""
print(matches(for: "\"([^\"]*)\"", in: str))
// => ["how", "you"]

这篇关于使用正则表达式在字符串中查找多个带引号的单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆