在Objective C中,围绕多个模式提取多个文本子串的最佳方法是什么? [英] In Objective C, what's the best way to extract multiple substrings of text around multiple patterns?

查看:128
本文介绍了在Objective C中,围绕多个模式提取多个文本子串的最佳方法是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于一个NSString,我有N个模式字符串。我想在模式匹配周围提取子串。

For one NSString, I have N pattern strings. I'd like to extract substrings "around" the pattern matches.

所以,如果我有快速的棕色狐狸跳过懒狗,我的模式是棕色和懒惰我想得到快速棕色狐狸和懒狗。但是,子字符串不一定需要用空格分隔。

So, if i have "the quick brown fox jumped over the lazy dog" and my patterns are "brown" and "lazy" i would like to get "quick brown fox" and "the lazy dog." However, the substrings don't necessarily need to be delimited by whitespace.

另一个例子是,如果你有多段文本并希望找到所有红色的实例文本中的和蓝色,但是你想在上下文中显示红色和蓝色的实例,但是通过上下文你不关心上下文是否以单词的开头或结尾开始和结束在文本正文中,所以如果你在文本正文中有一句话树上有很多红鸭子,那么结果可能就是大量的红鸭子或者大量的红鸭子。红鸭子并没关系 - 我不是在寻找基于空白的解决方案。它可能只是找到红色并获得红色的子串和之前的10个字符以及之后的10个字符。

Another example would be if you had multiple paragraphs of text and wanted to find all instances of "red" and "blue" in the text, but you wanted to show the instances of "red" and "blue" in context, but by "context" you didn't care if the context started and ended with the beginnings or endings of words in the body of text, so if you had one of the sentences in the body of text as "there are a whole lot of red ducks in the trees" the result could be "whole lot of red ducks in" or "ole lot of red ducks in th" and it wouldn't matter -- i'm not looking for a whitespace based solution. it could just be to find "red" and get the substring that is "red" and the 10 characters before and the 10 characters after.

换句话说,有一些基于范围的字符串匹配函数。我希望有一种简单的方法可以同时匹配多个字符串并返回每个字符串的匹配点加上周围的字符。

In other words, there are some "range" based string matching functions. I was hoping there was an easy way to match multiple strings at once and return each string's matching point plus surrounding characters.

推荐答案

你可以使用第三方框架提供的正则表达式(例如 RegexKit RegexKitLite )。要创建RE,加入模式|并在前缀和附加括号和模式以捕获上下文。 匹配字符串与正则表达式。

You could use regular expressions provided by a third party framework (e.g. RegexKit or RegexKitLite). To create the RE, join the patterns with "|" and prepend and append parentheses and patterns to capture context. Match the string against the regexp.

一些示例前缀&后缀模式:

Some example prefix & suffix patterns:


  • 。{,15}()。{,15}最多匹配15个字符

  • (\ w + \ W +) {,4}()(\ W + \ w +){,4}最多可匹配4个字

  • ".{,15}(", ").{,15}" to match up to 15 characters
  • "(\w+\W+){,4}(", ")(\W+\w+){,4}" to match up to 4 words

这篇关于在Objective C中,围绕多个模式提取多个文本子串的最佳方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆