提取字符串的位置并将信息与此部分相关联 [英] Extract locations of a string and associate information with this parts

查看:156
本文介绍了提取字符串的位置并将信息与此部分相关联的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个字符串,里面有某种元数据描述字符串的部分。

示例:

 这是一个{TypeAStart}任意长{TypeAEnd}文本,具有{TypeBStart}各种类型{TypeBEnd}用法

我想要的是获取这些标签中包含的字符串的索引。

我的第一个想法是:

  String [] tags = [{TypeA,{TypeB等); 
for(String tag:tags){
int start = mainString.indexOf(tag +Start);
if(start!= -1){
int end = mainString.indexOf(tag +End},start);
//存储某处开始,结束
}
}

但我认为这种方法是黑客和容易出错的。如何使用正则表达式有效地执行此操作?



更新:

我提到了这个字符串,这些元数据是应用样式的标记。

所以在上面的例子中,我需要一个单独的数据结构,它将描述 TypeA 适用于任意长,位于23-36。



我的主要目标是将原始字符串剥离标签,并在另一个数据结构中进行索引和样式。

解决方案

这样的工作。然而,假设标签正确配对,否则将返回包含错误嵌入标签的完整内容。

  pattern = Pattern.compile( \\ {((\\w +)开始)\\}(。*?)\\ {(\\\2End)\\}); 
Matcher matcher = pattern.matches(input);

while(matcher.find()){
// matcher.start(1)matcher.end(1)matcher.group(1)开始标签的返回索引,其结束和内容
//类似的结束标签 - 开始(4)等
//类似的标签内容 - 开始(3)等
//组(2)只返回标签名称
String result = matcher.group(3);
//做一些结果
}


I have a string that inside has some kind of metadata that "describe" parts of the string.
Example:

This is an {TypeAStart}arbitrary long{TypeAEnd} text which has {TypeBStart}various{TypeBEnd} usages  

What I want is to get the indexes of the strings that are contained inside these tags.
My first thought was something like:

String[] tags = [ "{TypeA", "{TypeB", etc ];  
for(String tag:tags)  {  
   int start = mainString.indexOf(tag + "Start");  
   if(start != -1) {  
      int end = mainString.indexOf(tag + "End}", start);  
      // store somewhere start,end  
   } 
}   

But I think this approach is hacky and error prone. How can I do this efficiently with regexes?

UPDATE:
I have this string as I mentioned and these metadata are marks where styling is to be applied.
So in the example string above, I would need somehow to have a separate datastructure that would "describe" that TypeA is applicable for the string from "arbitrary long" which is from position 23-36.

My main target is to have the original string stripped of the tags and in another data structure the indexes and styles.

解决方案

Something like this work. However it supposes that tags are correctly paired, otherwise will return full content including incorrectly embedded tag.

pattern = Pattern.compile("\\{((\\w+)Start)\\}(.*?)\\{(\\2End)\\}");
Matcher matcher = pattern.matches(input);

while (matcher.find()) {
    // matcher.start(1) matcher.end(1) matcher.group(1) return index of start of opening tag, its end and content
    // similarly for end tag - start(4) etc.
    // similarly for tag content - start(3) etc.
    // group(2) returns just the tag name
    String result = matcher.group(3);
    // do something with result
}

这篇关于提取字符串的位置并将信息与此部分相关联的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆