我怎么可以跟踪字符位置之后，我删除一个字符串元素？ [英] How can I keep track of character positions after I remove elements from a string?

查看：100 发布时间：2015/11/30 21:46:20 algorithm language-agnostic string

本文介绍了我怎么可以跟踪字符位置之后，我删除一个字符串元素？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

让我们说我有以下字符串：

Let us say I have the following string:

 "my ., .,dog. .jumps. , .and..he. .,is., .a. very .,good, .dog"  
  1234567890123456789012345678901234567890123456789012345678901 <-- char pos

现在，我已经写了一个普通的前pression从字符串中删除上面的某些元素，在这个例子中，所有空白，所有的时间，和所有的逗号。

Now, I have written a regular expression to remove certain elements from the string above, in this example, all whitespace, all periods, and all commas.

我留下了以下转化的字符串：

I am left with the following transformed string:

 "mydogjumpsandheisaverygooddog"

现在，我要构建K-克该字符串。让我们说我是拿5克以上的字符串，它看起来像：

Now, I want to construct k-grams of this string. Let us say I were to take 5-grams of the above string, it would look like:

  mydog ydogj dogju ogjum gjump jumps umpsa ...

我的问题是，对于每一个K-克，我要跟踪它的原始字符位置的在我列出的第一个源文本。

The problem I have is that for each k-gram, I want to keep track of its original character position in the first source text I listed.

所以，mydog，将有0的开始位置11的结束位置。不过，我的源文本和修改后的文本之间没有映射。所以，我不知道在哪里一个特定的K-克开始和结束有关的原始，未经修改的文本。这是很重要的我的程序跟踪。

So, "mydog", would have a start position of "0" and an end position of "11". However, I have no mapping between the source text and the modified text. So, I have no idea where a particular k-gram starts and ends in relation to the original, unmodified text. This is important to my program to keep track of.

我创造了K-克的列表是这样的：

I am creating a list of k-grams like this:

public class Kgram
{
    public int start;  
    public int end;  
    public int text;  
}

其中，启动和结束在源文本（顶部）和文本的位置就是第k修改后-gram文本。

where start and end are positions in the source text (top) and the text is that of the k-gram text after the modifications.

任何人都可以点我在正确的方向来解决这个问题的最好方法是什么？

Can anyone point me in the right direction for the best way to solve this problem?

我怎么可以跟踪字符位置之后，我删除一个字符串元素？ [英] How can I keep track of character positions after I remove elements from a string?

问题描述

推荐答案

相关文章

C/C++最新文章

热门教程

热门工具

登录关闭

我怎么可以跟踪字符位置之后，我删除一个字符串元素？ [英] How can I keep track of character positions after I remove elements from a string?

问题描述

推荐答案

相关文章

C/C++最新文章

热门教程

热门工具

登录 关闭

登录关闭