替换多个字符串的更好方法 - C# 中的混淆 [英] A better way to replace many strings - obfuscation in C#

查看:22
本文介绍了替换多个字符串的更好方法 - C# 中的混淆的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试混淆大量数据.我已经创建了一个要替换的单词(标记)列表,并且我正在使用 StringBuilder 类一个一个地替换单词,如下所示:

I'm trying to obfuscate a large amount of data. I've created a list of words (tokens) which I want to replace and I am replacing the words one by one using the StringBuilder class, like so:

 var sb = new StringBuilder(one_MB_string);
 foreach(var token in tokens)
 {
   sb.Replace(token, "new string");
 }

太慢了!有什么我可以做的简单的事情来加快速度吗?

It's pretty slow! Are there any simple things that I can do to speed it up?

tokens 是大约一千个字符串的列表,每个字符串的长度为 5 到 15 个字符.

tokens is a list of about one thousand strings, each 5 to 15 characters in length.

推荐答案

不要在一个巨大的字符串中进行替换(这意味着您要移动大量数据),而是要遍历整个字符串并一次替换一个标记.

Instead of doing replacements in a huge string (which means that you move around a lot of data), work through the string and replace a token at a time.

为每个标记创建一个包含下一个索引的列表,找到第一个标记,然后将文本复制到标记到结果,然后替换标记.然后检查该标记的下一次出现在字符串中的位置以保持列表是最新的.重复直到找不到更多的标记,然后将剩余的文本复制到结果中.

Make a list containing the next index for each token, locate the token that is first, then copy the text up to the token to the result followed by the replacement for the token. Then check where the next occurance of that token is in the string to keep the list up to date. Repeat until there are no more tokens found, then copy the remaining text to the result.

我做了一个简单的测试,这个方法在 208 毫秒内对 1000000 个字符串做了 125000 次替换.

I made a simple test, and this method did 125000 replacements on a 1000000 character string in 208 milliseconds.

Token 和 TokenList 类:

Token and TokenList classes:

public class Token {

    public string Text { get; private set; }
    public string Replacement { get; private set; }
    public int Index { get; set; }

    public Token(string text, string replacement) {
        Text = text;
        Replacement = replacement;
    }

}

public class TokenList : List<Token>{

    public void Add(string text, string replacement) {
        Add(new Token(text, replacement));
    }

    private Token GetFirstToken() {
        Token result = null;
        int index = int.MaxValue;
        foreach (Token token in this) {
            if (token.Index != -1 && token.Index < index) {
                index = token.Index;
                result = token;
            }
        }
        return result;
    }

    public string Replace(string text) {
        StringBuilder result = new StringBuilder();
        foreach (Token token in this) {
            token.Index = text.IndexOf(token.Text);
        }
        int index = 0;
        Token next;
        while ((next = GetFirstToken()) != null) {
            if (index < next.Index) {
                result.Append(text, index, next.Index - index);
                index = next.Index;
            }
            result.Append(next.Replacement);
            index += next.Text.Length;
            next.Index = text.IndexOf(next.Text, index);
        }
        if (index < text.Length) {
            result.Append(text, index, text.Length - index);
        }
        return result.ToString();
    }

}

用法示例:

string text =
    "This is a text with some words that will be replaced by tokens.";

var tokens = new TokenList();
tokens.Add("text", "TXT");
tokens.Add("words", "WRD");
tokens.Add("replaced", "RPL");

string result = tokens.Replace(text);
Console.WriteLine(result);

输出:

This is a TXT with some WRD that will be RPL by tokens.

注意:此代码不处理重叠标记.例如,如果您有令牌菠萝"和苹果",则代码无法正常工作.

Note: This code does not handle overlapping tokens. If you for example have the tokens "pineapple" and "apple", the code doesn't work properly.


要使代码与重叠标记一起工作,请替换以下行:


To make the code work with overlapping tokens, replace this line:

next.Index = text.IndexOf(next.Text, index);

使用此代码:

foreach (Token token in this) {
    if (token.Index != -1 && token.Index < index) {
        token.Index = text.IndexOf(token.Text, index);
    }
}

这篇关于替换多个字符串的更好方法 - C# 中的混淆的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆