C#将字符串中的数组的元素与转义字符\隔开 [英] C# isolate elements of an array in a string with escape character \

查看:255
本文介绍了C#将字符串中的数组的元素与转义字符\隔开的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有几个数组,例如:

string[] sArTrigFunctions = {"sin", "cos", "tan", "sinh", "cosh", "tanh", "cot", "sec", "csc", "arcsin", "arccos", "arctan", "coth", "sech", "csch"};
string[] sArGreek = { "alpha", "beta", "chi", "delta", "Delta", "epsi", "varepsilon", "eta", "gamma", "Gamma", "iota", "kappa", "lambda", "Lambda", "lamda", "Lamda", "mu", "nu", "omega", "Omega", "phi", "varphi", "Phi", "pi", "Pi", "psi", "Psi", "rho", "sigma", "Sigma", "tau", "theta", "vartheta", "Theta", "upsilon", "xi", "Xi", "zeta" };
string sArBinOp = {"lt","gt","eq","neq",.....}; etc.

这些数组元素在文本文件中使用,这些文本文件彼此混合或与其他文件的内容。例如: sintheta ,一个 lt c。
我想使用 \ 转义文件中的这些数组元素,所以 sintheta 变为 \sin\theta altc 成为 a\ltc 。一个简单的string.replace(...)不起作用例如,如果我在 sArTrigFunctions 数组上运行以下 foreach 循环,然后在 sArGreek 数组,它将文件中的 sintheta 替换为 \sinth\eta 。如果我按照元素的长度按顺序重新排列 sArGreek 元素的顺序,那么θ在eta之前,则以下代码将首先更改 sintheta \sin\theta 然后到 \sin\th\eta 。同样,在 sArBinOp 数组上运行以下代码将将 sindelta 替换为 sinde\lta 或者如果我们首先在 sArGreek 上运行以下代码,然后在 sArGreek sindelta 更改为 \sin\de\lta

These array elements are used in a text file where these are mixed with each other or with other content of the file. For example: sintheta, altc. I want to escape these array elements in the file with \ so sintheta becomes \sin\theta and altc becomes a\ltc. A simple string.replace(...) does not work. For example if I run the following foreach loop on sArTrigFunctions array and then on sArGreek array, it will replace sintheta in the file to \sinth\eta. If I rearrange the order of sArGreek elements in descending order by length of elements so theta comes before eta, then the following code will first change sintheta to \sin\theta and then to \sin\th\eta. Likewise, running the following code on sArBinOp array will replace sindelta to sinde\lta or if we first run the following code on sArGreek and then on sArGreek the sindelta gets changed to \sin\de\lta:

foreach (string s in sArGreek)
{
    strfileContent = strfileContent.Replace(s, "\\" + s);
}

问题:我们如何以编程方式使其成为在替换过程中,如果数组元素位于任何数组的另一个数组元素中,则不会以 \ 转义。例如,在 sintheta 中不要退出 eta ,但在 sineta 。同样,不要在 sindelta 中转义 lt ,但在 altc
注意:文件中的数组元素并不一定以空格分隔,即 sintheta 不写为sinθ否则我们可以使用 C#正则表达式边界,使用以下代码实现这一点,例如:

Question: How can we programmatically make it so that during the replace process if an array element is inside another array element of any array don't escape it with \. For example don't escape eta in sintheta but do so in sineta. Likewise, don't escape lt in sindelta but do so in altc Note: The array elements in the file are not not necessarily separated by a space, i.e. sintheta is not written as sin theta otherwise we could use C# Regex Word Boundary to achieve this using the code like the following, for example:

foreach (string s in sArGreek)
{
    strfileContent = Regex.Replace(strfileContent, "\\b" + s + "\\b", "\\" + s + " ");
}


推荐答案

正则表达式替换。

首先,您需要从输入数组构造正则表达式。表达式的结构是:

First you need to construct your Regex from the input arrays. The structure of the expression is:

term1|term2|term3|t4|t5

含义,单个字符串中的所有术语以|分隔(正则表达式OR),按降序排列。这很重要,因为我们希望在可能的时候捕获更长的条款,并在需要时缩短到较短的条件。

Meaning, all the terms in a single string, separated by "|" (regex OR), sorted by descending term length. This is important since we want to capture longer terms when possible, and fallback to shorter terms when needed.

要做到这一点,LINQ查询很方便:

To do that, a little LINQ query comes handy:

Regex re = new Regex(String.Join("|", (
    from s in sArTrigFunctions.Union(sArGreek).Union(sArBinOp)
    orderby s.Length descending
    select s).ToArray()));

我们正在从我们所有的数组中创建一个枚举,然后按长度排序,并加入到单串。这用于创建一个正则表达式对象。

We're creating a single enumerable from all our arrays, then sorting by length, and joining to a single string. This is used to create a Regex object.

然后它是一个简单的替换:

Then it's a simple replace:

re.Replace("sintheta altc", "\\$&");

\\ $&表示替换整个匹配(一次一个单词),前缀为反斜杠。

"\\$&" means replace the entire match (single term at a time) with itself prefixed with a backslash.

这是一个小提琴

这篇关于C#将字符串中的数组的元素与转义字符\隔开的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆