C#将字符串中的数组的元素与转义字符\隔开 [英] C# isolate elements of an array in a string with escape character \
问题描述
我有几个数组,例如:
string[] sArTrigFunctions = {"sin", "cos", "tan", "sinh", "cosh", "tanh", "cot", "sec", "csc", "arcsin", "arccos", "arctan", "coth", "sech", "csch"};
string[] sArGreek = { "alpha", "beta", "chi", "delta", "Delta", "epsi", "varepsilon", "eta", "gamma", "Gamma", "iota", "kappa", "lambda", "Lambda", "lamda", "Lamda", "mu", "nu", "omega", "Omega", "phi", "varphi", "Phi", "pi", "Pi", "psi", "Psi", "rho", "sigma", "Sigma", "tau", "theta", "vartheta", "Theta", "upsilon", "xi", "Xi", "zeta" };
string sArBinOp = {"lt","gt","eq","neq",.....}; etc.
这些数组元素在文本文件中使用,这些文本文件彼此混合或与其他文件的内容。例如: sintheta
,一个 lt
c。
我想使用 \
转义文件中的这些数组元素,所以 sintheta
变为 \sin\theta
和 altc
成为 a\ltc
。一个简单的string.replace(...)不起作用例如,如果我在 sArTrigFunctions
数组上运行以下 foreach
循环,然后在 sArGreek
数组,它将文件中的 sintheta
替换为 \sinth\eta
。如果我按照元素的长度按顺序重新排列 sArGreek
元素的顺序,那么θ在eta之前,则以下代码将首先更改 sintheta
到 \sin\theta
然后到 \sin\th\eta
。同样,在 sArBinOp
数组上运行以下代码将将 sindelta
替换为 sinde\lta
或者如果我们首先在 sArGreek
上运行以下代码,然后在 sArGreek
sindelta
更改为 \sin\de\lta
:
These array elements are used in a text file where these are mixed with each other or with other content of the file. For example: sintheta
, alt
c.
I want to escape these array elements in the file with \
so sintheta
becomes \sin\theta
and altc
becomes a\ltc
. A simple string.replace(...) does not work. For example if I run the following foreach
loop on sArTrigFunctions
array and then on sArGreek
array, it will replace sintheta
in the file to \sinth\eta
. If I rearrange the order of sArGreek
elements in descending order by length of elements so theta comes before eta, then the following code will first change sintheta
to \sin\theta
and then to \sin\th\eta
. Likewise, running the following code on sArBinOp
array will replace sindelta
to sinde\lta
or if we first run the following code on sArGreek
and then on sArGreek
the sindelta
gets changed to \sin\de\lta
:
foreach (string s in sArGreek)
{
strfileContent = strfileContent.Replace(s, "\\" + s);
}
问题:我们如何以编程方式使其成为在替换过程中,如果数组元素位于任何数组的另一个数组元素中,则不会以 \
转义。例如,在 sintheta
中不要退出 eta
,但在 sineta
。同样,不要在 sindelta
中转义 lt
,但在 altc
注意:文件中的数组元素并不一定以空格分隔,即 sintheta
不写为sinθ
否则我们可以使用 C#正则表达式边界,使用以下代码实现这一点,例如:
Question: How can we programmatically make it so that during the replace process if an array element is inside another array element of any array don't escape it with \
. For example don't escape eta
in sintheta
but do so in sineta
. Likewise, don't escape lt
in sindelta
but do so in altc
Note: The array elements in the file are not not necessarily separated by a space, i.e. sintheta
is not written as sin theta
otherwise we could use C# Regex Word Boundary to achieve this using the code like the following, for example:
foreach (string s in sArGreek)
{
strfileContent = Regex.Replace(strfileContent, "\\b" + s + "\\b", "\\" + s + " ");
}
推荐答案
正则表达式替换。
首先,您需要从输入数组构造正则表达式。表达式的结构是:
First you need to construct your Regex from the input arrays. The structure of the expression is:
term1|term2|term3|t4|t5
含义,单个字符串中的所有术语以|分隔(正则表达式OR),按降序排列。这很重要,因为我们希望在可能的时候捕获更长的条款,并在需要时缩短到较短的条件。
Meaning, all the terms in a single string, separated by "|" (regex OR), sorted by descending term length. This is important since we want to capture longer terms when possible, and fallback to shorter terms when needed.
要做到这一点,LINQ查询很方便:
To do that, a little LINQ query comes handy:
Regex re = new Regex(String.Join("|", (
from s in sArTrigFunctions.Union(sArGreek).Union(sArBinOp)
orderby s.Length descending
select s).ToArray()));
我们正在从我们所有的数组中创建一个枚举,然后按长度排序,并加入到单串。这用于创建一个正则表达式
对象。
We're creating a single enumerable from all our arrays, then sorting by length, and joining to a single string. This is used to create a Regex
object.
然后它是一个简单的替换:
Then it's a simple replace:
re.Replace("sintheta altc", "\\$&");
\\ $&
表示替换整个匹配(一次一个单词),前缀为反斜杠。
"\\$&"
means replace the entire match (single term at a time) with itself prefixed with a backslash.
这是一个小提琴
这篇关于C#将字符串中的数组的元素与转义字符\隔开的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!