当传递多个多字符分隔符时,String.Split 方法如何确定分隔符优先级? [英] how does the String.Split method determine separator precedence when passed multiple multi-character separators?

查看:25
本文介绍了当传递多个多字符分隔符时,String.Split 方法如何确定分隔符优先级?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果您有此代码:

"......".Split(new String[]{"...", ".."}, StringSplitOptions.None);

结果数组元素为:

 1. ""
 2. ""
 3. ""

现在,如果您颠倒分隔符的顺序,

Now if you reverse the order of the separators,

"......".Split(new String[]{"..", "..."}, StringSplitOptions.None);

结果数组元素为:

 1. ""
 2. ""
 3. ""
 4. ""

从这两个示例中,我倾向于得出这样的结论:Split 方法在从左到右遍历数组的每个元素时递归地标记化.

From these 2 examples I feel inclined to conclude that the Split method recursively tokenizes as it goes through each element of the array from left to right.

然而,一旦我们在等式中加入包含字母数字字符的分隔符,很明显上述理论是错误的.

However, once we throw in separators that contain alphanumeric characters into the equation, it is clear that the above theory is wrong.

  "5.x.7".Split(new String[]{".x", "x."}, StringSplitOptions.None)

结果:<代码>1."5" 2. ".7"

   "5.x.7".Split(new String[]{"x.", ".x"}, StringSplitOptions.None)

结果:<代码>1."5" 2. ".7"

这次我们获得了相同的输出,这意味着基于第一组示例理论化的规则不再适用.(即:如果始终根据分隔符在数组中的位置来确定分隔符优先级,那么在最后一个示例中,我们将获得 "5." & "7" 而不是 "5" & ".7".

This time we obtain the same output, which means that the rule theorized based on the first set of examples no longer applies. (ie: if separator precedence was always determined based on the position of the separator within the array, then in the last example we would have obtained "5." & "7" instead of "5" & ".7".

至于为什么我浪费时间去猜测 .NET 标准 API 的工作原理,是因为我想为我的 Java 应用程序实现类似的功能,但是 StringTokenizer 和 org.apache.commons.lang.StringUtils 都没有提供这种能力使用多个多字符分隔符分割一个字符串(即使我找到了一个提供这种能力的 API,也很难知道它是否总是使用相同的算法进行标记String.Split 方法.

As to why I am wasting my time trying to guess how .NET standard API's work, it's because I want to implement similar functionality for my java apps, but neither StringTokenizer nor org.apache.commons.lang.StringUtils provide the ability to split a String using multiple multi-character separators (and even if I were to find an API that does provide this ability, it would be hard to know if it always tokenizes using the same algorithm used by the String.Split method.

推荐答案

来自 MSDN:

为了避免分隔符中的字符串有字符时产生歧义的结果通常,Split 操作是从头到尾进行的实例的值,并匹配中的第一个元素分隔符等于实例中的分隔符.中的顺序实例中遇到的子串优先于分隔符中元素的顺序.

To avoid ambiguous results when strings in separator have characters in common, the Split operation proceeds from the beginning to the end of the value of the instance, and matches the first element in separator that is equal to a delimiter in the instance. The order in which substrings are encountered in the instance takes precedence over the order of elements in separator.

因此,对于第一种情况,.."和..."位于同一位置,它们在分隔符中的顺序用于确定使用的那个.对于第二种情况,在x"之前找到.x".并且分隔符中元素的顺序不适用.

So, for the first case ".." and "..." are found on the same position and their order in separator is used to determine the used one. For the second case, ".x" is found before "x." and the order of elements in separator does not apply.

这篇关于当传递多个多字符分隔符时,String.Split 方法如何确定分隔符优先级?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆