重写与正则表达式IsHexString方法 [英] Rewrite IsHexString method with RegEx

查看：357 发布时间：2015/11/26 19:53:41 c# .net regex hex

本文介绍了重写与正则表达式IsHexString方法的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我是检查如果一个字符串是 有效的十六进制字符串的方法：

I got a method that checks if a string is a valid hex string:

public bool IsHex(string value)
{
  if (string.IsNullOrEmpty(value) || value.Length % 2 != 0)
    return false;

  return 
    value.Substring(0, 2) == "0x" &&
    value.Substring(2)
      .All(c => (c >= '0' && c <= '9') ||
                (c >= 'a' && c <= 'f') ||
                (c >= 'A' && c <= 'F'));
}

的规则如下：
的偶数EX pression必须由的十六进制数字（ 0-9，AF，AF ）。
字符 0X必须是前两个字符在EX pression。

The rules are:
The expression must be composed of an even number of hexadecimal digits (0-9, A-F, a-f).
The characters 0x must be the first two characters in the expression.

我敢肯定，这可能是 rewriten在正则表达式在一个更清洁，更有效的方式。
你能帮助我吗？

I'm sure it can be rewriten in regex in a much cleaner and more efficient way.
Could you help me out with that?

推荐答案

在更新你的问题，新的正则表达式，为你工作应该是：

After you updated your question, the new regex that works for you should be:

^0x(?:[0-9A-Fa-f]{2})+$

如果我用（？：的非捕获分组内的效率 {2} 意味着你想要两个的previous EX pression（即两个十六进制字符），您需要一个或多个十六进制字符 + 表示，需要注意的是该禁止 0X 作为有效值。

Where I use (?: for non-capturing grouping for efficiency. The {2} means that you want two of the previous expression (i.e., two hex chars), the + means you want one or more hex characters. Note that this disallows 0x as a valid value.

奥德提到一些有关效率。我不知道你的要求，所以我认为这更多的锻炼头脑比什么都重要。一个正则表达式将跨越只要最小的匹配正则表达式。例如，尽我自己的正则表达式大小10000可变的输入字符串50-5000字符，正确的，它运行1.1秒。

"Oded" mentioned something about efficiency. I don't know your requirements, so I consider this more an exercise for the mind than anything else. A regex will make leaps as long as the smallest matching regex. For instance, trying my own regex on 10,000 variable input strings of size 50-5000 characters, all correct, it runs in 1.1 seconds.

当我尝试以下正则表达式：

When I try the following regex:

^0x(?:[0-9A-Fa-f]{32})+(?:[0-9A-Fa-f]{2})+$

它的运行速度约40％，0.67秒。但要小心。了解你的输入是知道如何编写高效的正则表达式。举例来说，如果正则表达式失败，它将做很多的回溯。如果我的一半的输入字符串具有不正确的长度，运行时间爆炸至约34秒，或3000％（！），对于相同的输入。

it runs about 40% faster, in 0.67 seconds. But be careful. Knowing your input is knowing how to write efficient regexes. For instance, if the regex fails, it will do a lot of back-tracking. If half of my input strings has the incorrect length, the running time explodes to approx 34 seconds, or 3000% (!), for the same input.

如果大多数输入字符串是大它变得更加棘手。如果99％的输入的有效长度，都是> 4130字符，只有少数不是，写

It becomes even trickier if most input strings are large. If 99% of your input is of valid length, all are > 4130 chars and only a few are not, writing

^0x(?:[0-9A-Fa-f]{4096})+^0x(?:[0-9A-Fa-f]{32})+(?:[0-9A-Fa-f]{2})+$

是有效的，提高的时间甚至更多。但是，如果许多不正确的长度％2 = 0 ，这是反效率的，因为后面跟踪。

is efficient and boosts time even more. However, if many have incorrect length % 2 = 0, this is counter-efficient because of back-tracking.

最后，如果大多数串满足偶数的规则，仅一些或许多字符串包含一个错误的字符，速度上升：所述多个输入，它包含一个错误的字符，性能就越好。也就是说，因为当它发现无效字符它可以立即爆发。

Finally, if most your strings satisfy the even-number-rule, and only some or many strings contain a wrong character, the speed goes up: the more input that contains a wrong character, the better the performance. That is, because when it finds an invalid character it can immediately break out.

结论：如果你输入的是混合性小，大，错字，错了算你最快的方法是使用检查字符串的长度（即时在.NET）的组合，并使用有效的正则表达式

Conclusion: if your input is mixed small, large, wrong character, wrong count your fastest approach would be to use a combination of checking the length of the string (instantaneous in .NET) and use an efficient regex.

这篇关于重写与正则表达式IsHexString方法的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

重写与正则表达式IsHexString方法 [英] Rewrite IsHexString method with RegEx

问题描述

推荐答案

相关文章

C#/.NET最新文章

热门教程

热门工具

登录关闭

重写与正则表达式IsHexString方法 [英] Rewrite IsHexString method with RegEx

问题描述

推荐答案

相关文章

C#/.NET最新文章

热门教程

热门工具

登录 关闭

登录关闭