适用于UUID的Java正则表达式 [英] java regex for UUID

查看:38
本文介绍了适用于UUID的Java正则表达式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想解析具有以下格式的UUID的字符串

I want to parse a String which has UUID in the below format

"<urn:uuid:4324e9d5-8d1f-442c-96a4-6146640da7ce>"

我已经尝试过以下面的方式进行解析,虽然可以,但是我认为它会很慢

I have tried it parsing in below way, which works, however I think it would be slow

private static final String reg1 = ".*?";
private static final String reg2 = "([A-Z0-9]{8}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{12})";
private static final Pattern splitter = Pattern.compile(re1 + re2, Pattern.CASE_INSENSITIVE | Pattern.DOTALL);

我正在寻找一种更快的方法,并在下面进行了尝试,但无法匹配

I am looking for a faster way and tried below, but it fails to match

private static final Pattern URN_UUID_PATTERN = Pattern.compile("^< urn:uuid:([^&])+&gt");

我是regex的新手.任何帮助表示赞赏.

I am new to regex. any help is appreciated.

\ Aqura

推荐答案

您的较正则表达式示例使用的是< ,其中输入为& lt; 所以很混乱.

Your example of a faster regex is using a < where the input is &lt; so that's confusing.

关于速度,首先,您的UUID是十六进制的,因此请不要与 A-Z 匹配,而应与 a-f 匹配.其次,您不提供大小写混合的指示,因此请不要使用不区分大小写的字母,并在范围内编写正确的字母.

Regarding speed, first, your UUID is hexadecimal, so don't match with A-Z but rather a-f. Second you give no indication that case is mixed, so don't use case insensitive and write the correct case in the range.

您无需解释是否需要UUID之前的部分.如果不是,请不要包含.*?,您也可以在中一起编写 re1 re2 的文字.最终模式.没有迹象表明您也需要DOTALL.

You don't explain if you need the part preceding the UUID. If not, don't include .*?, and you may as well write the literals for re1 and re2 together in your final Pattern. There's no indication you need DOTALL either.

private static final Pattern splitter =
  Pattern.compile("([a-f0-9]{8}(-[a-f0-9]{4}){4}[a-f0-9]{8})");

或者,如果您测量正则表达式的性能太慢,则可以尝试另一种方法,例如:
在您的示例中,每个uuid前面是否都带有"uuid:"?如果可以的话

Alternatively, if you are measuring your Regular Expression's performance to be too slow, you might try another approach, for example:
Is each uuid preceded by "uuid:" as in your example? If so you can

  1. 找到"uuid:"的第一个索引为 i ,然后
  2. 子字符串0到 i +5 [假设您完全需要它],并且
  3. 将字符串 i +5更改为 i +41,如果我算对的话(长度为36个字符).
  1. find the first index of "uuid:" as i, then
  2. substring 0 to i+5 [assuming you needed it at all], and
  3. substring i+5 to i+41, if I counted that right (36 characters in length).

沿着相似的行,您更快的正则表达式可能是:

Along similar lines your faster regex could be:

private static final Pattern URN_UUID_PATTERN =
    Pattern.compile("^&lt;urn:uuid:(.{36})&gt;");

OTOH,如果您所有的输入字符串都将以这些确切的字符开头,则无需执行先前建议中的步骤1,只需 input.substring(13,49);

OTOH if all your input strings are going to start with those exact characters, no need to do step 1 in the previous suggestion, just input.substring(13, 49);

这篇关于适用于UUID的Java正则表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆