适用于UUID的Java正则表达式 [英] java regex for UUID
问题描述
我想解析具有以下格式的UUID的字符串
I want to parse a String which has UUID in the below format
"<urn:uuid:4324e9d5-8d1f-442c-96a4-6146640da7ce>"
我已经尝试过以下面的方式进行解析,虽然可以,但是我认为它会很慢
I have tried it parsing in below way, which works, however I think it would be slow
private static final String reg1 = ".*?";
private static final String reg2 = "([A-Z0-9]{8}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{12})";
private static final Pattern splitter = Pattern.compile(re1 + re2, Pattern.CASE_INSENSITIVE | Pattern.DOTALL);
我正在寻找一种更快的方法,并在下面进行了尝试,但无法匹配
I am looking for a faster way and tried below, but it fails to match
private static final Pattern URN_UUID_PATTERN = Pattern.compile("^< urn:uuid:([^&])+>");
我是regex的新手.任何帮助表示赞赏.
I am new to regex. any help is appreciated.
\ Aqura
推荐答案
您的较正则表达式示例使用的是<
,其中输入为& lt;
所以很混乱.
Your example of a faster regex is using a <
where the input is <
so that's confusing.
关于速度,首先,您的UUID是十六进制的,因此请不要与 A-Z
匹配,而应与 a-f
匹配.其次,您不提供大小写混合的指示,因此请不要使用不区分大小写的字母,并在范围内编写正确的字母.
Regarding speed, first, your UUID is hexadecimal, so don't match with A-Z
but rather a-f
. Second you give no indication that case is mixed, so don't use case insensitive and write the correct case in the range.
您无需解释是否需要UUID之前的部分.如果不是,请不要包含.*?
,您也可以在中一起编写
.没有迹象表明您也需要DOTALL. re1
和 re2
的文字.最终模式
You don't explain if you need the part preceding the UUID. If not, don't include .*?
, and you may as well write the literals for re1
and re2
together in your final Pattern
. There's no indication you need DOTALL either.
private static final Pattern splitter =
Pattern.compile("([a-f0-9]{8}(-[a-f0-9]{4}){4}[a-f0-9]{8})");
或者,如果您测量正则表达式的性能太慢,则可以尝试另一种方法,例如:
在您的示例中,每个uuid前面是否都带有"uuid:"?如果可以的话
Alternatively, if you are measuring your Regular Expression's performance to be too slow, you might try another approach, for example:
Is each uuid preceded by "uuid:" as in your example? If so you can
- 找到"uuid:"的第一个索引为 i ,然后
- 子字符串0到 i +5 [假设您完全需要它],并且
- 将字符串 i +5更改为 i +41,如果我算对的话(长度为36个字符).
- find the first index of "uuid:" as i, then
- substring 0 to i+5 [assuming you needed it at all], and
- substring i+5 to i+41, if I counted that right (36 characters in length).
沿着相似的行,您更快的正则表达式可能是:
Along similar lines your faster regex could be:
private static final Pattern URN_UUID_PATTERN =
Pattern.compile("^<urn:uuid:(.{36})>");
OTOH,如果您所有的输入字符串都将以这些确切的字符开头,则无需执行先前建议中的步骤1,只需 input.substring(13,49);
OTOH if all your input strings are going to start with those exact characters, no need to do step 1 in the previous suggestion, just input.substring(13, 49);
这篇关于适用于UUID的Java正则表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!