Java正则表达式帮助:捕获键值对 [英] Java regex help: capturing key-value pairs

查看:400
本文介绍了Java正则表达式帮助:捕获键值对的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从具有以下形式的字符串中捕获键值对:

I'm trying to capture key-value pairs from strings that have the following form:

a0=d235 a1=2314 com1="abcd" com2="a b c d"

使用这篇文章的帮助,我能够写捕获键值对的以下正则表达式:

Using help from this post, I was able to write the following regex that captures the key-value pairs:

Pattern.compile("(\\w*)=(\"[^\"]*\"|[^\\s]*)");

问题在于此模式中的第二个组还捕获了引号,如下所示:

The problem is that the second group in this pattern also captures the quotation marks, as follows:

a0=d235
a1=2314
com1="abcd"
com2="a b c d"

如何排除引号?我想要这样的东西:

How do I exclude the quotation marks? I want something like this:

a0=d235
a1=2314
com1=abcd
com2=a b c d

可以通过根据是否有引号将值分成不同的组来实现上述目的.我正在为解析器编写此代码,因此出于性能方面的原因,我试图提供一个可返回相同组号中值的正则表达式.

It is possible to achieve the above by capturing the value in different groups depending on whether there are quotation marks or not. I'm writing this code for a parser so for performance reasons I'm trying to come up with a regex that can return the value in the same group number.

推荐答案

这个怎么样?想法是将最后一组分成两组.

How about this? The idea is to split the last group into 2 groups.

Pattern p = Pattern.compile("(\\w+)=\"([^\"]+)\"|([^\\s]+)");

String test = "a0=d235 a1=2314 com1=\"abcd\" com2=\"a b c d\"";
Matcher m = p.matcher(test);

while(m.find()){
    System.out.print(m.group(1));
    System.out.print("=");
    System.out.print(m.group(2) == null ? m.group(3):m.group(2));
    System.out.println();
}

更新

这是一个针对新问题的新解决方案.此正则表达式采用积极的先行和后行方式,以确保没有实际解析引号的情况下使用引号.这样,上面的组2和3可以放在同一组中(下面的组2).返回组0时无法排除引号.

Here is a new solution in response to the updated question. This regex applies positive look-ahead and look-behind to make sure there is a quote without actually parsing it. This way, groups 2 and 3 above, can be put in the same group (group 2 below). There is no way to exclude the quotes by while returning group 0.

Pattern p = Pattern.compile("(\\w+)=\"*((?<=\")[^\"]+(?=\")|([^\\s]+))\"*");

String test = "a0=d235 a1=2314 com1=\"abcd\" com2=\"a b c d\"";
Matcher m = p.matcher(test);

while(m.find()){
    print m.group(1);
    print "="
    println m.group(2);
}

输出

a0=d235
a1=2314
com1=abcd
com2=a b c d

这篇关于Java正则表达式帮助:捕获键值对的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆