正则表达式拆分嵌套的坐标字符串 [英] Regex to split nested coordinate strings

查看:61
本文介绍了正则表达式拆分嵌套的坐标字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个字符串,其格式为"[((1,2),(2,3),(3,4)]" ,具有任意数量的元素.我正在尝试用逗号将其拆分以分隔坐标,即检索(1、2)(2、3)(3,4).

I have a String of the format "[(1, 2), (2, 3), (3, 4)]", with an arbitrary number of elements. I'm trying to split it on the commas separating the coordinates, that is, to retrieve (1, 2), (2, 3), and (3, 4).

我可以用Java正则表达式来做吗?我是一个完全菜鸟,但希望Java regex足够强大.如果不是,您能建议一个替代方法吗?

Can I do it in Java regex? I'm a complete noob but hoping Java regex is powerful enough for it. If it isn't, could you suggest an alternative?

推荐答案

您可以使用

You can use String#split() for this.

String string = "[(1, 2), (2, 3), (3, 4)]";
string = string.substring(1, string.length() - 1); // Get rid of braces.
String[] parts = string.split("(?<=\\))(,\\s*)(?=\\()");
for (String part : parts) {
    part = part.substring(1, part.length() - 1); // Get rid of parentheses.
    String[] coords = part.split(",\\s*");
    int x = Integer.parseInt(coords[0]);
    int y = Integer.parseInt(coords[1]);
    System.out.printf("x=%d, y=%d\n", x, y);
}

(?< = \\)) 正向后看表示必须在)之前.(?= \\() 正向超前的意思是表示必须用()替换.(,\\ s *)表示必须在和后面的任何空格处进行拆分 \\ 在这里只是为了逃避特定于正则表达式的字符.

The (?<=\\)) positive lookbehind means that it must be preceded by ). The (?=\\() positive lookahead means that it must be suceeded by (. The (,\\s*) means that it must be splitted on the , and any space after that. The \\ are here just to escape regex-specific chars.

也就是说,特定字符串可识别为

That said, the particular String is recognizeable as outcome of List#toString(). Are you sure you're doing things the right way? ;)

更新,您的确可以采用另一种方法来消除非数字:

Update as per the comments, you can indeed also do the other way round and get rid of non-digits:

String string = "[(1, 2), (2, 3), (3, 4)]";
String[] parts = string.split("\\D.");
for (int i = 1; i < parts.length; i += 3) {
    int x = Integer.parseInt(parts[i]);
    int y = Integer.parseInt(parts[i + 1]);
    System.out.printf("x=%d, y=%d\n", x, y);
}

此处 \\ D 表示必须将其拆分为任何位数字( \\ d 代表数字).后面的.表示应消除数字后的任何空白匹配项.但是,我必须承认我不确定如何在数字前 消除空白匹配.我还不是一位训练有素的正则表达式专家.嘿,巴特·K,你能做得更好吗?

Here the \\D means that it must be splitted on any non-digit (the \\d stands for digit). The . after means that it should eliminate any blank matches after the digits. I must however admit that I'm not sure how to eliminate blank matches before the digits. I'm not a trained regex guru yet. Hey, Bart K, can you do it better?

毕竟,为此最好使用解析器.请参阅有关此主题的休伯特答案.

After all, it's ultimately better to use a parser for this. See Huberts answer on this topic.

这篇关于正则表达式拆分嵌套的坐标字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆