正则表达式拆分嵌套的坐标字符串 [英] Regex to split nested coordinate strings
问题描述
我有一个字符串,其格式为"[((1,2),(2,3),(3,4)]"
,具有任意数量的元素.我正在尝试用逗号将其拆分以分隔坐标,即检索(1、2)
,(2、3)
和(3,4)
.
I have a String of the format "[(1, 2), (2, 3), (3, 4)]"
, with an arbitrary number of elements. I'm trying to split it on the commas separating the coordinates, that is, to retrieve (1, 2)
, (2, 3)
, and (3, 4)
.
我可以用Java正则表达式来做吗?我是一个完全菜鸟,但希望Java regex足够强大.如果不是,您能建议一个替代方法吗?
Can I do it in Java regex? I'm a complete noob but hoping Java regex is powerful enough for it. If it isn't, could you suggest an alternative?
推荐答案
You can use String#split()
for this.
String string = "[(1, 2), (2, 3), (3, 4)]";
string = string.substring(1, string.length() - 1); // Get rid of braces.
String[] parts = string.split("(?<=\\))(,\\s*)(?=\\()");
for (String part : parts) {
part = part.substring(1, part.length() - 1); // Get rid of parentheses.
String[] coords = part.split(",\\s*");
int x = Integer.parseInt(coords[0]);
int y = Integer.parseInt(coords[1]);
System.out.printf("x=%d, y=%d\n", x, y);
}
(?< = \\))
正向后看表示必须在)
之前.(?= \\()
正向超前的意思是表示必须用(
)替换.(,\\ s *)
表示必须在,
和后面的任何空格处进行拆分 \\
在这里只是为了逃避特定于正则表达式的字符.
The (?<=\\))
positive lookbehind means that it must be preceded by )
. The (?=\\()
positive lookahead means that it must be suceeded by (
. The (,\\s*)
means that it must be splitted on the ,
and any space after that. The \\
are here just to escape regex-specific chars.
That said, the particular String is recognizeable as outcome of List#toString()
. Are you sure you're doing things the right way? ;)
更新,您的确可以采用另一种方法来消除非数字:
Update as per the comments, you can indeed also do the other way round and get rid of non-digits:
String string = "[(1, 2), (2, 3), (3, 4)]";
String[] parts = string.split("\\D.");
for (int i = 1; i < parts.length; i += 3) {
int x = Integer.parseInt(parts[i]);
int y = Integer.parseInt(parts[i + 1]);
System.out.printf("x=%d, y=%d\n", x, y);
}
此处 \\ D
表示必须将其拆分为任何非位数字( \\ d
代表数字).后面的.
表示应消除数字后的任何空白匹配项.但是,我必须承认我不确定如何在数字前 消除空白匹配.我还不是一位训练有素的正则表达式专家.嘿,巴特·K,你能做得更好吗?
Here the \\D
means that it must be splitted on any non-digit (the \\d
stands for digit). The .
after means that it should eliminate any blank matches after the digits. I must however admit that I'm not sure how to eliminate blank matches before the digits. I'm not a trained regex guru yet. Hey, Bart K, can you do it better?
毕竟,为此最好使用解析器.请参阅有关此主题的休伯特答案.
After all, it's ultimately better to use a parser for this. See Huberts answer on this topic.
这篇关于正则表达式拆分嵌套的坐标字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!