如何使用某些分隔符拆分String但不在Java中删除该分隔符? [英] How to split String with some separator but without removing that separator in Java?
问题描述
我在拆分字符串
时遇到问题。
我想拆分 String
带有一些分隔符,但不会丢失该分隔符。
I want to split a String
with some separator but without losing that separator.
当我们使用 somestring.split(String separator)
Java中的方法它会拆分 String
,但会从 String
中删除分隔符部分。我不希望这种情况发生。
When we use somestring.split(String separator)
method in Java it splits the String
but removes the separator part from String
. I don't want this to happen.
我想要如下结果:
String string1="Ram-sita-laxman";
String seperator="-";
string1.split(seperator);
输出:
[Ram, sita, laxman]
但我希望结果如下所示相反:
but I want the result like the one below instead:
[Ram, -sita, -laxman]
有没有办法得到这样的输出?
Is there a way to get output like this?
推荐答案
string1.split("(?=-)");
这可行,因为 split
实际需要< a href =http://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html>正则表达式。你实际看到的是零宽度正向前瞻。
This works because split
actually takes a regular expression. What you're actually seeing is a "zero-width positive lookahead".
我想解释更多,但我女儿想参加茶话会。 :)
I would love to explain more but my daughter wants to play tea party. :)
编辑:返回!
为了解释这一点,我会先说显示不同的拆分
操作:
To explain this, I will first show you a different split
operation:
"Ram-sita-laxman".split("");
这会在每个零长度字符串上拆分字符串。每个字符之间都有一个零长度的字符串。因此,结果是:
This splits your string on every zero-length string. There is a zero-length string between every character. Therefore, the result is:
["", "R", "a", "m", "-", "s", "i", "t", "a", "-", "l", "a", "x", "m", "a", "n"]
现在,我修改我的正则表达式()以仅匹配零长度字符串如果后跟短划线。
Now, I modify my regular expression (""
) to only match zero-length strings if they are followed by a dash.
"Ram-sita-laxman".split("(?=-)");
["Ram", "-sita", "-laxman"]
例如,?=
表示前瞻。更具体地说,它意味着积极的先行。为什么积极?因为您还可以否定预测(?!
),它将在不的每个零长度字符串上拆分然后是破折号:
In that example, the ?=
means "lookahead". More specifically, it mean "positive lookahead". Why the "positive"? Because you can also have negative lookahead (?!
) which will split on every zero-length string that is not followed by a dash:
"Ram-sita-laxman".split("(?!-)");
["", "R", "a", "m-", "s", "i", "t", "a-", "l", "a", "x", "m", "a", "n"]
您还可以拥有积极的外观(?< =
),它将在每个零长度字符串上分割,前面有一个破折号:
You can also have positive lookbehind (?<=
) which will split on every zero-length string that is preceded by a dash:
"Ram-sita-laxman".split("(?<=-)");
["Ram-", "sita-", "laxman"]
最后,你也可以有负面的背后隐藏(?<!
),它会在不以破折号开头:
Finally, you can also have negative lookbehind (?<!
) which will split on every zero-length string that is not preceded by a dash:
"Ram-sita-laxman".split("(?<!-)");
["", "R", "a", "m", "-s", "i", "t", "a", "-l", "a", "x", "m", "a", "n"]
这四个表达式统称为 lookaround 表达式。
These four expressions are collectively known as the lookaround expressions.
我只想展示一个我最近遇到的结合了两个外观表达式的例子。假设您希望将CapitalCase标识符拆分为其标记:
I just wanted to show an example I encountered recently that combines two of the lookaround expressions. Suppose you wish to split a CapitalCase identifier up into its tokens:
"MyAwesomeClass" => ["My", "Awesome", "Class"]
您可以使用此正则表达式完成此操作:
You can accomplish this using this regular expression:
"MyAwesomeClass".split("(?<=[a-z])(?=[A-Z])");
这会在每个零长度字符串上拆分,前面带有小写字母((?< = [az])
)后跟一个大写字母((?= [AZ])
)。
This splits on every zero-length string that is preceded by a lower case letter ((?<=[a-z])
) and followed by an upper case letter ((?=[A-Z])
).
此技术也适用于camelCase标识符。
This technique also works with camelCase identifiers.
这篇关于如何使用某些分隔符拆分String但不在Java中删除该分隔符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!