关于标点符号的正则表达式 [英] Regular Expressions on Punctuation

查看:1401
本文介绍了关于标点符号的正则表达式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我对正则表达式完全不熟悉,我正在尝试使用Java的 java.util.regex 来查找标点符号输入字符串。我不知道我可能提前得到什么样的标点符号,除了(1)!,?,。,...都是有效的puncutation,以及(2)<和>表示特殊的东西,不算作标点符号。
程序本身伪随机地构建短语,我想在句子结束之前删除句点之前的句点。

So I'm completely new to regular expressions, and I'm trying to use Java's java.util.regex to find punctuation in input strings. I won't know what kind of punctuation I might get ahead of time, except that (1) !, ?, ., ... are all valid puncutation, and (2) "<" and ">" mean something special, and don't count as punctuation. The program itself builds phrases pseudo-randomly, and I want to strip off the punctuation at the end of a sentence before it goes through the random process.

我可以将整个单词与任何标点符号匹配,但匹配器只是为我提供了该单词的索引。换句话说:

I can match entire words with any punctuation, but the matcher just gives me indexes for that word. In other words:

Pattern p = Pattern.compile("(.*\\!)*?");
Matcher m = p.matcher([some input string]);

将使用!获取任何单词最后。例如:

String inputString = "It is a warm Summer day!";
Pattern p = Pattern.compile("(.*\\!)*?");
Matcher m = p.matcher(inputString);
String match = inputString.substring(m.start(), m.end());

结果 - >字符串匹配〜天!

results in --> String match ~ "day!"

但是我想要 Matcher 索引,所以我可以分开它关闭了。

But I want to have Matcher index just the "!", so I can just split it off.

我可以制作案例,并为每种情况使用 String.substring(...)标点符号我可能会得到,但我希望我使用正则表达式做错了。

I could probably make cases, and use String.substring(...) for each kind of punctuation I might get, but I'm hoping there's some mistake in my use of regular expressions to do this.

推荐答案

我会尝试类似的字符类正则表达式

I would try a character class regex similar to

"[.!?\\-]"

[] 中添加您想要匹配的任何字符。小心转义任何可能对正则表达式解析器有特殊含义的字符。

Add whatever characters you wish to match inside the []s. Be careful to escape any characters that might have a special meaning to the regex parser.

然后你必须使用 Matcher迭代匹配。 find()直到它返回false。

You then have to iterate through the matches by using Matcher.find() until it returns false.

这篇关于关于标点符号的正则表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆