如何为俚语和表情符号构建正则表达式(正则表达式) [英] how to build a regular expression (regex) for slangs and emoticons

查看:145
本文介绍了如何为俚语和表情符号构建正则表达式(正则表达式)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要建立一个正则表达式来匹配俚语(即lol,lmao,imo等等)和表情符号(即:),:P ,;)等等。)。

i need to build a regex to match slangs (i.e. lol, lmao, imo, etc..) and emoticons (i.e. :), :P, ;), etc...).

我按照上的示例进行了操作http://www.coderanch.com/t/497238/java/java/Regular-Expression-Detecting-Emoticons 。但是,这种方法/方法对我来说是失败的。

i followed the example at http://www.coderanch.com/t/497238/java/java/Regular-Expression-Detecting-Emoticons. however, this method/approach is failing for me.

例如,假设我需要匹配俚语od。我创建一个模式如下。
模式模式= Pattern.compile(Pattern.quote(od));

for example, let's say i need to match the slang "od". i create a Pattern as follows. Pattern pattern = Pattern.compile(Pattern.quote("od"));

让我们说我需要匹配下面的俚语od测试句,有些方法很糟糕。根据经验,字符串中的方法一词有一个匹配,这不是我想要的。

let's say i need to match the slang "od" in the following test sentence, "some methods are bad." empirically, there is one match on the word "methods" in the string, which is not what i want.

我确实阅读了一些javadoc和一些教程关于java和regex,但我仍然无法解决这个问题。

i did read some of the javadoc and some of the tutorial regarding java and regex, but i still can't figure this out.

顺便说一句,我使用的是Java 6(虽然我看过并参考了java 5 api doc)。

by the way, i am using Java 6 (though i've looked and reference the java 5 api doc).

如果正则表达式不是最佳方式,我也可以使用其他解决方案。提前感谢任何帮助/指针。以下代码获取3个匹配项并基于上面的链接。

if regex is not the best way to go, i am opened to other solutions as well. thanks in advance for any help/pointers. the following code gets me 3 matches and is based on the link above.

String regex = "od";
Pattern pattern = Pattern.compile(Pattern.quote(regex));
String str = "some methods are bad od od more text";
Matcher matcher = pattern.matcher(str);
while(matcher.find()) {
    System.out.println(matcher.group());
}

以下代码不返回任何匹配项,并且基于目前为止的回复。 / p>

the following code returns no matches and is based on the responses so far.

String regex = "\bod\b";
Pattern pattern = Pattern.compile(regex);
//Pattern pattern = Pattern.compile(Pattern.quote(regex)); //this fails
String str = "some methods are bad od od more text";
Matcher matcher = pattern.matcher(str);
while(matcher.find()) {
    System.out.println(matcher.group());
}

在下面两个有用的回复之后,我会发布正确/想要的代码段这里。

after the two helpful responses below, i will post the correct/desired code snippet here.

String regex = "(\\bod\\b)|(\\blmao\\b)";
Pattern pattern = Pattern.compile(regex);
String str = "some methods are bad od od more text lmao more text";
Matcher matcher = pattern.matcher(str);
while(matcher.find()) {
    System.out.println(matcher.group());
}

此代码是正确的还是按照需要,因为根据经验,它给了我3个匹配项( 2 od和1 lmao)。对不起,我希望我使用java(和一般的正则表达式)使用正则表达式更强。谢谢你的帮助。

this code is correct or as desired because empirically, it gives me 3 matches (2 od and 1 lmao). sorry, i wish i am stronger with regex using java (and just regex in general). thanks for your help.

推荐答案

[:;] - ?[DP()]

[:;]-?[DP()]

处理:或:的组合加上 - 和D或P或)或(


例如:P :-(; D等...

handles the combinations of ":" or ":" plus either "-" and "D" or "P" or ")" or "("
eg. :P :-( ;D etc...

只需添加更多组合......

just add more combinations...

玩得开心..

这篇关于如何为俚语和表情符号构建正则表达式(正则表达式)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆