在字符串中搜索单词 [英] Search for a word in a String

查看:101
本文介绍了在字符串中搜索单词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我在字符串中查找特定单词,例如,在字符串你好吗我正在寻找是。
常规indexOf()工作得更快更好还是正则表达式匹配()

 字符串testStr =怎么样您; 
String lookUp =are;

// METHOD1
if(testStr.indexOf(lookUp)!= -1)
{
System.out.println(Found!);
}

// OR
//方法2
if(testStr.match(。*+ lookUp +。*))
{
System.out.println(Found!);
}

上述两种方法中的哪一种是查找内部字符串的更好方法另一串?或者有更好的选择吗?




  • Ivard


解决方案

如果你不关心它是否真的是你匹配的整个单词,那么 indexOf()将是更快。



另一方面,如果你需要能够区分 harebrained 不是等等,那么你需要一个正则表达式: \ bare \ b 只会将 作为整个单词匹配( \\bare\\b b
$ b

\ b 是一个单词边界锚点,它与空白空间匹配在字母数字字符(字母,数字或下划线)和非字母数字字符之间。



警告:这也意味着,如果您的搜索字词实际上不是一个字(假设您正在寻找 ### ),然后这些单词边界锚点只匹配 aaa ### zzz 之类的字符串,但不会出现在 +++中### +++



进一步警告:默认情况下,Java对于构成字母数字字符的内容有一个有限的世界观。此处只有ASCII字母/数字(加上下划线)计数,因此单词边界锚点会在élèverelevé或ärgern了解更多相关信息(以及如何解决此问题)这里


If I am looking for a particular word inside a string, for example, in the string "how are you" I am looking for "are". Would a regular indexOf() work faster and better or a Regex match()

String testStr = "how are you";
String lookUp = "are";

//METHOD1
if (testStr.indexOf(lookUp) != -1)
{
 System.out.println("Found!");
}

//OR
//METHOD 2
if (testStr.match(".*"+lookUp+".*"))
{
 System.out.println("Found!");
}

Which of the two methods above is a better way of looking for a string inside another string? Or is there a much better alternative?

  • Ivard

解决方案

If you don't care whether it's actually the entire word you're matching, then indexOf() will be a lot faster.

If, on the other hand, you need to be able to differentiate between are, harebrained, aren't etc., then you need a regex: \bare\b will only match are as an entire word (\\bare\\b in Java).

\b is a word boundary anchor, and it matches the empty space between an alphanumeric character (letter, digit, or underscore) and a non-alphanumeric character.

Caveat: This also means that if your search term isn't actually a word (let's say you're looking for ###), then these word boundary anchors will only match in a string like aaa###zzz, but not in +++###+++.

Further caveat: Java has by default a limited worldview on what constitutes an alphanumeric character. Only ASCII letters/digits (plus the underscore) count here, so word boundary anchors will fail on words like élève, relevé or ärgern. Read more about this (and how to solve this problem) here.

这篇关于在字符串中搜索单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆