如何检测文本中的电话号码(并替换它们)? [英] how to detect telephone numbers in a text (and replace them)?
问题描述
我知道可以对不良单词(检查一组预设单词)进行处理,但是如何检测长文本中的电话号码? 我正在用PHP建立一个网站,该网站的客户需要避免使用描述字段输入手机号码的人.(请参阅craigslist等.)
I know it can be done for bad words (checking an array of preset words) but how to detect telephone numbers in a long text? I'm building a website in PHP for a client who needs to avoid people using the description field to put their mobile phone numbers..(see craigslist etc..)
除了他将需要一些节制,但我想知道是否有办法阻止至少像nnn-nnn-nnnn
这样的显而易见的东西,而不是要阻止像HeiGHT*/four*/nine
这样的其他怪异的写作方式...
beside he's going to need some moderation but i was wondering if there is a way to block at least the obvious like nnn-nnn-nnnn
, not asking to block other weird way of writing like HeiGHT*/four*/nine
etc...
推荐答案
欢迎来到正则表达式领域.基本上,您将要使用preg_replace查找(某种模式)并替换为字符串.
Welcome to the world of regular expressions. You're basically going to want to use preg_replace to look for (some pattern) and replace with a string.
以下是一些可以让您开始的东西:
Here's something to start you off:
$text = preg_replace('/\+?[0-9][0-9()\-\s+]{4,20}[0-9]/', '[blocked]', $text);
这寻找:
加号(可选),后跟一个数字,然后是4-20个数字,方括号,破折号或空格,然后是一个数字
a plus symbol (optional), followed by a number, followed by between 4-20 numbers, brackets, dashes or spaces, followed by a number
并替换为字符串[blocked].
and replaces with the string [blocked].
这捕获了所有我能想到的明显组合:
This catches all the obvious combinations I can think of:
012345 123123
+44 1234 123123
+44(0)123 123123
0123456789
Placename 123456 (although this one will leave 'Placename')
但是,它还会去除连续的6个以上的数字,这可能是不希望的!
however it will also strip out any succession of 6+ numbers, which might not be desirable!
这篇关于如何检测文本中的电话号码(并替换它们)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!