如何检测任意字符串中的电子邮件地址 [英] How to detect email addresses within arbitrary strings

查看:299
本文介绍了如何检测任意字符串中的电子邮件地址的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用以下代码检测字符串中的电子邮件。除了处理具有纯数字前缀的电子邮件(例如536264846@gmail.com)外,它的工作正常。是否可以克服这个苹果的错误?任何帮助将不胜感激!

  NSString * string = @536264846@gmail.com; 
NSError * error = NULL;
NSDataDetector * detector = [NSDataDetector dataDetectorWithTypes:NSTextCheckingTypeLink error:& error];
NSArray * matches = [检测器matchesInString:string
选项:0
范围:NSMakeRange(0,[string length])];
for(NSTextCheckingResult * match in matches){
if([match.URL.scheme isEqualToString:@mailto]){
NSString * email = [match.URL.absoluteString substringFromIndex: match.URL.scheme.length + 1];
NSLog(@email:%@,email);

} else {
NSLog(@[match URL]:%@,[match URL]);
}

}

修改
日志结果是:[匹配URL]: http://gmail.com

解决方案

过去我做了什么:




  • p>对输入进行标记,例如使用空格的独立令牌(因为大多数其他常见的分隔符在电子邮件中可能是有效的)。但是,如果正则表达式不被锚定,那么这可能不是必需的 - 但不确定如果没有^和$锚点(我添加到网站上显示的内容),它将如何工作。


  • 请注意,地址可能采用字符串'以及地址



  • 通过正则表达式运行令牌在此电子邮件检测器比较网站(我发现在测试中< a href =http://svn.php.net/viewvc/php/php-src/trunk/ext/filter/logical_filters.c?view=markup =nofollow noreferrer>一个标有#1 截至3/21/2013最好)




我所做的是将正则表达式放在文本文件中,所以我不需要逃脱它:


^((:( ?: \x22 \x5C [\x00-\x7E] \x22)|(?:!????\x22 [^ \x5C\x22? ?] \x22)){255})((:( ?: \x22 \x5C [\x00-\x7E] \x22)|(?!????\x22? [^ \x5C\x22] \x22)){65,} @)(?:?(?:[\x21\x23-\x27\x2A\x2B\x2D\x2F -\x39\x3D\x3F\x5E-\x7E] +)|(?:?\x22(:[\x01-\x08\x0B\x0C\x0E-\ x1F\x21\x23-\x5B\x5D-\x7F] |(?:\x5C [\x00-\x7F])?)。 \x22))(:( ?:(?:[\x21\x23-\x27\x2A\x2B\x2D\x2F-\x39\x3D\x3F\x5E-\x7E] +)|( ?:\x22(:[\x01-\x08\x0B\x0C\x0E-\x1F\x21\x23-\x5B\x5D-\x7F] |(? :\x5C [\x00-\x7F])) \x22))) @(?:(?:?!( [^] {64, })(:( :( ?: XN - )[A-Z0-9] +(:???? - [A-Z0-9] +)){1126}){1,} (:(:[AZ] [A-Z0-9] )|(:( ?: XN - )[A-Z0-9] +????))(?: - [A-Z0 -9] +))|(:???[(:( ?:的IPv6 :( :( ?: [A-f0-9] {1,4} {7})|((:: [A-f0-9] {1,4}):??!?((: [一-f0-9] [:]]){7,})(:?[A-f0-9] {1,4}(:: [A-f0-9] {1,4}){0, 5})::(?:?[A-f0-9] {1,4}(:: [A-f0-9] {1,4})?{0,5}))))|( ?:( ?: IPv6的:(:(:[A-f0-9] {1,4}(:: [A-f0-9] {1,4}){5} :) |(???? :((?:!*?[A-f0-9]){5})(?:?[A-f0-9] {1,4}(:: [A-f0-9] { 1,4}){0,3})::(:???[A-f0-9] {1,4}(:: [A-f0-9] {1,4}){0,3 }:))))(:( ?: 25 [0-5])|(:???2 [0-4] [0-9])|(?:λ1 [0-9] {2} )|(:[1-9] [0-9]))(:(:( ?: 25 [0-5])|(:?????2 [0-4] [0-9] )|(?:1 [0-9] {2})|(?:[1-9]?[0-9]))){3}))]))$


定义一个ivar:

  NSRegularExpression * reg 

创建正则表达式:

  NSString * fullPath = [[NSBundle mainBundle] pathForResource:@EMailRegExpofType:@txt]; 
NSString * pattern = [NSString stringWithContentsOfFile:fullPath encoding:NSUTF8StringEncoding error:NULL];
NSError * error = nil;
reg = [NSRegularExpression regularExpressionWithPattern:pattern options:NSRegularExpressionCaseInsensitive error:& error];
assert(reg&&!error);

然后写了一个方法来做比较:

   - (BOOL)isValidEmail:(NSString *)string 
{
NSTextCheckingResult * match = [reg firstMatchInString:string options:0 range:NSMakeRange(0, [string length])];
返回匹配?是:否
}

编辑:我已将上述转换为 github上的项目



EDIT2:为了更改,不太严格但更快,请参阅此问题的评论部分


I'm using the following code to detect an email in the string. It works fine except dealing with email having pure number prefix, such as "536264846@gmail.com". Is it possible to overcome this bug of apple? Any help will be appreciated!

NSString *string = @"536264846@gmail.com";
NSError *error = NULL;
NSDataDetector *detector = [NSDataDetector dataDetectorWithTypes:NSTextCheckingTypeLink error:&error];
NSArray *matches = [detector matchesInString:string
                                     options:0
                                       range:NSMakeRange(0, [string length])];    
for (NSTextCheckingResult *match in matches) {
    if ([match.URL.scheme isEqualToString:@"mailto"]) {
        NSString *email = [match.URL.absoluteString substringFromIndex:match.URL.scheme.length + 1];
        NSLog(@"email :%@",email);

    }else{
        NSLog(@"[match URL] :%@",[match URL]);
    }

}

Edit: log result is: [match URL] :http://gmail.com

解决方案

What I did in the past:

  • tokenize the input, e.g., separate tokens using spaces (since most other common separators may be valid within an email). However, this may not be necessary if the regular expression is not anchored - but not sure how it would work without the "^" and "$" anchors (which I added to what was shown on the web site).

  • keep in mind that addresses may take the form '"string"' as well as just address

  • in each token, look for '@', as it's probably the best indicator you have that its an email address

  • run the token through the regular expression shown on this Email Detector comparison site (I found in testing that the one marked #1 as of 3/21/2013 worked best)

What I did was put the regular expression in a text file, so I didn't need to escape it:

^(?!(?:(?:\x22?\x5C[\x00-\x7E]\x22?)|(?:\x22?[^\x5C\x22]\x22?)){255,})(?!(?:(?:\x22?\x5C[\x00-\x7E]\x22?)|(?:\x22?[^\x5C\x22]\x22?)){65,}@)(?:(?:[\x21\x23-\x27\x2A\x2B\x2D\x2F-\x39\x3D\x3F\x5E-\x7E]+)|(?:\x22(?:[\x01-\x08\x0B\x0C\x0E-\x1F\x21\x23-\x5B\x5D-\x7F]|(?:\x5C[\x00-\x7F]))\x22))(?:.(?:(?:[\x21\x23-\x27\x2A\x2B\x2D\x2F-\x39\x3D\x3F\x5E-\x7E]+)|(?:\x22(?:[\x01-\x08\x0B\x0C\x0E-\x1F\x21\x23-\x5B\x5D-\x7F]|(?:\x5C[\x00-\x7F]))\x22)))@(?:(?:(?!.[^.]{64,})(?:(?:(?:xn--)?[a-z0-9]+(?:-[a-z0-9]+).){1,126}){1,}(?:(?:[a-z][a-z0-9])|(?:(?:xn--)[a-z0-9]+))(?:-[a-z0-9]+))|(?:[(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){7})|(?:(?!(?:.[a-f0-9][:]]){7,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,5})?)))|(?:(?:IPv6:(?:(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){5}:)|(?:(?!(?:.*[a-f0-9]:){5,})(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3})?::(?:[a-f0-9]{1,4}(?::[a-f0-9]{1,4}){0,3}:)?)))?(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))(?:.(?:(?:25[0-5])|(?:2[0-4][0-9])|(?:1[0-9]{2})|(?:[1-9]?[0-9]))){3}))]))$

Defined an ivar:

NSRegularExpression *reg

Created the regular expression:

NSString *fullPath = [[NSBundle mainBundle] pathForResource:@"EMailRegExp" ofType:@"txt"];
NSString *pattern = [NSString stringWithContentsOfFile:fullPath encoding:NSUTF8StringEncoding error:NULL];
NSError *error = nil;
reg = [NSRegularExpression regularExpressionWithPattern:pattern options:NSRegularExpressionCaseInsensitive error:&error];
assert(reg && !error);

Then wrote a method to do the comparison:

- (BOOL)isValidEmail:(NSString *)string
{
    NSTextCheckingResult *match = [reg firstMatchInString:string options:0 range:NSMakeRange(0, [string length])];
    return match ? YES : NO;
}

EDIT: I've turned the above into a project on github

EDIT2: for an alterate, less rigorous but faster, see the comment section of this question

这篇关于如何检测任意字符串中的电子邮件地址的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆