在ruby / rails的html块中提取电子邮件地址 [英] Extracting email addresses in an html block in ruby/rails

查看：120 发布时间：2020/10/29 2:35:03 ruby-on-rails ruby regex html-parsing email-integration

本文介绍了在ruby / rails的html块中提取电子邮件地址的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在创建一个解析器，以防止来自tinyMCE的文本块中的垃圾邮件和电子邮件收集（因此其中可能包含或不包含html标签）

I am creating a parser that wards off against spamming and harvesting of emails from a block of text that comes from tinyMCE (so it may or may not have html tags in it)

我已经尝试过正则表达式，到目前为止，它已经成功完成：

I've tried regexes and so far this has been successful:

/\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b/i

问题是，我需要忽略所有带有mailto hrefs的电子邮件地址。例如：

problem is, i need to ignore all email addresses with mailto hrefs. for example:

<a href="mailto:test@mail.com">test@mail.com</a>

只应返回第二封电子邮件。

should only return the second email add.

要了解即时信息的背景，即时将电子邮件地址反向转换为一个块，以便上面的示例如下所示：

To get a background of what im doing, im reversing the email addresses in a block so the above example would look like this:

<a href="mailto:test@mail.com">moc.liam@tset</a>

我当前的正则表达式的问题是它也替换了href中的那个。有没有办法让我使用单个正则表达式来做到这一点？还是我必须先检查一个然后再检查另一个？有没有办法让我仅通过使用gsub来执行此操作，还是必须使用一些nokogiri / hpricot magicks和诸如此类的东西来解析mailto？

problem with my current regex is that it also replaces the one in href. Is there a way for me to do this with a single regex? Or do i have to check for one then the other? Is there a way for me to do this just by using gsub or do I have to use some nokogiri/hpricot magicks and whatnot to parse the mailtos? Thanks in advance!

以下是我的参考资料：

so.com/questions/504860/extract-电子邮件地址来自文本块

so.com/questions/504860/extract-email-addresses-from-a-block-of-text

so.com/questions/1376149/regexp-for-extractioning-amailto-address

so.com/questions/1376149/regexp-for-extracting-a-mailto-address

我还使用以下代码进行测试：

im also testing using this:

http://rubular.com/

edit

这是我当前的帮助程序代码：

here's my current helper code:

def email_obfuscator(text)
  text.gsub(/\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b/i) { |m|
    m = "<span class='anti-spam'>#{m.reverse}</span>"
  }
end

这将导致：

<a target="_self" href="mailto:<span class='anti-spam'>moc.liamg@tset</span>"><span class="anti-spam">moc.liamg@tset</span></a>

在ruby / rails的html块中提取电子邮件地址 [英] Extracting email addresses in an html block in ruby/rails

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

在ruby / rails的html块中提取电子邮件地址 [英] Extracting email addresses in an html block in ruby/rails

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭