验证 html 文本中 url 的存在 [英] validate presence of url in html text
问题描述
我想检测和过滤从表单发送的 html 文本是否包含 url 或 urls.
I would like detect and filter if html text sent from a form, contains url or urls.
例如,我从表单发送此 html:
For example, i send from a form this html:
RESOURCES<br></u></b><a target="_blank" rel="nofollow" href="http://stackoverflow.com/users/778094/hyperrjas">http://stackoverflow.com/users/778094/hyperrjas</a> <br><a target="_blank" rel="nofollow" href="https://github.com/hyperrjas">https://github.com/hyperrjas</a> <br><a target="_blank" rel="nofollow" href="http://www.linkedin.com/pub/juan-ardila-serrano/11/2a7/62">http://www.linkedin.com/pub/juan-ardila-serrano/11/2a7/62</a> <br>
我不想在 html 文本中允许一个或多个 url/url.它可能是这样的:
validate :no_urls
def no_urls
if text_contains_url
errors.add(:url, "#{I18n.t("mongoid.errors.models.profile.attributes.url.urls_are_not_allowed_in_this_text", url: url)}")
end
end
我想知道,如何过滤html文本是否包含一个或多个网址?
推荐答案
Mattherick 的答案仅在字符串不包含冒号符号 ":" 时有效.
The Mattherick's answer only works if the string not contains the colon symbol ":".
对于 Ruby 1.9.3,正确的做法是添加第二个参数来解决这个问题.
With Ruby 1.9.3 right thing is to add a second parameter to fix this problem.
此外,如果您将电子邮件地址添加为纯文本,则此代码不会过滤此电子邮件地址.解决此问题的方法是:
Also, If you add a email address as plain text, this code is not filtering this email address. The fix to this problem is:
html_text = "html text with email address e.g. info@test.com"
email_address = html_text.match(/[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}/i)[0]
所以,这是我的代码,适合我:
So, this is my code that works correctly for me:
def no_urls
whitelist = %w(attr1, attr2, attr3, attr4)
attributes.select{|el| whitelist.include?(el)}.each do |key, value|
links = URI.extract(value, /http(s)?|mailto/)
email_address = "#{value.match(/[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}/i)}"
unless links.empty? and email_address.empty?
logger.info links.first.inspect
errors.add(key, "#{I18n.t("mongoid.errors.models.cv.attributes.no_urls")}")
end
end
end
问候!
这篇关于验证 html 文本中 url 的存在的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!