Ruby 正则表达式的问题 [英] Problem with Ruby Regular Expression
本文介绍了Ruby 正则表达式的问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有这个 HTML 代码,就在一行中:
I have this HTML code, that's on a single line:
<h3 class='r'><a href="www.google.com">fkdsafjldsajl</a></h3><h3 class='r'><a href="www.google.com">fkdsafjldsajl</a></h3>
这是对线路友好的版本(我不能使用)
Here is the line-friendly version (that i can't use)
<h3 class='r'><a href="www.google.com">fkdsafjldsajl</a></h3>
<h3 class='r'><a href="www.google.com">fkdsafjldsajl</a></h3>
我正在尝试使用这个 REGEX 提取 URL
And i'm trying to extract just the URLs, with this REGEX
/<h3 class="r"><a href="(.*)">(.*)<\/a>/
它回来了
www.google.com">fkdsafjldsajl</a></h3><h3 class='r'><a href="www.google.com"
找到 " 时我该怎么做才能阻止它?
What can I do to stop it when find a " ?
推荐答案
叹气.正则表达式和 HTML 真是尴尬的搭档:
Sigh. Regex and HTML are such awkward bedfellows:
require 'nokogiri'
html = %q{<h3 class='r'><a href="www.google.com">fkdsafjldsajl</a></h3><h3 class='r'><a href="www.google.com">fkdsafjldsajl</a></h3>}
doc = Nokogiri::HTML(html)
puts doc.css('a').map{ |a| a['href'] }
# >> www.google.com
# >> www.google.com
这将找到它们,无论它们是嵌套很深还是都在一行上.
This will find them, whether they are deeply nested or all on one line.
这篇关于Ruby 正则表达式的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文