Nokogiri:选择元素A和B之间的内容 [英] Nokogiri: Select content between element A and B

查看:101
本文介绍了Nokogiri:选择元素A和B之间的内容的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



在下面的示例代码中,选择Nokogiri在start和stop元素之间选择所有内容的最佳方法是什么?了解我在找什么:

  require'rubygems'
require'nokogiri'

value = Nokogiri :: HTML.parse(<< -HTML_END)
< html>
< body>
< p id ='para-1'> ; A< / p>
< div class ='block'id ='X1'>
< p class =this> Foo< / p>
& p id ='para-2'> B< / p>
< / div>
< p id ='para-3'> C< / p>
< ; p class =that> Bar< / p>
< p id ='para-4'> D< / p>
< p id ='para-5'> ; E< / p>
< div class ='block'id ='X2'>
< p id ='para-6'> F< / p>
< / div>
< p id ='para-7'> F< / p>
& lt; p id ='para-8'> G< / p>
< / body>
< / html>
HTML_END

parent = value.css('body')。first

#START元素
@start_element = parent.at('p# para-3')
#STOP元素
@end_element = parent.at('p#para-7')

结果(返回值)应该看起来像这样

 < p id ='para-3'> C< / p> 
< p class =that> Bar< / p>
< p id ='para-4'> D< / p>
< p id ='para-5'> E< / p>
< div class ='block'id ='X2'>
< p id ='para-6'> F< / p>
< / div>
< p id ='para-7'> F< / p>

更新:这是我目前的解决方案,虽然我认为必须要更聪明: / strong>

  @my_content =
@selected_node = true

def collect_content (_start)

如果_start == @end_element
@my_content<<<< _start.to_html
@selected_node = false
end

如果@selected_node == true
@my_content<<< _start.to_html
collect_content(_start.next)
end

end

collect_content(@start_element)

puts @ my_content


解决方案

使用递归的方式太聪明的oneliner
$ b

  def collect_between(first,last)
first == last? [第一]:[第一,* collect_between(first.next,last)]
end

一个迭代解决方案:

  def collect_between(first,last)
result = [first]
until first == last
first = first.next
result<<<第一
结束
结果
结束

编辑: (短)解释asterix



它被称为splat运算符。它展开一个数组:

  array = [3,2,1] 
[4,array]# => [4,[3,2,1]]
[4,* array]#=> [4,3,2,1]

some_method(array)#=> some_method([3,2,1])$ ​​b $ b some_method(* array)#=> some_method(3,2,1)

def other_method(* array);阵列; end
other_method(1,2,3)#=> [1,2,3]


What's the smartest way to have Nokogiri select all content between the start and the stop element (including start-/stop-element)?

Check example code below to understand what I'm looking for:

require 'rubygems'
require 'nokogiri'

value = Nokogiri::HTML.parse(<<-HTML_END)
  "<html>
    <body>
      <p id='para-1'>A</p>
      <div class='block' id='X1'>
        <p class="this">Foo</p>
        <p id='para-2'>B</p>
      </div>
      <p id='para-3'>C</p>
      <p class="that">Bar</p>
      <p id='para-4'>D</p>
      <p id='para-5'>E</p>
      <div class='block' id='X2'>
        <p id='para-6'>F</p>
      </div>
      <p id='para-7'>F</p>
      <p id='para-8'>G</p>
    </body>
  </html>"
HTML_END

parent = value.css('body').first

# START element
@start_element = parent.at('p#para-3')
# STOP element
@end_element = parent.at('p#para-7')

The result (return value) should look like this:

<p id='para-3'>C</p>
<p class="that">Bar</p>
<p id='para-4'>D</p>
<p id='para-5'>E</p>
<div class='block' id='X2'>
  <p id='para-6'>F</p>
</div>
<p id='para-7'>F</p>

Update: This is my current solution, though I think there must be something smarter:

@my_content = ""
@selected_node = true

def collect_content(_start)

  if _start == @end_element
    @my_content << _start.to_html
    @selected_node = false
  end

  if @selected_node == true
    @my_content << _start.to_html
    collect_content(_start.next)
  end

end

collect_content(@start_element)

puts @my_content

解决方案

A way-too-smart oneliner which uses recursion:

def collect_between(first, last)
  first == last ? [first] : [first, *collect_between(first.next, last)]
end

An iterative solution:

def collect_between(first, last)
  result = [first]
  until first == last
    first = first.next
    result << first
  end
  result
end

EDIT: (Short) explanation of the asterix

It's called the splat operator. It "unrolls" an array:

array = [3, 2, 1]
[4, array]  # => [4, [3, 2, 1]]
[4, *array] # => [4, 3, 2, 1]

some_method(array)  # => some_method([3, 2, 1])
some_method(*array) # => some_method(3, 2, 1)

def other_method(*array); array; end
other_method(1, 2, 3) # => [1, 2, 3] 

这篇关于Nokogiri:选择元素A和B之间的内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆