如何使用 Ruby 的 Sanitize/Nokogiri 访问未标记的元素? [英] How can I use Ruby's Sanitize/Nokogiri to access untagged elements?

查看：63 发布时间：2021/6/8 18:49:09 ruby nokogiri sanitize

本文介绍了如何使用 Ruby 的 Sanitize/Nokogiri 访问未标记的元素?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试构建一个 Sanitize 转换器，它接受可能格式错误的 HTML 输入任何标签之外的元素，例如在这个例子中:

出一个标签在一个标签中</p>出又一次！

我想让转换器将任何未标记的元素包装在 <p> 标签中，以便上述转换为:

出一个标签
再出一个标签出！</p>

不幸的是，我无法弄清楚如何选择未标记的元素，因为它不是节点.我确定我在这里遗漏了一些东西.有人能给我一个正确的方向吗?

解决方案

require 'nokogiri'html = 'out of a tag<p>in a tag</p>out again!'Nokogiri::HTML(html).at_css('body').children.地图 {|x|'<p>'+ x.text + '</p>'}.加入('')#=><p>出标签</p><p>在标签中</p><p>再次出！</p>"

文本存储在文本节点中.因为 CSS 无法选择文本节点，所以您必须使用其他方法来获取它们，例如 Nokogiri::XML::Node#children.

I'm trying to build a Sanitize transformer that accepts potentially malformed HTML input with elements outside of any tags at all, such as in this example:

out of a tag<p>in a tag</p>out again!

I want to have the transformer wrap any non-tagged elements in <p> tags so that the above transforms into:

<p>out of a tag</p><p>in a tag</p><p>out again!</p>

Unfortunately, I can't figure out how to select the untagged element because it's not a node. I'm sure I'm missing something here. Can someone give me a nudge in the right direction?

解决方案

require 'nokogiri'

html = 'out of a tag<p>in a tag</p>out again!'

Nokogiri::HTML(html).at_css('body').children.
  map {|x| '<p>' + x.text + '</p>' }.join('')
#=> "<p>out of a tag</p><p>in a tag</p><p>out again!</p>"

Text is stored in text nodes. Because CSS cannot select text nodes, you will have to use other methods to get them like Nokogiri::XML::Node#children.

这篇关于如何使用 Ruby 的 Sanitize/Nokogiri 访问未标记的元素?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何使用 Ruby 的 Sanitize/Nokogiri 访问未标记的元素? [英] How can I use Ruby's Sanitize/Nokogiri to access untagged elements?

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何使用 Ruby 的 Sanitize/Nokogiri 访问未标记的元素? [英] How can I use Ruby&#39;s Sanitize/Nokogiri to access untagged elements?

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

如何使用 Ruby 的 Sanitize/Nokogiri 访问未标记的元素? [英] How can I use Ruby's Sanitize/Nokogiri to access untagged elements?

登录关闭