是否有办法在Nokogiri CSS中转义非字母数字字符? [英] Is there a way to escape non-alphanumeric characters in Nokogiri css?

查看:54
本文介绍了是否有办法在Nokogiri CSS中转义非字母数字字符?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个锚标记:

  file.html#stuff-morestuff-CHP-1-SECT-2.1 

尝试在Nokogiri中提取引用的内容:

  documentFragment.at_css('#stuff-morestuff-CHP-1-SECT-2.1') 

失败,并显示错误:

'[#< Nokogiri :: CSS::Node:0x007fd1a7df9b40 @ type =:CONDITIONAL_SELECTOR,@value = [#< Nokogiri :: CSS :: Node:0x007fd1a7df9b90 @ type =:ELEMENT_NAME,@value = ["*"]> ;,#< Nokogiri :: CSS::节点:0x007fd1a7df9cd0 @类型=:ID,@值= [#unixnut4-CHP-1-SECT-2"]>]>]'(Nokogiri :: CSS :: SyntaxError)

尝试尝试一下-我认为Nokogiri抱怨选择器ID中的 .1 ,因为.在html ID中无效.

我没有内容的所有权,所以我真的不想遍历所有错误的ID,如果可以避免的话,请进行修复.有没有办法在nokogiri .css()调用中转义非字母数字选择器?

解决方案

假设您的HTML看起来像这样:

 < div id ='stuff-morestuff-CHP-1-SECT-2.1'> foo</div> 

有问题的字符串 stuff-morestuff-CHP-1-SECT-2.1 <代码>#stuff-morestuff-CHP-1-SECT-2 \ .1

不幸的是,这在Nokogiri中似乎不起作用,可能是CSS到XPath转换中的错误.(它在浏览器中确实有效).

您可以通过直接检查 id 属性来解决此问题:

  documentFragment.at_css('* [id ="stuff-morestuff-CHP-1-SECT-2.1"]') 

即使斜杠转义有效,您也可能必须检查 id 属性,如果它的值以数字开头,这在HTML中是有效的,但不能(据我所知))表示为CSS选择器,甚至可以转义.

您还可以使用XPath,它的 id 功能,您可以在此处使用它:

  documentFragment.xpath("id('stuff-morestuff-CHP-1-SECT-2.1')") 

I have an anchor tag:

file.html#stuff-morestuff-CHP-1-SECT-2.1

Trying to pull the referenced content in Nokogiri:

documentFragment.at_css('#stuff-morestuff-CHP-1-SECT-2.1')

fails with the error:

unexpected '.1' after '[#<Nokogiri::CSS:
:Node:0x007fd1a7df9b40 @type=:CONDITIONAL_SELECTOR, @value=[#<Nokogiri::CSS::Node:0x007fd1a7df9b90 @type=:ELEMENT_NAME, @value=["*"]>, #<Nokogiri::CSS::Node:0x007fd1a7df9cd0 @
type=:ID, @value=["#unixnut4-CHP-1-SECT-2"
]>]>]' (Nokogiri::CSS::SyntaxError)

Just trying talk through this - I think Nokogiri is complaining about the .1 in the selectorId, because . is not valid in an html id.

I don't own the content, so I really don't want to go through and fix all the bad IDs if it is avoidable. Is there a way to escape non-alphanumeric selectors in a nokogiri .css() call?

解决方案

Assuming your HTML looks something like this:

<div id='stuff-morestuff-CHP-1-SECT-2.1'>foo</div>

The string in question, stuff-morestuff-CHP-1-SECT-2.1, is a valid HTML ID, but it isn’t a valid CSS selector — the . character isn’t valid there.

You should be able to escape the . with a slash character, i.e. this is a valid CSS selector:

#stuff-morestuff-CHP-1-SECT-2\.1

Unfortunately this doesn’t seem to work in Nokogiri, there may be a bug in the CSS to XPath translation that it does. (It does work in the browser).

You can get around this by just checking the id attribute directly:

documentFragment.at_css('*[id="stuff-morestuff-CHP-1-SECT-2.1"]')

Even if slash escaping worked, you would probably have to check the id attribute like this if it value started with a digit, which is valid in HTML but cannot be (as far as I can tell) expressed as a CSS selector, even with escaping.

You could also use XPath, which has an id function that you can use here:

documentFragment.xpath("id('stuff-morestuff-CHP-1-SECT-2.1')")

这篇关于是否有办法在Nokogiri CSS中转义非字母数字字符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆