Xpath 表达式可以访问 shadow-root 元素吗? [英] Can Xpath expressions access shadow-root elements?

查看:177
本文介绍了Xpath 表达式可以访问 shadow-root 元素吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

目前我正在抓取文章新闻网站,在获取其主要内容的过程中,我遇到了一个问题,其中很多都在其中嵌入了这样的推文:

Currently I am scraping article news sites, in the process of getting its main content, I ran into the issue that a lot of them have embedded tweets in them like these:

我将 XPath 表达式与 XPath 助手(chrome addon) 以测试我是否可以获得内容,然后将此表达式添加到scrapy python,但是在 #shadow-root 元素内的元素似乎超出了 DOM 的范围,我正在寻找一种在这些类型的元素中获取内容的方法,最好使用 XPath.

I use XPath expressions with XPath helper(chrome addon) in order to test if I can get content, then add this expression to scrapy python, but with elements that are inside a #shadow-root elements seem to be outside the scope of the DOM, I am looking for a way to get content inside these types of elements, preferably with XPath.

推荐答案

大多数网页抓取工具,包括 Scrapy,都不支持 Shadow DOM,因此您根本无法访问 Shadow 树中的元素.

Most web scrapers, including Scrapy, don't support the Shadow DOM, so you will not be able to access elements in shadow trees at all.

即使网络爬虫支持 Shadow DOM,XPath 也根本不支持.如CSS 范围规范中所述,仅在某种程度上支持选择器.

And even if a web scraper did support the Shadow DOM, XPath is not supported at all. Only selectors are supported to some extent, as documented in the CSS Scoping spec.

这篇关于Xpath 表达式可以访问 shadow-root 元素吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆