如何在selenium中加载仅来自网页的html [英] How to load only html from web pages in selenium
问题描述
我只需要HTML页面而无需CSS和javascript。
<如果你需要selenium进行网页抓取,严格来说,你仍然需要javascript和css文件,因为它们可以在页面加载和呈现中占据重要的一部分。例如,页面的几个部分可以加载额外的ajax调用,或者通过自定义的javascript逻辑插入。
另外,如果您只想要页面的HTML部分,为什么你需要涉及一个真正的浏览器?
如果你仍然想阻止加载js和css文件,你可以通过调整 FirefoxProfile
偏好设置,请参阅:
How to load only html from web pages in selenium?
I need only html of requested page without css and javascript.
If you need selenium for web-scraping, strictly speaking, you would still need need javascript and css files since they can take a significant part in the page load and rendering. For example, several parts of a page can be loaded with additional ajax calls, or inserted via a custom javascript logic.
Also, if you want only HTML part of a page, why do you need to involve a real browser?
If you still want to prevent js and css files from loading, you can configure certain permissions in Firefox through tweaking FirefoxProfile
preferences, see:
- Do not want images to load and CSS to render on Firefox in Selenium WebDriver tests with Python
- FirefoxDriver: how to disable javascript,css and make sendKeys type instantly?
这篇关于如何在selenium中加载仅来自网页的html的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!