Python-被认为更适合用于刮擦的方法:硒还是含硒的beautifulsoup? [英] Python - which is considered better for scraping: selenium or beautifulsoup with selenium?
问题描述
这个问题是针对Win10上的Python 3.6.3,bs4和Selenium 3.8.
This question is for Python 3.6.3, bs4 and Selenium 3.8 on Win10.
我正在尝试抓取具有动态内容的页面.我要抓取的是数字和文本(例如,来自 http://www.oddsportal.com ).以我的理解,使用request + beautifulsoup将无法完成任务,因为动态内容将被隐藏.因此,我必须使用其他工具,例如我们的Selenium Webdriver.
I am trying to scrape pages with dynamic content. What I am trying to scrape is numbers and text (from http://www.oddsportal.com for example). From my understanding using requests+beautifulsoup will not do the job, as dynamic content will be hidden. So I have to use other tools such us selenium webdriver.
然后,考虑到我仍然会使用硒webdriver,建议您不要忽略beautifulsoup并坚持使用硒webdriver功能,例如
Then, given that I will use selenium webdriver anyway, do you recommend ignoring beautifulsoup and stick with selenium webdriver functions, e.g.
elem = driver.find_element_by_name("q"))
或者使用硒+美丽汤被认为是更好的做法吗?
Or is it considered better practice to use selenium+beautifulsoup?
您对两条路线中的哪条路线会给我带来更便捷的功能有什么看法?
Do you have any opinion as to which of the two routes will give me more convenient functions to work with?
推荐答案
Beautifulsoup
Beautifulsoup
是用于 Web抓取的强大工具.它使用 urllib.request Python库. urllib.request
非常强大,可以从静态页面提取数据.
Beautifulsoup
Beautifulsoup
is a powerful tool for Web Scraping. It use the urllib.request Python library. urllib.request
is quite powerful to extract data from static pages.
Selenium
当前是网络自动化的最广泛接受且最有效的工具.Selenium支持与动态页面,内容和元素
进行交互.
Selenium
is currently the most widely accepted and efficient tool for Web Automation. Selenium supports interacting with Dynamic Pages, Contents and Elements
.
要创建一个强大而高效的框架来抓取具有动态内容的页面,您必须集成 Selenium
和 Beautifulsoup
>在您的框架中.通过 Selenium
浏览动态元素并与之交互,并通过 Beautifulsoup
To create a robust and efficient framework to scrape pages with dynamic content you must integrate both Selenium
and Beautifulsoup
in your framework. Browse and interact with dynamic elements through Selenium
and scrape the contents efficiently through Beautifulsoup
Here is an example
using Selenium
and Beautifulsoup
for Scraping
这篇关于Python-被认为更适合用于刮擦的方法:硒还是含硒的beautifulsoup?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!