Python-认为更适合报废的方法:硒还是含硒的beautifulsoup? [英] Python - which is considered better for scrapping: selenium or beautifulsoup with selenium?

查看:77
本文介绍了Python-认为更适合报废的方法:硒还是含硒的beautifulsoup?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这个问题是针对Win10上的Python 3.6.3,bs4和Selenium 3.8.

This question is for Python 3.6.3, bs4 and Selenium 3.8 on Win10.

我正在尝试抓取具有动态内容的页面.我要抓取的是数字和文本(例如,来自 http://www.oddsportal.com ) .根据我的理解,使用请求+精美的汤将无法完成任务,因为动态内容将被隐藏.因此,我必须使用其他工具,例如我们的Selenium Webdriver.

I am trying to scrape pages with dynamic content. What I am trying to scrape is numbers and text (from http://www.oddsportal.com for example). From my understanding using requests+beautifulsoup will not do the job, as dynamic content will be hidden. So I have to use other tools such us selenium webdriver.

然后,考虑到我仍然要使用硒Web驱动程序,您是否建议忽略beautifulsoup并坚持使用硒Web驱动程序功能,例如

Then, given that I will use selenium webdriver anyway, do you recommend ignoring beautifulsoup and stick with selenium webdriver functions, eg

elem = driver.find_element_by_name("q"))

或者使用硒+美丽汤被认为是更好的做法?

or is it considered better practice to use selenium+beautifulsoup?

您对两条路线中的哪条路线会给我带来更便捷的功能有什么看法?

Do you have any opinion as to which of the two routes will give me more convenient functions to work with?

谢谢.

推荐答案

Beautifulsoup

Beautifulsoup Web爬网的强大工具.它使用 urllib.request Python库. urllib.request 具有从静态页面提取数据的强大功能.

Beautifulsoup

Beautifulsoup is a powerful tool for Web Scrapping. It use the urllib.request Python library. urllib.request is quite powerful to extract data from static pages.

> Selenium 当前是最广泛接受且效率最高的网络自动化的工具.硒支持与Dynamic Pages, Contents and Elements进行交互.

Selenium is currently the most widely accepted and efficient tool for Web Automation. Selenium supports interacting with Dynamic Pages, Contents and Elements.

要创建一个健壮高效的框架来抓取具有动态内容的页面,您必须将 Selenium Beautifulsoup 都集成到框架中.通过 Selenium 浏览动态元素并与之交互,并通过 Beautifulsoup

To create a robust and efficient framework to scrape pages with dynamic content you must integrate both Selenium and Beautifulsoup in your framework. Browse and interact with dynamic elements through Selenium and scrape the contents efficiently through Beautifulsoup

这是

Here is an example using Selenium and Beautifulsoup for Scrapping

这篇关于Python-认为更适合报废的方法:硒还是含硒的beautifulsoup?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆