Python,Selenium和美丽的汤为URL [英] Python, Selenium, and Beautiful Soup for URL

查看:105
本文介绍了Python,Selenium和美丽的汤为URL的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用Selenium编写脚本来访问pastebin执行搜索并在文本中输出URL结果。

I am trying to write a script using Selenium to access pastebin do a search and print out in text the URL results. I need the visible URL results and nothing else.

<div class="gs-bidi-start-align gs-visibleUrl gs-visibleUrl-long" dir="ltr" style="word-break:break-all;">pastebin.com/VYQTSbzY</div>

当前脚本为:

Current script is:

import time
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.keys import Keys
from bs4 import BeautifulSoup

browser = webdriver.Firefox()
browser.get('http://www.pastebin.com')

search = browser.find_element_by_name('q')
search.send_keys("test")
search.send_keys(Keys.RETURN)

soup=BeautifulSoup(browser.page_source)

for link in soup.find_all('a'):
    print link.get('href',None),link.get_text()


推荐答案

您实际上并不需要 BeautifulSoup selenium 本身在定位元素时非常强大:

You don't actually need BeautifulSoup. selenium itself is very powerful at locating element:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.keys import Keys


browser = webdriver.Firefox()
browser.get('http://www.pastebin.com')

search = browser.find_element_by_name('q')
search.send_keys("test")
search.send_keys(Keys.RETURN)

# wait for results to appear
wait = WebDriverWait(browser, 10)
results = wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div.gsc-resultsbox-visible")))

# grab results
for link in results.find_elements_by_css_selector("a.gs-title"):
    print link.get_attribute("href")

browser.close()

打印:

Prints:

http://pastebin.com/VYQTSbzY
http://pastebin.com/VYQTSbzY
http://pastebin.com/VAAQCjkj
...
http://pastebin.com/fVUejyRK
http://pastebin.com/fVUejyRK

注意使用显式等待有助于等待搜索结果的显示。

Note the use of an Explicit Wait which helps to wait for the search results to appear.

这篇关于Python,Selenium和美丽的汤为URL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆