在 Selenium Python 中获取 URL [英] Get URL in Selenium Python

查看:38
本文介绍了在 Selenium Python 中获取 URL的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对 python 非常陌生,我希望抓取以下网站:链接

我认为 Selenium 可能是合适的工具,于是我开始编写以下代码:

from selenium import webdriver从 selenium.webdriver.common.keys 导入密钥path='http://planning.hackney.gov.uk/Northgate/PlanningExplorer/generalsearch.aspx'浏览器 = webdriver.Firefox()browser.get(路径)elem = browser.find_element_by_id('txtPostCode')elem.clear()elem.send_keys("E9 7JP")elem.send_keys(Keys.RETURN)打印 (browser.current_url)

到目前为止一切顺利,它有效.但是,browser.current_url 的返回值与我浏览器的 url 栏中显示的不太一样.我的意思是脚本的返回值是:

//planning.hackney.gov.uk/Northgate/PlanningExplorer/generalsearch.aspx

但是浏览器中的 url 在这里显示了这个:

//planning.hackney.gov.uk/Northgate/PlanningExplorer/Generic/StdResults.aspx?PT=Planning%20Applications%20On-Line&SC=Postcode%20is%20E9%207JP&FT=Planning%20Application%20Search%20Results&XMLSIDE=/Northgate/PlanningExplorer/SiteFiles/Skins/Hackney/Menus/PL.xml&XSLTemplate=/Northgate/PlanningExplorer/SiteFiles/Skins/Hackney/xslt/PL/PLResults.xslt&XMLLoc=/Northgate/PlanningExplorer/Generic/XMLtemp/j5jzxiwxklgslnam4qffypw5/052dd052-3993-4f10-83aa-dd0c6c326676.xml

现在我想知道如何获得这个地址?!

非常感谢!

解决方案

在检查脚本返回的 URL 和浏览器显示的 URL 之间,您是否提出了任何其他请求.Keys.RETURN 后发送的请求 URL 添加了一个带有 URL 的会话标识符,这可能是您获得不同 URL 的原因.

我有这个脚本

from selenium import webdriver从 selenium.webdriver.common.keys 导入密钥chromepath='chrome_driver_path'//将此更改为您的 chromedriver 路径驱动程序 = webdriver.Chrome(chromepath)

driver.get('http://planning.hackney.gov.uk/Northgate/PlanningExplorer/generalsearch.aspx')

print(driver.current_url)elem = driver.find_element_by_id('txtPostCode')elem.clear()elem.send_keys("E9 7JP")elem.send_keys(Keys.RETURN)打印 (driver.current_url)驱动程序退出()

按键代码已从您的代码本身复制而来.我从浏览器和脚本中获得了相同的 URL

脚本给了我这个 URL - Link

I think that Selenium might be the right tool and I started to write following code:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys

path='http://planning.hackney.gov.uk/Northgate/PlanningExplorer/generalsearch.aspx'

browser = webdriver.Firefox()
browser.get(path)

elem = browser.find_element_by_id('txtPostCode')
elem.clear()
elem.send_keys("E9 7JP")
elem.send_keys(Keys.RETURN)

print (browser.current_url)

So far so good, it works. However, the return value of browser.current_url is not quite what is displayed in the url bar of my browser. I mean the the return value of the script is:

//planning.hackney.gov.uk/Northgate/PlanningExplorer/generalsearch.aspx

however the url in the browser is showing me this one here:

//planning.hackney.gov.uk/Northgate/PlanningExplorer/Generic/StdResults.aspx?PT=Planning%20Applications%20On-Line&SC=Postcode%20is%20E9%207JP&FT=Planning%20Application%20Search%20Results&XMLSIDE=/Northgate/PlanningExplorer/SiteFiles/Skins/Hackney/Menus/PL.xml&XSLTemplate=/Northgate/PlanningExplorer/SiteFiles/Skins/Hackney/xslt/PL/PLResults.xslt&PS=10&XMLLoc=/Northgate/PlanningExplorer/Generic/XMLtemp/j5jzxiwxklgslnam4qffypw5/052dd052-3993-4f10-83aa-dd0c6c326676.xml

Now I wonder how to get hold of this adress?!

Thanks a lot!

解决方案

Did you made any other request in between checking your script returned URL and the URL shown by the browser. The request URL sent post the Keys.RETURN adds a session identifier with the URL, which might be the reason why you are getting different URL.

I have this script

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
chromepath='chrome_driver_path' //change this to your chromedriver path
driver = webdriver.Chrome(chromepath)

driver.get('http://planning.hackney.gov.uk/Northgate/PlanningExplorer/generalsearch.aspx')

print(driver.current_url)

elem = driver.find_element_by_id('txtPostCode')
elem.clear()
elem.send_keys("E9 7JP")
elem.send_keys(Keys.RETURN)

print (driver.current_url)

driver.quit()

Keypress code has been copied from your code itself. I get an identical URL from both the browser and the script

Script gives me this URL - Link Browser gives me this same URL - Copied Manually

这篇关于在 Selenium Python 中获取 URL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆