在 Selenium Python 中获取 URL [英] Get URL in Selenium Python
问题描述
我对 python 非常陌生,我希望抓取以下网站:链接
我认为 Selenium 可能是合适的工具,于是我开始编写以下代码:
from selenium import webdriver从 selenium.webdriver.common.keys 导入密钥path='http://planning.hackney.gov.uk/Northgate/PlanningExplorer/generalsearch.aspx'浏览器 = webdriver.Firefox()browser.get(路径)elem = browser.find_element_by_id('txtPostCode')elem.clear()elem.send_keys("E9 7JP")elem.send_keys(Keys.RETURN)打印 (browser.current_url)
到目前为止一切顺利,它有效.但是,browser.current_url
的返回值与我浏览器的 url 栏中显示的不太一样.我的意思是脚本的返回值是:
//planning.hackney.gov.uk/Northgate/PlanningExplorer/generalsearch.aspx
但是浏览器中的 url 在这里显示了这个:
//planning.hackney.gov.uk/Northgate/PlanningExplorer/Generic/StdResults.aspx?PT=Planning%20Applications%20On-Line&SC=Postcode%20is%20E9%207JP&FT=Planning%20Application%20Search%20Results&XMLSIDE=/Northgate/PlanningExplorer/SiteFiles/Skins/Hackney/Menus/PL.xml&XSLTemplate=/Northgate/PlanningExplorer/SiteFiles/Skins/Hackney/xslt/PL/PLResults.xslt&XMLLoc=/Northgate/PlanningExplorer/Generic/XMLtemp/j5jzxiwxklgslnam4qffypw5/052dd052-3993-4f10-83aa-dd0c6c326676.xml
现在我想知道如何获得这个地址?!
非常感谢!
在检查脚本返回的 URL 和浏览器显示的 URL 之间,您是否提出了任何其他请求.Keys.RETURN
后发送的请求 URL 添加了一个带有 URL 的会话标识符,这可能是您获得不同 URL 的原因.
我有这个脚本
from selenium import webdriver从 selenium.webdriver.common.keys 导入密钥chromepath='chrome_driver_path'//将此更改为您的 chromedriver 路径驱动程序 = webdriver.Chrome(chromepath)
driver.get('http://planning.hackney.gov.uk/Northgate/PlanningExplorer/generalsearch.aspx')
print(driver.current_url)elem = driver.find_element_by_id('txtPostCode')elem.clear()elem.send_keys("E9 7JP")elem.send_keys(Keys.RETURN)打印 (driver.current_url)驱动程序退出()
按键代码已从您的代码本身复制而来.我从浏览器和脚本中获得了相同的 URL
脚本给了我这个 URL - Link
I think that Selenium might be the right tool and I started to write following code:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
path='http://planning.hackney.gov.uk/Northgate/PlanningExplorer/generalsearch.aspx'
browser = webdriver.Firefox()
browser.get(path)
elem = browser.find_element_by_id('txtPostCode')
elem.clear()
elem.send_keys("E9 7JP")
elem.send_keys(Keys.RETURN)
print (browser.current_url)
So far so good, it works. However, the return value of browser.current_url
is not quite what is displayed in the url bar of my browser. I mean the the return value of the script is:
//planning.hackney.gov.uk/Northgate/PlanningExplorer/generalsearch.aspx
however the url in the browser is showing me this one here:
//planning.hackney.gov.uk/Northgate/PlanningExplorer/Generic/StdResults.aspx?PT=Planning%20Applications%20On-Line&SC=Postcode%20is%20E9%207JP&FT=Planning%20Application%20Search%20Results&XMLSIDE=/Northgate/PlanningExplorer/SiteFiles/Skins/Hackney/Menus/PL.xml&XSLTemplate=/Northgate/PlanningExplorer/SiteFiles/Skins/Hackney/xslt/PL/PLResults.xslt&PS=10&XMLLoc=/Northgate/PlanningExplorer/Generic/XMLtemp/j5jzxiwxklgslnam4qffypw5/052dd052-3993-4f10-83aa-dd0c6c326676.xml
Now I wonder how to get hold of this adress?!
Thanks a lot!
Did you made any other request in between checking your script returned URL and the URL shown by the browser. The request URL sent post the Keys.RETURN
adds a session identifier with the URL, which might be the reason why you are getting different URL.
I have this script
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
chromepath='chrome_driver_path' //change this to your chromedriver path
driver = webdriver.Chrome(chromepath)
driver.get('http://planning.hackney.gov.uk/Northgate/PlanningExplorer/generalsearch.aspx')
print(driver.current_url)
elem = driver.find_element_by_id('txtPostCode')
elem.clear()
elem.send_keys("E9 7JP")
elem.send_keys(Keys.RETURN)
print (driver.current_url)
driver.quit()
Keypress code has been copied from your code itself. I get an identical URL from both the browser and the script
Script gives me this URL - Link Browser gives me this same URL - Copied Manually
这篇关于在 Selenium Python 中获取 URL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!