requests.exceptions.MissingSchema:无效的 URL 'None':尝试通过 Selenium 和 Python 查找断开的链接时未提供架构 [英] requests.exceptions.MissingSchema: Invalid URL 'None': No schema supplied while trying to find broken links through Selenium and Python
问题描述
我想使用 Selenium + Python 查找我网页上的损坏链接.我尝试了上面的代码,但它显示了以下错误:
I want to find the broken links on my web page by using Selenium + Python. I tried the above code but it shows me the following error:
requests.exceptions.MissingSchema: Invalid URL 'None': No schema supplied. Perhaps you meant http://None?
代码试验:
for link in links:
r = requests.head(link.get_attribute('href'))
print(link.get_attribute('href'), r.status_code)
完整代码:
def test_lsearch(self):
driver=self.driver
driver.get("http://www.google.com")
driver.set_page_load_timeout(10)
driver.find_element_by_name("q").send_keys("selenium")
driver.set_page_load_timeout(10)
el=driver.find_element_by_name("btnK")
el.click()
time.sleep(5)
links=driver.find_elements_by_css_selector("a")
for link in links:
r=requests.head(link.get_attribute('href'))
print(link.get_attribute('href'),r.status_code)
推荐答案
此错误信息...
raise MissingSchema(error)
requests.exceptions.MissingSchema: Invalid URL 'None': No schema supplied. Perhaps you meant http://None?
...暗示对 unicode 域名和路径的支持在收集的 href
属性内失败.
...implies that the Support for unicode domain names and paths failed within the collected href
attribute.
此错误定义在 models.py 如下:
This error is defined in models.py as follows:
# Support for unicode domain names and paths.
scheme, auth, host, port, path, query, fragment = parse_url(url)
if not scheme:
raise MissingSchema("Invalid URL {0!r}: No schema supplied. "
"Perhaps you meant http://{0}?".format(url))
解决方案
在 Google 主页搜索框.为此,您可以使用以下解决方案:
Solution
Possibly you are trying to look for the broken links once the search results are available for the keyword selenium on Google Home Page Search Box. To achieve that you can use the following solution:
代码块:
Code Block:
import requests
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.keys import Keys
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_argument('disable-infobars')
driver=webdriver.Chrome(chrome_options=options, executable_path=r'C:UtilityBrowserDriverschromedriver.exe')
driver.get('https://google.co.in/')
search = driver.find_element_by_name('q')
search.send_keys("selenium")
search.send_keys(Keys.RETURN)
links = WebDriverWait(driver, 10).until(EC.visibility_of_any_elements_located((By.XPATH, "//div[@class='rc']//h3//ancestor::a[1]")))
print("Number of links : %s" %len(links))
for link in links:
r = requests.head(link.get_attribute('href'))
print(link.get_attribute('href'), r.status_code)
控制台输出:
Console Output:
Number of links : 9
https://www.seleniumhq.org/ 200
https://www.seleniumhq.org/download/ 200
https://www.seleniumhq.org/docs/01_introducing_selenium.jsp 200
https://www.guru99.com/selenium-tutorial.html 200
https://en.wikipedia.org/wiki/Selenium_(software) 200
https://github.com/SeleniumHQ 200
https://www.edureka.co/blog/what-is-selenium/ 200
https://seleniumhq.github.io/selenium/docs/api/py/ 200
https://seleniumhq.github.io/docs/ 200
根据您的反问,从 Selenium 的角度,要规范地回答为什么 xpath 有效,而 tagName 无效的原因会有点困难.也许您可能想更深入地研究这些讨论:
As per your counter question, it would be a bit tough to canonically answer why xpath worked but not tagName from Selenium perspective. Perhaps you may like to dig deeper into these discussions for the same:
- 错误 1323614 - 无法验证:requests.exceptions.MissingSchema:无效的 URL 'stage/auth/token/obtain/':未提供架构.
- 无效网址无":未提供架构.也许你的意思是 http://None?
这篇关于requests.exceptions.MissingSchema:无效的 URL 'None':尝试通过 Selenium 和 Python 查找断开的链接时未提供架构的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!