requests.exceptions.MissingSchema:无效的网址“无":尝试通过Selenium和Python查找断开的链接时,未提供任何模式 [英] requests.exceptions.MissingSchema: Invalid URL 'None': No schema supplied while trying to find broken links through Selenium and Python
问题描述
我想使用Selenium + Python在我的网页上找到损坏的链接.我尝试了上面的代码,但它显示了以下错误:
I want to find the broken links on my web page by using Selenium + Python. I tried the above code but it shows me the following error:
requests.exceptions.MissingSchema: Invalid URL 'None': No schema supplied. Perhaps you meant http://None?
代码试用:
for link in links:
r = requests.head(link.get_attribute('href'))
print(link.get_attribute('href'), r.status_code)
完整代码:
def test_lsearch(self):
driver=self.driver
driver.get("http://www.google.com")
driver.set_page_load_timeout(10)
driver.find_element_by_name("q").send_keys("selenium")
driver.set_page_load_timeout(10)
el=driver.find_element_by_name("btnK")
el.click()
time.sleep(5)
links=driver.find_elements_by_css_selector("a")
for link in links:
r=requests.head(link.get_attribute('href'))
print(link.get_attribute('href'),r.status_code)
推荐答案
此错误消息...
raise MissingSchema(error)
requests.exceptions.MissingSchema: Invalid URL 'None': No schema supplied. Perhaps you meant http://None?
...表示在收集的 href
属性内,对Unicode域名和路径的支持失败.
...implies that the Support for unicode domain names and paths failed within the collected href
attribute.
此错误是在 models.py 如下:
# Support for unicode domain names and paths.
scheme, auth, host, port, path, query, fragment = parse_url(url)
if not scheme:
raise MissingSchema("Invalid URL {0!r}: No schema supplied. "
"Perhaps you meant http://{0}?".format(url))
解决方案
Solution
Possibly you are trying to look for the broken links once the search results are available for the keyword selenium on Google Home Page Search Box. To achieve that you can use the following solution:
-
代码块:
Code Block:
import requests
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.keys import Keys
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_argument('disable-infobars')
driver=webdriver.Chrome(chrome_options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
driver.get('https://google.co.in/')
search = driver.find_element_by_name('q')
search.send_keys("selenium")
search.send_keys(Keys.RETURN)
links = WebDriverWait(driver, 10).until(EC.visibility_of_any_elements_located((By.XPATH, "//div[@class='rc']//h3//ancestor::a[1]")))
print("Number of links : %s" %len(links))
for link in links:
r = requests.head(link.get_attribute('href'))
print(link.get_attribute('href'), r.status_code)
控制台输出:
Console Output:
Number of links : 9
https://www.seleniumhq.org/ 200
https://www.seleniumhq.org/download/ 200
https://www.seleniumhq.org/docs/01_introducing_selenium.jsp 200
https://www.guru99.com/selenium-tutorial.html 200
https://en.wikipedia.org/wiki/Selenium_(software) 200
https://github.com/SeleniumHQ 200
https://www.edureka.co/blog/what-is-selenium/ 200
https://seleniumhq.github.io/selenium/docs/api/py/ 200
https://seleniumhq.github.io/docs/ 200
根据您的反问,从 Selenium 角度规范地回答 xpath 为什么有效但 tagName 无效的原因可能有点困难.也许您可能希望对这些讨论进行更深入的研究:
As per your counter question, it would be a bit tough to canonically answer why xpath worked but not tagName from Selenium perspective. Perhaps you may like to dig deeper into these discussions for the same:
- 错误1323614-无法进行身份验证:requests.exceptions.MissingSchema:无效的URL'stage/auth/token/obtain/':未提供架构.
- 无效的网址无":未提供任何架构.也许您是说http://None?
- Bug 1323614 - Cannot authenticate: requests.exceptions.MissingSchema: Invalid URL 'stage/auth/token/obtain/': No schema supplied.
- Invalid URL 'None': No schema supplied. Perhaps you meant http://None?
这篇关于requests.exceptions.MissingSchema:无效的网址“无":尝试通过Selenium和Python查找断开的链接时,未提供任何模式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!