拒绝加载脚本,因为它违反了以下内容安全策略指令:ChromeDriver Chrome 和 Selenium 的 script-src 错误 [英] Refused to load the script because it violates the following Content Security Policy directive: script-src error with ChromeDriver Chrome and Selenium
问题描述
我试图从这些链接中获取电话号码https://www.practo.com/delhi/doctor/dr-meeka-gulati-dentist-3?specialization=Dentist&practice_id=722421"和https://www.practo.com/delhi/doctor/dr-rajeev-puri-ear-nose-throat-ent-specialist?specialization=Ear-Nose-Throat%20(ENT)%20Specialist&practice_id=912154"
I am trying to scrape Phone Number from these links "https://www.practo.com/delhi/doctor/dr-meeka-gulati-dentist-3?specialization=Dentist&practice_id=722421" and "https://www.practo.com/delhi/doctor/dr-rajeev-puri-ear-nose-throat-ent-specialist?specialization=Ear-Nose-Throat%20(ENT)%20Specialist&practice_id=912154"
如果元素存在,它会抓取电话号码,否则电话号码为无
if element present it scrapes the phone number otherwise phone number is None
蜘蛛代码:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import NoSuchElementException
options = webdriver.ChromeOptions()
options.add_argument('headless')
options.add_argument('window-size=1200x600')
driver = webdriver.Chrome(chrome_options=options)
driver.get('https://www.practo.com/delhi/doctor/dr-meeka-gulati-dentist-3?specialization=Dentist&practice_id=722421')
WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.XPATH, "//p[@data-a-target='carousel-broadcaster-displayname']"))
)
try:
next1 = driver.find_element_by_xpath('//*[@class="c-btn--light c-btn--center"]')
next1.click()
next2 = driver.find_element_by_xpath('//*[@class="u-title-font icon-ic_call_filled u-valign--middle"]')
next2.click()
phone_number = driver.find_element_by_class_name('c-vn__number').get_attribute('innerHTML')
except NoSuchElementException:
phone_number = None
print(phone_number)
输出
DevTools listening on ws://127.0.0.1:60482/devtools/browser/9f226a40-2d1a-4108-9fde-f005b49e60b3
[1206/102937.475:INFO:CONSOLE(0)] "[Report Only] Refused to load the script
'https://www.googletagmanager.com/gtag/js?id=AW-942004674' because it violates the following Content Security Policy directive: "script-src 'self' 'unsafe-inline' 'unsafe-eval' 'strict-dynamic' 'nonce-3RJz12sDPuoV27qS7dcBXLRZawmPobLo' *.practo.com *.practostatic.com *.onesignal.com *.mxpnl.com *.mixpanel.com *.facebook.com *.facebook.net *.twitter.com *.gstatic.com *.googleapis.com *.google.com *.googlesyndication.com *.newrelic.com *.google-analytics.com *.googletagmanager.com *.googleadservices.com *.googlesyndication.com *.doubleclick.net *.survicate.com in.wzrkt.com *.nr-data.net *.newrelic.com *.speedcurve.com *.ampproject.org *.netcore.co.in *.netcoresmartech.com *.criteo.net *.criteo.com https://secure.livechatinc.com". 'strict-dynamic' is present, so host-based whitelisting is disabled. Note that 'script-src-elem' was not explicitly set, so 'script-src' is used as a fallback.
", source: https://www.practo.com/delhi/doctor/dr-rajeev-puri-ear-nose-throat-ent-specialist?specialization=Ear-Nose-Throat%20(ENT)%20Specialist&practice_id=912154 (0)
[1206/125829.645:INFO:CONSOLE(33)] "[Report Only] Refused to execute inline script because it violates the following Content Security Policy directive: "script-src 'self' 'unsafe-inline' 'unsafe-eval' 'strict-dynamic' 'nonce-eNRfqc27QHPklLLhavu92zuUGDeEoSZL' *.practo.com *.practostatic.com *.onesignal.com *.mxpnl.com *.mixpanel.com *.facebook.com *.facebook.net *.twitter.com *.gstatic.com *.googleapis.com *.google.com *.googlesyndication.com *.newrelic.com *.google-analytics.com *.googletagmanager.com *.googleadservices.com *.googlesyndication.com *.doubleclick.net *.survicate.com in.wzrkt.com *.nr-data.net *.newrelic.com *.speedcurve.com *.ampproject.org *.netcore.co.in *.netcoresmartech.com *.criteo.net *.criteo.com https://secure.livechatinc.com". Note that 'unsafe-inline' is ignored if either a hash or nonce value is present in the source list.
", source: https://www.practo.com/delhi/doctor/dr-rajeev-puri-ear-nose-throat-ent-specialist?specialization=Ear-Nose-Throat%20(ENT)%20Specialist&practice_id=912154 (33)
[1206/125829.829:INFO:CONSOLE(0)] "[Report Only] Refused to frame 'https://9535906.fls.doubleclick.net/' because it violates the following Content Security Policy directive: "frame-src 'self' https://survicate.com *.practo.com *.criteo.net *.criteo.com https://www.facebook.com https://bid.g.doubleclick.net https://secure.livechatinc.com".
", source: https://www.googletagmanager.com/ (0)
[1206/125830.508:INFO:CONSOLE(0)] "[Report Only] Refused to frame 'https://9535906.fls.doubleclick.net/' because it violates the following Content Security Policy directive: "frame-src 'self' https://survicate.com *.practo.com *.criteo.net *.criteo.com https://www.facebook.com https://bid.g.doubleclick.net https://secure.livechatinc.com".
", source: https://www.googletagmanager.com/ (0)
推荐答案
此错误信息...
[1206/102937.475:INFO:CONSOLE(0)] "[Report Only] Refused to load the script
'https://www.googletagmanager.com/gtag/js?id=AW-942004674' because it violates the following Content Security Policy directive: "script-src 'self' 'unsafe-inline' 'unsafe-eval' 'strict-dynamic' 'nonce-3RJz12sDPuoV27qS7dcBXLRZawmPobLo' *.practo.com *.practostatic.com *.onesignal.com *.mxpnl.com *.mixpanel.com *.facebook.com *.facebook.net *.twitter.com *.gstatic.com *.googleapis.com *.google.com *.googlesyndication.com *.newrelic.com *.google-analytics.com *.googletagmanager.com *.googleadservices.com *.googlesyndication.com *.doubleclick.net *.survicate.com in.wzrkt.com *.nr-data.net *.newrelic.com *.speedcurve.com *.ampproject.org *.netcore.co.in *.netcoresmartech.com *.criteo.net *.criteo.com https://secure.livechatinc.com". 'strict-dynamic' is present, so host-based whitelisting is disabled. Note that 'script-src-elem' was not explicitly set, so 'script-src' is used as a fallback.
.
[1206/125830.508:INFO:CONSOLE(0)] "[Report Only] Refused to frame 'https://9535906.fls.doubleclick.net/' because it violates the following Content Security Policy directive: "frame-src 'self' https://survicate.com *.practo.com *.criteo.net *.criteo.com https://www.facebook.com https://bid.g.doubleclick.net https://secure.livechatinc.com".
", source: https://www.googletagmanager.com/ (0)
...表示 ChromeDriver 无法启动/生成新的浏览上下文,即 Chrome 浏览器 会话.
...implies that the ChromeDriver was unable to initiate/spawn a new Browsing Context i.e. Chrome Browser session.
为了缓解跨站脚本问题,Chrome 的扩展系统实施了内容安全策略(CSP) 引入了一些严格的政策,这些政策将使扩展在默认情况下更加安全,并使我们能够创建和实施管理可以由您的扩展和应用程序加载和执行的内容类型的规则.CSP 用作扩展加载或执行的资源的阻止/允许列表机制.为您的扩展程序定义合理的策略使您能够考虑扩展程序所需的资源并与浏览器协商以确保这些是您的扩展程序可以访问的唯一资源.这些政策提供的安全性甚至高于您的扩展程序请求的主机权限,作为额外的保护层.此类策略是通过 HTTP 标头或元元素定义的.在 Chrome 的扩展系统中,扩展的策略是通过扩展的 manifest.json 文件定义的,如下所示:
To mitigate the cross-site scripting issues Chrome's extension system has implemented the concept of Content Security Policy (CSP) which introduces some strict policies that will make extensions more secure by default and provides us the ability to create and enforce rules governing the types of content that can be loaded and executed by your extensions and applications. CSP works as a block/allowlisting mechanism for resources loaded or executed by your extensions. Defining a reasonable policy for your extension enables you to consider the resources that your extension requires and to negotiate with the browser to ensure that those are the only resources your extension has access to. These policies provide security even above the host permissions your extension requests acting as an additional layer of protection. Such policies are defined via an HTTP header or meta element. Within Chrome's extension system the extension's policy is defined via the extension's manifest.json file as follows:
{
"content_security_policy": "[POLICY STRING GOES HERE]"
}
<小时>
放宽内容安全政策
直到 Chrome 45,没有机制可以放宽对执行内联 JavaScript 的限制.特别是,设置包含unsafe-inline"的脚本策略将不起作用.但是,从 Chrome 46 开始,可以通过在策略中指定源代码的 base64 编码哈希来允许内联脚本.该散列必须以使用的散列算法(sha256、sha384 或 sha512)作为前缀.这可以通过将 http://*
添加到 style-src
和/或 script-src
来实现> 如下:
Relaxing the Content Security Policy
Till Chrome 45, there was no mechanism for relaxing the restriction against executing inline JavaScript. In particular, setting a script policy that includes 'unsafe-inline' will have no effect. However, from Chrome 46 onwards, inline scripts can be allowed by specifying the base64-encoded hash of the source code in the policy. This hash must be prefixed by the used hash algorithm (sha256, sha384 or sha512). This can be achived by setting adding http://*
to both style-src
and/or script-src
as follows:
script-src 'self' http://xxxx 'unsafe-inline' 'unsafe-eval';
和/或
style-src 'self' http://xxxx 'unsafe-inline' 'unsafe-eval';
<小时>
这个用例
代码块:
Code Block:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
options = webdriver.ChromeOptions()
options.add_argument('window-size=1200x600')
options.add_argument('--headless')
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path=r'C:UtilityBrowserDriverschromedriver.exe')
driver.get("https://www.practo.com/delhi/doctor/dr-rajeev-puri-ear-nose-throat-ent-specialist?specialization=Ear-Nose-Throat%20(ENT)%20Specialist&practice_id=912154")
print(driver.page_source)
driver.quit()
控制台输出:
Console Output:
<html><head><title>Dr. Rajeev Puri - ENT/ Otorhinolaryngologist - Book Appointment Online, View Fees, Feedbacks | Practo</title><meta name="description" content="Dr. Rajeev Puri is an ENT/ Otorhinolaryngologist in DLF Phase IV. Book appointments Online, View Fees, User Feedbacks for Dr. Rajeev Puri | Practo"><meta charset="utf-8"><meta http-equiv="x-ua-compatible" content="ie=edge"><script src="https://js-agent.newrelic.com/nr-spa-1026.min.js"></script><script src="//survey.survicate.com/workspaces/wfhrNWYKtlLEWMqcaXcweuzHeMRiSljw/web_surveys.js" async=""></script><script src="//api.survicate.com/assets/survicate.js" async=""></script><script src="//survey.survicate.com/workspaces/wfhrNWYKtlLEWMqcaXcweuzHeMRiSljw/web_surveys.js" async=""></script><script src="//api.survicate.com/assets/survicate.js" async=""></script><script src="https://surveys-static.survicate.com/widget_core-3.0.4.js" async=""></script><script src="//survey.survicate.com/workspaces/wfhrNWYKtlLEWMqcaXcweuzHeMRiSljw/web_surveys.js" async=""></script><script src="//survey.survicate.com/workspaces/wfhrNWYKtlLEWMqcaXcweuzHeMRiSljw/web_surveys.js" async=""></script><script src="//survey.survicate.com/workspaces/wfhrNWYKtlLEWMqcaXcweuzHeMRiSljw/web_surveys.js" async=""></script><script src="//api.survicate.com/assets/survicate.js" async=""></script><script src="//api.survicate.com/assets/survicate.js" async=""></script><script src="//api.survicate.com/assets/survicate.js" async=""></script><script type="text/javascript" async="" src="https://www.googleadservices.com/pagead/conversion_async.js" nonce=""></script><script type="text/javascript" async="" src="https://www.google-analytics.com/plugins/ua/ec.js" nonce=""></script><script type="text/javascript" async="" src="https://www.practostatic.com/pel/clevertap/a.js"></script><script async="" src="//sweep.practo.com/sp.js"></script><script type="text/javascript" src="https://www.practostatic.com/pel/pel-1.6.1.js"></script><script async="" src="https://connect.facebook.net/en_US/fbevents.js"></script><script async="" src="//www.google-analytics.com/analytics.js"></script><script async="" src="https://www.googletagmanager.com/gtm.js?id=GTM-PSMVGL5"></script><script nonce="" type="text/javascript">(function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':
new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],
j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src=
'https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);
})(window,document,'script','dataLayer',"GTM-PSMVGL5");</script>
确保:
- Selenium 已升级到当前级别版本 3.141.59.
- ChromeDriver 已更新为当前的 ChromeDriver v79.0.3945.36 级别.
- Chrome 已更新到当前的 Chrome 版本 79.0 级别.(根据 ChromeDriver v79.0 发行说明)
- Selenium is upgraded to current levels Version 3.141.59.
- ChromeDriver is updated to current ChromeDriver v79.0.3945.36 level.
- Chrome is updated to current Chrome Version 79.0 level. (as per ChromeDriver v79.0 release notes)
您可以在中找到相关讨论使用 Selenium IDE 调用 eval() 被 CSP 阻止
这篇关于拒绝加载脚本,因为它违反了以下内容安全策略指令:ChromeDriver Chrome 和 Selenium 的 script-src 错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!