拒绝加载脚本,因为它违反了以下内容安全策略指令:script-src error with ChromeDriver Chrome and Selenium [英] Refused to load the script because it violates the following Content Security Policy directive: script-src error with ChromeDriver Chrome and Selenium

查看:68
本文介绍了拒绝加载脚本,因为它违反了以下内容安全策略指令:script-src error with ChromeDriver Chrome and Selenium的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从这些链接中获取电话号码https://www.practo.com/delhi/doctor/dr-meeka-gulati-dentist-3?specialization=Dentist&practice_id=722421" 和 "https://www.practo.com/delhi/doctor/dr-rajeev-puri-ear-nose-throat-ent-specialist?specialization=Ear-Nose-Throat%20(ENT)%20Specialist&practice_id=912154"

I am trying to scrape Phone Number from these links "https://www.practo.com/delhi/doctor/dr-meeka-gulati-dentist-3?specialization=Dentist&practice_id=722421" and "https://www.practo.com/delhi/doctor/dr-rajeev-puri-ear-nose-throat-ent-specialist?specialization=Ear-Nose-Throat%20(ENT)%20Specialist&practice_id=912154"

如果元素存在,它会抓取电话号码,否则电话号码为无

if element present it scrapes the phone number otherwise phone number is None

蜘蛛代码:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import NoSuchElementException

options = webdriver.ChromeOptions()
options.add_argument('headless')
options.add_argument('window-size=1200x600')

driver = webdriver.Chrome(chrome_options=options)

driver.get('https://www.practo.com/delhi/doctor/dr-meeka-gulati-dentist-3?specialization=Dentist&practice_id=722421')

WebDriverWait(driver, 10).until(
                            EC.presence_of_element_located((By.XPATH, "//p[@data-a-target='carousel-broadcaster-displayname']"))
                            )
try:
    next1 = driver.find_element_by_xpath('//*[@class="c-btn--light c-btn--center"]')
    next1.click()

    next2 = driver.find_element_by_xpath('//*[@class="u-title-font icon-ic_call_filled u-valign--middle"]')
    next2.click()
    phone_number = driver.find_element_by_class_name('c-vn__number').get_attribute('innerHTML')
except NoSuchElementException:
    phone_number = None

print(phone_number)

输出

DevTools listening on ws://127.0.0.1:60482/devtools/browser/9f226a40-2d1a-4108-9fde-f005b49e60b3
[1206/102937.475:INFO:CONSOLE(0)] "[Report Only] Refused to load the script 
'https://www.googletagmanager.com/gtag/js?id=AW-942004674' because it violates the following Content Security Policy directive: "script-src 'self' 'unsafe-inline' 'unsafe-eval' 'strict-dynamic' 'nonce-3RJz12sDPuoV27qS7dcBXLRZawmPobLo' *.practo.com *.practostatic.com *.onesignal.com *.mxpnl.com *.mixpanel.com *.facebook.com *.facebook.net *.twitter.com *.gstatic.com *.googleapis.com *.google.com *.googlesyndication.com *.newrelic.com *.google-analytics.com *.googletagmanager.com *.googleadservices.com *.googlesyndication.com *.doubleclick.net *.survicate.com in.wzrkt.com *.nr-data.net *.newrelic.com *.speedcurve.com *.ampproject.org *.netcore.co.in *.netcoresmartech.com *.criteo.net *.criteo.com https://secure.livechatinc.com". 'strict-dynamic' is present, so host-based whitelisting is disabled. Note that 'script-src-elem' was not explicitly set, so 'script-src' is used as a fallback.

", source: https://www.practo.com/delhi/doctor/dr-rajeev-puri-ear-nose-throat-ent-specialist?specialization=Ear-Nose-Throat%20(ENT)%20Specialist&practice_id=912154 (0)
[1206/125829.645:INFO:CONSOLE(33)] "[Report Only] Refused to execute inline script because it violates the following Content Security Policy directive: "script-src 'self' 'unsafe-inline' 'unsafe-eval' 'strict-dynamic' 'nonce-eNRfqc27QHPklLLhavu92zuUGDeEoSZL' *.practo.com *.practostatic.com *.onesignal.com *.mxpnl.com *.mixpanel.com *.facebook.com *.facebook.net *.twitter.com *.gstatic.com *.googleapis.com *.google.com *.googlesyndication.com *.newrelic.com *.google-analytics.com *.googletagmanager.com *.googleadservices.com *.googlesyndication.com *.doubleclick.net *.survicate.com in.wzrkt.com *.nr-data.net *.newrelic.com *.speedcurve.com *.ampproject.org *.netcore.co.in *.netcoresmartech.com *.criteo.net *.criteo.com https://secure.livechatinc.com". Note that 'unsafe-inline' is ignored if either a hash or nonce value is present in the source list.
    ", source: https://www.practo.com/delhi/doctor/dr-rajeev-puri-ear-nose-throat-ent-specialist?specialization=Ear-Nose-Throat%20(ENT)%20Specialist&practice_id=912154 (33)
[1206/125829.829:INFO:CONSOLE(0)] "[Report Only] Refused to frame 'https://9535906.fls.doubleclick.net/' because it violates the following Content Security Policy directive: "frame-src 'self' https://survicate.com *.practo.com *.criteo.net *.criteo.com https://www.facebook.com https://bid.g.doubleclick.net https://secure.livechatinc.com".
", source: https://www.googletagmanager.com/ (0)
[1206/125830.508:INFO:CONSOLE(0)] "[Report Only] Refused to frame 'https://9535906.fls.doubleclick.net/' because it violates the following Content Security Policy directive: "frame-src 'self' https://survicate.com *.practo.com *.criteo.net *.criteo.com https://www.facebook.com https://bid.g.doubleclick.net https://secure.livechatinc.com".
", source: https://www.googletagmanager.com/ (0)

推荐答案

这个错误信息...

[1206/102937.475:INFO:CONSOLE(0)] "[Report Only] Refused to load the script 
'https://www.googletagmanager.com/gtag/js?id=AW-942004674' because it violates the following Content Security Policy directive: "script-src 'self' 'unsafe-inline' 'unsafe-eval' 'strict-dynamic' 'nonce-3RJz12sDPuoV27qS7dcBXLRZawmPobLo' *.practo.com *.practostatic.com *.onesignal.com *.mxpnl.com *.mixpanel.com *.facebook.com *.facebook.net *.twitter.com *.gstatic.com *.googleapis.com *.google.com *.googlesyndication.com *.newrelic.com *.google-analytics.com *.googletagmanager.com *.googleadservices.com *.googlesyndication.com *.doubleclick.net *.survicate.com in.wzrkt.com *.nr-data.net *.newrelic.com *.speedcurve.com *.ampproject.org *.netcore.co.in *.netcoresmartech.com *.criteo.net *.criteo.com https://secure.livechatinc.com". 'strict-dynamic' is present, so host-based whitelisting is disabled. Note that 'script-src-elem' was not explicitly set, so 'script-src' is used as a fallback.
.
[1206/125830.508:INFO:CONSOLE(0)] "[Report Only] Refused to frame 'https://9535906.fls.doubleclick.net/' because it violates the following Content Security Policy directive: "frame-src 'self' https://survicate.com *.practo.com *.criteo.net *.criteo.com https://www.facebook.com https://bid.g.doubleclick.net https://secure.livechatinc.com".
", source: https://www.googletagmanager.com/ (0)

...暗示 ChromeDriver 无法启动/生成新的浏览上下文,即 Chrome 浏览器 会话.

...implies that the ChromeDriver was unable to initiate/spawn a new Browsing Context i.e. Chrome Browser session.

为了缓解跨站脚本问题,Chrome 的扩展系统实施了 内容安全策略 (CSP) 它引入了一些严格的策略,默认情况下将使扩展更加安全,并让我们能够创建和执行规则来管理可以由您的扩展和应用程序加载和执行的内容类型.CSP 作为扩展加载或执行的资源的阻止/许可机制.为您的扩展程序定义一个合理的策略使您能够考虑您的扩展程序所需的资源,并与浏览器协商以确保这些是您的扩展程序可以访问的唯一资源.这些策略甚至在您的扩展请求作为额外保护层的主机权限之上提供安全性.此类策略是通过 HTTP 标头或元元素定义的.在 Chrome 的扩展系统中,扩展的策略是通过扩展的 manifest.json 文件定义的,如下所示:

To mitigate the cross-site scripting issues Chrome's extension system has implemented the concept of Content Security Policy (CSP) which introduces some strict policies that will make extensions more secure by default and provides us the ability to create and enforce rules governing the types of content that can be loaded and executed by your extensions and applications. CSP works as a block/allowlisting mechanism for resources loaded or executed by your extensions. Defining a reasonable policy for your extension enables you to consider the resources that your extension requires and to negotiate with the browser to ensure that those are the only resources your extension has access to. These policies provide security even above the host permissions your extension requests acting as an additional layer of protection. Such policies are defined via an HTTP header or meta element. Within Chrome's extension system the extension's policy is defined via the extension's manifest.json file as follows:

{
  "content_security_policy": "[POLICY STRING GOES HERE]"
}

<小时>

放宽内容安全政策

在 Chrome 45 之前,没有机制可以放宽对执行内联 JavaScript 的限制.特别是,设置包含unsafe-inline"的脚本策略将无效.但是,从 Chrome 46 开始,可以通过在策略中指定源代码的 base64 编码哈希来允许内联脚本.此哈希必须以使用的哈希算法(sha256、sha384 或 sha512)作为前缀.这可以通过将 http://* 添加到 style-src 和/或 script-src 如下:

script-src 'self' http://xxxx 'unsafe-inline' 'unsafe-eval'; 

和/或

style-src 'self' http://xxxx 'unsafe-inline' 'unsafe-eval';

<小时>

这个用例

但是我能够访问网页 https://www.practo.com/delhi/doctor/dr-rajeev-puri-ear-nose-throat-ent-专家?specialization=Ear-Nose-Throat%20(ENT)%20Specialist&practice_id=912154 简单如下:

  • 代码块:

  • Code Block:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

options = webdriver.ChromeOptions() 
options.add_argument('window-size=1200x600')
options.add_argument('--headless')
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path=r'C:UtilityBrowserDriverschromedriver.exe')
driver.get("https://www.practo.com/delhi/doctor/dr-rajeev-puri-ear-nose-throat-ent-specialist?specialization=Ear-Nose-Throat%20(ENT)%20Specialist&practice_id=912154")
print(driver.page_source)
driver.quit()

  • 控制台输出:

  • Console Output:

    <html><head><title>Dr. Rajeev Puri - ENT/ Otorhinolaryngologist - Book Appointment Online, View Fees, Feedbacks | Practo</title><meta name="description" content="Dr. Rajeev Puri is an ENT/ Otorhinolaryngologist in DLF Phase IV. Book appointments Online, View Fees, User Feedbacks for Dr. Rajeev Puri | Practo"><meta charset="utf-8"><meta http-equiv="x-ua-compatible" content="ie=edge"><script src="https://js-agent.newrelic.com/nr-spa-1026.min.js"></script><script src="//survey.survicate.com/workspaces/wfhrNWYKtlLEWMqcaXcweuzHeMRiSljw/web_surveys.js" async=""></script><script src="//api.survicate.com/assets/survicate.js" async=""></script><script src="//survey.survicate.com/workspaces/wfhrNWYKtlLEWMqcaXcweuzHeMRiSljw/web_surveys.js" async=""></script><script src="//api.survicate.com/assets/survicate.js" async=""></script><script src="https://surveys-static.survicate.com/widget_core-3.0.4.js" async=""></script><script src="//survey.survicate.com/workspaces/wfhrNWYKtlLEWMqcaXcweuzHeMRiSljw/web_surveys.js" async=""></script><script src="//survey.survicate.com/workspaces/wfhrNWYKtlLEWMqcaXcweuzHeMRiSljw/web_surveys.js" async=""></script><script src="//survey.survicate.com/workspaces/wfhrNWYKtlLEWMqcaXcweuzHeMRiSljw/web_surveys.js" async=""></script><script src="//api.survicate.com/assets/survicate.js" async=""></script><script src="//api.survicate.com/assets/survicate.js" async=""></script><script src="//api.survicate.com/assets/survicate.js" async=""></script><script type="text/javascript" async="" src="https://www.googleadservices.com/pagead/conversion_async.js" nonce=""></script><script type="text/javascript" async="" src="https://www.google-analytics.com/plugins/ua/ec.js" nonce=""></script><script type="text/javascript" async="" src="https://www.practostatic.com/pel/clevertap/a.js"></script><script async="" src="//sweep.practo.com/sp.js"></script><script type="text/javascript" src="https://www.practostatic.com/pel/pel-1.6.1.js"></script><script async="" src="https://connect.facebook.net/en_US/fbevents.js"></script><script async="" src="//www.google-analytics.com/analytics.js"></script><script async="" src="https://www.googletagmanager.com/gtm.js?id=GTM-PSMVGL5"></script><script nonce="" type="text/javascript">(function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':
                  new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],
                  j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src=
                  'https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);
                  })(window,document,'script','dataLayer',"GTM-PSMVGL5");</script>
    

  • 确保:

    • Selenium is upgraded to current levels Version 3.141.59.
    • ChromeDriver is updated to current ChromeDriver v79.0.3945.36 level.
    • Chrome is updated to current Chrome Version 79.0 level. (as per ChromeDriver v79.0 release notes)

    您可以在 中找到相关讨论使用 Selenium IDE 被 CSP 阻止的对 eval() 的调用

    这篇关于拒绝加载脚本,因为它违反了以下内容安全策略指令:script-src error with ChromeDriver Chrome and Selenium的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    相关文章
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆