拒绝加载脚本,因为它违反了以下内容安全策略指令:ChromeDriver Chrome和Selenium的script-src错误 [英] Refused to load the script because it violates the following Content Security Policy directive: script-src error with ChromeDriver Chrome and Selenium

查看:263
本文介绍了拒绝加载脚本,因为它违反了以下内容安全策略指令:ChromeDriver Chrome和Selenium的script-src错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从这些链接中抓取电话号码 https://www.practo.com/delhi/doctor/dr-meeka-gulati-dentist-3?specialization=Dentist&practice_id=722421 https://www.practo.com/delhi/doctor/dr-rajeev-puri-ear-nose-throat-ent-specialist?specialization=Ear-Nose-Throat%20(ENT)% 20Specialist& practice_id = 912154

I am trying to scrape Phone Number from these links "https://www.practo.com/delhi/doctor/dr-meeka-gulati-dentist-3?specialization=Dentist&practice_id=722421" and "https://www.practo.com/delhi/doctor/dr-rajeev-puri-ear-nose-throat-ent-specialist?specialization=Ear-Nose-Throat%20(ENT)%20Specialist&practice_id=912154"

如果元素存在,则会刮擦电话号码,否则电话号码为无

if element present it scrapes the phone number otherwise phone number is None

蜘蛛代码:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import NoSuchElementException

options = webdriver.ChromeOptions()
options.add_argument('headless')
options.add_argument('window-size=1200x600')

driver = webdriver.Chrome(chrome_options=options)

driver.get('https://www.practo.com/delhi/doctor/dr-meeka-gulati-dentist-3?specialization=Dentist&practice_id=722421')

WebDriverWait(driver, 10).until(
                            EC.presence_of_element_located((By.XPATH, "//p[@data-a-target='carousel-broadcaster-displayname']"))
                            )
try:
    next1 = driver.find_element_by_xpath('//*[@class="c-btn--light c-btn--center"]')
    next1.click()

    next2 = driver.find_element_by_xpath('//*[@class="u-title-font icon-ic_call_filled u-valign--middle"]')
    next2.click()
    phone_number = driver.find_element_by_class_name('c-vn__number').get_attribute('innerHTML')
except NoSuchElementException:
    phone_number = None

print(phone_number)

输出

DevTools listening on ws://127.0.0.1:60482/devtools/browser/9f226a40-2d1a-4108-9fde-f005b49e60b3
[1206/102937.475:INFO:CONSOLE(0)] "[Report Only] Refused to load the script 
'https://www.googletagmanager.com/gtag/js?id=AW-942004674' because it violates the following Content Security Policy directive: "script-src 'self' 'unsafe-inline' 'unsafe-eval' 'strict-dynamic' 'nonce-3RJz12sDPuoV27qS7dcBXLRZawmPobLo' *.practo.com *.practostatic.com *.onesignal.com *.mxpnl.com *.mixpanel.com *.facebook.com *.facebook.net *.twitter.com *.gstatic.com *.googleapis.com *.google.com *.googlesyndication.com *.newrelic.com *.google-analytics.com *.googletagmanager.com *.googleadservices.com *.googlesyndication.com *.doubleclick.net *.survicate.com in.wzrkt.com *.nr-data.net *.newrelic.com *.speedcurve.com *.ampproject.org *.netcore.co.in *.netcoresmartech.com *.criteo.net *.criteo.com https://secure.livechatinc.com". 'strict-dynamic' is present, so host-based whitelisting is disabled. Note that 'script-src-elem' was not explicitly set, so 'script-src' is used as a fallback.

", source: https://www.practo.com/delhi/doctor/dr-rajeev-puri-ear-nose-throat-ent-specialist?specialization=Ear-Nose-Throat%20(ENT)%20Specialist&practice_id=912154 (0)
[1206/125829.645:INFO:CONSOLE(33)] "[Report Only] Refused to execute inline script because it violates the following Content Security Policy directive: "script-src 'self' 'unsafe-inline' 'unsafe-eval' 'strict-dynamic' 'nonce-eNRfqc27QHPklLLhavu92zuUGDeEoSZL' *.practo.com *.practostatic.com *.onesignal.com *.mxpnl.com *.mixpanel.com *.facebook.com *.facebook.net *.twitter.com *.gstatic.com *.googleapis.com *.google.com *.googlesyndication.com *.newrelic.com *.google-analytics.com *.googletagmanager.com *.googleadservices.com *.googlesyndication.com *.doubleclick.net *.survicate.com in.wzrkt.com *.nr-data.net *.newrelic.com *.speedcurve.com *.ampproject.org *.netcore.co.in *.netcoresmartech.com *.criteo.net *.criteo.com https://secure.livechatinc.com". Note that 'unsafe-inline' is ignored if either a hash or nonce value is present in the source list.
    ", source: https://www.practo.com/delhi/doctor/dr-rajeev-puri-ear-nose-throat-ent-specialist?specialization=Ear-Nose-Throat%20(ENT)%20Specialist&practice_id=912154 (33)
[1206/125829.829:INFO:CONSOLE(0)] "[Report Only] Refused to frame 'https://9535906.fls.doubleclick.net/' because it violates the following Content Security Policy directive: "frame-src 'self' https://survicate.com *.practo.com *.criteo.net *.criteo.com https://www.facebook.com https://bid.g.doubleclick.net https://secure.livechatinc.com".
", source: https://www.googletagmanager.com/ (0)
[1206/125830.508:INFO:CONSOLE(0)] "[Report Only] Refused to frame 'https://9535906.fls.doubleclick.net/' because it violates the following Content Security Policy directive: "frame-src 'self' https://survicate.com *.practo.com *.criteo.net *.criteo.com https://www.facebook.com https://bid.g.doubleclick.net https://secure.livechatinc.com".
", source: https://www.googletagmanager.com/ (0)


推荐答案

此错误消息...

[1206/102937.475:INFO:CONSOLE(0)] "[Report Only] Refused to load the script 
'https://www.googletagmanager.com/gtag/js?id=AW-942004674' because it violates the following Content Security Policy directive: "script-src 'self' 'unsafe-inline' 'unsafe-eval' 'strict-dynamic' 'nonce-3RJz12sDPuoV27qS7dcBXLRZawmPobLo' *.practo.com *.practostatic.com *.onesignal.com *.mxpnl.com *.mixpanel.com *.facebook.com *.facebook.net *.twitter.com *.gstatic.com *.googleapis.com *.google.com *.googlesyndication.com *.newrelic.com *.google-analytics.com *.googletagmanager.com *.googleadservices.com *.googlesyndication.com *.doubleclick.net *.survicate.com in.wzrkt.com *.nr-data.net *.newrelic.com *.speedcurve.com *.ampproject.org *.netcore.co.in *.netcoresmartech.com *.criteo.net *.criteo.com https://secure.livechatinc.com". 'strict-dynamic' is present, so host-based whitelisting is disabled. Note that 'script-src-elem' was not explicitly set, so 'script-src' is used as a fallback.
.
[1206/125830.508:INFO:CONSOLE(0)] "[Report Only] Refused to frame 'https://9535906.fls.doubleclick.net/' because it violates the following Content Security Policy directive: "frame-src 'self' https://survicate.com *.practo.com *.criteo.net *.criteo.com https://www.facebook.com https://bid.g.doubleclick.net https://secure.livechatinc.com".
", source: https://www.googletagmanager.com/ (0)

...表示 ChromeDriver 无法启动/产生新的浏览上下文,即 Chrome浏览器会话。

...implies that the ChromeDriver was unable to initiate/spawn a new Browsing Context i.e. Chrome Browser session.

跨站点脚本问题Chrome的扩展系统已实现内容安全策略(CSP)引入了一些严格的策略,这些策略将默认使扩展更安全,并为我们提供了能够创建和执行规则来管理扩展和应用程序可以加载和执行的内容类型。 CSP充当扩展程序加载或执行的资源的阻止/允许列表机制。为扩展程序定义合理的策略,使您可以考虑扩展程序所需的资源,并与浏览器协商以确保这些资源是扩展程序可以访问的唯一资源。这些策略甚至在扩展请求作为附加保护层的主机权限之上也提供了安全性。此类策略是通过HTTP标头或meta元素定义的。在Chrome的扩展程序系统中,扩展程序的政策是通过扩展程序的 manifest.json 文件定义的,如下所示:

To mitigate the cross-site scripting issues Chrome's extension system has implemented the concept of Content Security Policy (CSP) which introduces some strict policies that will make extensions more secure by default and provides us the ability to create and enforce rules governing the types of content that can be loaded and executed by your extensions and applications. CSP works as a block/allowlisting mechanism for resources loaded or executed by your extensions. Defining a reasonable policy for your extension enables you to consider the resources that your extension requires and to negotiate with the browser to ensure that those are the only resources your extension has access to. These policies provide security even above the host permissions your extension requests acting as an additional layer of protection. Such policies are defined via an HTTP header or meta element. Within Chrome's extension system the extension's policy is defined via the extension's manifest.json file as follows:

{
  "content_security_policy": "[POLICY STRING GOES HERE]"
}






放松内容安全政策



在Chrome 45之前,没有任何机制可以放宽执行内联JavaScript的限制。特别是,设置包含不安全内联的脚本策略将无效。但是,从Chrome 46开始,可以通过在策略中指定源代码的base64编码的哈希值来允许内联脚本。该哈希必须以使用的哈希算法(sha256,sha384或sha512)作为前缀。可以通过将 http:// * 都添加到 style-src 中来实现。和/或 script-src 如下:


Relaxing the Content Security Policy

Till Chrome 45, there was no mechanism for relaxing the restriction against executing inline JavaScript. In particular, setting a script policy that includes 'unsafe-inline' will have no effect. However, from Chrome 46 onwards, inline scripts can be allowed by specifying the base64-encoded hash of the source code in the policy. This hash must be prefixed by the used hash algorithm (sha256, sha384 or sha512). This can be achived by setting adding http://* to both style-src and/or script-src as follows:

script-src 'self' http://xxxx 'unsafe-inline' 'unsafe-eval'; 

和/或

style-src 'self' http://xxxx 'unsafe-inline' 'unsafe-eval';






此用例



但是我可以访问 https://www.practo.com/delhi/doctor/dr-rajeev-puri-ear-nose- throat-ent-specialist?specialization = Ear-Nose-Throat%20(ENT)%20Specialist& practice_id = 912154 容易如下:


  • 代码块:

  • Code Block:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

options = webdriver.ChromeOptions() 
options.add_argument('window-size=1200x600')
options.add_argument('--headless')
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
driver.get("https://www.practo.com/delhi/doctor/dr-rajeev-puri-ear-nose-throat-ent-specialist?specialization=Ear-Nose-Throat%20(ENT)%20Specialist&practice_id=912154")
print(driver.page_source)
driver.quit()


  • 控制台输出:

  • Console Output:

    <html><head><title>Dr. Rajeev Puri - ENT/ Otorhinolaryngologist - Book Appointment Online, View Fees, Feedbacks | Practo</title><meta name="description" content="Dr. Rajeev Puri is an ENT/ Otorhinolaryngologist in DLF Phase IV. Book appointments Online, View Fees, User Feedbacks for Dr. Rajeev Puri | Practo"><meta charset="utf-8"><meta http-equiv="x-ua-compatible" content="ie=edge"><script src="https://js-agent.newrelic.com/nr-spa-1026.min.js"></script><script src="//survey.survicate.com/workspaces/wfhrNWYKtlLEWMqcaXcweuzHeMRiSljw/web_surveys.js" async=""></script><script src="//api.survicate.com/assets/survicate.js" async=""></script><script src="//survey.survicate.com/workspaces/wfhrNWYKtlLEWMqcaXcweuzHeMRiSljw/web_surveys.js" async=""></script><script src="//api.survicate.com/assets/survicate.js" async=""></script><script src="https://surveys-static.survicate.com/widget_core-3.0.4.js" async=""></script><script src="//survey.survicate.com/workspaces/wfhrNWYKtlLEWMqcaXcweuzHeMRiSljw/web_surveys.js" async=""></script><script src="//survey.survicate.com/workspaces/wfhrNWYKtlLEWMqcaXcweuzHeMRiSljw/web_surveys.js" async=""></script><script src="//survey.survicate.com/workspaces/wfhrNWYKtlLEWMqcaXcweuzHeMRiSljw/web_surveys.js" async=""></script><script src="//api.survicate.com/assets/survicate.js" async=""></script><script src="//api.survicate.com/assets/survicate.js" async=""></script><script src="//api.survicate.com/assets/survicate.js" async=""></script><script type="text/javascript" async="" src="https://www.googleadservices.com/pagead/conversion_async.js" nonce=""></script><script type="text/javascript" async="" src="https://www.google-analytics.com/plugins/ua/ec.js" nonce=""></script><script type="text/javascript" async="" src="https://www.practostatic.com/pel/clevertap/a.js"></script><script async="" src="//sweep.practo.com/sp.js"></script><script type="text/javascript" src="https://www.practostatic.com/pel/pel-1.6.1.js"></script><script async="" src="https://connect.facebook.net/en_US/fbevents.js"></script><script async="" src="//www.google-analytics.com/analytics.js"></script><script async="" src="https://www.googletagmanager.com/gtm.js?id=GTM-PSMVGL5"></script><script nonce="" type="text/javascript">(function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':
                  new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],
                  j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src=
                  'https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);
                  })(window,document,'script','dataLayer',"GTM-PSMVGL5");</script>
    


  • 请确保:

    • Selenium is upgraded to current levels Version 3.141.59.
    • ChromeDriver is updated to current ChromeDriver v79.0.3945.36 level.
    • Chrome is updated to current Chrome Version 79.0 level. (as per ChromeDriver v79.0 release notes)

    您可以找到一个调用eval()的相关讨论带有Selenium IDE的CSP

    这篇关于拒绝加载脚本,因为它违反了以下内容安全策略指令:ChromeDriver Chrome和Selenium的script-src错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    相关文章
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆