硒-python。如何捕获网络流量的响应 [英] Selenium - python. how to capture network traffic's response

查看:184
本文介绍了硒-python。如何捕获网络流量的响应的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用python Django创建一个Web应用程序。
我正在使用硒启动无头浏览器(phantomjs)并单击一些直到我到达特定页面。
我希望捕获网络流量并获得特定网络呼叫的响应。此网络调用实际上是作为响应保存的html文档。

I am using python Django to create a web app. i am using selenium to launch a headless browser(phantomjs) and making some clicks till i reach a particular page. I wish to capture network traffic and get the response of a particular network call. This network call actually holds a html doc as it's response.

任何实现此目标的方法?

Any way to achieve this ?

推荐答案

您可以访问浏览器或chromedriver日志,它们在网络响应方面略有不同。浏览器日志称为性能,驱动程序日志称为 driver 。它们返回一个类似json的对象,您可以解析该对象以使用其中的Network方法提取事件:

You can get access to browser or chromedriver logs, they are slightly different when it comes to network responses. The browser log is called performance and the driver log is called driver. They return a json-like object, which you can parse to extract events with Network methods inside them:

{'level': 'INFO',
  'message': '{"message":{"method":"Page.frameStoppedLoading","params":{"frameId":"FB10764A3ABF7FFC83110C39C5F7BF77"}},"webview":"C2D13BD13CF743B6D0695B35E9CC935C"}',
  'timestamp': 1538607113832},
 {'level': 'INFO',
  'message': '{"message":{"method":"Page.frameDetached","params":{"frameId":"FB10764A3ABF7FFC83110C39C5F7BF77"}},"webview":"C2D13BD13CF743B6D0695B35E9CC935C"}',
  'timestamp': 1538607113838},
 {'level': 'INFO',
  'message': '{"message":{"method":"Network.requestWillBeSent","params":{"documentURL":"https://stackoverflow.com/questions/52633697/selenium-python-how-to-capture-network-traffics-response","frameId":"C2D13BD13CF743B6D0695B35E9CC935C","hasUserGesture":false,"initiator":{"type":"other"},"loaderId":"5331BFDC4F466FCED920CFC9F033D2EC","request":{"headers":{"Upgrade-Insecure-Requests":"1","User-Agent":"Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36"},"initialPriority":"VeryHigh","method":"GET","mixedContentType":"none","referrerPolicy":"no-referrer-when-downgrade","url":"https://stackoverflow.com/questions/52633697/selenium-python-how-to-capture-network-traffics-response"},"requestId":"5331BFDC4F466FCED920CFC9F033D2EC","timestamp":104499.729,"type":"Document","wallTime":1538607113.838206}},"webview":"C2D13BD13CF743B6D0695B35E9CC935C"}',
  'timestamp': 1538607113839},...}

您需要启用登录 DesiredCapabilities ,然后使用 JSON 模块对其进行解析:

You need to enable logging in DesiredCapabilities and then parse it using JSON module:

import json
from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities

caps = DesiredCapabilities.CHROME
caps['loggingPrefs'] = {'performance': 'ALL'}
driver = webdriver.Chrome(desired_capabilities=caps)
driver.get('https://stackoverflow.com/questions/52633697/selenium-python-how-to-capture-network-traffics-response')

def process_browser_log_entry(entry):
    response = json.loads(entry['message'])['message']
    return response

browser_log = driver.get_log('performance') 
events = [process_browser_log_entry(entry) for entry in browser_log]
events = [event for event in events if 'Network.response' in event['method']]

我不知道您是否可以使用它来访问响应数据本身,但是您可以获取响应的URL。

I don't know if you can get access to response data itself using this, but you can get a url of the response.

UPDATE 2020-10-07⬇

@Roey B @Inactivist 在评论中进行解释,您可以使用 Network.getResponseBody 命令访问响应正文:

As @Roey B and @Inactivist explain in the comments, you can access response body using Network.getResponseBody command:

driver.execute_cdp_cmd('Network.getResponseBody', {'requestId': events[0]["params"]["requestId"]})

这篇关于硒-python。如何捕获网络流量的响应的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆