如何使用 puppeteer 转储 WebSocket 数据 [英] How to use puppeteer to dump WebSocket data

查看:101
本文介绍了如何使用 puppeteer 转储 WebSocket 数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在这个页面获取 websocket 数据

所以我想知道也许无头 chrome 可以帮助我监控 websocket 数据.

有什么想法吗?谢谢!

解决方案

您实际上不需要对此做任何复杂的事情.URL 虽然看起来是动态的,但通过代码也能正常工作.它不起作用的原因是您需要了解后台发生的事情.

首先让我们看看网络标签.

cookies 和 Origin 可能对连接很重要.所以我们记下这些.

现在让我们看看socket上的数据交换

如果您查看帧,初始帧将接收 o 作为数据,这可能表示打开连接.然后网站向socket发送一些数据,可能和我们要查询的有关.当连接暂停一段时间时,套接字接收 h 作为数据.这可能表示暂停或其他内容(如第二张图片所示)

为了获得准确的数据,我们在代码中放置了一个断点

然后在控制台打印值

现在我们有足够的信息来处理编码部分.我发现下面是一个很好的 websocket 库

现在如果我更换

const ws = new WebSocket("wss://example.com/sockjs/299/enavklnl/websocket",null,{标题:{"Cookie":"<前面提到的 cookie 数据>","User-Agent": "<您的浏览器代理>"},来源:https://example.com",})

const ws = new WebSocket("wss://example.com/sockjs/299/enavklnl/websocket")

这意味着从不需要 cookiesorigin .但我仍然建议你使用它们

I want to get websocket data in this page https://upbit.com/exchange?code=CRIX.UPBIT.KRW-BTC, its websocket URL is dynamic and only valid during the first connection, the second time you connect to it it will not send data anymore.

So I wonder that maybe headless chrome can help me to monitor the websocket data.

Any ideas? Thanks!

解决方案

You actually don't need to do anything complex on this. The URL though seems dynamic, but works fine through code as well. The reason it doesn't work is that you need to understand what is happening in the background.

First let's look at the Network Tab.

The cookies and the Origin may be of importance to connecting. So we note these down.

Now let us look at the data exchanges on the socket

If you look at the frames the initial frame receives o as the data, which may indicate a opening connection. And then the website sends some data to the socket, which may be related to what we want to query. When the connection gets halted for some time, the socket receives h as the data. This may indicated a hold or something (as shown in second image)

To get the exact data we put a breakpoint in the code

And then print the value in the console

Now we have enough information to hit to the coding part. I found below to be a good websocket library for this

https://github.com/websockets/ws

So we do a

yarn add ws || npm install ws --save

Now we write our code

const WebSocket = require("ws")
const ws = new WebSocket("wss://example.com/sockjs/299/enavklnl/websocket",null,{
    headers: {
        "Cookie":"<cookie data noted earlier>",
        "User-Agent": "<Your browser agent>"
    },
    origin: "https://example.com",
})
const opening_message = '["[{\\"ticket\\":\\"ram macbook\\"},{\\"type\\":\\"recentCrix\\",\\"codes\\":[\\"CRIX.UPBIT.KRW-BTC\\",\\"CRIX.BITFINEX.USD-BTC\\",\\"CRIX.BITFLYER.JPY-BTC\\",\\"CRIX.OKCOIN.CNY-BTC\\",\\"CRIX.KRAKEN.EUR-BTC\\",\\"CRIX.UPBIT.KRW-DASH\\",\\"CRIX.UPBIT.KRW-ETH\\",\\"CRIX.UPBIT.KRW-NEO\\",\\"CRIX.UPBIT.KRW-BCC\\",\\"CRIX.UPBIT.KRW-MTL\\",\\"CRIX.UPBIT.KRW-LTC\\",\\"CRIX.UPBIT.KRW-STRAT\\",\\"CRIX.UPBIT.KRW-XRP\\",\\"CRIX.UPBIT.KRW-ETC\\",\\"CRIX.UPBIT.KRW-OMG\\",\\"CRIX.UPBIT.KRW-SNT\\",\\"CRIX.UPBIT.KRW-WAVES\\",\\"CRIX.UPBIT.KRW-PIVX\\",\\"CRIX.UPBIT.KRW-XEM\\",\\"CRIX.UPBIT.KRW-ZEC\\",\\"CRIX.UPBIT.KRW-XMR\\",\\"CRIX.UPBIT.KRW-QTUM\\",\\"CRIX.UPBIT.KRW-LSK\\",\\"CRIX.UPBIT.KRW-STEEM\\",\\"CRIX.UPBIT.KRW-XLM\\",\\"CRIX.UPBIT.KRW-ARDR\\",\\"CRIX.UPBIT.KRW-KMD\\",\\"CRIX.UPBIT.KRW-ARK\\",\\"CRIX.UPBIT.KRW-STORJ\\",\\"CRIX.UPBIT.KRW-GRS\\",\\"CRIX.UPBIT.KRW-VTC\\",\\"CRIX.UPBIT.KRW-REP\\",\\"CRIX.UPBIT.KRW-EMC2\\",\\"CRIX.UPBIT.KRW-ADA\\",\\"CRIX.UPBIT.KRW-SBD\\",\\"CRIX.UPBIT.KRW-TIX\\",\\"CRIX.UPBIT.KRW-POWR\\",\\"CRIX.UPBIT.KRW-MER\\",\\"CRIX.UPBIT.KRW-BTG\\",\\"CRIX.COINMARKETCAP.KRW-USDT\\"]},{\\"type\\":\\"crixTrade\\",\\"codes\\":[\\"CRIX.UPBIT.KRW-BTC\\"]},{\\"type\\":\\"crixOrderbook\\",\\"codes\\":[\\"CRIX.UPBIT.KRW-BTC\\"]}]"]'
ws.on('open', function open() {
    console.log("opened");
});

ws.on('message', function incoming(data) {
    if (data == "o" || data == "h") {
        console.log("sending opening message")
        ws.send(opening_message)
    }
    else {
        console.log("Received", data)

    }
});

And running the code we get

Now if I replace

const ws = new WebSocket("wss://example.com/sockjs/299/enavklnl/websocket",null,{
    headers: {
        "Cookie":"<cookie data noted earlier>",
        "User-Agent": "<Your browser agent>"
    },
    origin: "https://example.com",
})

to

const ws = new WebSocket("wss://example.com/sockjs/299/enavklnl/websocket")

Which means cookies and origin was never needed as such. But I would still recommend you to use them

这篇关于如何使用 puppeteer 转储 WebSocket 数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆