网页抓取实时更改数据 [英] Web Scrape live chaning data

查看:39
本文介绍了网页抓取实时更改数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对网页抓取还很陌生.静态内容很容易,但我想知道是否有办法抓取这样的网站:https://threatmap.checkpoint.com/

I am pretty new to web scraping. It is pretty easy with static content, but I would like to know if there is a way, to scrape a site like that: https://threatmap.checkpoint.com/

我需要从该站点抓取所有实时攻击.但我什至不知道如何开始.

I need to scrape all the live attacks from that site. But I don't even know how to start.

推荐答案

有时您根本不需要抓取.
但要深入了解机制.

Sometimes you don't need to scrape at all.
But look deep into the mechanics.

本网站使用内置的浏览器提取API.

你只需要从这个来源解码:
https://threatmap-api.checkpoint.com/ThreatMap/api/feed

You just need decode from this source:
https://threatmap-api.checkpoint.com/ThreatMap/api/feed

下面是一个示例提取调用:

Below is a sample fetch call:

fetch("https://threatmap-api.checkpoint.com/ThreatMap/api/feed", {
  "headers": {
    "accept": "text/event-stream",
    "accept-language": "en-US,pt;q=0.9,en-US;q=0.8,en;q=0.7",
    "cache-control": "no-cache",
    "sec-ch-ua": "\"Google Chrome\";v=\"89\", \"Chromium\";v=\"89\", \";Not A Brand\";v=\"99\"",
    "sec-ch-ua-mobile": "?0",
    "sec-fetch-dest": "empty",
    "sec-fetch-mode": "cors",
    "sec-fetch-site": "same-site"
  },
  "referrer": "https://threatmap.checkpoint.com/",
  "referrerPolicy": "strict-origin-when-cross-origin",
  "body": null,
  "method": "GET",
  "mode": "cors",
  "credentials": "omit"
});

这是一个示例事件:
{"a_c":1,"a_n":"DNS Enforcement Violation","a_t":"exploit","d_co":"SE","d_la":63.8284,d_lo":20.2597,d_s":AC",s_co":US",s_lo":-73.9712,s_la":40.7428,s_s";,"t":null}

视觉上的意思:

  • 描述:DNS 强制违规
  • 种类:利用
  • 目标国家/地区:SE/AC - 纬度/经度:63.8284、20.2597
  • 来源国家/地区:美国/纽约 - 纬度/经度:40.7428,-73.9712

这篇关于网页抓取实时更改数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆