如何使用Scapy提取HTML代码? [英] How can I extract HTML code with Scapy?

查看:235
本文介绍了如何使用Scapy提取HTML代码?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近开始针对Python 2.x使用 scapy 库,我发现在那里是有关sniff()函数的最少文档.我开始试用它,发现可以在很低的级别访问TCP数据包.到目前为止,我只发现了参考数据.例如:

I recently began to use the scapy library for Python 2.x I found there to be minimal documentation on the sniff() function. I began to play around with it and found that I can veiw TCP packets at a very low level. So far I have only found informational data. For example:

这是我在船头码头里放的东西:

Here is what I put in the scapy terminal:

A = sniff(filter="tcp and host 216.58.193.78", count=2)

这是google.com要求提供的首页

This is a request to google.com asking for the homepage:

<Ether  dst=e8:de:27:55:17:f3 src=00:24:1d:20:a6:1b type=0x800 |<IP  version=4L ihl=5L tos=0x0 len=60 id=46627 flags=DF frag=0L ttl=64 proto=tcp chksum=0x2a65 src=192.168.0.2 dst=216.58.193.78 options=[] |<TCP  sport=54036 dport=www seq=2948286264 ack=0 dataofs=10L reserved=0L flags=S window=29200 chksum=0x5a62 urgptr=0 options=[('MSS', 1460), ('SAckOK', ''), ('Timestamp', (389403, 0)), ('NOP', None), ('WScale', 7)] |>>>

以下是回复:

<Ether  dst=00:24:1d:20:a6:1b src=e8:de:27:55:17:f3 type=0x800 |<IP  version=4L ihl=5L tos=0x0 len=60 id=42380 flags= frag=0L ttl=55 proto=tcp chksum=0x83fc src=216.58.193.78 dst=192.168.0.2 options=[] |<TCP  sport=www dport=54036 seq=3087468609 ack=2948286265 dataofs=10L reserved=0L flags=SA window=42540 chksum=0xecaf urgptr=0 options=[('MSS', 1430), ('SAckOK', ''), ('Timestamp', (2823173876, 389403)), ('NOP', None), ('WScale', 7)] |>>>

使用此功能,有什么方法可以从响应中提取HTML代码?

Using this function, is there a way that I can extract HTML code from the response?

另外,这些数据包是什么样的?

Also, what do those packets look like?

最后,为什么这两个数据包几乎相同?

And finaly, Why are both of these packets nearly identical?

推荐答案

示例中的段几乎相同",因为它们是

The segments in your example are "nearly identical" because they are the TCP SYN and SYN-ACK segments which are part of the TCP handshake, HTTP request and response comes after that during the connection (usually when in ESTABLISHED state except when TCP Fast Open option is used) so you need to look at segments after the handshake to get the data you are interested in.

         SYN
C ---------------> S
       SYN-ACK
C <--------------- S
         ACK
C ---------------> S
    HTTP request
C ---------------> S
         ACK
C <--------------- S
    HTTP response
C <--------------- S  <= Here is the server's answer
         ACK
C ---------------> S
...

在您的情况下,您可以使用Scapy的Raw层提取高于TCP的数据(例如pkt[Raw])

You can use Scapy's Raw layer to extract data above TCP in your case (e.g. pkt[Raw])

这篇关于如何使用Scapy提取HTML代码?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆