http服务器响应(套接字)的标头和内容之间的区别 [英] Differ between header and content of http server response (sockets)

查看:22
本文介绍了http服务器响应(套接字)的标头和内容之间的区别的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道,是否有可能找出响应流中标头结束的位置?

i want to know, is there a possibility to find out where in the response Stream the header ends?

问题的背景如下,我在 c 中使用套接字从网站获取内容,内容以 gzip 编码.我想直接从流中读取内容并使用 zlib 对 gzip 内容进行编码.但是我怎么知道 gzip 内容开始了,http 标头完成了.

The background of the question is as following, i am using sockets in c to get content from a website, the content is encoded in gzip. I would like to read the content directly from stream and encode the gzip content with zlib. But how do i know the gzip content started and the http header is finished.

我粗略地尝试了两种方法,在我看来,它们给了我一些奇怪的结果.首先,我读入了整个流,并在终端中打印出来,我的 http 标头像我预期的那样以 "结尾,但是第二次,我只检索了一次响应以获取标头然后用while循环读取内容,这里的标题结尾没有 ".

I roughly tried two ways which are giving me some, in my opinion, strange results. First, i read in the whole stream, and print it out in terminal, my http header ends with " " like i expected, but the secound time, i just retrieve the response once to get the header and then read the content with while loop, here the header ends without " ".

为什么?哪种方式才是阅读内容的正确方式?

Why? And which way is the right way to read in the content?

我只是给你代码,这样你就可以看到我是如何从服务器获得响应的.

I'll just give you the code so you could see how i'm getting the response from server.

//first way (gives rnrn)
char *output, *output_header, *output_content, **output_result;
size_t size;
FILE *stream;
stream = open_memstream (&output, &size);
char BUF[BUFSIZ];
while(recv(socket_desc, BUF, (BUFSIZ - 1), 0) > 0)
{
    fprintf (stream, "%s", BUF);
}
fflush(stream);
fclose(stream);

output_result = str_split(output, "

");
output_header = output_result[0];
output_content = output_result[1];

printf("Header:
%s
", output_header);
printf("Content:
%s
", output_content);

.

//second way (doesnt give rnrn)
char *content, *output_header;
size_t size;
FILE *stream;
stream = open_memstream (&content, &size);
char BUF[BUFSIZ];

if((recv(socket_desc, BUF, (BUFSIZ - 1), 0) > 0)
{
    output_header = BUF;
}

while(recv(socket_desc, BUF, (BUFSIZ - 1), 0) > 0)
{
    fprintf (stream, "%s", BUF); //i would just use this as input stream to zlib
}
fflush(stream);
fclose(stream);

printf("Header:
%s
", output_header);
printf("Content:
%s
", content);

两者都给出相同的结果将它们打印到终端,但是第二个应该打印出更多的中断,至少我期望,因为它们在分割字符串时丢失了.

Both give the same result printing them to terminal, but the secound one should print out some more breaks, at least i expect, because they get lost splitting the string.

我是 c 的新手,所以我可能只是监督一些简单的事情.

I am new to c, so i might just oversee some easy stuff.

推荐答案

您在循环中调用 recv() 直到套接字断开连接或失败(并将接收到的数据错误地写入流方式),将所有原始数据存储到您的 char* 缓冲区中.这不是读取 HTTP 响应的正确方法,尤其是在使用 HTTP 保持连接的情况下(在这种情况下,响应结束时不会发生断开连接).您必须遵守RFC 2616中列出的规则.即:

You are calling recv() in a loop until the socket disconnects or fails (and writing the received data to your stream the wrong way), storing all of the raw data into your char* buffer. That is not the correct way to read an HTTP response, especially if HTTP keep-alives are used (in which case no disconnect will occur at the end of the response). You must follow the rules outlined in RFC 2616. Namely:

  1. 阅读直到遇到 " 序列.这将终止响应标头.不要再读取超过那个字节的任何字节.

  1. Read until the " " sequence is encountered. This terminates the response headers. Do not read any more bytes past that yet.

根据 RFC 2616 部分中的规则分析收到的标头4.4.它们会告诉您剩余响应数据的实际格式.

Analyze the received headers, per the rules in RFC 2616 Section 4.4. They tell you the actual format of the remaining response data.

按照 #2 中发现的格式读取剩余数据(如果有).

Read the remaining data, if any, per the format discovered in #2.

如果响应使用 HTTP 1.1,或者缺少 Connection: keep-aliveConnection: close 标头> 标头,如果响应使用 HTTP 0.9 或 1.0.如果检测到,请关闭套接字连接的一端,因为服务器正在关闭其一端.否则,保持连接打开并在后续请求中重新使用它(除非您使用完连接,在这种情况下关闭它).

Check the received headers for the presence of a Connection: close header if the response is using HTTP 1.1, or the lack of a Connection: keep-alive header if the response is using HTTP 0.9 or 1.0. If detected, close your end of the socket connection because the server is closing its end. Otherwise, keep the connection open and re-use it for subsequent requests (unless you are done using the connection, in which case do close it).

根据需要处理接收到的数据.

Process the received data as needed.

简而言之,您需要做更多类似的事情(伪代码):

In short, you need to do something more like this instead (pseudo code):

string headers[];
byte data[];

string statusLine = read a CRLF-delimited line;
int statusCode = extract from status line;
string responseVersion = extract from status line;

do
{
    string header = read a CRLF-delimited line;
    if (header == "") break;
    add header to headers list;
}
while (true);

if ( !((statusCode in [1xx, 204, 304]) || (request was "HEAD")) )
{
    if (headers["Transfer-Encoding"] ends with "chunked")
    {
        do
        {
            string chunk = read a CRLF delimited line;
            int chunkSize = extract from chunk line;
            if (chunkSize == 0) break;

            read exactly chunkSize number of bytes into data storage;

            read and discard until a CRLF has been read;
        }
        while (true);

        do
        {
            string header = read a CRLF-delimited line;
            if (header == "") break;
            add header to headers list;
        }
        while (true);
    }
    else if (headers["Content-Length"] is present)
    {
        read exactly Content-Length number of bytes into data storage;
    }
    else if (headers["Content-Type"] begins with "multipart/")
    {
        string boundary = extract from Content-Type header;
        read into data storage until terminating boundary has been read;
    }
    else
    {
        read bytes into data storage until disconnected;
    }
}

if (!disconnected)
{
    if (responseVersion == "HTTP/1.1")
    {
        if (headers["Connection"] == "close")
            close connection;
    }
    else
    {
        if (headers["Connection"] != "keep-alive")
            close connection;
    }
}

check statusCode for errors;
process data contents, per info in headers list;

这篇关于http服务器响应(套接字)的标头和内容之间的区别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆