如何存储可能包含二进制数据的Http响应? [英] How to store an Http Response that may contain binary data?

查看:139
本文介绍了如何存储可能包含二进制数据的Http响应?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

正如我在以前的问题中所述,我有一个分配写代理服务器。它现在部分工作,但是我仍然有处理gziped信息的问题。我将HttpResponse存储在一个字符串中,似乎我无法使用gzip压缩的内容。但是,标题是需要解析的文本,它们都来自相同的 InputStream 。我的问题是,为了正确处理二进制响应,我仍然需要做什么,同时仍然将标题解析为字符串?

As I described in a previous question, I have an assignment to write a proxy server. It partially works now, but I still have a problem with handling of gzipped information. I store the HttpResponse in a String, and it appears I can't do that with gzipped content. However, the headers are text which I need to parse, and they all come from the same InputStream. My question is, what do I have to do in order to correctly handle binary responses, while still parsing the headers as strings?

>>请参阅编辑

>> Please see the edit below before you look at the code.

这里是响应类实现:

public class Response {
    private String fullResponse = "";
    private BufferedReader reader;
    private boolean busy = true;
    private int responseCode;
    private CacheControl cacheControl;

    public Response(String input) {
        this(new ByteArrayInputStream(input.getBytes()));
    }

    public Response(InputStream input) {
        reader = new BufferedReader(new InputStreamReader(input));
        try {
            while (!reader.ready());//wait for initialization.

            String line;
            while ((line = reader.readLine()) != null) {
                fullResponse += "\r\n" + line;

                if (HttpPatterns.RESPONSE_CODE.matches(line)) {
                    responseCode = (Integer) HttpPatterns.RESPONSE_CODE.process(line);
                } else if (HttpPatterns.CACHE_CONTROL.matches(line)) {
                    cacheControl = (CacheControl) HttpPatterns.CACHE_CONTROL.process(line);
                }
            }
            reader.close();
            fullResponse = "\r\n" + fullResponse.trim() + "\r\n\r\n";
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        } 
        busy = false;
    }

    public CacheControl getCacheControl() {
        return cacheControl;
    }

    public String getFullResponse() {
        return fullResponse;
    }

    public boolean isBusy() {
        return busy;
    }

    public int getResponseCode() {
        return responseCode;
    }

    @Override
    public int hashCode() {
        final int prime = 31;
        int result = 1;
        result = prime * result
                + ((fullResponse == null) ? 0 : fullResponse.hashCode());
        return result;
    }

    @Override
    public boolean equals(Object obj) {
        if (this == obj)
            return true;
        if (obj == null)
            return false;
        if (!(obj instanceof Response))
            return false;
        Response other = (Response) obj;
        if (fullResponse == null) {
            if (other.fullResponse != null)
                return false;
        } else if (!fullResponse.equals(other.fullResponse))
            return false;
        return true;
    }

    @Override
    public String toString() {
        return "Response\n==============================\n" + fullResponse;
    }
}

这里是 HttpPatterns

And here's HttpPatterns:

public enum HttpPatterns {
    RESPONSE_CODE("^HTTP/1\\.1 (\\d+) .*$"),
    CACHE_CONTROL("^Cache-Control: (\\w+)$"),
    HOST("^Host: (\\w+)$"),
    REQUEST_HEADER("(GET|POST) ([^\\s]+) ([^\\s]+)$"),
    ACCEPT_ENCODING("^Accept-Encoding: .*$");

    private final Pattern pattern;

    HttpPatterns(String regex) {
        pattern = Pattern.compile(regex);
    }

    public boolean matches(String expression) {
        return pattern.matcher(expression).matches();
    }

    public Object process(String expression) {
        Matcher matcher = pattern.matcher(expression);
        if (!matcher.matches()) {
            throw new RuntimeException("Called `process`, but the expression doesn't match. Call `matches` first.");
        }

        if (this == RESPONSE_CODE) {
            return Integer.parseInt(matcher.group(1));
        } else if (this == CACHE_CONTROL) {
            return CacheControl.parseString(matcher.group(1));
        } else if (this == HOST) {
            return matcher.group(1);
        } else if (this == REQUEST_HEADER) {
            return new RequestHeader(RequestType.parseString(matcher.group(1)), matcher.group(2), matcher.group(3));
        } else { //never happens
            return null;
        }
    }


}



< hr>

编辑

我尝试根据建议实施,但它不工作, m变得绝望当我尝试查看图像时,我从浏览器中收到以下消息:

I tried implementing according the suggestions, but it's not working and I'm becoming desperate. When I try to view an image I get the following message from the browser:


图像 http://www.google.com/images/logos/ps_logo2.png 无法显示,因为它包含错误

The image "http://www.google.com/images/logos/ps_logo2.png" cannot be displayed because it contains errors.

这是日志:

Request
==============================

GET http://www.google.com/images/logos/ps_logo2.png HTTP/1.1
Host: www.google.com
User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:2.0) Gecko/20100101 Firefox/4.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 115
Cookie: PREF=ID=31f95dd7f42dfc7d:TM=1303507626:LM=1303507626:S=D4kIZ6rGFrlOUWlm


Not Reading from the Cache!!!!
I am going to try to connect to: www.google.com at port 80
Connected.
Writing to the server's buffer...
flushed.
Getting a response...
Got a binary response!


contentLength = 26209; headers.length() = 312; responseLength = 12136; fullResponse length = 12136


Got a response!

Writing to the Cache!!!!
I am going to write the following response:

HTTP/1.1 200 OK
Content-Type: image/png
Last-Modified: Thu, 05 Aug 2010 22:54:44 GMT
Date: Wed, 04 May 2011 15:05:30 GMT
Expires: Wed, 04 May 2011 15:05:30 GMT
Cache-Control: private, max-age=31536000
X-Content-Type-Options: nosniff
Server: sffe
Content-Length: 26209
X-XSS-Protection: 1; mode=block

 Response body is binary and was truncated.
Finished with request!

这是新的响应类: p>

Here's the new Response class:

public class Response {
    private String headers = "";
    private BufferedReader reader;
    private boolean busy = true;
    private int responseCode;
    private CacheControl cacheControl;
    private InputStream fullResponse;
    private ContentEncoding encoding = ContentEncoding.TEXT;
    private ContentType contentType = ContentType.TEXT;
    private int contentLength;

    public Response(String input) {
        this(new ByteArrayInputStream(input.getBytes()));
    }

    public Response(InputStream input) {

        ByteArrayOutputStream tempStream = new ByteArrayOutputStream();
        InputStreamReader inputReader = new InputStreamReader(input);
        try {
            while (!inputReader.ready());
            int responseLength = 0;
            while (inputReader.ready()) {
                tempStream.write(inputReader.read());
                responseLength++;
            }
            /*
             * Read the headers
             */
            reader = new BufferedReader(new InputStreamReader(new ByteArrayInputStream(tempStream.toByteArray())));
            while (!reader.ready());//wait for initialization.

            String line;
            while ((line = reader.readLine()) != null) {
                headers += "\r\n" + line;

                if (HttpPatterns.RESPONSE_CODE.matches(line)) {
                    responseCode = (Integer) HttpPatterns.RESPONSE_CODE.process(line);
                } else if (HttpPatterns.CACHE_CONTROL.matches(line)) {
                    cacheControl = (CacheControl) HttpPatterns.CACHE_CONTROL.process(line);
                } else if (HttpPatterns.CONTENT_ENCODING.matches(line)) {
                    encoding = (ContentEncoding) HttpPatterns.CONTENT_ENCODING.process(line);
                } else if (HttpPatterns.CONTENT_TYPE.matches(line)) {
                    contentType = (ContentType) HttpPatterns.CONTENT_TYPE.process(line);
                } else if (HttpPatterns.CONTENT_LENGTH.matches(line)) {
                    contentLength = (Integer) HttpPatterns.CONTENT_LENGTH.process(line);
                } else if (line.isEmpty()) {
                    break;
                }
            }

            InputStreamReader streamReader = new InputStreamReader(new ByteArrayInputStream(tempStream.toByteArray()));
            while (!reader.ready());//wait for initialization.
            //Now let's get the rest
            ByteArrayOutputStream out = new ByteArrayOutputStream();
            int counter = 0;
            while (streamReader.ready() && counter < (responseLength - contentLength)) {
                out.write((char) streamReader.read());
                counter++;
            }
            if (encoding == ContentEncoding.BINARY || contentType == ContentType.BINARY) {
                System.out.println("Got a binary response!");
                while (streamReader.ready()) {
                    out.write(streamReader.read());
                }
            } else {
                System.out.println("Got a text response!");
                while (streamReader.ready()) {
                    out.write((char) streamReader.read());
                }
            }
            fullResponse = new ByteArrayInputStream(out.toByteArray());

            System.out.println("\n\ncontentLength = " + contentLength + 
                    "; headers.length() = " + headers.length() + 
                    "; responseLength = " + responseLength + 
                    "; fullResponse length = " + out.toByteArray().length + "\n\n");

        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        } 
        busy = false;
    }

}

并且这里是 ProxyServer 类:

class ProxyServer {
    public void start() {
        while (true) {
            Socket serverSocket;
            Socket clientSocket;
            OutputStreamWriter toClient;
            BufferedWriter toServer;
            try {
                //The client is meant to put data on the port, read the socket.
                clientSocket = listeningSocket.accept();
                Request request = new Request(clientSocket.getInputStream());
                //System.out.println("Accepted a request!\n" + request);
                while(request.busy);
                //Make a connection to a real proxy.
                //Host & Port - should be read from the request
                URL url = null;
                try {
                    url = new URL(request.getRequestURL());
                } catch (MalformedURLException e){
                    url = new URL("http:\\"+request.getRequestHost()+request.getRequestURL());
                }

                System.out.println(request);

                //remove entry from cache if needed
                if (!request.getCacheControl().equals(CacheControl.CACHE) && cache.containsRequest(request)) {
                    cache.remove(request);
                }

                Response response = null;

                if (request.getRequestType() == RequestType.GET && request.getCacheControl().equals(CacheControl.CACHE) && cache.containsRequest(request)) {
                    System.out.println("Reading from the Cache!!!!");
                    response = cache.get(request);
                } else {
                    System.out.println("Not Reading from the Cache!!!!");
                    //Get the response from the destination
                    int remotePort = (url.getPort() == -1) ? 80 : url.getPort();
                    System.out.println("I am going to try to connect to: " + url.getHost() + " at port " + remotePort);
                    serverSocket = new Socket(url.getHost(), remotePort);
                    System.out.println("Connected.");
                    serverSocket.setSoTimeout(50000);

                    //write to the server - keep it open.
                    System.out.println("Writing to the server's buffer...");
                    toServer = new BufferedWriter(new OutputStreamWriter(serverSocket.getOutputStream()));
                    toServer.write(request.getFullRequest());
                    toServer.flush();
                    System.out.println("flushed.");

                    System.out.println("Getting a response...");
                    response = new Response(serverSocket.getInputStream());
                    //System.out.println("Got a response!\n" + response);
                    System.out.println("Got a response!\n");
                    //wait for the response
                    while(response.isBusy());
                }

                if (request.getRequestType() == RequestType.GET && request.getCacheControl().equals(CacheControl.CACHE) && response.getResponseCode() == 200) {
                    System.out.println("Writing to the Cache!!!!");
                    cache.put(request, response);
                }
                else System.out.println("Not Writing to the Cache!!!!");
                response = filter.filter(response);

                // Return the response to the client
                toClient = new OutputStreamWriter(clientSocket.getOutputStream());
                System.out.println("I am going to write the following response:\n" + response);
                BufferedReader responseReader = new BufferedReader(new InputStreamReader(response.getFullResponse()));
                while (responseReader.ready()) {
                    toClient.write(responseReader.read());
                }
                toClient.flush();
                toClient.close();
                clientSocket.close();
                System.out.println("Finished with request!");

            } catch (IOException e) {
                e.printStackTrace();
                continue;
            }
        }
   }
}

将感谢任何和所有反馈/洞察/建议关于如何解决这个,当然更喜欢一些实际的代码。

I would appreciate any and all feedback/insight/suggestion regarding how to solve this, and would of course prefer some actual code.

推荐答案

将其存储在字节数组中:

Store it in a byte array:

byte[] bufer = new byte[???];

一个更详细的过程:


  • 为响应头创建一个足够大的缓冲区(如果它更大,则丢弃异常)。

  • 将字节读入缓冲区,直到找到缓冲区中的\r\\\
    \r\\\
    你可以编写一个帮助函数,例如 static int arrayIndexOf(byte [] haystack,int offset,int length,byte [] needle)

  • 当遇到标题的结尾时,创建一个strinform缓冲区的第一个 n 字节。然后,您可以在此strng上使用RegEx(还请注意,RegEx不是解析HTTPeaders的最佳方法)。

  • 准备缓冲区将包含头文件之后的附加数据,响应体的第一个字节。您必须将这些字节复制到输出流或输出文件或输出缓冲区。

  • 读取响应正文的休息。 (直到内容长度被读取或流关闭)

  • Create a buffer large enough for the response header (and drop exception if it is bigger).
  • Read bytes to the buffer until you find \r\n\r\n in the buffer. You can write a helper function for example static int arrayIndexOf(byte[] haystack, int offset, int length, byte[] needle)
  • When you encounter the end of header, create a strinform the first n bytes of the buffer. You can then use RegEx on this strng (also note that RegEx is not the best method to parse HTTPeaders).
  • Be prepared that the buffer will contain additional data after the header, which are the first bytes of the response body. You have to copy these bytes to the output stream or output file or output buffer.
  • Read the rest of the response body. (Until content-length is read or stream is closed).

编辑: strong>

您没有按照我建议的这些步骤。 inputReader.ready()是检测响应阶段的错误方法。不能保证标题将以单个突发发送。

You are not following these steps I suggested. inputReader.ready() is a wrong way to detect the phases of the response. There is no guarantee that the header will be sent in a single burst.

我试图在代码(arrayIndexOf除外)中编写原理图。

I tried to write a schematics in code (except the arrayIndexOf) function.

InputStream is;

// Create a buffer large enough for the response header (and drop exception if it is bigger).
byte[] headEnd = {13, 10, 13, 10}; // \r \n \r \n
byte[] buffer = new byte[10 * 1024];
int length = 0;

// Read bytes to the buffer until you find `\r\n\r\n` in the buffer. 
int bytes = 0;
int pos;
while ((pos = arrayIndexOf(buffer, 0, length, headEnd)) == -1 && (bytes = is.read(buffer, length, buffer.length() - length)) > -1) {
    length += bytes;

    // buffer is full but have not found end siganture
    if (length == buffer.length())
        throw new RuntimeException("Response header too long");
}

// pos contains the starting index of the end signature (\r\n\r\n) so we add 4 bytes
pos += 4;

// When you encounter the end of header, create a strinform the first *n* bytes
String header = new String(buffer, 0, pos);

System.out.println(header);

// Be prepared that the buffer will contain additional data after the header
// ... so we process it
System.out.write(buffer, pos, length - pos);

// process the rest until connection is closed
while (bytes = is.read(buffer, 0, bufer.length())) {
    System.out.write(buffer, 0, bytes);
}

arrayIndexOf 可能看起来像这样:(有可能更快的版本)

The arrayIndexOf method could look something like this: (there are probably faster versions)

public static int arrayIndexOf(byte[] haystack, int offset, int length, byte[] needle) {
    for (int i=offset; i<offset+length-nedle.length(); i++) {
        boolean match = false;
        for (int j=0; j<needle.length(); j++) {
            match = haystack[i + j] == needle[j];
            if (!match)
                break;
        }
        if (match)
            return i;
    }
    return -1;
}

这篇关于如何存储可能包含二进制数据的Http响应?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆