上传时如何使用多路复用http2功能 [英] how to use Multiplexing http2 feature when uploading

查看:124
本文介绍了上传时如何使用多路复用http2功能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在上传多个文件时,使用多路复用http2功能应该可以显着提高性能.

并且Java有一个本地支持HTTP/2协议的httpclient,因此,鉴于我自己的理解,我尝试编写该代码.

这个任务似乎并不像我最初想象的那么容易,或者在另一方面,我似乎找不到能够在上载中使用多路传输的服务器(如果存在).

这是我编写的代码,有人想过吗?

HttpClient httpClient = HttpClient.newBuilder().version(HttpClient.Version.HTTP_2).build();
String url = "https://your-own-http2-server.com/incoming-files/%s";
Path basePath = Path.of("/path/to/directory/where/is/a/bunch/of/jpgs");

Function<Path, CompletableFuture<HttpResponse<String>>> handleFile = file -> {
    String currentUrl = String.format(url, file.getFileName().toString());
    try {
        HttpRequest request = HttpRequest.newBuilder()
                                         .uri(URI.create(currentUrl))
                                         .header("Content-Type", "image/jpeg")
                                         .PUT(HttpRequest.BodyPublishers.ofFile(file))
                                         .build();
        return httpClient.sendAsync(request, HttpResponse.BodyHandlers.ofString());
    } catch (IOException e) {
        e.printStackTrace();
        throw new RuntimeException(e);
    }
};

List<Path> files = Files.list(basePath).collect(toList());

files.parallelStream().map(handleFile).forEach(c -> {
         try {
             final HttpResponse<String> response = c.get();
             System.out.println(response.statusCode());
         } catch (Exception e) {
             e.printStackTrace();
             throw new RuntimeException((e));
         }
     });

解决方案

在上传多个文件时,使用多路复用http2功能应该可以显着提高性能.

这是一个通常是错误的假设.

丢弃具有多个HTTP/1.1连接以便可以并行上传的情况.

然后我们有1个TCP连接,我们想将上传的内容与HTTP/1.1和HTTP/2进行比较.

在HTTP/1.1中,请求将依次序列化 serialized ,因此,多次上传的结束时间取决于连接的带宽(忽略TCP慢启动).

在HTTP/2中,请求将通过复用进行交织.但是,需要发送的数据是相同的,因此,多次上载的结束时间再次取决于连接的带宽.

在HTTP/1.1中,您将拥有upload1.start...upload1.end|upload2.start...upload2.end|upload3.start...upload3.end等.

在HTTP/2中,您将拥有upload1.start|upload2.start|upload3.start.....upload3.end..upload1.end..upload2.end

结束时间相同.

HTTP/2的问题在于,您通常不受连接带宽的限制,而受到HTTP/2流控制窗口的限制,该窗口通常小得多,很多./p>

HTTP/2规范默认将HTTP/2流控制窗口设置为65535字节,这意味着客户端必须每65535字节停止发送数据,直到服务器确认这些字节为止. 这可能需要一次往返,因此,即使对于大文件上传而言,往返很小(例如50毫秒),您也可能需要为此往返支付多次,从而使上传时间增加几秒钟(例如,对于6 MiB上传,您可能要为此支付100欧元时间,即5秒).

那么使用大型HTTP/2流控制窗口配置服务器非常重要,尤其是如果您的服务器用于文件上传时. 服务器上的HTTP/2大流量控制窗口意味着服务器必须准备好缓冲大量字节,这意味着主要处理文件上载的HTTP/2服务器比HTTP/1.1服务器需要更多的内存.

在具有更大的HTTP/2流控制窗口的情况下,服务器可能很聪明,并且在客户端仍在上载时将确认发送给客户端.

客户端上传时,会减少其发送"窗口. 通过从服务器接收确认,客户端可以扩大发送"窗口.

典型的不良互动是从1 MiB开始指示客户端发送"窗口值:

[client send window]

1048576 
        client sends 262144 bytes
786432  
        client sends 262144 bytes
524288  
        client sends 262144 bytes
262144  
        client sends 262144 bytes
0       
        client cannot send
.
. (stalled)
.
        client receives acknowledgment from server (524288 bytes)
524288  
        client sends 262144 bytes
262144  
        client sends 262144 bytes
0       
        client cannot send
.
. (stalled)
.

好的互动是:

[client send window]

1048576 
        client sends 262144 bytes
786432  
        client sends 262144 bytes
524288  
        client sends 262144 bytes
262144  
        client receives acknowledgment from server (524288 bytes)
786432  
        client sends 262144 bytes
524288  
        client sends 262144 bytes
262144  
        client receives acknowledgment from server (524288 bytes)
786432  

正如您在良好的交互中所看到的,服务器在客户端耗尽发送"窗口之前正在确认客户端,因此客户端可以保持全速发送.

多路复用对于许多小的请求确实有效,这是浏览器的用例:许多小的GET请求(无请求内容)可以在HTTP/2中进行多路复用,比对应的HTTP/1.1早到达服务器请求,因此将更早地投放并更早返回浏览器.

对于大型请求(例如文件上传),HTTP/2可以和HTTP/1.1一样高效,但是如果服务器的默认配置使其性能比HTTP/1.1差很多,我不会感到惊讶- HTTP/2将需要对服务器配置进行一些调整.

HTTP/2流量控制窗口也可能妨碍下载,因此通过HTTP/2从服务器下载大量内容可能真的很慢(出于上述相同的原因).

浏览器通过告诉服务器具有一个非常大的发送"窗口来避免此问题-Firefox 72将其设置为每个连接12 MiB,并且非常聪明地确认服务器,以使其不会停止下载./p>

There should be a significant improvement of performance using multiplexing http2 feature when uploading multiple files.

And Java has an httpclient which supports natively the HTTP/2 protocol, so given that I tried to wrote the code for my own understanding.

This task seems to be not easy as I thought initially, or on the other side it seems that I wasn't able to find a server able to use Multiplexing in upload (if exists).

This is the code I wrote, anyone has thoughts about?

HttpClient httpClient = HttpClient.newBuilder().version(HttpClient.Version.HTTP_2).build();
String url = "https://your-own-http2-server.com/incoming-files/%s";
Path basePath = Path.of("/path/to/directory/where/is/a/bunch/of/jpgs");

Function<Path, CompletableFuture<HttpResponse<String>>> handleFile = file -> {
    String currentUrl = String.format(url, file.getFileName().toString());
    try {
        HttpRequest request = HttpRequest.newBuilder()
                                         .uri(URI.create(currentUrl))
                                         .header("Content-Type", "image/jpeg")
                                         .PUT(HttpRequest.BodyPublishers.ofFile(file))
                                         .build();
        return httpClient.sendAsync(request, HttpResponse.BodyHandlers.ofString());
    } catch (IOException e) {
        e.printStackTrace();
        throw new RuntimeException(e);
    }
};

List<Path> files = Files.list(basePath).collect(toList());

files.parallelStream().map(handleFile).forEach(c -> {
         try {
             final HttpResponse<String> response = c.get();
             System.out.println(response.statusCode());
         } catch (Exception e) {
             e.printStackTrace();
             throw new RuntimeException((e));
         }
     });

解决方案

There should be a significant improvement of performance using multiplexing http2 feature when uploading multiple files.

That is an assumption that is generally wrong.

Let's discard the case where you have multiple HTTP/1.1 connections so you can upload in parallel.

We then have 1 TCP connection and we want to compare the upload with HTTP/1.1 and HTTP/2.

In HTTP/1.1, the requests will be serialized one after the other, so the end time of the multiple uploads depends on the bandwidth of the connection (ignoring TCP slow start).

In HTTP/2, the requests will be interleaved by multiplexing. However, the data that needs to be sent is the same, so again the end time of the multiple uploads depend on the bandwidth of the connection.

In HTTP/1.1 you will have upload1.start...upload1.end|upload2.start...upload2.end|upload3.start...upload3.end etc.

In HTTP/2 you will have upload1.start|upload2.start|upload3.start.....upload3.end..upload1.end..upload2.end

The end time would be the same.

The problem with HTTP/2 is that you are typically not limited by the bandwidth of the connection, but by the HTTP/2 flow control window, which is typically much, much, smaller.

The HTTP/2 specification defaults the HTTP/2 flow control window at 65535 bytes, which means that every 65535 bytes the client must stop sending data until the server acknowledges those bytes. This may take a roundtrip, so even if the roundtrip is small (e.g. 50 ms) for large file uploads you may be paying this roundtrip multiple times, adding seconds to your uploads (e.g. for a 6 MiB upload you may be paying this cost 100 times, which is 5 seconds).

It is then very important that you configure the server with a large HTTP/2 flow control window, especially if your server is used for file uploads. A large HTTP/2 flow control window on the server means that the server must be prepared to buffer a large amount of bytes, which means that a HTTP/2 server that handles primarily file uploads will need more memory than a HTTP/1.1 server.

With larger HTTP/2 flow control windows, the server may be smart and send acknowledgements to the client while the client is still uploading.

When a client uploads, it reduces its "send" window. By receiving acknowledgements from the server, the client enlarges the "send" window.

A typical bad interaction would be, indicating the client "send" window value, starting at 1 MiB:

[client send window]

1048576 
        client sends 262144 bytes
786432  
        client sends 262144 bytes
524288  
        client sends 262144 bytes
262144  
        client sends 262144 bytes
0       
        client cannot send
.
. (stalled)
.
        client receives acknowledgment from server (524288 bytes)
524288  
        client sends 262144 bytes
262144  
        client sends 262144 bytes
0       
        client cannot send
.
. (stalled)
.

A good interaction would be:

[client send window]

1048576 
        client sends 262144 bytes
786432  
        client sends 262144 bytes
524288  
        client sends 262144 bytes
262144  
        client receives acknowledgment from server (524288 bytes)
786432  
        client sends 262144 bytes
524288  
        client sends 262144 bytes
262144  
        client receives acknowledgment from server (524288 bytes)
786432  

As you can see in the good interaction, the server is acknowledging the client before the client exhausts the "send" window, so the client can keep sending at full speed.

Multiplexing is really effective for many small requests, which is the browser use case: many small GET requests (with no request content) that can be multiplexed in HTTP/2, arriving to the server way before than the correspondent HTTP/1.1 requests, and as such will be served earlier and arrive back to the browser earlier.

For large requests, as it's the case of file upload, HTTP/2 can be as efficient as HTTP/1.1, but I would not be surprised if the default configuration of the server makes it much less performant than HTTP/1.1 - HTTP/2 will require some tuning of the server configuration.

The HTTP/2 flow control window could get in the way also for downloads, so downloading large contents from a server over HTTP/2 may be really slow (for the same reasons explained above).

Browsers avoid this issue by telling the server to have a server "send" window really large - Firefox 72 sets it at 12 MiB per connection, and are very smart at acknowledging the server so that it will not stall the downloads.

这篇关于上传时如何使用多路复用http2功能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆