无法执行HTTP请求:超时等待Flink中来自池的连接 [英] Unable to execute HTTP request: Timeout waiting for connection from pool in Flink

查看:164
本文介绍了无法执行HTTP请求:超时等待Flink中来自池的连接的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用应用程序,该应用程序将一些文件上传到s3存储桶,稍后,它从s3存储桶读取文件并将其推送到我的数据库.

I'm working on an app which uploads some files to an s3 bucket and at a later point, it reads files from s3 bucket and pushes it to my database.

我正在使用 Flink 1.4.2 fs.s3a API 从s3存储桶读取和写入文件.

I'm using Flink 1.4.2 and fs.s3a API for reading and write files from the s3 bucket.

将文件上传到s3存储桶可以正常工作,没有任何问题,但是当我的应用程序从s3读取那些上传的文件的第二阶段开始时,我的应用程序将引发以下错误:

Caused by: java.io.InterruptedIOException: Reopen at position 0 on s3a://myfilepath/a/b/d/4: org.apache.flink.fs.s3hadoop.shaded.com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool
at org.apache.flink.fs.s3hadoop.shaded.org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:125)
at org.apache.flink.fs.s3hadoop.shaded.org.apache.hadoop.fs.s3a.S3AInputStream.reopen(S3AInputStream.java:155)
at org.apache.flink.fs.s3hadoop.shaded.org.apache.hadoop.fs.s3a.S3AInputStream.lazySeek(S3AInputStream.java:281)
at org.apache.flink.fs.s3hadoop.shaded.org.apache.hadoop.fs.s3a.S3AInputStream.read(S3AInputStream.java:364)
at java.io.DataInputStream.read(DataInputStream.java:149)
at org.apache.flink.fs.s3hadoop.shaded.org.apache.flink.runtime.fs.hdfs.HadoopDataInputStream.read(HadoopDataInputStream.java:94)
at org.apache.flink.api.common.io.DelimitedInputFormat.fillBuffer(DelimitedInputFormat.java:702)
at org.apache.flink.api.common.io.DelimitedInputFormat.open(DelimitedInputFormat.java:490)
at org.apache.flink.api.common.io.GenericCsvInputFormat.open(GenericCsvInputFormat.java:301)
at org.apache.flink.api.java.io.CsvInputFormat.open(CsvInputFormat.java:53)
at org.apache.flink.api.java.io.PojoCsvInputFormat.open(PojoCsvInputFormat.java:160)
at org.apache.flink.api.java.io.PojoCsvInputFormat.open(PojoCsvInputFormat.java:37)
at org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:145)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718)
at java.lang.Thread.run(Thread.java:748)

我可以通过增加s3a API的最大连接数来控制此错误.

截至目前,我在s3存储桶中大约有 1000个文件,该文件由我的应用程序推送并拉入在s3存储桶中,并且我的最大连接数是3000 .我正在使用Flink的并行性从s3存储桶中上传/下载这些文件.我的任务管理器数为14 .这是间歇性故障,在这种情况下,我也有成功的案例.

As of now, I have around 1000 files in the s3 bucket which is pushed and pulled by my app in the s3 bucket and my max connection is 3000. I'm using Flink's parallelism to upload/download these files from s3 bucket. My task manager count is 14. This is an intermittent failure, I'm having success cases also for this scenario.

我的查询是

  1. 为什么我会间歇性失败?如果我设置的最大连接数很低,则我的应用每次运行时都应该抛出此错误.
  2. 有什么方法可以计算出我的应用程序正常运行所需的最大最大连接数,而不会遇到连接池超时错误?还是这个错误与我不知道的其他事情有关?

谢谢提前

推荐答案

基于我通过Flink(批处理)工作流处理来自S3的许多文件的经验,一些评论:

Some comments, based on my experience with processing lots of files from S3 via Flink (batch) workflows:

  1. 读取文件时,Flink将根据文件数和每个文件的大小来计算拆分".每个拆分都是分开读取的,因此理论上并发连接的最大数量不是基于文件数,而是文件和文件大小的组合.
  2. HTTP客户端使用的连接池会在一段时间后释放连接,因为能够重用现有连接是胜利(不必进行服务器/客户端握手).这样就在池中有多少可用连接中引入了一定程度的随机性.
  3. 连接池的大小不会对内存产生太大影响,因此我通常将其设置得很高(例如,对于最近的工作流程为4096).
  4. 使用AWS连接代码时,设置为 fs.s3.maxConnections ,与纯Hadoop配置不同.
  1. When you are reading the files, Flink will calculate "splits" based on the number of files, and each file's size. Each split is read separately, so the theoretical max # of simultaneous connections isn't based on the # of files, but a combination of files and file sizes.
  2. The connection pool used by the HTTP client releases connections after some amount of time, as being able to reuse an existing connection is a win (server/client handshake doesn't have to happen). So that introduces a degree of randomness into how many available connections are in the pool.
  3. The size of the connection pool doesn't impact memory much, so I typically set it pretty high (e.g. 4096 for a recent workflow).
  4. When using AWS connection code, the setting to bump is fs.s3.maxConnections, which isn't the same as a pure Hadoop configuration.

这篇关于无法执行HTTP请求:超时等待Flink中来自池的连接的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆