在 Apache Beam 中调用外部 API 的更好方法 [英] Better approach to call external API in apache beam
问题描述
我有两种方法来初始化 HttpClient
,以便从 Apache Beam 中的 ParDo 进行 API 调用.
I have 2 approaches to initialize the HttpClient
in order to make an API call from a ParDo in Apache Beam.
方法 1:
初始化StartBundle
中的HttpClient
对象并关闭FinishBundle
中的HttpClient
.代码如下:
Initialise the HttpClient
object in the StartBundle
and close the HttpClient
in FinishBundle
. The code is as follows:
public class ProcessNewIncomingRequest extends DoFn<String, KV<String, String>> {
@StartBundle
public void startBundle() {
HttpClient client = HttpClient.newHttpClient();
HttpRequest request = HttpRequest.newBuilder()
.uri(URI.create(<Custom_URL>))
.build();
}
@ProcessElement
public void processElement(){
// Use the client and do an external API call
}
@FinishBundle
public void finishBundle(){
httpClient.close();
}
}
方法 2:
有一个单独的类,使用连接池管理所有连接.
Have a separate Class where all the connections are managed using the connection pool.
public class ExternalConnection{
HttpClient client = HttpClient.newHttpClient();
HttpRequest request = HttpRequest.newBuilder()
.uri(URI.create(<Custom_URL>))
.build();
public Response getResponse(){
// use the client, send request and get response
}
}
public class ProcessNewIncomingRequest extends DoFn<String, KV<String, String>> {
@ProcessElement
public void processElement(){
Response response = new ExternalConnection().getResponse();
}
}
在性能和编码设计标准方面,上述 2 种方法中哪一种更好?
Which one of the above 2 approaches are better in terms of performance and coding design standards?
推荐答案
两种方法都可以;StartBundle/FinishBundle
一个包含更多恕我直言,但如果您的包非常小,则其缺点是无法正常工作.更好的方法可能是使用 DoFn 的 SetUp/TearDown
,它可以跨越任意数量的包,但与 DoFn 的生命周期相关(利用 Beam SDK 已经做到的 DoFn 实例池).
Either approach would work fine; the StartBundle/FinishBundle
one is more contained IMHO but has the disadvantage of not working well if your bundles are very small. An even better approach might be to use DoFn's SetUp/TearDown
which can span an arbitrary number of bundles, but is tied to the lifetime of the DoFn (leveraging the pooling of DoFn instances the Beam SDKs already do).
这篇关于在 Apache Beam 中调用外部 API 的更好方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!