限制 Azure Functions 队列上的并发作业数 [英] Limiting the number of concurrent jobs on Azure Functions queue

查看:17
本文介绍了限制 Azure Functions 队列上的并发作业数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 Azure 中有一个函数应用程序,它在将项目放入队列时触发.它看起来像这样(非常简化):

I have a Function app in Azure that is triggered when an item is put on a queue. It looks something like this (greatly simplified):

public static async Task Run(string myQueueItem, TraceWriter log)
{
    using (var client = new HttpClient())
    {
        client.BaseAddress = new Uri(Config.APIUri);
        client.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));

        StringContent httpContent = new StringContent(myQueueItem, Encoding.UTF8, "application/json");
        HttpResponseMessage response = await client.PostAsync("/api/devices/data", httpContent);
        response.EnsureSuccessStatusCode();

        string json = await response.Content.ReadAsStringAsync();
        ApiResponse apiResponse = JsonConvert.DeserializeObject<ApiResponse>(json);

        log.Info($"Activity data successfully sent to platform in {apiResponse.elapsed}ms.  Tracking number: {apiResponse.tracking}");
    }
}

这一切都很好,运行得很好.每次将项目放入队列时,我们都会将数据发送到我们这边的某个 API 并记录响应.很酷.

This all works great and runs pretty well. Every time an item is put on the queue, we send the data to some API on our side and log the response. Cool.

当生成队列消息的事物"出现大幅峰值并且同时将大量项目放入队列时,就会出现问题.这往往在一分钟内发生大约 1,000 - 1,500 个项目.错误日志会是这样的:

The problem happens when there's a big spike in "the thing that generates queue messages" and a lot of items are put on the queue at once. This tends to happen around 1,000 - 1,500 items in a minute. The error log will have something like this:

2017-02-14T01:45:31.692 mscorlib:执行函数时出现异常:Functions.SendToLimeade.f-SendToLimeade__-1078179529:一个错误发送请求时发生.系统:无法连接远程服务器.系统:每个套接字地址只能使用一次(协议/网络地址/端口)通常是允许的123.123.123.123:443.

2017-02-14T01:45:31.692 mscorlib: Exception while executing function: Functions.SendToLimeade. f-SendToLimeade__-1078179529: An error occurred while sending the request. System: Unable to connect to the remote server. System: Only one usage of each socket address (protocol/network address/port) is normally permitted 123.123.123.123:443.

起初,我认为这是 Azure Function 应用程序耗尽本地套接字的问题,因为 此处说明.但是,然后我注意到了IP地址.IP 地址 123.123.123.123(当然在这个例子中改变了)是我们的 IP 地址,HttpClient 发布到的地址.所以,现在我想知道是不是 我们的 服务器用完了套接字来处理这些请求.

At first, I thought this was an issue with the Azure Function app running out of local sockets, as illustrated here. However, then I noticed the IP address. The IP address 123.123.123.123 (of course changed for this example) is our IP address, the one that the HttpClient is posting to. So, now I'm wondering if it is our servers running out of sockets to handle these requests.

无论哪种方式,我们都会遇到扩展问题.我正在尝试找出解决它的最佳方法.

Either way, we have a scaling issue going on here. I'm trying to figure out the best way to solve it.

一些想法:

  1. 如果是本地套接字限制,上面的文章有一个使用Req.ServicePoint.BindIPEndPointDelegate增加本地端口范围的例子.这看起来很有希望,但是当你真正需要扩展时你会怎么做?我不希望这个问题在 2 年后再次出现.
  2. 如果是远程限制,看起来我可以控制 Functions 运行时一次处理多少条消息.这里有一篇有趣的文章说您可以将 serviceBus.maxConcurrentCalls 设置为 1,并且一次只会处理一条消息.也许我可以将其设置为相对较低的数字.现在,在某个时候,我们的队列会比我们处理它们的速度更快,但此时的答案是在我们端添加更多服务器.
  3. 多个 Azure Functions 应用?如果我有多个 Azure Functions 应用并且它们都在同一个队列上触发,会发生什么情况?Azure 是否足够聪明,可以在 Function 应用程序之间分配工作,并且我可以让一大群机器处理我的队列,可以根据需要扩大或缩小?
  4. 我也遇到过keep-alives.在我看来,如果我能在队列消息涌入时以某种方式保持我的套接字打开,它可能会有很大帮助.这可能吗,以及我将如何去做的任何提示?
  1. If it's a local socket limitation, the article above has an example of increasing the local port range using Req.ServicePoint.BindIPEndPointDelegate. This seems promising, but what do you do when you truly need to scale? I don't want this problem coming back in 2 years.
  2. If it's a remote limitation, it looks like I can control how many messages the Functions runtime will process at once. There's an interesting article here that says you can set serviceBus.maxConcurrentCalls to 1 and only a single message will be processed at once. Maybe I could set this to a relatively low number. Now, at some point our queue will be filling up faster than we can process them, but at that point the answer is adding more servers on our end.
  3. Multiple Azure Functions apps? What happens if I have more than one Azure Functions app and they all trigger on the same queue? Is Azure smart enough to divvy up the work among the Function apps and I could have an army of machines processing my queue, which could be scaled up or down as needed?
  4. I've also come across keep-alives. It seems to me if I could somehow keep my socket open as queue messages were flooding in, it could perhaps help greatly. Is this possible, and any tips on how I'd go about doing this?

对于此类系统的推荐(可扩展!)设计的任何见解将不胜感激!

Any insight on a recommended (scalable!) design for this sort of system would be greatly appreciated!

推荐答案

我想我已经找到了解决方案.在过去的 3 小时 6 小时内,我一直在运行这些更改,并且我的套接字错误为零.在我每 30 分钟左右大量出现这些错误之前.

I think I've figured out a solution for this. I've been running these changes for the past 3 hours 6 hours, and I've had zero socket errors. Before I would get these errors in large batches every 30 minutes or so.

首先,我添加了一个新类来管理 HttpClient.

First, I added a new class to manage the HttpClient.

public static class Connection
{
    public static HttpClient Client { get; private set; }

    static Connection()
    {
        Client = new HttpClient();

        Client.BaseAddress = new Uri(Config.APIUri);
        Client.DefaultRequestHeaders.Add("Connection", "Keep-Alive");
        Client.DefaultRequestHeaders.Add("Keep-Alive", "timeout=600");
        Client.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));
    }
}

现在,我们有一个静态的 HttpClient 实例,用于每次调用该函数.根据我的研究,强烈建议尽可能长时间地保留 HttpClient 实例,一切都是线程安全的,并且 HttpClient 会将请求排队并优化对同一主机的请求.请注意,我还设置了 Keep-Alive 标头(我认为这是默认设置,但我认为我会隐含).

Now, we have a static instance of HttpClient that we use for every call to the function. From my research, keeping HttpClient instances around for as long as possible is highly recommended, everything is thread safe, and HttpClient will queue up requests and optimize requests to the same host. Notice I also set the Keep-Alive headers (I think this is the default, but I figured I'll be implicit).

在我的函数中,我只是抓取静态 HttpClient 实例,例如:

In my function, I just grab the static HttpClient instance like:

var client = Connection.Client;
StringContent httpContent = new StringContent(myQueueItem, Encoding.UTF8, "application/json");
HttpResponseMessage response = await client.PostAsync("/api/devices/data", httpContent);
response.EnsureSuccessStatusCode();

我还没有真正对套接字级别发生的事情进行任何深入分析(我将不得不询问我们的 IT 人员,他们是否能够在负载均衡器上看到此流量),但我希望它只为我们的服务器打开一个套接字,并在处理队列项时进行一堆 HTTP 调用.无论如何,无论它在做什么似乎都在起作用.也许有人对如何改进有一些想法.

I haven't really done any in-depth analysis of what's happening at the socket level (I'll have to ask our IT guys if they're able to see this traffic on the load balancer), but I'm hoping it just keeps a single socket open to our server and makes a bunch of HTTP calls as the queue items are processed. Anyway, whatever it's doing seems to be working. Maybe someone has some thoughts on how to improve.

这篇关于限制 Azure Functions 队列上的并发作业数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆