Dotnet Core Docker 容器泄漏 Linux 上的 RAM 并导致 OOM [英] Dotnet Core Docker Container Leaks RAM on Linux and causes OOM

查看:68
本文介绍了Dotnet Core Docker 容器泄漏 Linux 上的 RAM 并导致 OOM的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 Docker 的 Linux 容器中运行 Dotnet Core 2.2.

I am running Dotnet Core 2.2 in a Linux container in Docker.

我尝试了许多不同的配置/环境选项 - 但我总是遇到同样的内存不足问题(docker events"报告 OOM).

I've tried many different configuration/environment options - but I keep coming back to the same problem of running out of memory ('docker events' reports an OOM).

在生产中,我在 Ubuntu 上托管.对于开发,我在 Windows 的 Docker 上使用 Linux 容器 (MobyLinux).

In production I'm hosting on Ubuntu. For Development, I'm using a Linux container (MobyLinux) on Docker in Windows.

我已经回去运行 Web API 模板项目,而不是我的实际应用程序.我实际上是在返回一个字符串而不做其他任何事情.如果我从 curl 调用它大约 1,000 次,容器就会死亡.垃圾收集器似乎根本没有工作.

I've gone back to running the Web API template project, rather than my actual app. I am literally returning a string and doing nothing else. If I call it about 1,000 times from curl, the container will die. The garbage collector does not appear to be working at all.

尝试在 docker-compose 中设置以下环境变量:

Tried setting the following environment variables in the docker-compose:

DOTNET_RUNNING_IN_CONTAINER=true
DOTNET_SYSTEM_GLOBALIZATION_INVARIANT=true
ASPNETCORE_preventHostingStartup=true

还在 docker-compose 中尝试了以下内容:

Also tried the following in the docker-compose:

mem_reservation: 128m
mem_limit: 256m
memswap_limit: 256m

(这些只会让它死得更快)

(these only make it die faster)

尝试将以下设置为 true 或 false,没有区别:

Tried setting the following to true or false, no difference:

ServerGarbageCollection

我尝试过作为 Windows 容器运行,这不是 OOM - 但它似乎也不尊重内存限制.

I have tried instead running as a Windows container, this doesn't OOM - but it does not seem to respect the memory limits either.

我已经排除了使用 HttpClient 和 EF Core - 因为我什至没有在我的示例中使用它们.我读过一些关于侦听端口 443 的问题 - 因为我可以让容器整天闲置,如果我在一天结束时检查 - 它消耗了更多内存(不是大量的,但它成长).

I have already ruled out use of HttpClient and EF Core - as I'm not even using them in my example. I have read a bit about listening on port 443 as a problem - as I can leave the container running idle all day long, if I check at the end of the day - it's used up some more memory (not a massive amount, but it grows).

我的 API 中的示例:

Example of what's in my API:

// GET api/values/5
[HttpGet("{id}")]
public ActionResult<string> Get(int id)
{
return "You said: " + id;
}

使用 Curl 调用示例:

Calling with Curl example:

curl -X GET "https://localhost:44329/api/values/7" -H  "accept: text/plain" --insecure

(重复 1,000 次左右)

(repeated 1,000 or so times)

预期:对于非常原始的请求,RAM 使用率保持较低

Expected: RAM usage to remain low for a very primitive request

实际:RAM 使用量继续增长直至出现故障

Actual: RAM usage continues to grow until failure

完整的 Dockerfile:

Full Dockerfile:

FROM microsoft/dotnet:2.2-aspnetcore-runtime AS base
WORKDIR /app
EXPOSE 80
EXPOSE 443

FROM microsoft/dotnet:2.2-sdk AS build
WORKDIR /src
COPY ["WebApplication1/WebApplication1.csproj", "WebApplication1/"]
RUN dotnet restore "WebApplication1/WebApplication1.csproj"
COPY . .
WORKDIR "/src/WebApplication1"
RUN dotnet build "WebApplication1.csproj" -c Release -o /app

FROM build AS publish
RUN dotnet publish "WebApplication1.csproj" -c Release -o /app

FROM base AS final
WORKDIR /app
COPY --from=publish /app .
ENTRYPOINT ["dotnet", "WebApplication1.dll"]

docker-compose.yml

docker-compose.yml

version: '2.3'

services:
  webapplication1:
    image: ${DOCKER_REGISTRY-}webapplication1
    mem_reservation: 128m
    mem_limit: 256m
    memswap_limit: 256m
    cpu_percent: 25
    build:
      context: .
      dockerfile: WebApplication1/Dockerfile

docker-compose.override.yml

docker-compose.override.yml

version: '2.3'

services:
  webapplication1:
    environment:
      - ASPNETCORE_ENVIRONMENT=Development
      - ASPNETCORE_URLS=https://+:443;http://+:80
      - ASPNETCORE_HTTPS_PORT=44329
      - DOTNET_RUNNING_IN_CONTAINER=true
      - DOTNET_SYSTEM_GLOBALIZATION_INVARIANT=true
      - ASPNETCORE_preventHostingStartup=true
    ports:
      - "50996:80"
      - "44329:443"
    volumes:
      - ${APPDATA}/ASP.NET/Https:/root/.aspnet/https:ro
      - ${APPDATA}/Microsoft/UserSecrets:/root/.microsoft/usersecrets:ro

我在 Windows 上运行 Docker CE 引擎 18.0.9.1,在 Ubuntu 上运行 18.06.1.确认 - 我也在 Dotnet Core 2.1 中尝试过.

I'm running Docker CE Engine 18.0.9.1 on Windows and 18.06.1 on Ubuntu. To confirm - I have also tried in Dotnet Core 2.1.

我也在 IIS Express 中尝试过 - 进程达到大约 55MB,这实际上是用多个线程等向它发送垃圾邮件.

I've also given it a try in IIS Express - the process gets to around 55MB, that's literally spamming it with multiple threads, etc.

当它们全部完成后,它会减少到大约 29-35MB.

When they're all done, it goes down to around 29-35MB.

推荐答案

这可能是因为没有执行垃圾回收 (GC).

This could be because garbage collection (GC) is not executed.

看这个未解决的问题,它看起来非常相似:

Looking at this open issue it looks very similar:

https://github.com/dotnet/runtime/issues/851

使 Ubuntu 18.04.4 在虚拟机上运行的一种解决方案是使用工作站垃圾收集 (GC):

One solution that made Ubuntu 18.04.4 work on a virtualized machine was using Workstation garbage collection (GC):

<PropertyGroup>
    <ServerGarbageCollection>false</ServerGarbageCollection>
</PropertyGroup>

https://github.com/dotnet/runtime/issues/851#issuecomment-644648315

https://github.com/dotnet/runtime/issues/851#issuecomment-438474207

https://docs.microsoft.com/en-us/dotnet/standard/garbage-collection/workstation-server-gc

这是另一个发现:

经过进一步调查,我注意到有很大的不同我的服务器之间可用逻辑 CPU 的数量计数(80 对16).经过一番谷歌搜索后,我发现了这个话题 dotnet/runtime#622这让我对 CPU/GC/线程设置进行了实验.

After further investigations I've noticed that there is big difference between my servers in amount of available logical CPUs count (80 vs 16). After some googling I came across this topic dotnet/runtime#622 that leads me to an experiments with CPU/GC/Threads settings.

我在堆栈文件中使用 --cpus 约束;明确设置System.GC.Concurrent=true, System.GC.HeapCount=8,System.GC.NoAffinitize=true, System.Threading.线程池.MaxThreads=16runtimeconfig.template.json 文件中;将图像更新为 3.1.301-bionicsdk3.1.5-bionic asp.net 运行时 — 我把所有这些东西都放在一个各种组合,所有这些都没有效果.申请刚刚挂起直到被 OOMKilled.

I was using --cpus constraint in stack file; explicitly set System.GC.Concurrent=true, System.GC.HeapCount=8, System.GC.NoAffinitize=true, System.Threading.ThreadPool.MaxThreads=16 in runtimeconfig.template.json file; update image to a 3.1.301-bionic sdk and 3.1.5-bionic asp.net runtime — I made all this things in a various combinations and all of this had no effect. Application just hangs until gets OOMKilled.

唯一使它与服务器 GC 一起工作的是 --cpuset-cpus约束.当然,可用处理器的显式设置不是docker swarm 模式的一个选项.但我正在试验可用 CPU 来查找任何规律.在这里我得到了一些有趣的事实.

The only thing that make it work with Server GC is --cpuset-cpus constraint. Of course, explicit setting of available processors is not an option for a docker swarm mode. But I was experimenting with available cpus to find any regularity. And here I got a few interesting facts.

有趣的是,之前我已经迁移了其他 3 个后端服务到一个新的服务器集群,它们在默认情况下都很顺利设置.他们的内存限制设置为 600 Mb 但实际上他们需要大约 400 Mb 运行.只有消耗内存才会出错应用程序(我有两个),它需要 3 Gb 来构建内存结构并在 6 Gb 约束下运行.

What is interesting, previously I have mirgated 3 other backend services to a new servers cluster and they all go well with a default settings. Their memory limit is set to 600 Mb but in fact they need about 400 Mb to run. Things go wrong only with memory-consuming applications (I have two of those), it requires 3 Gb to build in-memory structures and runs with a 6 Gb constraint.

它在 [1, 35] 可用 CPU 和获取之间的任何范围内保持工作当 cpu 计数为 36 时挂起.

It keeps working in any range between [1, 35] available cpus and gets hanging when cpus count is 36.

https://github.com/dotnet/runtime/issues/851#issuecomment-645237830

这篇关于Dotnet Core Docker 容器泄漏 Linux 上的 RAM 并导致 OOM的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆