Dotnet Core Docker 容器泄漏 Linux 上的 RAM 并导致 OOM [英] Dotnet Core Docker Container Leaks RAM on Linux and causes OOM
问题描述
我在 Docker 的 Linux 容器中运行 Dotnet Core 2.2.
I am running Dotnet Core 2.2 in a Linux container in Docker.
我尝试了许多不同的配置/环境选项 - 但我总是遇到同样的内存不足问题(docker events"报告 OOM).
I've tried many different configuration/environment options - but I keep coming back to the same problem of running out of memory ('docker events' reports an OOM).
在生产中,我在 Ubuntu 上托管.对于开发,我在 Windows 的 Docker 上使用 Linux 容器 (MobyLinux).
In production I'm hosting on Ubuntu. For Development, I'm using a Linux container (MobyLinux) on Docker in Windows.
我已经回去运行 Web API 模板项目,而不是我的实际应用程序.我实际上是在返回一个字符串而不做其他任何事情.如果我从 curl 调用它大约 1,000 次,容器就会死亡.垃圾收集器似乎根本没有工作.
I've gone back to running the Web API template project, rather than my actual app. I am literally returning a string and doing nothing else. If I call it about 1,000 times from curl, the container will die. The garbage collector does not appear to be working at all.
尝试在 docker-compose 中设置以下环境变量:
Tried setting the following environment variables in the docker-compose:
DOTNET_RUNNING_IN_CONTAINER=true
DOTNET_SYSTEM_GLOBALIZATION_INVARIANT=true
ASPNETCORE_preventHostingStartup=true
还在 docker-compose 中尝试了以下内容:
Also tried the following in the docker-compose:
mem_reservation: 128m
mem_limit: 256m
memswap_limit: 256m
(这些只会让它死得更快)
(these only make it die faster)
尝试将以下设置为 true 或 false,没有区别:
Tried setting the following to true or false, no difference:
ServerGarbageCollection
我尝试过作为 Windows 容器运行,这不是 OOM - 但它似乎也不尊重内存限制.
I have tried instead running as a Windows container, this doesn't OOM - but it does not seem to respect the memory limits either.
我已经排除了使用 HttpClient 和 EF Core - 因为我什至没有在我的示例中使用它们.我读过一些关于侦听端口 443 的问题 - 因为我可以让容器整天闲置,如果我在一天结束时检查 - 它消耗了更多内存(不是大量的,但它成长).
I have already ruled out use of HttpClient and EF Core - as I'm not even using them in my example. I have read a bit about listening on port 443 as a problem - as I can leave the container running idle all day long, if I check at the end of the day - it's used up some more memory (not a massive amount, but it grows).
我的 API 中的示例:
Example of what's in my API:
// GET api/values/5
[HttpGet("{id}")]
public ActionResult<string> Get(int id)
{
return "You said: " + id;
}
使用 Curl 调用示例:
Calling with Curl example:
curl -X GET "https://localhost:44329/api/values/7" -H "accept: text/plain" --insecure
(重复 1,000 次左右)
(repeated 1,000 or so times)
预期:对于非常原始的请求,RAM 使用率保持较低
Expected: RAM usage to remain low for a very primitive request
实际:RAM 使用量继续增长直至出现故障
Actual: RAM usage continues to grow until failure
完整的 Dockerfile:
Full Dockerfile:
FROM microsoft/dotnet:2.2-aspnetcore-runtime AS base
WORKDIR /app
EXPOSE 80
EXPOSE 443
FROM microsoft/dotnet:2.2-sdk AS build
WORKDIR /src
COPY ["WebApplication1/WebApplication1.csproj", "WebApplication1/"]
RUN dotnet restore "WebApplication1/WebApplication1.csproj"
COPY . .
WORKDIR "/src/WebApplication1"
RUN dotnet build "WebApplication1.csproj" -c Release -o /app
FROM build AS publish
RUN dotnet publish "WebApplication1.csproj" -c Release -o /app
FROM base AS final
WORKDIR /app
COPY --from=publish /app .
ENTRYPOINT ["dotnet", "WebApplication1.dll"]
docker-compose.yml
docker-compose.yml
version: '2.3'
services:
webapplication1:
image: ${DOCKER_REGISTRY-}webapplication1
mem_reservation: 128m
mem_limit: 256m
memswap_limit: 256m
cpu_percent: 25
build:
context: .
dockerfile: WebApplication1/Dockerfile
docker-compose.override.yml
docker-compose.override.yml
version: '2.3'
services:
webapplication1:
environment:
- ASPNETCORE_ENVIRONMENT=Development
- ASPNETCORE_URLS=https://+:443;http://+:80
- ASPNETCORE_HTTPS_PORT=44329
- DOTNET_RUNNING_IN_CONTAINER=true
- DOTNET_SYSTEM_GLOBALIZATION_INVARIANT=true
- ASPNETCORE_preventHostingStartup=true
ports:
- "50996:80"
- "44329:443"
volumes:
- ${APPDATA}/ASP.NET/Https:/root/.aspnet/https:ro
- ${APPDATA}/Microsoft/UserSecrets:/root/.microsoft/usersecrets:ro
我在 Windows 上运行 Docker CE 引擎 18.0.9.1,在 Ubuntu 上运行 18.06.1.确认 - 我也在 Dotnet Core 2.1 中尝试过.
I'm running Docker CE Engine 18.0.9.1 on Windows and 18.06.1 on Ubuntu. To confirm - I have also tried in Dotnet Core 2.1.
我也在 IIS Express 中尝试过 - 进程达到大约 55MB,这实际上是用多个线程等向它发送垃圾邮件.
I've also given it a try in IIS Express - the process gets to around 55MB, that's literally spamming it with multiple threads, etc.
当它们全部完成后,它会减少到大约 29-35MB.
When they're all done, it goes down to around 29-35MB.
推荐答案
这可能是因为没有执行垃圾回收 (GC).
This could be because garbage collection (GC) is not executed.
看这个未解决的问题,它看起来非常相似:
Looking at this open issue it looks very similar:
https://github.com/dotnet/runtime/issues/851
使 Ubuntu 18.04.4
在虚拟机上运行的一种解决方案是使用工作站垃圾收集 (GC):
One solution that made Ubuntu 18.04.4
work on a virtualized machine was using Workstation garbage collection (GC):
<PropertyGroup>
<ServerGarbageCollection>false</ServerGarbageCollection>
</PropertyGroup>
https://github.com/dotnet/runtime/issues/851#issuecomment-644648315
https://github.com/dotnet/runtime/issues/851#issuecomment-438474207
https://docs.microsoft.com/en-us/dotnet/standard/garbage-collection/workstation-server-gc
这是另一个发现:
经过进一步调查,我注意到有很大的不同我的服务器之间可用逻辑 CPU 的数量计数(80 对16).经过一番谷歌搜索后,我发现了这个话题 dotnet/runtime#622这让我对 CPU/GC/线程设置进行了实验.
After further investigations I've noticed that there is big difference between my servers in amount of available logical CPUs count (80 vs 16). After some googling I came across this topic dotnet/runtime#622 that leads me to an experiments with CPU/GC/Threads settings.
我在堆栈文件中使用 --cpus 约束;明确设置System.GC.Concurrent=true, System.GC.HeapCount=8,System.GC.NoAffinitize=true, System.Threading.线程池.MaxThreads=16在 runtimeconfig.template.json
文件中;将图像更新为 3.1.301-bionicsdk 和 3.1.5-bionic asp.net 运行时 — 我把所有这些东西都放在一个各种组合,所有这些都没有效果.申请刚刚挂起直到被 OOMKilled.
I was using --cpus constraint in stack file; explicitly set
System.GC.Concurrent=true, System.GC.HeapCount=8,
System.GC.NoAffinitize=true, System.Threading.ThreadPool.MaxThreads=16
in runtimeconfig.template.json
file; update image to a 3.1.301-bionic
sdk and 3.1.5-bionic asp.net runtime — I made all this things in a
various combinations and all of this had no effect. Application just
hangs until gets OOMKilled.
唯一使它与服务器 GC 一起工作的是 --cpuset-cpus
约束.当然,可用处理器的显式设置不是docker swarm 模式的一个选项.但我正在试验可用 CPU 来查找任何规律.在这里我得到了一些有趣的事实.
The only thing that make it work with Server GC is --cpuset-cpus
constraint. Of course, explicit setting of available processors is not
an option for a docker swarm mode. But I was experimenting with
available cpus to find any regularity. And here I got a few
interesting facts.
有趣的是,之前我已经迁移了其他 3 个后端服务到一个新的服务器集群,它们在默认情况下都很顺利设置.他们的内存限制设置为 600 Mb
但实际上他们需要大约 400 Mb
运行.只有消耗内存才会出错应用程序(我有两个),它需要 3 Gb
来构建内存结构并在 6 Gb
约束下运行.
What is interesting, previously I have mirgated 3 other backend
services to a new servers cluster and they all go well with a default
settings. Their memory limit is set to 600 Mb
but in fact they need
about 400 Mb
to run. Things go wrong only with memory-consuming
applications (I have two of those), it requires 3 Gb
to build
in-memory structures and runs with a 6 Gb
constraint.
它在 [1, 35]
可用 CPU 和获取之间的任何范围内保持工作当 cpu 计数为 36
时挂起.
It keeps working in any range between [1, 35]
available cpus and gets
hanging when cpus count is 36
.
https://github.com/dotnet/runtime/issues/851#issuecomment-645237830
这篇关于Dotnet Core Docker 容器泄漏 Linux 上的 RAM 并导致 OOM的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!