在Kubernetes中调度和扩展Pod [英] Scheduling and scaling pods in kubernetes

查看:89
本文介绍了在Kubernetes中调度和扩展Pod的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在GKE上运行k8s集群

i am running k8s cluster on GKE

它有4个具有不同配置的节点池

it has 4 node pool with different configuration

节点池:1 (单节点已加冕状态)

Node pool : 1 (Single node coroned status)

运行 Redis& RabbitMQ

节点池:2 (单节点加冕状态)

运行监控和普罗米修斯

节点池:3 (大型单节点)

应用程序窗格

节点池:4 (启用了自动缩放的单个节点)

Node pool : 4 (Single node with auto-scaling enabled)

应用程序窗格

当前,我正在GKE上为每个服务运行单个副本

currently, i am running single replicas for each service on GKE

但是主要管理所有内容的3个主要服务副本.

however 3 replicas of the main service which mostly manages everything.

有时在使用HPA扩展此主要服务时看到节点崩溃或kubelet frequent restart POD变为未知状态的问题.

when scaling this main service with HPA sometime seen the issue of Node getting crashed or kubelet frequent restart PODs goes to Unkown state.

如何处理这种情况?如果节点崩溃,GKE会花一些时间进行自动修复,这会导致服务停机.

How to handle this scenario ? If the node gets crashed GKE taking time to auto repair and which cause service down time.

问题:2

节点池:3 -4个正在运行的应用程序POD.在该应用程序内部,有3-4个内存密集型微服务,我也认为使用节点选择器并将其固定在一个节点上.

Node pool : 3 -4 running application PODs. Inside the application, there are 3-4 memory-intensive micro services i am also thinking same to use Node selector and fix it on one Node.

而只有小型节点池将运行具有HPA的主服务,并且该节点池的节点自动扩展会自动工作.

while only small node pool will run main service which has HPA and node auto scaling auto work for that node pool.

但是我觉得用Node选择器并不是最好的方法.

however i feel like it's not best way to it with Node selector.

始终最好运行每个服务的多个副本,但目前,我们仅运行每个服务的单个副本,因此请考虑考虑该部分.

it's always best to run more than one replicas of each service but currently, we are running single replicas only of each service so please suggest considering that part.

推荐答案

正如 Patrick W 在他的建议中正确建议的评论:

As Patrick W rightly suggested in his comment:

如果您有一个节点,则只剩下一个点 失败.另外请记住,自动缩放需要花费一些时间才能投入使用, 基于资源请求. 如果您的节点由于以下原因而遭受OOM 内存密集型工作负载,您需要重新调整内存请求 和限制 – Patrick W 10月10日在

if you have a single node, you leave yourself with a single point of failure. Also keep in mind that autoscaling takes time to kick in and is based on resource requests. If your node suffers OOM because of memory intensive workloads, you need to readjust your memory requests and limits – Patrick W Oct 10 at

您可能需要重新设计基础架构,以便每个节点池中不仅有一个节点,而且

you may need to redesign a bit your infrastructure so you have more than a single node in every nodepool as well as readjust mamory requests and limits

您可能想看看 kubernetes官方文档 Google Cloud博客中的以下部分:

You may want to take a look at the following sections in the official kubernetes docs and Google Cloud blog:

  • Managing Resources for Containers
  • Assign CPU Resources to Containers and Pods
  • Configure Default Memory Requests and Limits for a Namespace
  • Resource Quotas
  • Kubernetes best practices: Resource requests and limits

如何处理这种情况?如果节点崩溃GKE需要时间 进行自动修复,这会导致服务中断.

How to handle this scenario ? If the node gets crashed GKE taking time to auto repair and which cause service down time.

这就是为什么一个节点池中不止一个节点可以是更好的选择的原因.它极大地降低了您遇到上述情况的可能性. GKE autorapair 功能需要花费一些时间(通常是几分钟),如果这是您的唯一节点,则您无法做很多事情,需要接受可能的停机时间. /p>

That's why having more than just one node for a single node pool can be much better option. It greatly reduces the likelihood that you'll end up in the situation described above. GKE autorapair feature needs to take its time (usually a few minutes) and if this is your only node, you cannot do much about it and need to accept possible downtimes.

节点池:3 -4个正在运行的应用程序POD.在应用程序内部 我也有3-4个内存密集型微服务 使用节点选择器并将其固定在一个节点上.

Node pool : 3 -4 running application PODs. Inside the application, there are 3-4 memory-intensive micro services i am also thinking same to use Node selector and fix it on one Node.

,而只有小型节点池将运行具有HPA和 该节点池的节点自动缩放自动工作.

while only small node pool will run main service which has HPA and node auto scaling auto work for that node pool.

但是我觉得用Node选择器并不是最好的方法.

however i feel like it's not best way to it with Node selector.

您也可以在污点和容忍度

这篇关于在Kubernetes中调度和扩展Pod的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆