Speed up highly available deployments on Kubernetes
2022-10-23, 18:20–18:25, Room 1

In this talk we will show you how we speed up Cortex deployments at scale, using zone-aware Kubernetes controllers.

Kubernetes allow pods to be spread across different zones through topology constraints but these are not taken into consideration during rollout updates, or on pod disruption budgets. For instance, it's recommended to replicate Cortex's ingesters across different zones for high availability, allowing for the system to continue to work in the event of a zone outage. However, the lack of zone aware deployments support forces Cortex operators to allow just a single container to be updated at once, causing long deployments and impacting the velocity in which nodes can be upgraded.

To bypass these limitations, the Amazon Managed Service for Prometheus team released a couple of k8s controllers for zone aware rollouts and disruptions that can be used by any high available quorum-base distributed application, such as Cortex, to improve the velocity of deployments in a safe way.