Zero downtime upgrades of Kubernetes Cloud Native Rejekts

Zero downtime upgrades of Kubernetes
.ical

2019-05-19 17:40–17:45, Main Hall

The Kubernetes project releases a new version every 3 month as well as several bug fix releases in between. You need and want to upgrade your clusters. How do you do that with zero-downtime and no impact on your production workloads? In this lightning talk I will show how my team has come up with a procedure to upgrade a cluster and monitor the upgrade itself. In particular to avoid impact due to nodes becoming "Not Ready".

The team at Meltwater develops and provides Kubernetes as a Service in a multi-tenant setup internally for 40+ development teams. The base cluster is deployed with Kops and then enhanced with add-ons by the team. During our first "kops rolling-update" runs we have experienced nodes going into "Not ready". Mainly do to bug #48638/#41916 and the way nodes reach the masters through DNS round-robin. Though our way of doing the upgrade was "triggered" by kops, we think the process and its steps, in particular the way we monitor the upgrade itself and the API functionality during the upgrade, could be interesting and applied to any Kubernetes instance.

Simone Sciarrati

I have worked with large scale distributed systems for the last 10+ years, from online gaming to data intensive applications. In the last couple years I have been focusing on building a Kubernetes platform to accelerate the development teams in Meltwater. In my spare time, while not riding my Ducati on a race track I practice the fine art of tsundoku.

Zero downtime upgrades of Kubernetes .ical 2019-05-19 17:40–17:45, Main Hall

Zero downtime upgrades of Kubernetes
.ical

2019-05-19 17:40–17:45, Main Hall