To achieve zero-downtime migrations, we modified etcd snapshots to artificially inflate the revision number. Discover the surprising challenges of operating etcd at massive scale.
#1about 3 minutes
The journey to managed Kubernetes at IONOS
From its first release in 2019 to managing over 20,000 clusters, IONOS scaled its Kubernetes service by building on a massive etcd foundation.
#2about 4 minutes
Evolving etcd deployment strategies over time
The team progressed from the CoreOS operator and Bitnami Helm charts to a simplified custom Helm chart for better control and stability.
#3about 3 minutes
Understanding multi-tenancy and its performance impact
Using a shared etcd with client-side prefixes reduces cost but creates noisy neighbor problems, requiring careful tuning like compaction and defragmentation.
#4about 3 minutes
Iterating on etcd cluster layouts for reliability
Initial cross-location clusters suffered from latency and revision drift, leading to a more stable single data center layout using availability zones.
#5about 3 minutes
A zero-downtime control plane migration strategy
A live migration process using `etcdctl mirror` allows moving a Kubernetes control plane to a new etcd cluster without global downtime or data loss.
#6about 3 minutes
Manipulating etcd revisions for seamless migration
By modifying an etcd snapshot to insert a high revision number, clients like kubelet continue watching for changes without needing a restart after migration.
#7about 2 minutes
Future plans for etcd management and automation
The team is working on automating the migration process, offering dedicated etcd clusters, and contributing their migration learnings to the Kaji project.
Related jobs
Jobs that call for the skills explored in this talk.
Matching moments
02:45 MIN
Understanding the challenges of scaling Kubernetes with confidence
5 steps for running a Kubernetes environment at scale
00:57 MIN
Managing containers at scale with Kubernetes
#90DaysOfDevOps - The DevOps Learning Journey
05:25 MIN
Managing a complex mix of old and new infrastructure
Hosting a modern justice system
03:37 MIN
Addressing unique data protection challenges in Kubernetes
It's all about the Data
01:14 MIN
Building end-to-end AI solutions in European data centers
From foundation model to hosted AI solution in minutes
02:26 MIN
Why teams move from monoliths to Kubernetes
Get ready for operations by pull requests
00:50 MIN
Using Kubernetes as an extensible control plane
Chaos in Containers - Unleashing Resilience
03:27 MIN
Comparing managed DBaaS with databases on Kubernetes
Learning Kubernetes made easy with KubeCampusLearning to use Kubernetes? KubeCampus by Kasten offers free educational content for all skill levels to get you started!Kubernetes is an open-source system for deploying, scaling and managing containerized applications. It allows you to deploy your ...
Benedikt Bischof
MLops – Deploying, Maintaining And Evolving Machine Learning Models in ProductionWelcome to this issue of the WeAreDevelopers Live Talk series. This article recaps an interesting talk by Bas Geerdink who gave advice on MLOps.About the speaker:Bas is a programmer, scientist, and IT manager. At ING, he is responsible for the Fast...
Benedikt Bischof
MLOps – What’s the deal behind it?Welcome to this issue of the WeAreDevelopers Live Talk series. This article recaps an interesting talk by Nico Axtmann who introduced us to MLOpsAbout the speaker:Nico Axtmann is a seasoned machine learning veteran. Starting back in 2014 he observed ...