I am Navin Jammula working as Staff SWE at Intuit for 15 years and have overall 20 years of experience in Software industry.
Application Pod Autoscaling is a very challenging problem and the solutions available currently works well individually but not as a whole and may not solve all usecases (ex: peak hour traffic, weekday/weekend, month end, year end etc.). The current solutions helps with realtime scaling based on short term analysis but not based on both short and long term analysis.
We created a thorough solution (a.k.a. Global Updater) considering most of the usecases with multiple components which will address the realtime + longterm vertical sizing(pod size) and Horizontal sizing(num of pods).
Global updater analyzes vertical size recommendations from VPA(opensource), pod-size recommender(internal), horizontal recommendations from HPA(opensource), replica-recommender(internal) and generate a recommendation finetune file that contain both vertical and horizontal size after multiple checks and safeguards ensuring the reliability of the service is not impacted.
Global Updater product runs remotely and is designed to work for multiple services across clusters and different environments per service and generate recommendations for a service considering metrics from multiple environments (pre-prod and prod).
This solution helped scaling both vertical and horizontal without human intervention thereby saving time and cost and also reliability of the application.
Benefits to the ecosystem:
We are planning to open source the solutions we built as this will not just benefit us but also the community/companies with similar problems. We will keep enhancing the product and rollout the changes. This talk will help the audience in understanding the innovative approaches we have taken to solve this complex problem.