Shubham Srivastava

Leading Developer Relations at Zenduty - an advanced incident management and response orchestration platform.
Take pride in making mistakes, learning from them and advocating for best practices for orgs setting up their DevOps, SRE and Production Engineering teams.

A zealous and eternally curious professional, fascinated by stories from DevOps, Incident Management and Product Design. An orator, gamer, writer, and hopeful comedian trying his very best to do something worth remembering everyday.

How Thanos Almost Snapped $100,000 from our Infra Budget
Deepak Kumar, Shubham Srivastava, Vishwa Krishnakumar, Ankur Rawal

In a galaxy not so far away, where data is as vast as the cosmos, our team was troubled with observability data chaos.
Seeking some clarity, we sought salvation with Thanos and Fluentbit – fabled titans against our metric storage and logging issues.
Thanos empowered us with a Prometheus setup with high availability and virtually infinite historical data storage. Prometheus ascended to new heights, flawlessly scaling horizontally while Thanos Compactor's downsampling abilities promised faster results for querying older data.
Fluentbit made collecting, filtering, and outputting logs across multiple sources and destinations effortless.

But, little did we know that even the most powerful tools, when not wielded correctly could be double-edged Infinity Stones.

Join us on a thrilling tale of blunders as we recount some missteps in configuring these tools, easily missed caveats in data downsampling and log storage, and how the pursuit of seamless data handling almost cost us over $100,000.