Prerit Munjal
Prerit is working as a Software Architect, directing his expertise towards harnessing Cloud Native Technologies to design resilient architectures that can seamlessly scale in the future, all while prioritizing technical cost, security, availability and end-user experience.
CTO at InfraOne
Session
Ever watched your cloud bill grow faster than GPT's parameter count? That was us - burning through $50K on GPU instances while our LLM inference pipeline played hide and seek with production issues.
Our breaking point? Silent failures in production that were harder to catch than jerry to tom.
Join this session as we explore how we built ML telemetry that doesn't need its own data center by combining Pyroscope's lightweight profiling with OpenTelemetry's distributed tracing (because two tools are better than none when you're hunting GPU ghosts), we built a profiling pipeline that finally gave us clarity without burning cash.
The result? We cut our GPU costs by 40% (enough to make our CFO smile), slashed p99 latency by 65% (making our users actually believe in AI), and found memory leaks that were better hidden than my secret candy stash.