Mayuresh Krishna
Mayuresh Krishna is the CTO and Co-Founder of initializ.ai, where he drives product engineering, building AI models and private AI services. He has previously worked at VMware Tanzu as a Solution Engineering Leader & Pivotal Software as a Senior Platform Architect.
Sessions
Deploying large language models (LLMs) is inherently complex, challenging, and expensive. This case study demonstrates how Kubernetes, specifically Kserve with Modelcar OCI storage backend, simplifies the deployment and management of private LLM services.
First, we explore how Kserve enables efficient and scalable model serving within a Kubernetes environment, allowing seamless integration and optimized GPU utilization. Second, we delve into how Modelcar OCI artifacts streamline artifact delivery beyond container images, reducing duplicate storage usage, increasing download speeds, and minimizing governance overhead.
The session will cover implementation details, benefits, best practices, and lessons learned.
Walk away learning how to leverage Kubernetes, Kserve, and OCI artifacts to enhance your MLOps journey, achieving significant efficiency gains and overcoming common challenges in deploying and scaling private LLM services.