Saiyam Pathak

Saiyam is working as Head of DevRel at Loft Labs. He is the founder of Kubesimplify that focuses on simplifying cloud-native and Kubernetes technologies. Previously at Civo, Walmart Labs, Oracle, and HP, Saiyam has worked on many facets of Kubernetes, including machine learning platforms, scaling, multi-cloud, and managed Kubernetes services. He has implemented Kubernetes solutions in various organizations. When not coding, Saiyam contributes to the community by writing blogs and organizing local meetups for Kubernetes and CNCF. He is a Kubestronaut, CNCF TAG Operational Resilience co-chair, runs a YouTube channel, and can be reached on Twitter @saiyampathak


Session

11-08
16:00
30min
Beyond the Default Scheduler: Navigating GPU Multitenancy in the AI Era
Shivay Lamba, Hrittik Roy, Saiyam Pathak

GPU multitenancy in Kubernetes faces significant security challenges when deploying AI workloads on shared infrastructure. Time slicing enables GPU sharing but lacks hardware isolation, risking exposure of sensitive data. NVIDIA Multi-Instance GPU (MIG) provides true hardware isolation with dedicated compute cores, memory slices, and L2 cache partitions, ensuring consistent performance and strict QoS guarantees.

Since the default Kubernetes scheduler cannot partition GPU resources like CPUs for workloads, advanced schedulers—KAI, Volcano, and Kueue can serve as the scheduler for your workloads. They improve GPU sharing through hierarchical queues for secure multi-tenant environments. This talk demonstrates how combining isolation in multi-tenant setups with intelligent scheduling results in optimal utilization, fair resource distribution, and robust security boundaries, guiding the transition from default to GPU-aware scheduling solutions for scalable AI infrastructure.

Crystal Dining Room