Saiyam Pathak Cloud Native Rejekts NA (Atlanta) 2025

Saiyam Pathak
.ical

Saiyam is working as Head of DevRel at Loft Labs. He is the founder of Kubesimplify that focuses on simplifying cloud-native and Kubernetes technologies. Previously at Civo, Walmart Labs, Oracle, and HP, Saiyam has worked on many facets of Kubernetes, including machine learning platforms, scaling, multi-cloud, and managed Kubernetes services. He has implemented Kubernetes solutions in various organizations. When not coding, Saiyam contributes to the community by writing blogs and organizing local meetups for Kubernetes and CNCF. He is a Kubestronaut, CNCF TAG Operational Resilience co-chair, runs a YouTube channel, and can be reached on Twitter @saiyampathak

Session

11-08

16:00

30min

Beyond the Default Scheduler: Navigating GPU Multitenancy in the AI Era

Shivay Lamba, Hrittik Roy, Saiyam Pathak

GPU multitenancy in Kubernetes faces significant security challenges when deploying AI workloads on shared infrastructure. Time slicing enables GPU sharing but lacks hardware isolation, risking exposure of sensitive data. NVIDIA Multi-Instance GPU (MIG) provides true hardware isolation with dedicated compute cores, memory slices, and L2 cache partitions, ensuring consistent performance and strict QoS guarantees.

Since the default Kubernetes scheduler cannot partition GPU resources like CPUs for workloads, advanced schedulers—KAI, Volcano, and Kueue can serve as the scheduler for your workloads. They improve GPU sharing through hierarchical queues for secure multi-tenant environments. This talk demonstrates how combining isolation in multi-tenant setups with intelligent scheduling results in optimal utilization, fair resource distribution, and robust security boundaries, guiding the transition from default to GPU-aware scheduling solutions for scalable AI infrastructure.

Crystal Dining Room

Saiyam Pathak .ical

Session

Saiyam Pathak
.ical