Matheus Oliveira Cloud Native Rejekts Europe (London) 2025

Matheus Oliveira
.ical

Job Title & Company (eg. Developer Advocate at xyz) –

Solutions Architect at AWS

Session

03-30

11:10

30min

Building a SLM Platform with Karpenter, Ray Server and Ollama

Pedro Henrique Oliveira, Matheus Oliveira

In today's enterprise landscape, organizations struggle with deploying AI infrastructure at scale, facing challenges in resource optimization and cost management. This presentation introduces a Small Language Model (SLM) platform combining Karpenter, Ray Server and Ollama on Kubernetes to address these challenges. We'll showcase how to achieves up to 20% cost reduction in GPU utilization through dynamic resource allocation and efficient workload distribution. The unified management layer simplifies model versioning, monitoring, while handling concurrent model deployments, demand spikes and ensuring consistent performance with built-in audit capabilities for compliance.

The Waterloo

Matheus Oliveira .ical

Session

Matheus Oliveira
.ical