Pedro Henrique Oliveira
Pedro Henrique is a ISV Solution Architect at AWS serving clients on their containers and open source journey. Pedro has experience working with distribuited systems, modernization, cloud native application, GitOps and platform Engineering. He also contributes to open source projects such as twelve factors and has spoken about CNCF projects at CNCF KCD and third-party community events.
ISV Solutions Architect at AWS
Sessions
In today's enterprise landscape, organizations struggle with deploying AI infrastructure at scale, facing challenges in resource optimization and cost management. This presentation introduces a Small Language Model (SLM) platform combining Karpenter, Ray Server and Ollama on Kubernetes to address these challenges. We'll showcase how to achieves up to 20% cost reduction in GPU utilization through dynamic resource allocation and efficient workload distribution. The unified management layer simplifies model versioning, monitoring, while handling concurrent model deployments, demand spikes and ensuring consistent performance with built-in audit capabilities for compliance.
In the ever-evolving world of Kubernetes-based applications, the ability to effectively scale resources is paramount. However, when an application's performance relies on the health and performance of multiple external services or dependencies, traditional scaling approaches can fall short. This session explores the powerful combination of KEDA (Kubernetes-based Event-Driven Autoscaler) and the use of multiple metric sources to achieve intelligent, adaptive scaling for your applications. At this session, we'll dive deeper on Scaling Modifiers feature, which allows for fine-grained control over scaling behaviors based on diverse metrics and conditions and also walking through detailed case studies, illustrate the practical benefits of using Scaling Modifiers in real-world applications.