Opening comments
Within the cloud native ecosystem there are a wide variety of tools tackling authorization. This presentation covers what those tools are and how they relate to each other so that folks can find the right tool for the job.
Deep Learning (DL) models are being successfully applied in a variety of fields. Managing DL inferencing for diverse models presents cost and operational complexity challenges. The resource requirements for serving a DL model depend on its architecture, and its prediction load can vary over time, leading to the need for flexible resource allocation to avoid provisioning for the maximum amount of resources needed at peak load. Using the cloud to allocate resources flexibly adds operational complexity to obtain minimum-cost resources matching model needs from the large and ever-evolving sets of instance types. Selecting minimum-cost cloud resources is particularly important given the high cost of x86+GPU compute instances, which are often used to serve DL models.
We describe an approach to efficient DL inferencing on cloud Kubernetes (K8s) cluster resources. The approach combines two kinds of right-sizing. The first is right-sizing the inference resources, using Elotl Luna smart node provisioner to add right-sized compute to cloud K8s clusters when needed and remove it when not. The second is right-sizing the inference compute type, using cloud Ampere A1 Arm compute with the Ampere Optimized AI library, which can provide a price-performance advantage on DL inferencing relative to GPUs and to other CPUs.
We show the benefits of the approach using inference workloads running on auto-scaled TorchServe deployments. For cloud K8s clusters from two vendors, we compare the cost and operational complexity of right-sizing against two common non-right-sized approaches.
eBPF is now a well-known technology used for networking, observability and security purposes in the cloud native landscape. There are a lot of different projects like BCC, Cilium, Falco, Pixie and Inspektor Gadget (to mention a few) that use eBPF as its core technology. One question often asked is how much CPU and memory are used by those programs. This is a hard question to answer as eBPF programs run in the kernel context and traditional tools to measure CPU and memory consumption aren’t aware of them.
The 5.1 release of Linux introduced a new feature to collect statistics on eBPF programs and bpftool implemented support to show them. However, bpftool is not Kubernetes aware and it doesn’t provide an easy way to sort the output. That’s where the new ebpf top gadget comes in. It uses the same bpftool mechanism to collect information about the eBPF programs and maps from the kernel and provides an interface to show the list of programs and their resource consumption with additional information like the processes that created those programs. The ebpf top gadget also provides a mechanism to sort the output based on different parameters like number of runs, memory used, etc.
In this talk, Mauricio will make an introduction of the Inspektor Gadget project and then will show how the ebpf top gadget can be used to measure the resource consumption of eBPF programs from different projects like Falco, Cilium and Inspektor Gadget.
Engineering distributed applications has never been harder. The development process is filled with work that distracts from business logic, such as state persistence, event-handling, and knowledge about orchestrators, schedulers, and cloud providers. What if we create a new POSIX for the cloud?
The SpiderLightning Project experiments with capabilities as interfaces that extend WASI to create a new POSIX for the cloud. For example, developers can use a key-value interface to manage application states without requiring provider specific knowledge (e.g., Redis) because the host implements this interface and will be configured with the proper implementation. This creates common distributed application APIs and decouples application development from operational knowledge.
What do you do when you have one nice PC sitting around but you really need to hack on a multi-node Kubernetes cluster? Build one by installing FreeBSD and using its native bhyve virtualization platform.
Working in a large team, multi-tenant organization can be hard. There can be sub-teams, sibling teams, different BUs, parallel efforts, clients, tenants and more that all need to both collaborate and be kept separate. In this complex type of environment RBAC, access rules, network policies, and api server load can be difficult to manage. Someone might have already suggested looking into virtual clusters. After looking into how virtual clusters provide isolation that namespaces do not, you may have even decided they are a good fit for your environment.
Now that you have a virtual cluster running on a Kubernetes cluster that runs on a virtual machine that runs in a virtual data center where does it all end?
In this talk Mike will be using vcluster to layer virtual cluster on top of virtual cluster, diving deeper & deeper into the depths of inception. While api servers explode around us we'll find out how many api servers are dancing on the head of that pin.
Virtual Networks, Container Networks and Software Defined Networking have all added layers of abstraction and complication on what used to be straightforward and very tactile, plug in a cable then watch the packets flow. But the basic protocols and how our systems exchange information largely remain the same. This talk is a back to basics look at how we can remember some basic principles to troubleshoot modern problems.
We've all seen it: Conferences fail to provide a diverse line-up, get called out publicly and speakers bail in fear of backlash. But this is just the tip of the iceberg. More often than not, they reveal a failure of leaders to create a diverse and inclusive community in the first place.
It’s not enough to have the right boxes checked. Marginalised folks need to also feel safe to share their experiences.
A clear set of values, Codes of Conduct, and programs aimed at underrepresented folks, are all tools that can help. Ultimately, however, a community is made up of people, and it is on us to reflect on our behaviour, resist the urge to go for the option that makes us comfortable and do better.
In this talk, I want to discuss how we can take action beyond calling people out on Twitter to build something that will truly benefit everyone.
A lot of interest in virtual Kubernetes clusters and the open source tool vcluster has developed over the last year. vcluster allows platform teams to provide virtual Kubernetes clusters to their users. A virtual cluster appears to be a full-blown Kubernetes cluster to the users, but it runs within a namespace of the host cluster. This allows users to have admin access to the cluster, use multiple namespaces in it, and manage global objects like CRDs.
During the last year, many new features have been added to vcluster, and we’ve seen it used for use cases that we hadn’t even imagined. This talk will provide tips and tricks to help teams get more from their virtual clusters and show off some fun things you can do with them.
We’ll cover: How to share resources like ingresses from the host cluster, using vcluster’s isolated mode to automatically add network policies and Pod Security Standards to your virtual clusters, pausing and resuming virtual clusters, monitoring and backing up virtual clusters, and writing plugins with the vcluster SDK. We’ll also cover some weirder examples like using vcluster for shadow IT (users don’t need to have elevated privileges in the host cluster to start a virtual cluster) and running a virtual cluster inside a virtual cluster.
Where do you find internal documentation about a legacy microservice? How can I make an API call to the new service deployed by other team? How is the status of my service in the production kubernetes cluster? The frontend team finds the backend service is down on Friday’s evening, how can they trigger a PagerDuty?
All these questions can be answered with a unique tool, Backstage.
It’s possible to integrate Backstage in any platform or company, increase productivity and start the journey with developer experience. With some documentation already in place and starting from scratch, it's very easy to install Backstage and integrate the minimum capabilities to make the life easier to any company member, starting with developers' life.
Role-based Access Control (AKA RBAC) is a continuous challenge with the growing complexity of cloud native operations, the sheer number of services involved, as well as the privileges required to manage and maintain complex systems with today's ironclad SLAs. Many modern microservices systems are built upon Kubernetes that has its own unique set of RBAC challenges.
In this talk I'll walk through some of the challenges with managing RBAC at scale in Kubernetes operations - from common mistakes (cluster-admin anyone?) and misconfigurations, as well as overly privileged roles including unnecessary access to secrets. Amir, as a Kubernetes RBAC expert will cover all the questions you always wanted to ask and never dared, such as including how to assign access to secrets (both from a technical and organizational perspective), who should be allowed to delete pods, as well as the age-old question of who really should be allowed to have cluster-admin access. We'll wrap up with some hard-earned tips for how to architect RBAC best-practices into your systems, and some good open source tools to manage privileges and access in the long term.
Come explore building micro-service APIs using Kubernetes Custom Resources (CRs)! We'll demo a real-life example of such an API, analyze its advantages and disadvantanges relative to typical REST APIs, and provide some guidelines for deciding whether using a CR based API is right for your application.
This talk would cover why there is a need to give developers access to Kubernetes based development environments and Crossplane during development: so they can code and test their changes in an environment as close to production as possible.
The talk will highlight the challenges developers face due to a lack of simple infrastructure provisioning workflow how Kubernetes and Crossplane come together to solve that. We will then go over how a simple yet powerful dev workflow can be set up using Crossplane and Kubernetes-based development environments.
The talk would cover:
- What Kubernetes based development environments are, and how Crossplane provisions infrastructure
- Why developers need the combination of the two for being effective when writing cloud-native applications
- Demo of setting up a dev workflow using them
eBPF allows for introspection of events across entire nodes and is a powerful foundation for collecting data from different workloads on a Kubernetes cluster. This talk will explore step-by-step a cryptocurrency mining attack, showing how it behaves, evolves, and how different stages of the attack can be detected using open source eBPF-based tools.
As a demonstration, a live miner barely detectable using traditional userspace tools will be shown on a pod. Using tools like Cilium’s project Tetragon and leveraging eBPF’s kernel-based network and process-level visibility, malicious behaviors such as suspicious processes and unexpected outbound connections are easily identified. As a result, the detected miner will be blocked, and the cluster defended.
Attendees will leave with ideas for protecting Kubernetes clusters, as well as an understanding of how eBPF-based tools can operate across an entire Kubernetes cluster without any modification to applications or their configuration.
That’s right! The Open Policy Agent has other skills than just securing your clusters. The general-purpose design of the Open Policy Agent has enabled many tools, such as Gatekeeper, to adopt it for their own policy decision needs. This is powerful because it provides end-users with a consistent approach to policy enforcement throughout the cloud native ecosystem.
This talk will look at several different tools and techniques that leverage OPA's policy engine and how they can benefit the development, deployment, and security of your applications.
We'll explore:
- How Regula can evaluate your infrastructure for compliance violations before ever reaching the cloud.
- How Conftest can enforce cluster policies in local environments and CI without the need for a cluster.
- How Gatekeeper can provide cluster audits and prevent insecure workloads from being deployed.
- How Konstraint can automatically generate documentation, constraints, and templates for your policies.
- ... and more!
By the end of this talk, the audience will have more tools available to them in their toolkit and gain a different perspective on how the Open Policy Agent is used today to make better decisions for tomorrow.
The collection and storage of observability data is critical for day to day operations and long term health of clusters and applications. The increasing volume of this observability data can be leveraged by AI algorithms and data analytics to automate triaging, response, and remediation for common issues, reducing mean time to detection and resolution. To achieve this observability based AIOps system, one must be capable of implementing AI algorithms, set up a combination of logging, monitoring, and tracing backends to store data, and agents for each type of observability data in downstream clusters to ship data to the backend. This complex setup can be challenging to users and this talk will demonstrate how Opni can be leveraged to simplify the setup and management of a fully open source AIOps & observability system.
You’re deploying a project with a Kubernetes service that can be accessed using port-forward or an external IP, by using the load balancer service type. But when it’s time to deploy the project into production, the documentation doesn’t explain how to set up TLS. Now what?
Cert-manager to the rescue! Cert-manager makes it easy to generate a TLS certificate, which can be used to enable HTTPS (secure HTTP) access to an application. During this presentation and live demo, Onkar will show attendees how to:
Install cert-manager
Deploy a certificate issuer using “Let's Encrypt” and a DNS-01 resolver
Provision a TLS certificate using cert-manager and the certificate issuer
Create DNS records to map a domain name to the application's external IP addresses
Deploy an application with the TLS certificate and demo how to access the application using HTTPS on a browser
The audience will walk away with a concrete set of steps for deploying their application with TLS, so it can be accessed using HTTPS.
We all love this community and having the privilege of working in open source. In this talk I will talk about the key tenants of a positive community and specific things we can do to support developers and the community
What do you do when faced with the ever-growing and always confusing cloud native landscape? Use visual, interactive analogies from the equally confusing twisty-puzzle landscape!
The Lego experience is more than just a collection of premium priced bricks in a box. If one looks closer, it's full of guidance for the cloud native developer, including; interoperability, backward compatibility, design, and documentation. This brief rant will highlight ways your project can meet developer expectations, and pitfalls to avoid so your project won't be cast aside like a disappointing toy.
End to end testing in Kubernetes apps is usually done with many lines of bash scripts as this may seem as natural progression from testing Kubernetes apps manually with kubectl. Bash is not well-equipped for such tests because users have to create a lot of boilerplate and wrapper functions to make tests reliable. At the same time, Kubernetes provides excellent client libraries in many programming languages. In this presentation, Paweł will show that taking advantage of the client libraries can improve tests speed and reliability, and as a side effect, shorten a feedback loop for developers.
Log aggregation is one of the cornerstones of observability but setting up a logging stack can be overly complicated. As the number of clusters an operations team are expected to manage explodes a simpler solution is needed. This talk will demonstrate how we can simplify this process and set up a log aggregation platform in 5 minutes.
Security like all technology disciplines has its buzzwords. You'll often hear acronyms like SAST, SCA, DAST, and much more…but what does it all really mean?
In this talk we will review the many kinds of vulnerability scanning with a focus on Kubernetes security scanning. We'll help you understand what kinds of vulnerabilities you can as well as cannot identify with these tools. We'll review some of the popular open source security scanning tools in the ecosystem, and help you understand where you can use each and what to scan - registries, clusters, CI/CD. This will be demoed through real code examples and scanning scenarios.
In this talk we will show you how we speed up Cortex deployments at scale, using zone-aware Kubernetes controllers.
Kubernetes allow pods to be spread across different zones through topology constraints but these are not taken into consideration during rollout updates, or on pod disruption budgets. For instance, it's recommended to replicate Cortex's ingesters across different zones for high availability, allowing for the system to continue to work in the event of a zone outage. However, the lack of zone aware deployments support forces Cortex operators to allow just a single container to be updated at once, causing long deployments and impacting the velocity in which nodes can be upgraded.
To bypass these limitations, the Amazon Managed Service for Prometheus team released a couple of k8s controllers for zone aware rollouts and disruptions that can be used by any high available quorum-base distributed application, such as Cortex, to improve the velocity of deployments in a safe way.
Cloud Custodian is an open source cloud security, governance, and management tool with powerful integrations with AWS cloud services that allows for quick response times to address a wide array of compliance, governance, and security issues. As public cloud adoption increases across industries, the need to be able to properly secure and govern cloud resources is more important than ever. This session will show how to react quickly to changing security and compliance standards in reaction to security bulletins published by public cloud providers in a serverless and event based process.