Spotlight
Tobby Kuo
This tutorial teaches how to build an end-to-end real-time baggage tracking system using Kafka for event streaming, Flink for state processing, ClickHouse for analytics, and Grafana for visualization on Kubernetes.
Marcin Cuber
This article explains why NATS headless services return NXDOMAIN errors on EKS and how Kubernetes headless service DNS resolution works with StatefulSets.
Rob Sherling
This case study shows how EMC Healthcare built an on-premise CI/CD pipeline using K3s, ArgoCD, and Argo Workflows to automate testing and deployments with preview environments.
Shanaka Jayasundera
This tutorial shows how to expose Kubernetes Gateway API from AKS through Azure Application Gateway by fixing health probe failures with a dedicated HTTPRoute and connection timeouts using externalTrafficPolicy Local for Azure DSR.
Tools and utilities
OpenKruise Agents manage AI agent workloads in Kubernetes, providing rapid resource provisioning via pooling, sandbox hibernation with checkpoint support, and user session management with efficient traffic routing.
Gonzo lets you use a terminal UI to stream and analyse logs in real time, with support for OpenTelemetry (OTLP), AI-powered insights, heatmaps and advanced filtering.
helm-chartsnap is a tool that provides powerful UI testing capabilities for Helm charts with minimal configuration just within values.yaml files.
onechart is a generic Helm chart for your application deployments.
k9sight is a Go TUI for debugging Kubernetes workloads with vim-style navigation, supporting log search, exec, port-forward, scale, restart, and built-in debug helpers for common pod failure states like CrashLoopBackOff and ImagePullBackOff.
Events starting soon
March 25, 2026
Location: Seattle, WA, USA
This event requires an entrance fee
March 26, 2026
Location: Rust, DE
This event requires an entrance fee
March 26, 2026
Location: Amsterdam, NL
This event requires an entrance fee
March 26, 2026
Location: Amsterdam, NL
This is a free event.
March 26, 2026
Location: Austin, TX, USA
This is a free event.
March 27, 2026
This is a virtual event
This event requires an entrance fee
Running GPU workloads on Kubernetes sounds straightforward until you need to isolate multiple tenants on the same server. The moment you virtualize GPUs for security, you lose access to NVIDIA kernel drivers — and almost every tool in the ecosystem assumes those drivers exist.
Landon Clipp built a GPU-based Containers as a Service platform from scratch, solving each isolation layer — from kernel separation with Kata Containers + QEMU to NVLink fabric partitioning to network policies with Cilium/eBPF — and shares exactly what broke along the way.
In this interview:
Where Containers as a Service fits best: inference workloads where AI teams want to ship an OCI image without managing infrastructure or signing multi-million dollar cluster contracts.
Learn from production
Sphoorthi Charan Nayakudugari
This case study explains how the authors used dynamic MIG partitioning to split large GPUs like NVIDIA A100/H100 into multiple isolated slices, letting many small jobs share GPU efficiently.
Farid Guluzade
This case study shows how reducing JVM MaxRAMPercentage, cutting the Hikari connection pool from 50 to 20, and implementing aggressive HPA scale-up (0s stabilization, 4 pods/min) doubled traffic capacity while cutting baseline pods from 26 to 10.
Mateen Ali Anjum
This case study describes rebuilding a fragile Kubernetes infrastructure into a production-grade platform for GPU-based ML workloads, improving deployment frequency from weekly to 10+ times daily.
Ron Matsliah
This article describes how the team at Next Insurance built an AI-powered microservice that watches build failures via Jenkins, analyzes logs automatically and posts clear, helpful feedback to Slack.
Matching jobs
Data Engineer with Capco
Salary: PLN 5.7K to PLN 532.4K a year
Location: remote from
Tech stack: Kubernetes, AWS, Azure, GCP, Docker, Python, Scala, SQL, Airflow, Spark
Data Engineer with Cavnue
Salary: $54K to $286K a year
Location: remote from
Tech stack: Kubernetes, GCP, Docker, C++, Python, Redis, PostgreSQL, Terraform, Gitlab
Data Engineer with ShyftLabs
Salary: $1.4L to $2.37L a year
Location: based in the office in Noida, IN
Tech stack: Kubernetes, GCP, Docker, SQL, Python, Snowflake, Airflow, Terraform, Gitlab
DevOps Engineer with 100MS
Salary: $1.08L to $2.75L a year
Location: based in the office in Bengaluru, IN
Tech stack: Kubernetes, GCP, Helm, Shell, Terraform, Grafana, Prometheus, Loki
DevOps Engineer with ALTEN
Salary: $99.9K to $275K a year
Location: based in the office in Boulogne-Billancourt, FR
Tech stack: Kubernetes, AWS, GCP, Docker, Terraform
Build something
Jean Baptiste Lapeyre
This article explains how to integrate Arista datacenter fabric with Cilium CNI by building a spine-leaf architecture with an ISIS underlay and a BGP EVPN overlay.
piotr.minkowski
This tutorial teaches how to use the In-Place Pod Resize feature in Kubernetes version 1.35 combined with Kube Startup CPU Boost controller to speed up Java application startup by temporarily increasing CPU resources during the startup phase.
Juan Carlos Gonzalez Cabrero
This tutorial teaches how to build intelligent load balancing on GCP for Python microservices using Google Kubernetes Engine with Network Endpoint Groups, readiness probes, autoscaling, and observability to reduce failures.
Naman Raj
This tutorial teaches how to build a production-like service mesh lab using Istio with a 3-tier application (Next.js, Go, Flask) on a local Kind cluster.
Call for Papers closing soon
1
days
Location: Amsterdam, NL
In-person conference organized by CNCF.
The conference starts on the 26 March 2026.
2
days
Location: New York, NY, USA
In-person conference organized by DeveloperWeek New York.
The conference starts on the 10 June 2026.
4
days
Location: Amsterdam, NL
In-person conference organized by Devopsdays.
The conference starts on the 19 June 2026.
4
days
KubeCon + CloudNativeCon Japan 2026
Location: Yokohama, JP
In-person conference organized by Linux Foundation.
The conference starts on the 30 July 2026.
6
days
WeAreDevelopers World Congress 2026 North America
Location: San Jose, CA, USA
In-person conference organized by WeAreDevelopers.
The conference starts on the 25 September 2026.
6
days
Location: Malaga, ES and virtual
Online & in-person conference organized by Yay Yay Events.
The conference starts on the 29 October 2026.
6
days
Location: MALMÖ, SE
In-person conference organized by Øredev.
The conference starts on the 4 November 2026.
More articles
Ar Hakboian
This article describes an experiment using three autonomous AI agents to conduct multi-agent SRE incident investigations in a sandboxed Kubernetes environment with real tooling access.
Shivee Gupta
This article explains how Dream11 built an in-house observability platform using SigNoz, ClickHouse, and OpenTelemetry to handle millions of metrics and traces across thousands of EC2 instances, saving millions in commercial tooling costs.
Jitin Kayyala
This article explains service mesh patterns for managing microservice communication, covering how sidecars like Envoy handle retries, circuit breakers, timeouts, and load balancing transparently.
Vasily Pilitsyn
This article shows how KRO manages ephemeral test environments as single Kubernetes API objects by orchestrating resource deployment in dependency order, with readiness conditions and a unified status across namespace, frontend, backend, and database.