Spotlight
Sphoorthi Charan Nayakudugari
This case study explains how the authors used dynamic MIG partitioning to split large GPUs like NVIDIA A100/H100 into multiple isolated slices, letting many small jobs share GPU efficiently.
Jean Baptiste Lapeyre
This article explains how to integrate Arista datacenter fabric with Cilium CNI by building a spine-leaf architecture with an ISIS underlay and a BGP EVPN overlay.
Ar Hakboian
This article describes an experiment using three autonomous AI agents to conduct multi-agent SRE incident investigations in a sandboxed Kubernetes environment with real tooling access.
piotr.minkowski
This tutorial teaches how to use the In-Place Pod Resize feature in Kubernetes version 1.35 combined with Kube Startup CPU Boost controller to speed up Java application startup by temporarily increasing CPU resources during the startup phase.
Tools and utilities
KubeDiagrams is a tool that automatically generates visual architecture diagrams from Kubernetes manifests, Helm charts, and live clusters.
lazydocker is a terminal UI for managing Docker containers and services, with log and metric graph viewing, container attachment, and execution of common Docker commands.
Radius bridges developer and operator workflows by enabling cloud-neutral application deployment with infrastructure recipes and Dapr integration that simplify building, configuring, and managing modern cloud applications across platforms.
Sloth generates Prometheus Service Level Objectives with reliable SLI recording rules and multi-window, multi-burn-rate alerts from simple YAML specs.
NVIDIA Dynamo is a datacenter-scale distributed LLM inference framework supporting disaggregated prefill/decode, KV-aware routing, and dynamic GPU scheduling across vLLM, SGLang, and TensorRT-LLM.
Events starting soon
March 21, 2026
Location: Thane, IN
This is a free event.
March 21, 2026
Location: Noida, IN
This is a free event.
March 21, 2026
This is a virtual event
This is a free event.
March 23, 2026
Location: Amsterdam, NL
This event requires an entrance fee
March 23, 2026
Location: Amsterdam, NL
This event requires an entrance fee
March 23, 2026
Location: Amsterdam, NL
This event requires an entrance fee
Build failures in Kubernetes CI/CD pipelines are a silent productivity killer. Developers spend 45+ minutes scrolling through cryptic logs, often just hitting rerun and hoping for the best.
Ron Matsliah, DevOps engineer at Next Insurance, built an AI-powered assistant that cut build debugging time by 75% — not as a dashboard, but delivered directly in Slack where developers already work.
In this episode:
The takeaway: simple rules plus rich context consistently outperform complex AI queries on their own.
Learn from production
Farid Guluzade
This case study shows how reducing JVM MaxRAMPercentage, cutting the Hikari connection pool from 50 to 20, and implementing aggressive HPA scale-up (0s stabilization, 4 pods/min) doubled traffic capacity while cutting baseline pods from 26 to 10.
Mateen Ali Anjum
This case study describes rebuilding a fragile Kubernetes infrastructure into a production-grade platform for GPU-based ML workloads, improving deployment frequency from weekly to 10+ times daily.
Ron Matsliah
This article describes how the team at Next Insurance built an AI-powered microservice that watches build failures via Jenkins, analyzes logs automatically and posts clear, helpful feedback to Slack.
Scout24
This case study shows how Scout24 turned an Amazon Linux 2 end-of-life deadline into a 30% reduction in nodes across their EKS clusters by combining OS migration with Karpenter adoption.
Matching jobs
Data Engineer with Kasada
Salary: USD 0 to USD 412.61K a year
Location: based in the office in Sydney, AU
Tech stack: Kubernetes, AWS, Docker, Java, Python, Scala, SQL, Kafka, Airflow, Pulumi
Data Engineer with SoFi
Salary: $54K to $286K a year
Location: based in the office in San Francisco, CA, USA
Tech stack: Kubernetes, Docker, SQL, Python, Snowflake, Terraform, Datadog
DevOps Engineer with Accesa & RaRo
Salary: $115.96K to $255.42K a year
Location: remote from
Tech stack: Kubernetes, AWS, Azure, GCP, OpenShift, Docker, Terraform
DevOps Engineer with Egen
Salary: $49.5K to $539K a year
Location: remote from
Tech stack: Kubernetes, GCP, Helm, Docker, Shell, PostgreSQL, MySQL, Terraform, Azure DevOps, Jenkins
DevOps Engineer with HavocAI
Salary: $49.5K to $539K a year
Location: fully remote
Tech stack: Kubernetes, AWS, Docker, Go, Python, Terraform
Build something
Juan Carlos Gonzalez Cabrero
This tutorial teaches how to build intelligent load balancing on GCP for Python microservices using Google Kubernetes Engine with Network Endpoint Groups, readiness probes, autoscaling, and observability to reduce failures.
Naman Raj
This tutorial teaches how to build a production-like service mesh lab using Istio with a 3-tier application (Next.js, Go, Flask) on a local Kind cluster.
Anton Pechenin
This tutorial shows you how to extend Argo Workflows using Executor Plugins by building HTTP servers that handle lightweight tasks in reusable agent pods instead of spinning up separate pods for each step.
Suraj Bhattarai
This tutorial teaches how to implement multi-tenancy on Google Kubernetes Engine using namespaces for isolation, RBAC for access control, and resource quotas for capacity management.
Call for Papers closing soon
3
days
Location: Amsterdam, NL
In-person conference organized by CNCF.
The conference starts on the 23 March 2026.
3
days
Cloud Native AI + Kubeflow Day Europe
Location: Amsterdam, NL
In-person conference organized by CNCF.
The conference starts on the 23 March 2026.
3
days
This is a virtual event
Online conference organized by Conf42.
The conference starts on the 23 April 2026.
3
days
Location: Iai, RO
In-person conference organized by DevDays Conf.
The conference starts on the 23 September 2026.
5
days
Kubernetes Community Days New York 2026
Location: New York, NY, USA
In-person conference organized by KCD New York.
The conference starts on the 10 June 2026.
6
days
Location: Amsterdam, NL
In-person conference organized by CNCF.
The conference starts on the 26 March 2026.
7
days
Location: New York, NY, USA
In-person conference organized by DeveloperWeek New York.
The conference starts on the 10 June 2026.
More articles
Jitin Kayyala
This article explains service mesh patterns for managing microservice communication, covering how sidecars like Envoy handle retries, circuit breakers, timeouts, and load balancing transparently.
Vasily Pilitsyn
This article shows how KRO manages ephemeral test environments as single Kubernetes API objects by orchestrating resource deployment in dependency order, with readiness conditions and a unified status across namespace, frontend, backend, and database.
David L. Armstrong
This case study debugs 30+ second GitHub ARC workflow pod startup delays caused by mixing nodeName with OpenEBS allowedTopologies constraints, resolving it by enabling ACTIONS_RUNNER_USE_KUBE_SCHEDULER to use nodeAffinity instead.
Ajay Edupuganti
This article explains why Linecraft AI migrated from manual Windows IIS deployments to Kubernetes after facing multi-week release cycles, inconsistent environments from local server changes, and extended downtimes.