Spotlight
Michael Preston
This article explains the challenges of running Java at scale on Kubernetes, covering JVM memory management with container limits, heap sizing with MaxRAMPercentage, CPU throttling, and garbage collector selection for containers.
Fabián Sellés Rosa
This case study shows how upgrading to Kubernetes 1.34 caused KIAM pods to fail due to service account token expiration changes, revealing that legacy clients using long-lived tokens now expire after 24 hours instead of 90 days.
Yaron Yadid
This article describes building an Image Preload Operator that reduces Kubernetes pod startup times from minutes to seconds by intelligently preloading container images using a single DaemonSet with CRI-agnostic image pulling.
Beza
This article explains how Linkerd's destination service works as the central routing and policy authority, using event-driven architecture with Kubernetes Informers to provide service discovery, policy distribution, and Layer 7 configuration to proxies.
Tools and utilities
InfraLens is a zero-instrumentation observability tool that uses eBPF to automatically discover and visualize service-to-service communication in Kubernetes clusters without requiring code changes or sidecars.
KubeAttention is a machine learning-powered Kubernetes scheduler plugin that uses eBPF telemetry to detect noisy neighbor interference and place latency-sensitive workloads on optimal nodes.
K8up is a Kubernetes Operator that helps you:
OpenEBS is a modern Block-Mode storage platform, a Hyper-Converged Software Storage System, and a virtual NVMe-oF SAN (vSAN) Fabric that is natively integrated into Kubernetes' core.
AgentDiscover Scanner detects autonomous AI agents and Shadow AI in codebases using static analysis for Python and JavaScript, network monitoring for active LLM traffic, and Kubernetes runtime detection via Cilium Tetragon eBPF.
Events starting soon
March 28, 2026
Location: Hyderabad, IN
This is a free event.
March 28, 2026
Location: Pune, IN
This is a free event.
March 28, 2026
This is a virtual event
This event requires an entrance fee
March 28, 2026
Location: Cali, CO and virtual
This is a free event.
March 31, 2026
This is a virtual event
This is a free event.
March 31, 2026
Location: Umeå, SE
This is a free event.
Running GPU workloads on Kubernetes sounds straightforward until you need to isolate multiple tenants on the same server. The moment you virtualize GPUs for security, you lose access to NVIDIA kernel drivers — and almost every tool in the ecosystem assumes those drivers exist.
Landon Clipp built a GPU-based Containers as a Service platform from scratch, solving each isolation layer — from kernel separation with Kata Containers + QEMU to NVLink fabric partitioning to network policies with Cilium/eBPF — and shares exactly what broke along the way.
In this interview:
Where Containers as a Service fits best: inference workloads where AI teams want to ship an OCI image without managing infrastructure or signing multi-million dollar cluster contracts.
Learn from production
Rob Sherling
This case study shows how EMC Healthcare built an on-premise CI/CD pipeline using K3s, ArgoCD, and Argo Workflows to automate testing and deployments with preview environments.
Sphoorthi Charan Nayakudugari
This case study explains how the authors used dynamic MIG partitioning to split large GPUs like NVIDIA A100/H100 into multiple isolated slices, letting many small jobs share GPU efficiently.
Farid Guluzade
This case study shows how reducing JVM MaxRAMPercentage, cutting the Hikari connection pool from 50 to 20, and implementing aggressive HPA scale-up (0s stabilization, 4 pods/min) doubled traffic capacity while cutting baseline pods from 26 to 10.
Mateen Ali Anjum
This case study describes rebuilding a fragile Kubernetes infrastructure into a production-grade platform for GPU-based ML workloads, improving deployment frequency from weekly to 10+ times daily.
Matching jobs
Data Engineer with ILLUIN Technology
Salary: $90K to $412.61K a year
Location: based in the office in Paris La Défense, FR
Tech stack: Kubernetes, On-premise, Docker, SQL, Java, Kotlin, Python, Scala, Snowflake, Terraform
Data Engineer with XBOW
Salary: $72K to $188.18K a year
Location: remote from
Tech stack: Kubernetes, C#, Go, Java, Javascript, Python, Shell, SQL, Typescript, PostgreSQL
DevOps Engineer with Cosine
Salary: US$49.5K to US$275K a year
Location: based in the office in London, GB
Tech stack: Kubernetes, AWS, On-premise, Helm, Docker, Go, Javascript, Python, Typescript, Redis
DevOps Engineer with Meritis
Salary: $99.9K to $275K a year
Location: based in the office in Sophia Antipolis, FR
Tech stack: Kubernetes, AWS, Azure, GCP, Docker, Python, Terraform, GitHub Actions, Jenkins, Ansible
DevOps Engineer with Septeo
Salary: $99.9K to $275K a year
Location: based in the office in Montpellier, FR
Tech stack: Kubernetes, AWS, Docker, Terraform, Ansible
Build something
Tobby Kuo
This tutorial teaches how to build an end-to-end real-time baggage tracking system using Kafka for event streaming, Flink for state processing, ClickHouse for analytics, and Grafana for visualization on Kubernetes.
Shanaka Jayasundera
This tutorial shows how to expose Kubernetes Gateway API from AKS through Azure Application Gateway by fixing health probe failures with a dedicated HTTPRoute and connection timeouts using externalTrafficPolicy Local for Azure DSR.
Ahmad Asmar
This tutorial shows how to use Kyverno policy engine to generate Pod Disruption Budgets for Kubernetes deployments with multiple replicas, preventing downtime during Karpenter node consolidation through intelligent API lookups and label matching.
Jean Baptiste Lapeyre
This article explains how to integrate Arista datacenter fabric with Cilium CNI by building a spine-leaf architecture with an ISIS underlay and a BGP EVPN overlay.
Call for Papers closing soon
2
days
Location: Amsterdam, NL
In-person conference organized by Devopsdays.
The conference starts on the 19 June 2026.
2
days
KubeCon + CloudNativeCon Japan 2026
Location: Yokohama, JP
In-person conference organized by Linux Foundation.
The conference starts on the 30 July 2026.
4
days
WeAreDevelopers World Congress 2026 North America
Location: San Jose, CA, USA
In-person conference organized by WeAreDevelopers.
The conference starts on the 25 September 2026.
4
days
Location: Malaga, ES and virtual
Online & in-person conference organized by Yay Yay Events.
The conference starts on the 29 October 2026.
4
days
Location: MALMÖ, SE
In-person conference organized by Øredev.
The conference starts on the 4 November 2026.
4
days
Cloud Native Summit Munich 2026
Location: Munich, DE
In-person conference organized by Cloud Native Summit Munich.
The conference starts on the 30 June 2026.
4
days
Kubernetes Community Days Czech & Slovak - Prague 2026
Location: Bratislava, SK
In-person conference organized by KCD Czech & Slovak.
The conference starts on the 21 May 2026.
More articles
Marcin Cuber
This article explains why NATS headless services return NXDOMAIN errors on EKS and how Kubernetes headless service DNS resolution works with StatefulSets.
Ar Hakboian
This article describes an experiment using three autonomous AI agents to conduct multi-agent SRE incident investigations in a sandboxed Kubernetes environment with real tooling access.
Shivee Gupta
This article explains how Dream11 built an in-house observability platform using SigNoz, ClickHouse, and OpenTelemetry to handle millions of metrics and traces across thousands of EC2 instances, saving millions in commercial tooling costs.
Jitin Kayyala
This article explains service mesh patterns for managing microservice communication, covering how sidecars like Envoy handle retries, circuit breakers, timeouts, and load balancing transparently.