Learn Kubernetes Weekly issue 183 · 13 May 2026

Autoscaling Hid Our LLM Cost Regression, Mount Mayhem at Netflix, DocumentDB Automatic Failover, Skew Protection, Kubernetes VM Security Model

This newsletter is brought to you by LearnKube — master Kubernetes with hands-on training designed for engineers who want to learn the smart way.

“What Kubernetes-specific behavior will affect my app when I deploy, update, scale, restart, route, evict, or secure it?”

Gulcan and I prepared a Kubernetes production-readiness checklist to help teams answer that question before going live.

It includes:

  1. An interactive checklist with a detailed breakdown
  2. A downloadable PDF worksheet
  3. A GitHub repository that you can fork and make yours.

Check out the Kubernetes production readiness checklist!

— Dan

Articles

  1. Autoscaling Hid Our LLM Cost Regression (85% → 4% Cache Hit Rate)

    medium.com

    This case study shows how a single RAG chunk size change collapsed vLLM prefix-cache hit rate from 85% to 4%, triggering an 80% GPU replica increase while latency stayed flat.

    It also includes the fix: adding a two-phase cache replay gate in CI.

  2. Mount mayhem at netflix: scaling containers on modern cpus

    netflixtechblog.com

    This article explains how Netflix traced severe container launch slowdowns to Linux mount lock contention, image layer mount storms, and CPU architecture differences while scaling containers on modern Kubernetes infrastructure.

  3. DocumentDB on Kubernetes: Resilient, Highly Available Databases with Automatic Failover

    abhishek1987.medium.com

    This article explains how the DocumentDB Kubernetes Operator delivers high availability with automatic failover, replica promotion, and optional zone, region, and multi-cloud resilience.

  4. We brought Skew Protection to your Kubernetes

    blog.platformatic.dev

    This article explains how Kubernetes skew protection routes traffic based on app version to prevent frontend and backend mismatches during deployments, and version-aware routing using the Gateway API.

  5. Keeping Your Security Model Intact When Running VMs in Kubernetes

    medium.com

    This article shows how to maintain VM-level network security during KubeVirt live migration by using Calico labels and policy enforcement rather than node or pod IPs.

  6. Vibe Coding a Kubernetes Media Server: What I Learned About AI-First Engineering

    medium.com

    This article explains how building a k3s media server with Claude Code exposed both the speed and the limits of AI-first engineering across GitOps, observability, storage tuning, and Kubernetes debugging.

Is your app actually ready for Kubernetes?

Kubernetes production-readiness checklist to help teams answer that question before going live.

It includes an interactive checklist with a detailed breakdown of each check, plus a downloadable PDF worksheet you can use with your team.

Download the checklist

Is your app actually ready for Kubernetes?

Tutorials

  1. CloudnativePG: postgres database the modern way

    medium.com

    This tutorial shows how to run highly available PostgreSQL on Kubernetes with CloudNativePG and Terraform by replacing the traditional Patroni, etcd, and HAProxy stack with a simpler operator-driven setup.

  2. I Added Prometheus, Grafana, and Custom Alerting to My EKS Cluster, Here's How Observability Actually Works

    dev.to

    This tutorial shows how to add Prometheus, Grafana, Alertmanager, custom metrics, ServiceMonitors, dashboards, and alert rules to an EKS cluster through GitOps.

  3. CRaC in Production: 88% Faster Spring Boot Startups on Kubernetes

    medium.com

    This tutorial shows how CRaC can cut Spring Boot startup time on Kubernetes from 23 seconds to 2.8 seconds and explains the real production issues around AWS SDK checkpointing and OpenTelemetry.

The Namespaces Scaling Trap

Most teams scale Kubernetes by thinking about pods and nodes. At Render, Brian Stack ran into a different dimension: hundreds of thousands of namespaces per cluster, multiplied across DaemonSets that list-watch every namespace.

Brian explains how Render traced the issue through Calico and Vector, worked with upstream maintainers, and turned memory profiling into operational wins: lower node costs, lighter API-server load, and faster rollouts.

In this interview:

  • Why namespaces can become a hidden scaling bottleneck
  • How DaemonSets multiply memory and control-plane pressure
  • How profiling, staging clusters, and upstream collaboration freed 7 TiB
  • Why pushing from an 80% fix to a complete fix can make teams faster
The Namespaces Scaling Trap

Kubernetes jobs

    • Software Architect with IntelliDyne Jobs for Veterans

    • Salary: $150K to $180K a year

    • Location: based in the office in Washington, DC, USA

    • Tech stack: Kubernetes, On-Prem, Kubernetes, Azure Government, AWS GovCloud, AWS, Azure, Docker, OpenShift, Audit Logging

    • System Administrator with Mattel Inc

    • Salary: $58.5K to $4.4L a year

    • Location: based in the office in Hyderabad, IN

    • Tech stack: Kubernetes, Kubernetes, AWS, Azure, OpenShift, alerting, monitoring, logging, Red Hat Insights, BigFix

    • Support Engineer with Mirantis

    • Salary: $45K to $176K a year

    • Location: remote from

    • Tech stack: Kubernetes, Kubernetes, Docker, Openstack, On-premise, Kibana, alert management, Grafana, Nagios, Prometheus

    • Support Engineer with Mirantis

    • Salary: $45K to $176K a year

    • Location: remote from

    • Tech stack: Kubernetes, Kubernetes, K0s, Mirantis Kubernetes Engine, Docker, Openstack, On-premise, Mirantis Opensack/k0s, MOSK, alert management

    • Platform Engineer with Mattel Inc

    • Salary: $1.25L to $3.74L a year

    • Location: based in the office in Hyderabad, IN

    • Tech stack: Kubernetes, Kubernetes, Google Cloud, AWS, Azure, OpenShift, Dynatrace, alerting, monitoring, logging

Discover more Kubernetes jobs on Kube Careers →

Code & tools

  1. ayaFlow

    github.com/DavidHavoc

    ayaFlow is an eBPF-based Rust tool that runs as a sidecarless DaemonSet to capture node-wide network traffic, expose metrics, and provide lightweight kernel-level visibility for troubleshooting and observability.

  2. Teleskopio

    github.com/teleskopio

    Teleskopio is a small, open-source Kubernetes web client that provides a clean browser interface for viewing and managing cluster resources without the weight of a full platform dashboard.

  3. Valkey cluster operator

    github.com/valkey-io

    Valkey Operator is a Kubernetes operator that automates deployment and lifecycle management of Valkey clusters and instances with features like automated installation and configuration management.

  4. Crossview: Crossplane UI

    github.com/corpobit

    Crossview is a React-based dashboard for managing and monitoring Crossplane resources in Kubernetes with features like:

    • resource visualization,
    • search capabilities,
    • SSO support,
    • and deployment via Helm or Kubernetes manifests.
  5. Kubeinvaders

    github.com/lucky-sideburn

    With k-inv, you can stress a Kubernetes cluster in a fun way and check its resilience by playing space invaders.

Other interesting projects:

Subscribe to Learn Kubernetes Weekly

Trusted by 77K engineers. Delivered 183 issues and counting.

or subscribe via

Upcoming Kubernetes events

  1. May

    18

    Cloud Native Days Italy 2026

    In-person conference organized by CND Italy.

    • Location: Bologna, IT

    • This event requires an entrance fee

  2. May

    18

    Advanced Kubernetes course (London)

    In-person workshop organized by LearnKube.

    • Location: London, UK

    • This event requires an entrance fee

  3. May

    18

    Advanced Kubernetes course (Boston)

    In-person workshop organized by LearnKube.

    • Location: Boston, MA, USA

    • This event requires an entrance fee

  4. May

    13

    Kubernetes Community Days Toronto Canada 2026

    In-person conference organized by KCD Toronto.

    • Location: Toronto, CA

    • This event requires an entrance fee

      • Use KCDTO-2026-KUBEEVENTS to get 20% off

  5. May

    15

    Kubernetes Community Days Texas 2026

    In-person conference organized by KCD Texas.

    • Location: Austin, TX, USA

    • This event requires an entrance fee

Discover more Kubernetes events on Kube Events →

Thanks to our sponsors who make Kube Today possible

  • LearnKube
  • Akamai
  • Fairwinds
  • Densify
Find out more about being a sponsor →

Kubernetes call for papers

  1. 6

    days

    Kubernetes Community Days Lima 2026

    The Call For Paper is open until 19 May 2026 at UTC. More info →
    • Location: Lima, PE

    • In-person conference organized by KCD Lima, Perú.

    • The conference starts on the 18 July 2026.

    • Apply here
  2. 19

    days

    Cloud Native Days Norway

    The Call For Paper is open until 1 June 2026 at UTC. More info →
    • Location: Bergen, NO

    • In-person conference organized by CND Norway.

    • The conference starts on the 27 October 2026.

    • Apply here
  3. 19

    days

    KubeCon + CloudNativeCon North America 2026

    The Call For Paper is open until 1 June 2026 at UTC. More info →
    • Location: Los Angeles, CA, USA

    • In-person conference organized by Linux Foundation.

    • The conference starts on the 26 October 2026.

    • Apply here
  4. 40

    days

    Dutch Cloud Native Day

    The Call For Paper is open until 22 June 2026 at UTC. More info →
    • Location: Utrecht, NL

    • In-person conference organized by Dutch CND.

    • The conference starts on the 29 October 2026.

    • Apply here
  5. 22

    days

    Devopsdays Feira de Santana

    The Call For Paper is open until 4 June 2026 at UTC. More info →
    • Location: Feira de Santana, BR

    • In-person conference organized by Devopsdays.

    • The conference starts on the 6 June 2026.

    • Apply here
  6. 22

    days

    Devopsdays Curitiba

    The Call For Paper is open until 4 June 2026 at UTC. More info →
    • Location: Curitiba, BR

    • In-person conference organized by Devopsdays.

    • The conference starts on the 22 August 2026.

    • Apply here
  7. 19

    days

    Heapcon 2026

    The Call For Paper is open until 1 June 2026 at UTC. More info →
    • Location: Belgrade, RS

    • In-person conference organized by heapspace.

    • The conference starts on the 6 November 2026.

    • Apply here
  8. 4

    days

    TechEx North America

    The Call For Paper is open until 17 May 2026 at UTC. More info →
    • Location: San Jose, CA, USA

    • In-person conference organized by TechEx Events.

    • The conference starts on the 19 May 2026.

    • Apply here
  9. 18

    days

    DevOpsDays Istanbul 2026

    The Call For Paper is open until 31 May 2026 at UTC. More info →
    • Location: Istanbul, TR

    • In-person conference organized by DevOps Turkey.

    • The conference starts on the 29 September 2026.

    • Apply here

Thanks for reading.

See you next week!

— Gulcan

Subscribe to Learn Kubernetes Weekly

Trusted by 77K engineers. Delivered 183 issues and counting.

or subscribe via