<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom"><title>Learn Kubernetes Weekly</title><link href="https://kube.today"/><link rel="self" href="https://kube.today/learn-kubernetes-weekly.xml"/><updated>2026-05-20T09:00:00Z</updated><icon>[object Object]</icon><rights>Learn Kubernetes Weekly</rights><id>https://kube.today</id><entry><title>⎈ Hunting a 4GB Native Memory Leak, Ingress NGINX Surprises Before You Migrate, ctx_ for DevOps, Migrating to Istio, PostgreSQL on Kubernetes</title><link href="https://kube.today/issues/184"/><id>https://kube.today/issues/184</id><published>2026-05-20T09:00:00Z</published><content type="html"><![CDATA[<p>This newsletter is brought to you by <a href="https://ku.bz/CvpvW-SG2" title="">WeAreDevelopers World Congress — The World’s Largest Event for Developers, AI Builders & Tech Leaders</a></p><hr /><p>Hi!</p><p><strong>⭐️ We've got a discount for WeAreDevelopers World Congress.</strong></p><p>This is the world's largest developer conference taking place <strong>8-10 July 2026 · Berlin, Germany.</strong></p><p>Covering AI, Cloud Native, DevOps, Frontend, and Security, with workshops and a hackathon.</p><p><a href="https://ku.bz/CvpvW-SG2" title="">Use code <strong>LEARNKUBE10</strong> for 10% off your ticket.</a></p><hr /><h2>📚 Articles</h2><p><strong>🔥 Three Weeks in the Trenches: Hunting a 4GB Native Memory Leak That .NET Couldn’t See](<a href="https://ku.bz/0KH7ncbBR" title="">https://ku.bz/0KH7ncbBR</a>)</strong></p><p>This case study shows how a team traced <strong>repeated</strong> pod <strong>OOM kills</strong> in <strong>ASP.NET Core</strong> to native memory growth from zombie <strong>SignalR</strong> connections, glibc fragmentation, and kernel socket buffers.</p><p><strong>Before You Migrate: Five Surprising Ingress-NGINX Behaviors You Need to Know](<a href="https://ku.bz/KXFB4qzH6" title="">https://ku.bz/KXFB4qzH6</a>)</strong></p><p>This article explains five <strong>Ingress-NGINX behaviors</strong> that can <strong>break migrations</strong>, including path-matching differences, regex quirks, rewrite behavior, and annotation <strong>mismatches</strong> when migrating to another ingress solution.</p><p><strong>Why I built ctx_: the context switcher that actually gets DevOps work](<a href="https://ku.bz/-BG8_C5W2" title="">https://ku.bz/-BG8_C5W2</a>)</strong></p><p>This article introduces <strong>ctx_</strong>, a CLI tool that <strong>switches</strong> an entire DevOps working context at once, including Kubernetes context, cloud <strong>credentials</strong>, environment variables, VPN, SSH tunnels, secrets, and browser profile.</p><p><strong>Migrating Ingress NGINX Controller to Istio in Kubernetes environment](<a href="https://ku.bz/NB-YWHBJS" title="">https://ku.bz/NB-YWHBJS</a>)</strong></p><p>This article covers an <strong>ingress-nginx to Istio migration</strong>, architectural decisions, the RE2 vs PCRE <strong>regex incompatibility</strong> gotcha, URL rewrite differences, resource <strong>overhead comparison</strong> between sidecar and ambient mode, and a <strong>phased</strong> migration strategy.</p><p><strong>Running PostgreSQL on Kubernetes: Operators, Storage and Production Guide](<a href="https://ku.bz/zFbF6fRQg" title="">https://ku.bz/zFbF6fRQg</a>)</strong></p><p>This article covers running <strong>PostgreSQL</strong> on Kubernetes in production — comparing <strong>Zalando</strong>, Crunchy and CloudNativePG operators, <strong>storage class decisions</strong>, backup strategies, <strong>connection pooling</strong>, and a take on <strong>when</strong> Kubernetes is overkill for databases.</p><p><strong>Building Secure GitOps Pipelines: Integrating External Secrets Operator with ArgoCD on EKS](<a href="https://ku.bz/1qJT8SG1s" title="">https://ku.bz/1qJT8SG1s</a>)</strong></p><p>This tutorial shows how to secure an <strong>ArgoCD</strong> based EKS <strong>GitOps</strong> workflow with External Secrets Operator, <strong>IRSA</strong>, and AWS SSM Parameter Store so secrets stay out of Git and <strong>sync</strong> safely into Kubernetes.</p><hr /><p><strong>🌟 <a href="https://ku.bz/CvpvW-SG2" title="">The World’s Largest Event for Developers, AI Builders & Tech Leaders</a></strong></p><p><strong>15,000 developers. 500+ speakers. One place.</strong></p><p>Werner Vogels, Thomas Dohmke, Garry Kasparov and more on stage. Three days of talks, workshops, and live coding in Berlin.</p><p>→ <a href="https://ku.bz/CvpvW-SG2" title=""><strong>Get your ticket</strong></a></p><p><img src="https://assets.learnk8s.io/wearedevelopers-ad-v3.png" alt="The World’s Largest Event for Developers, AI Builders & Tech Leaders" title="" /></p><hr /><h2>📖 Tutorials</h2><p><strong>Mastering Crossview Deployment: Securing Your Crossplane Dashboard in an Enterprise Kubernetes Environment](<a href="https://ku.bz/hwQDK693G" title="">https://ku.bz/hwQDK693G</a>)</strong></p><p>This tutorial teaches how to <strong>deploy</strong> <strong>Crossview</strong> on Kubernetes with Helm and <strong>secure</strong> it for enterprise use with session <strong>auth</strong>, SSO, proxy header auth, RBAC, TLS, and <strong>high-availability</strong> settings.</p><p><strong>Handling Leaked Secrets and Credentials in Version Control Repositories](<a href="https://ku.bz/PZjTtq9v8" title="">https://ku.bz/PZjTtq9v8</a>)</strong></p><p>This tutorial explains how to <strong>prevent</strong>, detect, and clean up <strong>leaked secrets</strong> in <strong>Git</strong> repositories using .env files, Kubernetes Secrets, Gitleaks, GitGuardian, and git-filter-repo.</p><p><strong>Running Production Minded Kubernetes on a Raspberry Pi](<a href="https://ku.bz/_XTl5KG56" title="">https://ku.bz/_XTl5KG56</a>)</strong></p><p>This tutorial shows how to run a small, <strong>security-focused</strong> k3s cluster on a <strong>Raspberry Pi</strong> inside a normal home network with tight hardware and networking limits.</p><p><strong>The Complete OpenSSL & TLS Debugging Guide: From Root CA to Kubernetes](<a href="https://ku.bz/z-30r6w-V" title="">https://ku.bz/z-30r6w-V</a>)</strong></p><p>This tutorial explains <strong>TLS</strong> and certificate debugging from <strong>root CA</strong> basics to Kubernetes secrets, with <strong>OpenSSL</strong> and curl commands for <strong>inspecting</strong> certs, validating <strong>handshakes</strong>, and fixing common production errors.</p><hr /><h2>📺 This week on the KubeFM podcast</h2><p><strong><a href="https://ku.bz/DdmVC2_7v" title="">The Hidden Cost of Slow Autoscaling</a></strong></p><hr /><h2>💼 Kubernetes jobs</h2><p><strong><a href="https://ku.bz/hJCDWh1rc" title="">System Administrator</a></strong> 💰 US$153K to US$246.4K a year · 🏢 based in the office in Villars-sur-Glâne, FR, CH</p><p><strong><a href="https://ku.bz/M9jj9mjrf" title="">Software Engineer</a></strong> 💰 $117K to $275K a year · 🏢 based in the office in Paris, FR</p><p><strong><a href="https://ku.bz/WlZzTQRX_" title="">Software Engineer</a></strong> 💰 $47.97K to $264K a year · 🌎 remote from</p><p><strong><a href="https://ku.bz/m7flm3x0Q" title="">Product Owner</a></strong> 💰 $100K a year · 🏢 based in the office (and remote from home) in Toronto, CA</p><p><strong><a href="https://ku.bz/vwT19HMsK" title="">Platform Engineer</a></strong> 💰 $47.97K to $266.2K a year · 🏢 based in the office in Barcelona, ES</p><p>👉 Discover more opportunities on <a href="https://kube.careers" title="">Kube Careers.</a></p><hr /><h2>🛠 Tools and libraries</h2><p><strong>🔥 Crust-Gather – kubectl Cluster Snapshot Plugin](<a href="https://ku.bz/R4x18D0Fb" title="">https://ku.bz/R4x18D0Fb</a>)</strong></p><p>Crust-Gather is a <strong>kubectl plugin for collecting Kubernetes cluster state</strong> and exposing it through an API server.</p><p><strong>🔥 KubeDiagrams](<a href="https://ku.bz/4tVrVM3pd" title="">https://ku.bz/4tVrVM3pd</a>)</strong></p><p><strong>KubeDiagrams is a tool that automatically generates visual architecture diagrams from Kubernetes manifests, Helm charts, and live clusters.</strong></p><p>It supports 47+ resource types, customizable clustering by namespace and labels, and can handle custom resources.</p><p><strong>k10s](<a href="https://ku.bz/CpbrdbBG0" title="">https://ku.bz/CpbrdbBG0</a>)</strong></p><p>k10s is a <strong>terminal dashboard</strong> for <strong>watching</strong> multiple Kubernetes clusters at once, with side-by-side views, health signals, warnings, and recent <strong>logs</strong> in one screen.</p><p><strong>eks-up](<a href="https://ku.bz/2B_b2k4F4" title="">https://ku.bz/2B_b2k4F4</a>)</strong></p><p>eksup analyzes your <strong>EKS cluster</strong> and generates a step-by-step <strong>upgrade playbook</strong>, <strong>flagging</strong> deprecated APIs, add-on version mismatches, and node group issues before you <strong>upgrade</strong>.</p><p><strong>H8s (Homernetes)](<a href="https://ku.bz/CRfmCj5PC" title="">https://ku.bz/CRfmCj5PC</a>)</strong></p><p>H8s is a home <strong>infrastructure project</strong> combining Kubernetes with <strong>Talos OS security</strong>, running on <strong>2 N100</strong> mini PCs with <strong>GitOps</strong> deployment via ArgoCD.</p><h3>More projects</h3><ul><li><p><a href="https://ku.bz/3YdGlDTkZ" title="">Clabernetes: Containerlab in Kubernetes</a></p></li><li><p><a href="https://ku.bz/8Y52rJ74q" title="">Node Healthcheck Operator</a></p></li></ul><hr /><h2>📅 Upcoming Kubernetes events</h2><p><strong><a href="https://ku.bz/wWjxl5Nh7" title="">🌟 Kubernetes Community Days Czech & Slovak - Prague 2026</a></strong> 📅 May 21</p><p><strong><a href="https://ku.bz/Jq_M2V-rx" title="">Devopsdays Geneva</a></strong> 📅 May 21</p><p><strong><a href="https://ku.bz/zpRr562mj" title="">🌟 Codemotion Madrid</a></strong> 📅 May 21</p><p><strong><a href="https://ku.bz/JKnCMVdhj" title="">🔥 Cloud Native Days Amsterdam</a></strong> 📅 May 22</p><p><strong><a href="https://ku.bz/L_l-qH6Jw" title="">Observability Summit North America</a></strong> 📅 May 22</p><p><strong><a href="https://ku.bz/y_BhFMftb" title="">🔥 Advanced Kubernetes course</a></strong> 📅 Jun 11</p><p>👉 You can find more events on <a href="https://kube.events" title="">Kube Events.</a></p><hr /><h2>📢 Call for papers closing soon</h2><p><strong><a href="https://ku.bz/pxrqd9zHV" title="">🔥 Cloud Native Days Norway</a></strong> ⏳ <em>closes Jun 1</em></p><p><strong><a href="https://ku.bz/zJnQvbW4F" title="">🔥 KubeCon + CloudNativeCon North America 2026</a></strong> ⏳ <em>closes Jun 1</em></p><p><strong><a href="https://ku.bz/JTbTchKw4" title="">🔥 Dutch Cloud Native Day</a></strong> ⏳ <em>closes Jun 22</em></p><p><strong><a href="https://ku.bz/DyqM3gbzC" title="">🔥 Kubernetes Community Days San Francisco Bay Area 2026</a></strong> ⏳ <em>closes Jun 15</em></p><p><strong><a href="https://ku.bz/PlXC3yxS1" title="">🔥 Kubernetes Community Days São Paulo 2026</a></strong> ⏳ <em>closes Jul 6</em></p><p><strong><a href="https://ku.bz/zzmJzrSpb" title="">🔥 ContainerDays & AI Context Singapore</a></strong> ⏳ <em>closes Jul 31</em></p><p><strong><a href="https://ku.bz/HLKtWcqg2" title="">🔥 Open Source Summit Europe 2026</a></strong> ⏳ <em>closes Jun 25</em></p><p><strong><a href="https://ku.bz/6wqgzsV2Y" title="">🔥 Experts Live Emirates 2026</a></strong> ⏳ <em>closes Jun 10</em></p><p><strong><a href="https://ku.bz/K6Sw1VXvb" title="">🔥 CloudBrew 2026</a></strong> ⏳ <em>closes Jun 7</em></p><p>👉 You can find more Call for Papers on <a href="https://kube.events/call-for-papers" title="">Kube Events.</a></p><hr /><p>That's all for this week!</p><p>See you next week.</p><p><em>— Gulcan</em></p>]]></content><author><name>Kube Today</name></author></entry><entry><title>⎈ Autoscaling Hid Our LLM Cost Regression, Mount Mayhem at Netflix, DocumentDB Automatic Failover, Skew Protection, Kubernetes VM Security Model</title><link href="https://kube.today/issues/183"/><id>https://kube.today/issues/183</id><published>2026-05-13T09:00:00Z</published><content type="html"><![CDATA[<p>This newsletter is brought to you by <a href="https://ku.bz/hypSbyc-V" title="">LearnKube — master Kubernetes with hands-on training designed for engineers who want to learn the smart way.</a></p><hr /><p><strong>“What Kubernetes-specific behavior will affect my app when I deploy, update, scale, restart, route, evict, or secure it?”</strong></p><p>Gulcan and I prepared a Kubernetes <a href="https://ku.bz/7py0zX-ct" title="">production-readiness checklist</a> to help teams answer that question before going live.</p><p>It includes:</p><ol><li><p>An interactive checklist with a detailed breakdown</p></li><li><p>A downloadable PDF worksheet</p></li><li><p>A GitHub repository that you can fork and make yours.</p></li></ol><p><a href="https://ku.bz/7py0zX-ct" title=""><strong>Check out the Kubernetes production readiness checklist!</strong></a></p><p><em>— Dan</em></p><hr /><h2>📚 Articles</h2><p><strong>🔥 Autoscaling Hid Our LLM Cost Regression (85% → 4% Cache Hit Rate)](<a href="https://ku.bz/7hB5K3Wkn" title="">https://ku.bz/7hB5K3Wkn</a>)</strong></p><p>This case study shows how a single <strong>RAG chunk size change</strong> collapsed vLLM prefix-cache <strong>hit rate</strong> from <strong>85% to 4%</strong>, triggering an 80% GPU replica increase while latency stayed flat.</p><p>It also includes the fix: adding a two-phase <strong>cache replay gate</strong> in <strong>CI</strong>.</p><p><strong>🔥 Mount mayhem at netflix: scaling containers on modern cpus](<a href="https://ku.bz/v1kX9xWXz" title="">https://ku.bz/v1kX9xWXz</a>)</strong></p><p>This article explains how Netflix <strong>traced</strong> severe <strong>container launch slowdowns</strong> to Linux mount lock contention, image layer mount storms, and CPU <strong>architecture differences</strong> while scaling containers on modern Kubernetes infrastructure.</p><p><strong>DocumentDB on Kubernetes: Resilient, Highly Available Databases with Automatic Failover](<a href="https://ku.bz/vczYVnhZ4" title="">https://ku.bz/vczYVnhZ4</a>)</strong></p><p>This article explains how the DocumentDB Kubernetes Operator delivers <strong>high availability</strong> with automatic <strong>failover</strong>, replica promotion, and optional zone, region, and multi-cloud <strong>resilience</strong>.</p><p><strong>We brought Skew Protection to your Kubernetes](<a href="https://ku.bz/LMpclL3PW" title="">https://ku.bz/LMpclL3PW</a>)</strong></p><p>This article explains how Kubernetes <strong>skew protection</strong> routes traffic based on <strong>app version</strong> to prevent frontend and backend mismatches during deployments, and  <strong>version-aware routing</strong> using the Gateway API.</p><p><strong>Keeping Your Security Model Intact When Running VMs in Kubernetes](<a href="https://ku.bz/mggD2nXf6" title="">https://ku.bz/mggD2nXf6</a>)</strong></p><p>This article shows how to maintain <strong>VM-level network security</strong> during <strong>KubeVirt</strong> live <strong>migration</strong> by using <strong>Calico</strong> labels and policy enforcement rather than node or pod IPs.</p><p><strong>Vibe Coding a Kubernetes Media Server: What I Learned About AI-First Engineering](<a href="https://ku.bz/94Y_G5wtb" title="">https://ku.bz/94Y_G5wtb</a>)</strong></p><p>This article explains how <strong>building</strong> a <strong>k3s media server</strong> with Claude Code exposed both the speed and the limits of AI-first engineering across GitOps, observability, storage tuning, and Kubernetes debugging.</p><hr /><p><strong>🌟 <a href="https://ku.bz/7py0zX-ct" title="">Is your app actually ready for Kubernetes?</a></strong></p><p>Kubernetes <strong>production-readiness</strong> checklist to help teams answer that question <strong>before going live</strong>.</p><p>It includes an <strong>interactive checklist</strong> with a detailed breakdown of each check, plus a downloadable <strong>PDF</strong> worksheet you can use with your team.</p><p>→ <a href="https://ku.bz/7py0zX-ct" title=""><strong>Download the checklist</strong></a></p><p><img src="https://assets.learnk8s.io/kubernetes-prod-best-practices-checklist-v3.png" alt="Is your app actually ready for Kubernetes?" title="" /></p><hr /><h2>📖 Tutorials</h2><p><strong>🔥 CloudnativePG: postgres database the modern way](<a href="https://ku.bz/_k3x_Z2-t" title="">https://ku.bz/_k3x_Z2-t</a>)</strong></p><p>This tutorial shows how to <strong>run</strong> highly available <strong>PostgreSQL</strong> on Kubernetes with CloudNativePG and Terraform by <strong>replacing</strong> the traditional <strong>Patroni</strong>, etcd, and HAProxy stack with a simpler operator-driven setup.</p><p><strong>I Added Prometheus, Grafana, and Custom Alerting to My EKS Cluster, Here's How Observability Actually Works](<a href="https://ku.bz/3WfLvwcv0" title="">https://ku.bz/3WfLvwcv0</a>)</strong></p><p>This tutorial shows how to add Prometheus, Grafana, <strong>Alertmanager</strong>, custom metrics, <strong>ServiceMonitors</strong>, dashboards, and alert rules to an <strong>EKS</strong> cluster through <strong>GitOps</strong>.</p><p><strong>CRaC in Production: 88% Faster Spring Boot Startups on Kubernetes](<a href="https://ku.bz/4l_WDB_6R" title="">https://ku.bz/4l_WDB_6R</a>)</strong></p><p>This tutorial shows how <strong>CRaC</strong> can cut Spring Boot <strong>startup time</strong> on Kubernetes from <strong>23 seconds to 2.8 seconds</strong> and explains the real production issues around AWS SDK checkpointing and OpenTelemetry.</p><hr /><h2>📺 This week on the KubeFM podcast</h2><p><strong><a href="https://ku.bz/0mrvCsXrV" title="">The Namespaces Scaling Trap</a></strong></p><hr /><h2>💼 Kubernetes jobs</h2><p><strong><a href="https://ku.bz/9xNmf1y3N" title="">Software Architect</a></strong> 💰 $150K to $180K a year · 🏢 based in the office in Washington, DC, USA</p><p><strong><a href="https://ku.bz/z6HTmszvW" title="">System Administrator</a></strong> 💰 $58.5K to $4.4L a year · 🏢 based in the office in Hyderabad, IN</p><p><strong><a href="https://ku.bz/ZW858R0GM" title="">Support Engineer</a></strong> 💰 $45K to $176K a year · 🌎 remote from</p><p><strong><a href="https://ku.bz/gJ-VFDhr0" title="">Support Engineer</a></strong> 💰 $45K to $176K a year · 🌎 remote from</p><p><strong><a href="https://ku.bz/XXMBj-7BF" title="">Platform Engineer</a></strong> 💰 $1.25L to $3.74L a year · 🏢 based in the office in Hyderabad, IN</p><p>👉 Discover more opportunities on <a href="https://kube.careers" title="">Kube Careers.</a></p><hr /><h2>🛠 Tools and libraries</h2><p><strong>🔥 ayaFlow](<a href="https://ku.bz/m08ygstP6" title="">https://ku.bz/m08ygstP6</a>)</strong></p><p>ayaFlow is an <strong>eBPF-based</strong> Rust tool that runs as a <strong>sidecarless DaemonSet</strong> to capture node-wide network traffic, <strong>expose</strong> metrics, and provide lightweight kernel-level <strong>visibility</strong> for troubleshooting and observability.</p><p><strong>Teleskopio](<a href="https://ku.bz/vKvsg-kwn" title="">https://ku.bz/vKvsg-kwn</a>)</strong></p><p>Teleskopio is a small, open-source Kubernetes web client that provides a clean browser interface for viewing and managing cluster resources without the weight of a full platform dashboard.</p><p><strong>Valkey cluster operator](<a href="https://ku.bz/M2q9_T15T" title="">https://ku.bz/M2q9_T15T</a>)</strong></p><p>Valkey Operator is a Kubernetes operator that <strong>automates deployment and lifecycle management</strong> of Valkey clusters and instances with features like automated installation and configuration management.</p><p><strong>Crossview: Crossplane UI](<a href="https://ku.bz/0PvW1jHdj" title="">https://ku.bz/0PvW1jHdj</a>)</strong></p><p>Crossview is a <strong>React-based dashboard</strong> for managing and monitoring Crossplane resources in Kubernetes with features like:</p><ul><li><p>resource visualization,</p></li><li><p>search capabilities,</p></li><li><p><strong>SSO</strong> support,</p></li><li><p>and deployment via Helm or Kubernetes manifests.</p></li></ul><p><strong>🔥 Kubeinvaders](<a href="https://ku.bz/chMMB0vF" title="">https://ku.bz/chMMB0vF</a>_)</strong></p><p>With k-inv, you can <strong>stress</strong> a Kubernetes cluster in a fun way and check its <strong>resilience</strong> by playing <strong>space invaders.</strong></p><h3>More projects</h3><ul><li><p><a href="https://ku.bz/0KwJPmTj3" title="">Ingress NGINX Migration</a></p></li><li><p><a href="https://ku.bz/MzcHBsY_d" title="">kubevirt-benchmark</a></p></li><li><p><a href="https://ku.bz/9rDdrr363" title="">OpenChoreo</a></p></li></ul><hr /><h2>📅 Upcoming Kubernetes events</h2><p><strong><a href="https://ku.bz/Nd8J2WTtV" title="">🔥 Cloud Native Days Italy 2026</a></strong> 📅 May 18</p><p><strong><a href="https://ku.bz/dgWg1PxSn" title="">🔥 Advanced Kubernetes course (London)</a></strong> 📅 May 18</p><p><strong><a href="https://ku.bz/cDhS27dhz" title="">🔥 Advanced Kubernetes course (Boston)</a></strong> 📅 May 18</p><p><strong><a href="https://ku.bz/QYm0G6RXN" title="">🌟 Kubernetes Community Days Toronto Canada 2026</a></strong> 📅 May 13</p><p><strong><a href="https://ku.bz/D9vm6YY0F" title="">🌟 Kubernetes Community Days Texas 2026</a></strong> 📅 May 15</p><p>👉 You can find more events on <a href="https://kube.events" title="">Kube Events.</a></p><hr /><h2>📢 Call for papers closing soon</h2><p><strong><a href="https://ku.bz/LNsV_WGtk" title="">🔥 Kubernetes Community Days Lima 2026</a></strong> ⏳ <em>closes May 19</em></p><p><strong><a href="https://ku.bz/pxrqd9zHV" title="">🔥 Cloud Native Days Norway</a></strong> ⏳ <em>closes Jun 1</em></p><p><strong><a href="https://ku.bz/zJnQvbW4F" title="">🔥 KubeCon + CloudNativeCon North America 2026</a></strong> ⏳ <em>closes Jun 1</em></p><p><strong><a href="https://ku.bz/JTbTchKw4" title="">🔥 Dutch Cloud Native Day</a></strong> ⏳ <em>closes Jun 22</em></p><p><strong><a href="https://ku.bz/2qtpBDcyJ" title="">Devopsdays Feira de Santana</a></strong> ⏳ <em>closes Jun 4</em></p><p><strong><a href="https://ku.bz/k84xzzhxj" title="">Devopsdays Curitiba</a></strong> ⏳ <em>closes Jun 4</em></p><p><strong><a href="https://ku.bz/LwdX0YkNf" title="">🌟 Heapcon 2026</a></strong> ⏳ <em>closes Jun 1</em></p><p><strong><a href="https://ku.bz/vR6Z290h2" title="">TechEx North America</a></strong> ⏳ <em>closes May 17</em></p><p><strong><a href="https://ku.bz/0QcvjfWc-" title="">DevOpsDays Istanbul 2026</a></strong> ⏳ <em>closes May 31</em></p><p>👉 You can find more Call for Papers on <a href="https://kube.events/call-for-papers" title="">Kube Events.</a></p><hr /><p>Thanks for reading.</p><p>See you next week!</p><p><em>— Gulcan</em></p>]]></content><author><name>Kube Today</name></author></entry><entry><title>⎈ Self-Healing Registry Mirror, Migrating to Fly.io, Kubeshark Packet Visibility, Temporal.io in Production, Tracking Kubernetes Costs</title><link href="https://kube.today/issues/182"/><id>https://kube.today/issues/182</id><published>2026-05-06T09:00:00Z</published><content type="html"><![CDATA[<p>This newsletter is brought to you by <a href="https://ku.bz/hypSbyc-V" title="">LearnKube — master Kubernetes with hands-on training designed for engineers who want to learn the smart way.</a></p><hr /><p>Hi,</p><p>We published a new page for companies <a href="https://learnkube.com/for-marketers" title=""><strong>interested in working with us, LearnKube.</strong></a>!</p><p>We want to keep creating <strong>ambitious technical education for Kubernetes and platform engineering teams,</strong> and already have ideas we’d like to develop around AI infrastructure, Kubernetes resource optimization, platform engineering, and general Kubernetes education.</p><p>If your company wants to support these efforts and reach Kubernetes practitioners with useful technical content, get in touch!</p><p><em>— Dan</em></p><hr /><h2>📚 Articles</h2><p><strong>We built a self-healing registry mirror (because Docker hub rate limits are no fun)](<a href="https://ku.bz/R-8sWZ7NS" title="">https://ku.bz/R-8sWZ7NS</a>)</strong></p><p>This article shows how to build a self-healing <strong>registry mirror</strong> on <strong>GKE</strong> with zot and automation that copies remote images locally and <strong>rewrites</strong> deployments to avoid Docker Hub rate limits and ImagePullBackOff failures.</p><p><strong>Our Kubernetes Cluster Was Costing $14,850/Month. We Moved to Fly.io for $680.](<a href="https://ku.bz/YVgVVrTqQ" title="">https://ku.bz/YVgVVrTqQ</a>)</strong></p><p>This is a war story about a <strong>3-person startup</strong> that replaced a <strong>$14,850/month</strong> over-engineered Kubernetes setup on <strong>AWS</strong> with <strong>Fly.io</strong> for $680, cutting <strong>P99</strong> latency from 320ms to 180ms and <strong>deploy time</strong> from 8 minutes to 45 seconds.</p><p><strong>Kubeshark: Making Packet Level Visibility in Kubernetes](<a href="https://ku.bz/Sg1y678cP" title="">https://ku.bz/Sg1y678cP</a>)</strong></p><p>This article explains how Kubeshark provides <strong>packet-level visibility</strong> in Kubernetes by capturing live pod traffic, decoding <strong>protocols</strong> such as HTTP and gRPC, and <strong>mapping</strong> requests back to workloads for debugging.</p><p><strong>Running Temporal.io on Kubernetes in Production — What Nobody Tells You](<a href="https://ku.bz/9fV6WBMLP" title="">https://ku.bz/9fV6WBMLP</a>)</strong></p><p>This article explains how to run <strong>Temporal</strong> on Kubernetes in production, covering <strong>GKE</strong> deployment, Cassandra <strong>repair and backups</strong>, Istio mTLS, resource sizing, <strong>PodDisruptionBudgets</strong>, and Prometheus-based monitoring.</p><p><strong>What 6 Months of Tracking a Production OpenShift Cluster Revealed About Kubernetes Costs](<a href="https://ku.bz/_JW351wS0" title="">https://ku.bz/_JW351wS0</a>)</strong></p><p>This article explains what six months of production OpenShift <strong>cost tracking</strong> revealed, including a <strong>24 to 30 percent</strong> non-allocatable <strong>CPU tax</strong> and how infrastructure overhead can consume most cluster capacity before app workloads even start.</p><p><strong>Orchestrating Secure AI Agents on Amazon EKS](<a href="https://ku.bz/lyr0QGf1f" title="">https://ku.bz/lyr0QGf1f</a>)</strong></p><p>This case study shows how Unitary built Osmia, an open-source <strong>orchestration layer</strong> on <strong>EKS</strong> to run autonomous AI <strong>coding agents</strong> safely at scale using pod isolation, Karpenter, <strong>IRSA</strong>-based secrets, and real-time trajectory scoring.</p><hr /><p><strong>🌟 <a href="https://learnkube.com/for-marketers" title="">Some LearnKube projects are too large to make alone</a></strong></p><p><strong>We want to keep creating ambitious technical education for Kubernetes and platform engineering teams.</strong></p><p>If your company wants to partner on creating useful content and reach Kubernetes engineers who value technical depth, get in touch.</p><p>→ <a href="https://learnkube.com/for-marketers" title=""><strong>Learn more</strong></a></p><p><img src="https://res.cloudinary.com/learnk8s/image/upload/v1777884834/for-marketers_jcjnef.png" alt="Some LearnKube projects are too large to make alone" title="" /></p><hr /><h2>📖 Tutorials</h2><p><strong>LLMs on Kubernetes: The Easy Way](<a href="https://ku.bz/rMGbd9tnz" title="">https://ku.bz/rMGbd9tnz</a>)</strong></p><p>This tutorial shows how to run an <strong>open source LLM</strong> on OpenShift with <strong>Red Hat AI Inference Server</strong> based on vLLM, using a PVC, GPU-backed deployment, OpenAI-compatible endpoint, model switching, and an optional <strong>AnythingLLM</strong> UI.</p><p><strong>Kubernetes Gateway API on EKS Exposed via ALB](<a href="https://ku.bz/2drG48dk5" title="">https://ku.bz/2drG48dk5</a>)</strong></p><p>This tutorial shows how to set up Kubernetes <strong>Gateway API</strong> on <strong>EKS</strong> using <strong>Istio</strong> Ambient Mesh exposed through AWS ALB, with Terraform, ArgoCD, and a layered architecture separating infra from app deployment.</p><p><strong>Designing an Elastic Kubernetes Platform on VMware vSphere with Cluster API and Cluster Autoscaler](<a href="https://ku.bz/1tGfK1hSF" title="">https://ku.bz/1tGfK1hSF</a>)</strong></p><p>This tutorial teaches how to <strong>build</strong> an elastic Kubernetes platform on VMware <strong>vSphere</strong> using Cluster API, Talos, and <strong>Cluster Autoscaler</strong> for declarative provisioning and automatic node scaling.</p><hr /><h2>📺 This week on the KubeFM podcast</h2><p><strong><a href="https://ku.bz/y70mLvWNs" title="">AI Agents Running Kubernetes</a></strong></p><hr /><h2>💼 Kubernetes jobs</h2><p><strong><a href="https://ku.bz/1VsKMSzYl" title="">Support Engineer</a></strong> 💰 $72K to $224.4K a year · 🌎 remote from</p><p><strong><a href="https://ku.bz/_MRlLh05q" title="">Software Engineer</a></strong> 💰 $130K to $280K a year · 🏢 based in the office (and remote from home) in San Mateo, CA, USA</p><p><strong><a href="https://ku.bz/m3RW2s4P7" title="">Platform Engineer</a></strong> 💰 $160K to $200K a year · 🏢 based in the office (and remote from home) in New York, NY, USA</p><p><strong><a href="https://ku.bz/3dHMPCzfT" title="">Test Automation Engineer</a></strong> 💰 $93.1K to $167.7K a year · 🏢 based in the office in Aurora, CO, USA</p><p><strong><a href="https://ku.bz/HBSzmSQ1D" title="">DevOps Engineer</a></strong> 💰 $47.97K to $242K a year · 🌎 remote from</p><p>👉 Discover more opportunities on <a href="https://kube.careers" title="">Kube Careers.</a></p><hr /><h2>🛠 Tools and libraries</h2><p><strong>PII-Shield](<a href="https://ku.bz/V2B6Gqksv" title="">https://ku.bz/V2B6Gqksv</a>)</strong></p><p>PII-Shield is a <strong>sidecar</strong> that <strong>sanitizes logs</strong> before they leave the pod by <strong>detecting</strong> secrets and <strong>personal data</strong>, preserving JSON structure, and supporting Helm based deployment..</p><p><strong>🔥 Kubebuilder](<a href="https://ku.bz/_j-Y09TWS" title="">https://ku.bz/_j-Y09TWS</a>)</strong></p><p>Kubebuilder is a Kubernetes SIGs <strong>framework</strong> for <strong>building CRDs</strong>, controllers, and <strong>admission webhooks</strong> in <strong>Go</strong> with scaffolding, plugins, and controller-runtime based libraries that reduce boilerplate for operator development.</p><p><strong>Kube-Argus](<a href="https://ku.bz/9xF8w3hc9" title="">https://ku.bz/9xF8w3hc9</a>)</strong></p><p>Kube-Argus is a single-binary Kubernetes <strong>dashboard</strong> that combines live cluster state, <strong>log streaming</strong>, YAML editing, drain workflows, <strong>cost analysis</strong>, and AI-assisted <strong>diagnosis</strong> in one web interface.</p><p><strong>Kubetest4j](<a href="https://ku.bz/szMqWK2f3" title="">https://ku.bz/szMqWK2f3</a>)</strong></p><p>Kubetest4j is a <strong>Java library</strong> for <strong>testing</strong> Kubernetes deployments and operators with Fabric8, JUnit support, resource cleanup, multi-cluster testing, and built-in log and <strong>metrics collection</strong>.</p><p><strong>Chartpack](<a href="https://ku.bz/rZnN8ZyWr" title="">https://ku.bz/rZnN8ZyWr</a>)</strong></p><p>Chartpack is an opinionated <strong>Helm chart</strong> that lets you deploy many Kubernetes workload types from <strong>one values file</strong>, with built-in <strong>networking</strong>, autoscaling, observability, secrets, and <strong>GitOps</strong> support.</p><h3>More projects</h3><ul><li><p><a href="https://ku.bz/vDzfSRkST" title="">Tilt</a></p></li><li><p><a href="https://ku.bz/cSX5czD5y" title="">Siclaw</a></p></li><li><p><a href="https://ku.bz/fy2bXhv9X" title="">SOPS Operator: secrets management</a></p></li><li><p><a href="https://ku.bz/wjgnKV07S" title="">vRouter Operator: Kubernetes operator for managing VyOS virtual routers</a></p></li><li><p><a href="https://ku.bz/RzQz-MqdK" title="">VictoriaMetrics/log-collectors-benchmark</a></p></li></ul><hr /><h2>📅 Upcoming Kubernetes events</h2><p><strong><a href="https://ku.bz/Pq7VPTk8l" title="">🔥 SREday Austin 2026</a></strong> 📅 May 6</p><p><strong><a href="https://ku.bz/Lzy-bG4D1" title="">Owning the Stack: Why Building a Private Automation Engine was Easier (and Harder) Than I Thought</a></strong> 📅 May 7</p><p><strong><a href="https://ku.bz/FFwWf4GPV" title="">🔥 Confidential Computing with CoCo and Kata</a></strong> 📅 May 7</p><p><strong><a href="https://ku.bz/dqDpJ9vD0" title="">🔥 DevOpsCon London</a></strong> 📅 May 11</p><p><strong><a href="https://ku.bz/QYm0G6RXN" title="">🌟 Kubernetes Community Days Toronto Canada 2026</a></strong> 📅 May 13</p><p><strong><a href="https://ku.bz/y_BhFMftb" title="">🔥 Advanced Kubernetes course</a></strong> 📅 Jun 11</p><p>👉 You can find more events on <a href="https://kube.events" title="">Kube Events.</a></p><hr /><h2>📢 Call for papers closing soon</h2><p><strong><a href="https://ku.bz/LNsV_WGtk" title="">🔥 Kubernetes Community Days Lima 2026</a></strong> ⏳ <em>closes May 19</em></p><p><strong><a href="https://ku.bz/pxrqd9zHV" title="">🔥 Cloud Native Days Norway</a></strong> ⏳ <em>closes Jun 1</em></p><p><strong><a href="https://ku.bz/zJnQvbW4F" title="">🔥 KubeCon + CloudNativeCon North America 2026</a></strong> ⏳ <em>closes Jun 1</em></p><p><strong><a href="https://ku.bz/JTbTchKw4" title="">🔥 Dutch Cloud Native Day</a></strong> ⏳ <em>closes Jun 22</em></p><p><strong><a href="https://ku.bz/2qtpBDcyJ" title="">Devopsdays Feira de Santana</a></strong> ⏳ <em>closes Jun 4</em></p><p><strong><a href="https://ku.bz/k84xzzhxj" title="">Devopsdays Curitiba</a></strong> ⏳ <em>closes Jun 4</em></p><p><strong><a href="https://ku.bz/LwdX0YkNf" title="">🌟 Heapcon 2026</a></strong> ⏳ <em>closes Jun 1</em></p><p><strong><a href="https://ku.bz/vR6Z290h2" title="">TechEx North America</a></strong> ⏳ <em>closes May 17</em></p><p><strong><a href="https://ku.bz/0QcvjfWc-" title="">DevOpsDays Istanbul 2026</a></strong> ⏳ <em>closes May 31</em></p><p>👉 You can find more Call for Papers on <a href="https://kube.events/call-for-papers" title="">Kube Events.</a></p><hr /><p>Thank you for reading. See you next week!</p><p><em>— Gulcan</em></p>]]></content><author><name>Kube Today</name></author></entry><entry><title>⎈ Benchmarking Log Collectors, ListenerSet in Gateway API v1.5, eBPF GPU Monitoring, Modernizing Image Promoter, Zero-Downtime Disk Migration</title><link href="https://kube.today/issues/181"/><id>https://kube.today/issues/181</id><published>2026-04-29T09:00:00Z</published><content type="html"><![CDATA[<p>This issue is brought to you by <a href="https://ku.bz/_n4B_yTWF" title="">Dash0 — OpenTelemetry-native observability that takes minutes, not months. Full visibility into your logs, metrics, and traces with no lock-in and transparent pricing.</a></p><hr /><h2>📚 Articles</h2><p><strong>🔥 Benchmarking Kubernetes log collectors: vlagent, Vector, Fluent Bit, OpenTelemetry collector, and more](<a href="https://ku.bz/4Lf8MjBYz" title="">https://ku.bz/4Lf8MjBYz</a>)</strong></p><p>This article <strong>compares</strong> major Kubernetes <strong>log collectors</strong> with a reproducible <strong>benchmark</strong> focused on:</p><ul><li><p>throughput,</p></li><li><p>CPU,</p></li><li><p>memory,</p></li><li><p>and <strong>log loss</strong> under production-like load.</p></li></ul><p><strong>🌟 Understanding OpenTelemetry Support in kgateway](<a href="https://ku.bz/ZRwVYYp5Y" title="">https://ku.bz/ZRwVYYp5Y</a>)</strong></p><p>This article analyzes how <strong>kgateway</strong> handles <strong>OpenTelemetry observability</strong> across traces, logs, and metrics, covering <strong>signal quality</strong>, semantic conventions, and what works well versus where it falls short for platform teams.</p><p><strong>🔥 Exploring ListenerSet in Gateway API v1.5](<a href="https://ku.bz/s-5QsVS_T" title="">https://ku.bz/s-5QsVS_T</a>)</strong></p><p>This article explains how <strong>ListenerSet</strong> in <strong>Gateway API</strong> v1.5 separates listeners from Gateways so teams can restore self-service <strong>TLS</strong> management across namespaces and scale beyond the old <strong>listener limit</strong>.</p><p><strong>🔥 X-Ray Vision for GPUs: eBPF Monitoring on Kubernetes](<a href="https://ku.bz/dH51_VM47" title="">https://ku.bz/dH51_VM47</a>)</strong></p><p>This article explains how to <strong>monitor GPU inference</strong> nodes on Kubernetes with <strong>eBPF</strong> and <strong>bpftrace</strong> by tracing NVIDIA <strong>driver calls</strong>, kernel behavior, and DaemonSet-based deployment patterns.</p><p><strong>The Invisible Rewrite: Modernizing the Kubernetes Image Promoter](<a href="https://ku.bz/b87XYdmQY" title="">https://ku.bz/b87XYdmQY</a>)</strong></p><p>This article explains how the Kubernetes <strong>Image Promoter</strong> was rewritten to improve <strong>rate limiting</strong>, observability, and resilience in the pipeline that publishes and signs images for <strong>registry</strong>·k8s·io.</p><p><strong>In-place PVC re-binding: zero-downtime disk migration on Kubernetes](<a href="https://ku.bz/wfZ7x_6ZG" title="">https://ku.bz/wfZ7x_6ZG</a>)</strong></p><p>This case study explains how to <strong>migrate</strong> bound Kubernetes <strong>volumes</strong> from deprecated in-tree Azure Disk provisioning <strong>to CSI</strong> with <strong>in-place PVC re-binding</strong>, minimal restarts, and no data loss across production disks.</p><p><strong>Kubernetes Optimization Beyond Requests and Limits — Node Scaling Blockers](<a href="https://ku.bz/MVjlJVQ99" title="">https://ku.bz/MVjlJVQ99</a>)</strong></p><p>This article explains why <strong>reducing</strong> requests and limits <strong>does not always</strong> lower Kubernetes cost, and shows how <strong>node scale-down</strong> blockers can keep <strong>autoscalers</strong> from actually removing idle infrastructure.</p><h3>More articles</h3><ul><li><p><a href="https://ku.bz/Ld5N3YcfS" title="">Two Production Incidents That Taught Me More Than Any Course</a></p></li></ul><hr /><p><strong>🌟 <a href="https://ku.bz/wWy4bc2LT" title="">"Supports OpenTelemetry" means nothing anymore.</a></strong></p><p><strong>75% of organizations</strong> run or evaluate <strong>OTel</strong>, yet two projects can both claim support while delivering completely different results.</p><p>A proposed 7-dimension <strong>maturity model</strong> finally gives platform teams a shared language to tell them apart.</p><p><a href="https://ku.bz/wWy4bc2LT" title="">→ <strong>Read the proposal</strong></a></p><p><img src="https://assets.learnk8s.io/dash0-ad.v1.png" alt=""Supports OpenTelemetry" means nothing anymore." title="" /></p><hr /><h2>📖 Tutorials</h2><p><strong>🌟 Teach Your AI Coding Agent OpenTelemetry Best Practices with Dash0 Agent Skills](<a href="https://ku.bz/2WPVqWw1z" title="">https://ku.bz/2WPVqWw1z</a>)</strong></p><p>This guide shows how to use <strong>Dash0 Agent Skills</strong> to give AI coding agents like Claude Code, Cursor, and Windsurf proper <strong>OpenTelemetry knowledge</strong>, covering <strong>instrumentation</strong> across 10 languages, <strong>Collector configuration</strong>, semantic conventions, and Kubernetes deployment patterns.</p><p><strong>🔥 From Docker Compose to Kubernetes on AWS: A Hands-On Migration Story](<a href="https://ku.bz/rvSsYfPFd" title="">https://ku.bz/rvSsYfPFd</a>)</strong></p><p>This tutorial walks through <strong>moving</strong> a <strong>five-service Java app</strong> from Docker Compose to <strong>Kubernetes</strong> on <strong>AWS</strong> by rebuilding networking, secrets, ingress, persistence, and <strong>service discovery</strong> step by step.</p><p><strong>🔥 Connecting Multi-Cloud Applications with Cilium](<a href="https://ku.bz/_jjQb111R" title="">https://ku.bz/_jjQb111R</a>)</strong></p><p>This tutorial explains how to <strong>connect applications</strong> across <strong>AWS and GCP</strong> Kubernetes clusters with <strong>Cilium Cluster Mesh</strong>, VPN networking, and VXLAN to enable east-west multi-cluster communication.</p><p><strong>Automated GitOps: from ECR push to EKS deploy](<a href="https://ku.bz/x-bFkZhmM" title="">https://ku.bz/x-bFkZhmM</a>)</strong></p><p>This tutorial shows how to <strong>automate</strong> EKS deployments with <strong>Argo</strong> CD, Argo CD Image Updater, GitHub, and Amazon <strong>ECR</strong> so new container images flow to the cluster through GitOps without manual deployment steps.</p><p><strong>🔥 Build a Kubernetes Cluster at Home with Raspberry Pis](<a href="https://ku.bz/-_SRhRNQk" title="">https://ku.bz/-_SRhRNQk</a>)</strong></p><p>This tutorial teaches how to <strong>build</strong> a home Kubernetes <strong>cluster</strong> with four <strong>Raspberry Pis</strong>, including <strong>network design</strong>, NAT, <strong>DHCP</strong>, Ubuntu setup, and worker node connectivity on a <strong>private subnet</strong>.</p><hr /><h2>📺 This week on the KubeFM podcast</h2><p><strong><a href="https://ku.bz/TGy4Qn7Qs" title="">SaaS with Kubernetes Operators and Garbage Collection</a></strong></p><hr /><h2>💼 Kubernetes jobs</h2><p><strong><a href="https://ku.bz/HbB8Tp-LL" title="">Support Engineer</a></strong> 💰 $45K to $176K a year · 🌎 remote from</p><p><strong><a href="https://ku.bz/QjGblLJ_k" title="">DevOps Engineer</a></strong> 💰 PLN 17.82K to PLN 521.84K a year · 🌎 remote from</p><p><strong><a href="https://ku.bz/qmzvjQ4mH" title="">Software Engineer</a></strong> 💰 $9 to $533.5K a year · 🌎 remote from</p><p><strong><a href="https://ku.bz/6vlM63kQW" title="">DevOps Engineer</a></strong> 💰 $120K to $150K a year · 🏢 based in the office (and remote from home) in Toronto, CA</p><p><strong><a href="https://ku.bz/t4hpfLN7y" title="">Platform Engineer</a></strong> 💰 $139K to $220K a year · 🏢 based in the office (and remote from home) in Livingston, NJ, USA</p><p>👉 Discover more opportunities on <a href="https://kube.careers" title="">Kube Careers.</a></p><hr /><h2>🛠 Tools and libraries</h2><p><strong>🔥 KubeAttention](<a href="https://ku.bz/h55vGmVjM" title="">https://ku.bz/h55vGmVjM</a>)</strong></p><p>KubeAttention is a Kubernetes <strong>scheduler plugin</strong> that uses <strong>eBPF telemetry</strong> and <strong>machine learning</strong> to place <strong>latency sensitive</strong> pods on nodes with lower contention from <strong>noisy neighbors</strong>.</p><p><strong>K8s cleaner](<a href="https://ku.bz/0m2jdQzWx" title="">https://ku.bz/0m2jdQzWx</a>)</strong></p><p>K8s cleaner is a <strong>controller</strong> that identifies, removes, or updates <strong>stale/orphaned</strong> or unhealthy resources in a Kubernetes cluster.</p><p><strong>YAML Schema Router](<a href="https://ku.bz/gmKPq8tVs" title="">https://ku.bz/gmKPq8tVs</a>)</strong></p><p>YAML Schema Router is a <strong>proxy</strong> for <strong>yaml-language-server</strong> that <strong>detects</strong> YAML file types from content and path, then <strong>injects</strong> the right <strong>JSON schema</strong> for better <strong>validation</strong> in editors like Neovim, Helix, and Emacs.</p><p><strong>Sympozium](<a href="https://ku.bz/Myt3WxhGT" title="">https://ku.bz/Myt3WxhGT</a>)</strong></p><p>Sympozium <strong>runs AI agents</strong> as <strong>isolated</strong> pods with CRDs, Jobs, RBAC, and network policies, so teams can <strong>orchestrate</strong> agent workflows and let agents diagnose or <strong>remediate</strong> cluster issues safely.</p><p><strong>Kelos](<a href="https://ku.bz/CC0Gj_hC2" title="">https://ku.bz/CC0Gj_hC2</a>)</strong></p><p>Kelos runs <strong>autonomous coding agents</strong> as Kubernetes resources, with tasks, workspaces, reusable agent configs, and trigger-based <strong>task spawners</strong> for continuous software workflows.</p><h3>More projects</h3><ul><li><p><a href="https://ku.bz/T-j7BM-H3" title="">Skiperator: Kubernetes operator for simpler application platform setup</a></p></li><li><p><a href="https://ku.bz/JC2kbCg1X" title="">Audicia</a></p></li><li><p><a href="https://ku.bz/qnbH0j751" title="">KubeUser</a></p></li><li><p><a href="https://ku.bz/vXR4B9QVz" title="">Omni Infrastructure Provider for Proxmox</a></p></li><li><p><a href="https://ku.bz/yr03sXHnv" title="">Telescope</a></p></li></ul><hr /><h2>📅 Upcoming Kubernetes events</h2><p><strong><a href="https://ku.bz/tvwXDWnhs" title="">🌟 Devopsdays Raleigh</a></strong> 📅 Apr 30</p><p><strong><a href="https://ku.bz/3-sLg2MN6" title="">🔥 How We Tamed Inference with Kubernetes and Open Source Muscle, Kubernetes Network Policies Done Right</a></strong> 📅 May 2</p><p><strong><a href="https://ku.bz/XzFlK3BvY" title="">Devopsdays Austin</a></strong> 📅 May 5</p><p><strong><a href="https://ku.bz/PPwWMM2nX" title="">Devopsdays Zurich</a></strong> 📅 May 6</p><p><strong><a href="https://ku.bz/Pq7VPTk8l" title="">🔥 SREday Austin 2026</a></strong> 📅 May 6</p><p><strong><a href="https://ku.bz/y_BhFMftb" title="">🔥 Advanced Kubernetes course</a></strong> 📅 Jun 11</p><p>👉 You can find more events on <a href="https://kube.events" title="">Kube Events.</a></p><hr /><h2>📢 Call for papers closing soon</h2><p><strong><a href="https://ku.bz/LNsV_WGtk" title="">🔥 Kubernetes Community Days Lima 2026</a></strong> ⏳ <em>closes May 19</em></p><p><strong><a href="https://ku.bz/kRYhSQhMq" title="">🔥 KubeCon China 2026</a></strong> ⏳ <em>closes May 3</em></p><p><strong><a href="https://ku.bz/pxrqd9zHV" title="">🔥 Cloud Native Days Norway</a></strong> ⏳ <em>closes Jun 1</em></p><p><strong><a href="https://ku.bz/zJnQvbW4F" title="">🔥 KubeCon + CloudNativeCon North America 2026</a></strong> ⏳ <em>closes Jun 1</em></p><p><strong><a href="https://ku.bz/JTbTchKw4" title="">🔥 Dutch Cloud Native Day</a></strong> ⏳ <em>closes Jun 22</em></p><p><strong><a href="https://ku.bz/2qtpBDcyJ" title="">Devopsdays Feira de Santana</a></strong> ⏳ <em>closes Jun 4</em></p><p><strong><a href="https://ku.bz/XLYQNxkqk" title="">SREday NYC 2026</a></strong> ⏳ <em>closes May 1</em></p><p><strong><a href="https://ku.bz/k84xzzhxj" title="">Devopsdays Curitiba</a></strong> ⏳ <em>closes Jun 4</em></p><p><strong><a href="https://ku.bz/8qs6lvgwN" title="">🌟 Devopsdays Berlin</a></strong> ⏳ <em>closes May 3</em></p><p>👉 You can find more Call for Papers on <a href="https://kube.events/call-for-papers" title="">Kube Events.</a></p><hr /><p>Thanks for reading. See you next week!</p><p><em>— Gulcan</em></p>]]></content><author><name>Kube Today</name></author></entry><entry><title>⎈ Distributed LLM Inference Challenges, Model Serving with Ray, Lazy Image Pulling, eBPF Based Bandwidth Limiting, Slurm on Kubernetes</title><link href="https://kube.today/issues/180"/><id>https://kube.today/issues/180</id><published>2026-04-22T09:00:00Z</published><content type="html"><![CDATA[<p>This newsletter is brought to you by <a href="https://ku.bz/sjN4qdbrL" title="">Portworx. Automate, protect, and unify data for modern applications across on-premises, public, and hybrid cloud environments.</a></p><hr /><h2>📚 Articles</h2><p><strong>🔥 Hidden Infrastructure Challenges in Distributed LLM Inference on Kubernetes](<a href="https://ku.bz/tS4gS33XK" title="">https://ku.bz/tS4gS33XK</a>)</strong></p><p>This article explains why distributed <strong>LLM inference</strong> on Kubernetes is <strong>so hard to get right</strong>: your GPUs and network cards need to be <strong>physically</strong> close on the same <strong>PCIe switch</strong>, but Kubernetes pairs them at random and kills your <strong>RDMA performance</strong>.</p><p><strong>🌟 How Kubernetes Storage Actually Works](<a href="https://ku.bz/LmM4Kzv3b" title="">https://ku.bz/LmM4Kzv3b</a>)</strong></p><p>This article explains how <strong>storage</strong> works in Kubernetes and covers <strong>PersistentVolumes</strong>, StorageClasses, <strong>CSI drivers</strong>, snapshots, backup and <strong>disaster recovery</strong> for stateful workloads.</p><p><strong>Simplifying Model Serving with Kubernetes and Ray: Inside DoubleVerify’s ML Platform](<a href="https://ku.bz/ZbC7YhJgl" title="">https://ku.bz/ZbC7YhJgl</a>)</strong></p><p>This case study shows how DoubleVerify built a Kubernetes and <strong>Ray serving platform</strong> to deploy and scale <strong>ML models</strong> in production.</p><p>It also covers RayService <strong>wrapped</strong> with Helm, <strong>fault tolerance</strong> with external Redis, and platform gains like <strong>30% lower GPU cost</strong>.</p><p><strong>🔥 Lazy-pulling container images: a deep dive into OCI seekability](<a href="https://ku.bz/9HP0mcNrr" title="">https://ku.bz/9HP0mcNrr</a>)</strong></p><p>This article covers:</p><ul><li><p>why OCI container layers resist random access due to <strong>DEFLATE dependency chains</strong>,</p></li><li><p>benchmarks eStargz, SOCI, Nydus, and cloud-managed <strong>lazy-pulling approaches</strong>,</p></li><li><p>how FUSE-based lazy pulling <strong>shifts cost</strong> from pull to runtime.</p></li></ul><p><strong>🔥 Building eBPF-Based Bandwidth Limiting in AWS Network Policy Agent — Why Vibe Coding Isn’t Enough](<a href="https://ku.bz/KlSSnd0gm" title="">https://ku.bz/KlSSnd0gm</a>)</strong></p><p>This article walks you through <strong>building</strong> EDT-based <strong>eBPF bandwidth limiting</strong> in the AWS Network Policy Agent, showing where AI-generated code silently broke and how <strong>domain knowledge</strong> caught each bug.</p><p><strong>Slurm on Kubernetes (SUNK): Modernizing HPC and AI workload management](<a href="https://ku.bz/TMyTCcWG0" title="">https://ku.bz/TMyTCcWG0</a>)</strong></p><p>This article explains how Slurm on Kubernetes combines <strong>Slurm job scheduling</strong> with Kubernetes orchestration so AI and HPC teams can modernize <strong>GPU-heavy</strong> infrastructure without forcing researchers into raw Kubernetes workflows.</p><hr /><p><strong>🌟 <a href="https://ku.bz/sjN4qdbrL" title="">The Voice of Kubernetes Report 2026</a></strong></p><p>Where is Kubernetes <strong>headed</strong> in 2026?</p><p>519 infrastructure teams share what <strong>workloads</strong> they're running, where <strong>backup</strong> and <strong>DR</strong> is still the biggest gap, and what the <strong>next 5 years</strong> look like.</p><p>→ <a href="https://ku.bz/sjN4qdbrL" title=""><strong>Download the report</strong></a></p><p><img src="https://assets.learnk8s.io/k8s-experts-report.v1.png" alt="The Voice of Kubernetes Report 2026" title="" /></p><hr /><h2>📖 Tutorials</h2><p><strong>🌟 [Webinar]Virtualization Reimagined: How to Escape Your Rising VM Costs](<a href="https://ku.bz/XwMBZc1tL" title="">https://ku.bz/XwMBZc1tL</a>)</strong></p><p>This webinar by Portworx covers how Everpure <strong>migrated</strong> <strong>5,000+ VMs</strong> onto Kubernetes using KubeVirt and Portworx to <strong>cut</strong> legacy <strong>virtualization costs</strong> and unify VM and container workloads.</p><p>→ <a href="https://ku.bz/XwMBZc1tL" title=""><strong>Sign up here</strong></a></p><p><strong>🔥 Hardware-backed TLS certificates with cert-manager and yubihsm 2](<a href="https://ku.bz/b9GlYRS88" title="">https://ku.bz/b9GlYRS88</a>)</strong></p><p>This tutorial teaches how to build a <strong>cert-manager external issuer</strong> that uses a <strong>YubiHSM 2</strong> to sign TLS certificates via Go's crypto.Signer interface.</p><p><strong>Mastering KEDA on GKE: A Deep Dive into Event-Driven Autoscaling](<a href="https://ku.bz/1SCR2mSFR" title="">https://ku.bz/1SCR2mSFR</a>)</strong></p><p>This tutorial explains how to use <strong>KEDA</strong> on <strong>GKE</strong> to autoscale workloads based on <strong>event-driven signals</strong> rather than just CPU or memory.</p><p><strong>Freezing Spark Drivers to Zero Resources and Waking Them in 300 Milliseconds](<a href="https://ku.bz/b-Pr2FHkt" title="">https://ku.bz/b-Pr2FHkt</a>)</strong></p><p>This article explains how <strong>Spark Connect</strong>, CRIU, and ZeroPod can <strong>freeze</strong> idle Spark drivers to near-zero resources and <strong>restore</strong> full session state in about <strong>300 milliseconds</strong> on Kubernetes.</p><p><strong>🔥 ing-switch: Migrate from Ingress NGINX to Traefik or Gateway API in Minutes, Not Days](<a href="https://ku.bz/qFPc-WPPg" title="">https://ku.bz/qFPc-WPPg</a>)</strong></p><p>This article introduces <strong>ing-switch</strong>, a tool that <strong>scans</strong> Kubernetes ingress resources and helps teams <strong>migrate</strong> from Ingress NGINX to Traefik or Gateway API by <strong>mapping</strong> annotations and showing <strong>compatibility gaps</strong>.</p><hr /><h2>📺 This week on the KubeFM podcast</h2><p><strong><a href="https://ku.bz/czrCCXSLt" title="">What Hip-Hop Can Teach Us About Kubernetes</a></strong></p><hr /><h2>💼 Kubernetes jobs</h2><p><strong><a href="https://ku.bz/cLQGCWCBT" title="">Machine Learning Engineer</a></strong> 💰 $135K to $393.25K a year · 🏢 based in the office in Palo Alto, CA, USA</p><p><strong><a href="https://ku.bz/Q1B4-RhpP" title="">Software Engineer</a></strong> 💰 $23.76K to $125.4K a year · 🏢 based in the office in Lima, PE</p><p><strong><a href="https://ku.bz/0QZgj6RBq" title="">Software Engineer</a></strong> 💰 $126K to $275K a year · 🏢 based in the office in Nantes, FR</p><p><strong><a href="https://ku.bz/FGQxvkrqL" title="">Solution Architect</a></strong> 💰 $84.6K to $346.5K a year · 🌎 remote from</p><p><strong><a href="https://ku.bz/7t6Rd-g85" title="">Network & Container Platform Engineer (M/W)</a></strong> 💰 US$96.3K to US$286K a year · 🏢 based in the office in Zürich, CH</p><p>👉 Discover more opportunities on <a href="https://kube.careers" title="">Kube Careers.</a></p><hr /><h2>🛠 Tools and libraries</h2><p><strong>🔥 RootCause](<a href="https://ku.bz/rpQdbmF2g" title="">https://ku.bz/rpQdbmF2g</a>)</strong></p><p>RootCause is a <strong>local first MCP server</strong> for Kubernetes that turns <strong>natural language</strong> into <strong>evidence</strong> backed incident analysis, safe operation <strong>checks</strong>, and ecosystem <strong>diagnostics</strong> for tools like Argo CD, Flux, Cilium, and Helm.</p><p><strong>🔥 Warden for Identity-Based Access Control for AI Agents and Kubernetes Workloads](<a href="https://ku.bz/KTFVJj-Tv" title="">https://ku.bz/KTFVJj-Tv</a>)</strong></p><p>Warden is an open source runtime <strong>access gateway</strong> that lets <strong>AI agents</strong>, pods, pipelines, and services use <strong>identity-based policies</strong> to reach cloud APIs, databases, and storage without storing long-lived credentials.</p><p><strong>🔥 GreenKube: carbon and cost visibility for Kubernetes](<a href="https://ku.bz/4ncKbDJY-" title="">https://ku.bz/4ncKbDJY-</a>)</strong></p><p>GreenKube is an open source platform that <strong>measures</strong> Kubernetes <strong>workload energy use</strong>, estimates <strong>CO2e</strong> emissions, and gives <strong>optimization</strong> recommendations so teams can <strong>reduce</strong> cloud <strong>cost</strong> and <strong>carbon impact</strong>.</p><p><strong>AIBrix: GenAI inference](<a href="https://ku.bz/vJhPgwv7P" title="">https://ku.bz/vJhPgwv7P</a>)</strong></p><p>AIBrix is a Kubernetes-native <strong>GenAI inference infrastructure toolkit</strong> from the <strong>vLLM project</strong>, with LLM-aware routing, distributed KV cache, LoRA management, and an app-tailored <strong>autoscaler</strong> for vLLM workloads.</p><p><strong>Pluto](<a href="https://ku.bz/93tpTgGF2" title="">https://ku.bz/93tpTgGF2</a>)</strong></p><p>Pluto <strong>scans</strong> Kubernetes manifests, Helm charts, and live <strong>Helm</strong> releases to find <strong>deprecated</strong> or removed <strong>API versions</strong> before upgrades break workloads.</p><h3>More projects</h3><ul><li><p><a href="https://ku.bz/YqK6mqnD2" title="">Helm unittest</a></p></li><li><p><a href="https://ku.bz/FVN45fLyW" title="">Kloudlite: RemoteLocal Environments</a></p></li><li><p><a href="https://ku.bz/9txYqymYd" title="">make-argocd-fly: Kubernetes manifest generator</a></p></li><li><p><a href="https://ku.bz/T9Y1Vh5Mn" title="">Helm exporter</a></p></li><li><p><a href="https://ku.bz/hZYF4XgL_" title="">Cilium Policy Generator</a></p></li><li><p><a href="https://ku.bz/BPXM_D-v2" title="">X.509 Certificate Exporter</a></p></li><li><p><a href="https://ku.bz/ym6b2gcTj" title="">IaC– GitOps-Driven Infrastructure for Homelab</a></p></li><li><p><a href="https://ku.bz/c7FR_grvr" title="">DR-Syncer – CLI & Controller for Kubernetes Disaster Recovery</a></p></li></ul><hr /><h2>📅 Upcoming Kubernetes events</h2><p><strong><a href="https://ku.bz/M7LYxmdxn" title="">🔥 Advanced Kubernetes course</a></strong> 📅 Apr 23</p><p><strong><a href="https://ku.bz/Th7j94SCf" title="">🌟 Cloud Native 2026</a></strong> 📅 Apr 23</p><p><strong><a href="https://ku.bz/mdHcG9D8J" title="">NDC Sydney 2026</a></strong> 📅 Apr 23</p><p><strong><a href="https://ku.bz/RNNvxLVlj" title="">Google Cloud Next</a></strong> 📅 Apr 24</p><p><strong><a href="https://ku.bz/L4nxV_N8N" title="">🌟 Devopsdays Copenhagen</a></strong> 📅 Apr 28</p><p>👉 You can find more events on <a href="https://kube.events" title="">Kube Events.</a></p><hr /><h2>📢 Call for papers closing soon</h2><p><strong><a href="https://ku.bz/LNsV_WGtk" title="">🔥 Kubernetes Community Days Lima 2026</a></strong> ⏳ <em>closes May 19</em></p><p><strong><a href="https://ku.bz/kRYhSQhMq" title="">🔥 KubeCon China 2026</a></strong> ⏳ <em>closes May 3</em></p><p><strong><a href="https://ku.bz/pxrqd9zHV" title="">🔥 Cloud Native Days Norway</a></strong> ⏳ <em>closes Jun 1</em></p><p><strong><a href="https://ku.bz/2qtpBDcyJ" title="">Devopsdays Feira de Santana</a></strong> ⏳ <em>closes Jun 4</em></p><p><strong><a href="https://ku.bz/XLYQNxkqk" title="">SREday NYC 2026</a></strong> ⏳ <em>closes May 1</em></p><p><strong><a href="https://ku.bz/k84xzzhxj" title="">Devopsdays Curitiba</a></strong> ⏳ <em>closes Jun 4</em></p><p><strong><a href="https://ku.bz/8qs6lvgwN" title="">🌟 Devopsdays Berlin</a></strong> ⏳ <em>closes May 3</em></p><p><strong><a href="https://ku.bz/LwdX0YkNf" title="">🌟 Heapcon 2026</a></strong> ⏳ <em>closes Jun 1</em></p><p><strong><a href="https://ku.bz/vR6Z290h2" title="">TechEx North America</a></strong> ⏳ <em>closes May 17</em></p><p>👉 You can find more Call for Papers on <a href="https://kube.events/call-for-papers" title="">Kube Events.</a></p><hr /><p>Until next time!</p><p><em>— Gulcan</em></p>]]></content><author><name>Kube Today</name></author></entry></feed>