Spotlight
Pavel Buchnev
This article teaches how to build self-evolving AI systems using Kubernetes, Temporal workflows, and automated deployment pipelines, enabling AI agents to detect errors, fix code, and redeploy services without manual intervention.
Vlad Levinas
This article tests Taracode, a Go-based CLI AI agent, against real K3s homelab tasks like Kubernetes troubleshooting, manifest generation, and GitOps workflows using a local Ollama LLM.
Groww Engineering Team
This case study explains how Groww built an internal chaos engineering platform on Kubernetes to run controlled failure drills like network faults, dependency outages, and traffic replay before real incidents hit production.
Sheng Chen is a Sr. Specialist Solutions Architect at AWS Australia, bringing over 20 years of experience in IT infrastructure, cloud architecture, and multi-cloud networking. In his current role, Sheng helps customers accelerate cloud migrations and infrastructure modernization by leveraging cloud-native technologies. He specializes in Amazon EKS, AWS hybrid cloud services, platform engineering and AI infrastructure.
This tutorial shows how to run production generative AI at the edge by attaching on-prem NVIDIA DGX systems to an Amazon EKS control plane with hybrid nodes, GPU Operator, and NVIDIA NIM.
Tools and utilities
Luxury Yacht is a cross-platform desktop app for managing Kubernetes clusters, available for Linux, macOS, and Windows, built with Go and Wails.
Kroc is an educational Kubernetes Operator built with Go and kubebuilder that watches arbitrary resources and reactively creates derived objects using Go templating.
Kubedock lets you run Docker API based test workloads on Kubernetes without Docker-in-Docker, which makes it useful for Testcontainers, CI pipelines, and ephemeral test environments.
k8s-ingress-gen is a visual diagram builder for Kubernetes resources with bidirectional YAML workflow.
Hortator lets AI agents spawn sub-agents at runtime, with each agent running in its own pod with budget caps, network policies, PII redaction, and capability inheritance so children can never escalate beyond their parent's permissions.
Events starting soon
July 1, 2026
Location: Amsterdam, NL
This is a free event.
July 1, 2026
Location: Springfield, MI, USA
This is a free event.
July 2, 2026
Location: Mannheim, DE
This event requires an entrance fee
July 2, 2026
This is a virtual event
This is a free event.
July 2, 2026
Location: Bunnik, NL
This is a free event.
July 2, 2026
Location: San Francisco, CA, USA
This event requires an entrance fee
Learn from production
Alexey Demyanov
This case study shows how Palark migrated high-traffic Drupal 8 monoliths to Kubernetes to improve resilience, autoscaling, deployment automation, and DDoS handling while reducing infrastructure waste.
This blog post tells how the Render team:
Jack Lindamood
This case study shows how OOM Killer terminated a critical network daemon on Kubernetes nodes, causing a network outage.
It covers debugging via serial console and implementing memory reservations to prevent system-critical process termination.
Kalyan Josyula
This case study shows how a team traced repeated pod OOM kills in ASP.NET Core to native memory growth from zombie SignalR connections, glibc fragmentation, and kernel socket buffers.
Matching jobs
DevOps Engineer with Miratech
Salary: $81K to $297K a year
Location: remote from
Tech stack: Kubernetes, AWS, ArgoCD, Flux, Docker, Python, Cloudformation, Terraform, GitHub Actions, Jenkins
Engineering Manager with FIRY
Salary: $259K a year
Location: based in the office (and remote from home) in San Francisco, CA, USA
Tech stack: Kubernetes, AWS, Docker, Go, Java, Javascript, Python, Ruby
Head of Site Reliability Engineering with FIRY
Salary: $58.5K to $3.29L a year
Location: based in the office (and remote from home) in Bengaluru, IN
Tech stack: Kubernetes, AWS, ArgoCD, Go, Java, Python, GitHub Actions, Datadog, Prometheus, Jaeger
Head of Site Reliability Engineering with Kontakt.io
Salary: $196.2K to $357.5K a year
Location: based in the office in New York, NY, USA
Tech stack: Kubernetes, AWS, Docker, Terraform, Datadog, Grafana, Prometheus
Platform Engineer with Inversion
Salary: $139K to $201K a year
Location: based in the office in Playa Vista, CA, USA
Tech stack: Kubernetes, AWS, GCP, Docker, Python, Shell, Terraform, GitHub Actions, Jenkins, Grafana
Build something
Jeff Vincent
This tutorial shows how to take a multi-service app from local source to a Kubernetes environment with OAuth, TLS, Stripe webhooks, in-cluster CI, and automated deployment using Kindling.
Dylan Da Costa
This tutorial explains how to design CloudNativePG for production failure by using plugin-based backups, WAL archiving, point-in-time recovery, snapshots, and PgBouncer so recovery is treated as the real operational priority.
augusthottie
This tutorial shows how to add Prometheus, Grafana, Alertmanager, custom metrics, ServiceMonitors, dashboards, and alert rules to an EKS cluster through GitOps.
Andrew Pitt
This tutorial shows how to run an open source LLM on OpenShift with Red Hat AI Inference Server based on vLLM, using a PVC, GPU-backed deployment, OpenAI-compatible endpoint, model switching, and an optional AnythingLLM UI.
More articles
Debdut Chakraborty
This article explains:
Ægir Máni Hauksson
This article explains that Kubernetes operators become hard to maintain without explicit component and resource-primitive layers between the controller and raw objects.
Happy Bhati
This article describes how Red Hat's Konflux team built an AI-powered "finally task" for Tekton pipelines that automatically distills 170,000-line failure logs into a 10-line diagnosis.
Daniel Hnyk
This article shows how to run Claude Code as a CronJob using a custom Dockerfile, non-interactive mode flags, jq log filtering, and a timeout-based fallback that spawns a second Claude instance to recover partial results.