Spotlight

LLMs on Kubernetes: The Easy Way

Andrew Pitt

This tutorial shows how to run an open source LLM on OpenShift with Red Hat AI Inference Server based on vLLM, using a PVC, GPU-backed deployment, OpenAI-compatible endpoint, model switching, and an optional AnythingLLM UI.

More articles →

Tools and utilities

  • Radar

    Radar provides Kubernetes cluster visibility through:

  • Siclaw

    Siclaw is an open source AI SRE platform for read-only infrastructure diagnostics, root cause analysis, team workflows, Kubernetes access, and MCP-based investigation without changing live systems directly.

  • GoKubeDownscaler: workload autoscaler

    GoKubeDownscaler is a horizontal autoscaler for Kubernetes workloads written in Go that automatically scales down deployments, statefulsets, and other resources based on time schedules to save costs.

  • VictoriaMetrics/log-collectors-benchmark

    This tool benchmarks Kubernetes log collectors by measuring throughput, CPU, memory, and log loss with a built-in verifier across agents like Vector, Fluent Bit, OpenTelemetry Collector, and Grafana Alloy.

  • Kube-Argus

    Kube-Argus is a single-binary Kubernetes dashboard that combines live cluster state, log streaming, YAML editing, drain workflows, cost analysis, and AI-assisted diagnosis in one web interface.

More projects →

Events starting soon

Discover more events onn Kube Events →

SaaS with Kubernetes Operators and Garbage Collection
SaaS with Kubernetes Operators and Garbage Collection

A single Kubernetes CRD for every service request turns small changes into full-platform reconciliations.

Alexander Held, former platform engineer at Mercedes-Benz Tech Innovation, describes a production refactor from a 2,000-line CRD to purpose-built resources and controllers. He shows how teams can model business workflows as Kubernetes APIs and then use owner references, finalizers, and events to keep platform operations predictable.

You will learn:

  • Why monolithic CRDs create performance and troubleshooting problems
  • How controllers turn database provisioning and backups into reconciliation loops
  • How finalizers clean up external resources such as S3 backups
  • Why Kubernetes events make platform workflows easier to debug

Learn from production

More case studies →

Matching jobs

    • Data Engineer with Agile Defense

    • Salary: $54K to $297.88K a year

    • Location: based in the office in Chantilly, VA, USA

    • Tech stack: Kubernetes, Docker, Python, Kafka, Spark, GitHub Actions

    • Data Engineer with Teads

    • Salary: US$45K to US$275K a year

    • Location: based in the office (and remote from home) in Netanya, IL

    • Tech stack: Kubernetes, AWS, GCP, Docker, SQL, Java, Python, Scala, Typescript, Flink

    • DevOps Engineer with zooplus SE

    • Salary: $116.1K to $302.5K a year

    • Location: based in the office (and remote from home) in Krakow, PL; Wroclaw, PL

    • Tech stack: Kubernetes, AWS, Helm, Docker, Python, Rust, Terraform, Jenkins, Ansible, Sensu

    • Engineering Manager with SupplyHouse.com

    • Salary: $65K to $92K a year

    • Location: remote from

    • Tech stack: Kubernetes, Docker, Java, Javascript, Typescript, Redis, MySQL

    • Machine Learning Engineer with Gen Digital Inc.

    • Salary: $45K to $462K a year

    • Location: based in the office in New York, NY, USA

    • Tech stack: Kubernetes, AWS, Docker, Terraform

Discover more Kubernetes jobs on Kube Careers →

Subscribe to Learn Kubernetes Weekly

Trusted by 77K engineers. Delivered 181 issues and counting.

or subscribe via

Build something

More tutorials →

Call for Papers closing soon

  1. 0

    days

    KubeCon China 2026

    The Call For Paper is open until 3 May 2026 at GMT-4. More info →
    • Location: Shanghai, CN

    • In-person conference organized by CNCF.

    • The conference starts on the 9 September 2026.

    • Apply here
  2. 0

    days

    Devopsdays Berlin

    The Call For Paper is open until 3 May 2026 at GMT-4. More info →
    • Location: Berlin, DE

    • In-person conference organized by Devopsdays.

    • The conference starts on the 29 September 2026.

    • Apply here
  3. 5

    days

    techcamp 2026

    The Call For Paper is open until 8 May 2026 at GMT-4. More info →
    • Location: Hamburg, DE

    • In-person conference organized by techcamp.

    • The conference starts on the 26 August 2026.

    • Apply here
  4. 7

    days

    Devopsdays Kraków

    The Call For Paper is open until 10 May 2026 at GMT-4. More info →
    • Location: Kraków, PL

    • In-person conference organized by Devopsdays.

    • The conference starts on the 4 July 2026.

    • Apply here
  5. 12

    days

    code.talks

    The Call For Paper is open until 15 May 2026 at GMT-4. More info →
    • Location: Hamburg, DE

    • In-person conference organized by code.talks.

    • The conference starts on the 5 November 2026.

    • Apply here
  6. 13

    days

    Devopsdays Denver

    The Call For Paper is open until 16 May 2026 at GMT-4. More info →
    • Location: Denver, CO, USA

    • In-person conference organized by Devopsdays.

    • The conference starts on the 22 September 2026.

    • Apply here
  7. 13

    days

    Michigan Technology Conference 2026

    The Call For Paper is open until 16 May 2026 at GMT-4. More info →
    • Location: Rochester, MI, USA

    • In-person conference organized by The Michigan Technology Conference Association.

    • The conference starts on the 30 October 2026.

    • Apply here

Thanks to our sponsors who make Kube Today possible

Find out more about being a sponsor →

More articles

Even more articles →