Spotlight

Why nats-headless service returns NXDOMAIN on EKS- A Practical Guide to Kubernetes Headless Service DNS

Marcin Cuber

This article explains why NATS headless services return NXDOMAIN errors on EKS and how Kubernetes headless service DNS resolution works with StatefulSets.

More articles →

Tools and utilities

  • onechart: a generic Helm chart for your apps

    onechart is a generic Helm chart for your application deployments.

  • k9sight: Kubernetes TUI debugger

    k9sight is a Go TUI for debugging Kubernetes workloads with vim-style navigation, supporting log search, exec, port-forward, scale, restart, and built-in debug helpers for common pod failure states like CrashLoopBackOff and ImagePullBackOff.

  • HolmesGPT

    HolmesGPT is a tool that investigates incidents and provides root cause analysis for various issues, including Kubernetes, Prometheus, Jira, GitHub, OpsGenie, and PagerDuty alerts.

  • Heartbeat Operator: declarative probes

    Heartbeat Operator is a Kubernetes Operator for declarative HTTP and TCP health probes defined via CRDs, with native Prometheus metrics and a ready-to-use Grafana dashboard.

  • KubeDiagrams

    KubeDiagrams is a tool that automatically generates visual architecture diagrams from Kubernetes manifests, Helm charts, and live clusters.

More projects →

Events starting soon

Discover more events onn Kube Events →

GPU Containers as a Service
GPU Containers as a Service

Running GPU workloads on Kubernetes sounds straightforward until you need to isolate multiple tenants on the same server. The moment you virtualize GPUs for security, you lose access to NVIDIA kernel drivers — and almost every tool in the ecosystem assumes those drivers exist.

Landon Clipp built a GPU-based Containers as a Service platform from scratch, solving each isolation layer — from kernel separation with Kata Containers + QEMU to NVLink fabric partitioning to network policies with Cilium/eBPF — and shares exactly what broke along the way.

In this interview:

  • Why standard NVIDIA tooling (GPU Operator) fails in multi-tenant setups, and how to use CDI with PCI topology scanning to make GPUs visible to Kubernetes without kernel drivers
  • How to partition the NVLink fabric between tenants using a trusted service VM running Fabric Manager, and why the physical PCIe wiring differs between Supermicro HGX and NVIDIA DGX systems
  • Why gVisor doesn't work for GPU workloads — NVIDIA's unstable ioctl ABI means Google has to update gVisor for every driver release, and they only support a handful of GPUs
  • What caused 8-GPU VMs to take 30+ minutes to boot, and the specific fixes (IOMMUFD, cold plugging, kernel upgrades) that brought it down to minutes
  • How Cilium network policies enforce tenant isolation at the Kubernetes identity level instead of fragile IP-based rules

Where Containers as a Service fits best: inference workloads where AI teams want to ship an OCI image without managing infrastructure or signing multi-million dollar cluster contracts.

Learn from production

More case studies →

Matching jobs

    • Data Engineer with CoreWeave Europe

    • Salary: US$72K to US$286K a year

    • Location: based in the office (and remote from home) in London, GB

    • Tech stack: Kubernetes, AWS, Docker, Python, SQL, PostgreSQL, Kafka, Airflow, Spark

    • Data Engineer with Veeva Systems

    • Salary: $115K to $175K a year

    • Location: based in the office in Pleasanton, CA, USA

    • Tech stack: Kubernetes, AWS, SQL, Kafka, Spark

    • Data Engineer with Visa

    • Salary: $152.2K to $243.7K a year

    • Location: based in the office (and remote from home) in Austin, TX, USA

    • Tech stack: Kubernetes, AWS, Docker, Java, Javascript, Python, Scala, Shell, SQL, Redis

    • Data Engineer with Visa

    • Salary: $131.6K to $210.3K a year

    • Location: based in the office (and remote from home) in Austin, TX, USA

    • Tech stack: Kubernetes, AWS, Docker, Java, Javascript, Python, Scala, Shell, SQL, Redis

Discover more Kubernetes jobs on Kube Careers →

Subscribe to Learn Kubernetes Weekly

Trusted by 77K engineers. Delivered 175 issues and counting.

or subscribe via

Build something

More tutorials →

Call for Papers closing soon

  1. 1

    days

    Kubernetes Community Days New York 2026

    The Call For Paper is open until 25 March 2026 at GMT-4. More info →
    • Location: New York, NY, USA

    • In-person conference organized by KCD New York.

    • The conference starts on the 10 June 2026.

    • Apply here
  2. 2

    days

    Data on Kubernetes Day

    The Call For Paper is open until 26 March 2026 at GMT-4. More info →
    • Location: Amsterdam, NL

    • In-person conference organized by CNCF.

    • The conference starts on the 26 March 2026.

    • Apply here
  3. 3

    days

    DeveloperWeek New York 2026

    The Call For Paper is open until 27 March 2026 at GMT-4. More info →
    • Location: New York, NY, USA

    • In-person conference organized by DeveloperWeek New York.

    • The conference starts on the 10 June 2026.

    • Apply here
  4. 5

    days

    Devopsdays Amsterdam

    The Call For Paper is open until 29 March 2026 at GMT-4. More info →
    • Location: Amsterdam, NL

    • In-person conference organized by Devopsdays.

    • The conference starts on the 19 June 2026.

    • Apply here
  5. 5

    days

    KubeCon + CloudNativeCon Japan 2026

    The Call For Paper is open until 29 March 2026 at GMT-4. More info →
    • Location: Yokohama, JP

    • In-person conference organized by Linux Foundation.

    • The conference starts on the 30 July 2026.

    • Apply here
  6. 7

    days

    WeAreDevelopers World Congress 2026 North America

    The Call For Paper is open until 31 March 2026 at GMT-4. More info →
    • Location: San Jose, CA, USA

    • In-person conference organized by WeAreDevelopers.

    • The conference starts on the 25 September 2026.

    • Apply here
  7. 7

    days

    J On the Beach

    The Call For Paper is open until 31 March 2026 at GMT-4. More info →
    • Location: Malaga, ES and virtual

    • Online & in-person conference organized by Yay Yay Events.

    • The conference starts on the 29 October 2026.

    • Apply here

Thanks to our sponsors who make Kube Today possible

Find out more about being a sponsor →

More articles

Even more articles →