MinhVo

Minh Vo

rss feed

Slaying code & making it lit fr fr 🔥 tagline

Hey there 👋 I'm an AI Engineer with 7 years of experience building scalable web and mobile applications. Currently at Neurond AI (May 2025 — present), architecting an Enterprise AI Assistant Platform with multi-tenant RAG on pgvector, multi-provider LLM orchestration, and Azure-native infrastructure. Previously spent 5+ years at SNAPTEC (Sep 2019 — Apr 2025), leading SaaS themes, admin dashboards, and e-commerce platforms — earned the Hero of the Year award in 2021. I specialize in TypeScript, React, Next.js, and AI-Native engineering with Claude Code and Cursor.bio

Back to blogs

Kubernetes Autoscaling HPA VPA and KEDA Deep Dive

Kubernetes autoscaling: HPA, VPA, KEDA, Cluster Autoscaler, Karpenter. Metrics-driven scaling, cost optimization.

KubernetesautoscalingHPAVPAKEDAKarpenterdevops

By MinhVo

Introduction

Horizontal (more pods via HPA), vertical (more resources via VPA), cluster (more nodes via Cluster Autoscaler/Karpenter). HPA for stateless scalable workloads. VPA for workloads that cannot be horizontally scaled. Effective autoscaling combines all three.

Three Dimensions

devops illustration

Horizontal (more pods via HPA), vertical (more resources via VPA), cluster (more nodes via Cluster Autoscaler/Karpenter). HPA for stateless scalable workloads. VPA for workloads that cannot be horizontally scaled. Effective autoscaling combines all three.

HPA Configuration

Adjusts pod replicas based on CPU, memory, or custom metrics. Stabilization windows prevent flapping. Policies control scaling rate. KEDA extends HPA with event-driven scaling from 60+ sources. KEDA can scale to zero when no events exist.

VPA Right-Sizing

Components: Recommender, Updater, Admission Controller. Modes: Off, Initial, Auto. Recommends based on 90th percentile usage. Goldilocks project provides dashboard. Particularly useful for unpredictable workloads and batch jobs.

KEDA Event-Driven

devops illustration

ScaledObject defines triggers. Scalers: Kafka, RabbitMQ, AWS SQS, Azure Service Bus, Prometheus, PostgreSQL, Redis, 60+ more. Creates HPA resources with event-driven configuration. Scale to zero is unique to KEDA. Also supports scaling Jobs for batch processing.

Cluster Autoscaler and Karpenter

Cluster Autoscaler adds nodes for pending pods, removes underutilized nodes after 10+ minutes. Karpenter: faster provisioning (30-60s vs 3-7min), intelligent instance selection, consolidation replacing underutilized with cheaper nodes. Supports spot instances with interruption handling.

Metrics and Cost Optimization

Better metrics: RPS, queue depth, latency, connections. Prometheus pipeline exposes custom metrics for HPA. Cost: VPA right-sizing, spot instances via Karpenter (70% savings), bin-packing, scale-to-zero with KEDA. Monitor scaling events, set alerts for failures, review policies quarterly.

Conclusion

The topics covered in this article represent important developments in modern software engineering. By understanding these concepts deeply and applying them in your projects, you can build more robust, scalable, and maintainable systems. Continue exploring, experimenting, and building — the technology landscape rewards those who stay curious and keep learning.