Kubernetes has become the industry standard for container orchestration, but for many developers, it remains an intimidating and opaque technology. The official documentation is comprehensive but dense, and most tutorials assume infrastructure knowledge that application developers may not have. The result is that many developers interact with Kubernetes only through CI/CD pipelines without understanding what happens under the hood—until something breaks and they need to debug it.
This guide bridges that gap. Written specifically for application developers, it covers Kubernetes from the ground up with a focus on practical, hands-on knowledge. You will learn how to containerize an application, deploy it to a Kubernetes cluster, expose it to the outside world, scale it to handle traffic, and debug it when things go wrong. By the end, you will have a working understanding of Kubernetes concepts and the confidence to deploy and manage your own applications.
The key insight about Kubernetes is that it is not about containers—it is about declarative infrastructure. You describe the desired state of your application in YAML files, and Kubernetes continuously works to make reality match that description. If a pod crashes, Kubernetes restarts it. If traffic increases, you tell Kubernetes to run more replicas, and it provisions them. This declarative model is what makes Kubernetes powerful and worth learning.
Core Concepts: Pods, Deployments, and Services
Pods: The Smallest Deployable Unit
A Pod is the fundamental building block of Kubernetes. It wraps one or more containers that share the same network namespace, storage volumes, and lifecycle. In practice, most pods run a single container, but multi-container pods are useful for sidecar patterns like log collection or service mesh proxies.
# A minimal pod definition
apiVersion: v1
kind: Pod
metadata:
name: my-app
labels:
app: my-app
version: v1
spec:
containers:
- name: my-app
image: my-app:1.0
ports:
- containerPort: 8080
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "256Mi"
readinessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 15
periodSeconds: 20Every pod gets its own IP address, but this IP is ephemeral—it changes whenever the pod is rescheduled. This is why you never address pods directly by IP. Instead, you use Services or other discovery mechanisms.
Deployments: Managing Replica Sets
A Deployment is a higher-level abstraction that manages ReplicaSets, which in turn manage pods. The Deployment ensures that a specified number of pod replicas are running at all times. If a pod crashes, the ReplicaSet creates a replacement. If you update the container image, the Deployment performs a rolling update by default.
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
labels:
app: my-app
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
selector:
matchLabels:
app: my-app
template:
metadata:
labels:
app: my-app
spec:
containers:
- name: my-app
image: my-app:2.0
ports:
- containerPort: 8080
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "256Mi"
readinessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 15
periodSeconds: 20The strategy field controls how updates roll out. RollingUpdate with maxSurge: 1 and maxUnavailable: 0 ensures zero downtime by creating one new pod before terminating an old one.
Services: Stable Network Endpoints
Services provide a stable DNS name and IP address that routes traffic to a set of pods. The Service uses label selectors to find target pods and load-balances traffic across them.
apiVersion: v1
kind: Service
metadata:
name: my-app-service
spec:
selector:
app: my-app
ports:
- port: 80
targetPort: 8080
type: ClusterIPThere are four Service types:
| Type | Use Case | External Access |
|---|---|---|
| ClusterIP | Internal communication between services | No |
| NodePort | Development/testing, exposes on node IP | Yes (node IP:port) |
| LoadBalancer | Production external access (cloud) | Yes (external IP) |
| ExternalName | Alias for external DNS | N/A |
For production, you typically use ClusterIP Services with an Ingress controller handling external traffic routing.
The Declarative Model
In traditional infrastructure, you tell the system what to do: "start a server," "deploy version 2," "add a load balancer." In Kubernetes, you tell the system what you want: "I want 3 replicas of my web app running version 2 behind a load balancer." Kubernetes then figures out how to make that happen and continuously ensures it stays that way.
This declarative approach has profound implications. If a node fails, Kubernetes reschedules the affected pods on healthy nodes. If you accidentally delete a pod, the Deployment controller immediately creates a replacement. If a pod enters a crash loop, Kubernetes restarts it according to a configured policy. You describe the desired state; Kubernetes maintains it.
Cluster Architecture
A Kubernetes cluster consists of a control plane and worker nodes:
Control Plane:
- API Server — The front door to the cluster. All operations go through the API server.
- etcd — A distributed key-value store that holds the cluster state.
- Scheduler — Assigns pods to nodes based on resource requirements and constraints.
- Controller Manager — Runs controllers that reconcile desired state with actual state.
Worker Nodes:
- kubelet — An agent that runs on each node, managing pods and reporting status.
- kube-proxy — Handles networking rules for Service routing.
- Container Runtime — Runs the containers (containerd, CRI-O).
Advanced Pod Patterns
Multi-Container Pods: Sidecar, Ambassador, and Adapter
While most pods run a single container, multi-container patterns solve specific architectural challenges:
Sidecar Pattern — A helper container runs alongside the main container. Common examples include log collectors (Fluentd), service proxies (Envoy), and configuration watchers.
apiVersion: v1
kind: Pod
metadata:
name: web-with-logging
spec:
containers:
- name: web
image: myapp:1.0
ports:
- containerPort: 8080
volumeMounts:
- name: logs
mountPath: /var/log/app
- name: log-collector
image: fluentd:latest
volumeMounts:
- name: logs
mountPath: /var/log/app
readOnly: true
volumes:
- name: logs
emptyDir: {}Ambassador Pattern — A proxy container handles network communication for the main container. This simplifies service discovery and connection management, especially in service mesh architectures.
Adapter Pattern — A container that transforms the output of the main container into a standard format consumed by monitoring or logging systems.
Init Containers for Setup Tasks
Init containers run before the main container starts and are useful for setup tasks like database migrations, configuration generation, or waiting for dependencies:
spec:
initContainers:
- name: db-migrate
image: myapp:2.0
command: ["node", "scripts/migrate.js"]
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: db-secret
key: url
- name: wait-for-db
image: busybox:1.36
command: ['sh', '-c', 'until nc -z postgres-service 5432; do echo waiting for db; sleep 2; done']
containers:
- name: app
image: myapp:2.0Step-by-Step Implementation
Setting Up a Local Development Cluster
Before deploying to a production cluster, you need a local development environment. There are three main options:
Minikube — A single-node cluster running in a VM. Best for learning and testing:
# Install minikube (macOS)
brew install minikube
# Start a local cluster with recommended resources
minikube start --cpus=4 --memory=8192 --driver=docker
# Verify cluster is running
kubectl cluster-info
kubectl get nodeskind (Kubernetes IN Docker) — Runs Kubernetes nodes as Docker containers. Faster startup, better for CI:
# Install kind
brew install kind
# Create a cluster with a custom config
kind create cluster --config kind-config.yaml# kind-config.yaml
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
extraPortMappings:
- containerPort: 30080
hostPort: 30080
- role: worker
- role: workerDocker Desktop — Built-in Kubernetes on Mac/Windows. Enable it in Docker Desktop settings under "Kubernetes."
Containerizing Your Application
Create a multi-stage Dockerfile for a Node.js application:
# Dockerfile
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
FROM node:20-alpine
WORKDIR /app
RUN addgroup -g 1001 appgroup && adduser -u 1001 -G appgroup -s /bin/sh -D appuser
COPY --from=builder /app/node_modules ./node_modules
COPY . .
USER appuser
EXPOSE 8080
HEALTHCHECK --interval=30s --timeout=3s CMD wget -qO- http://localhost:8080/health || exit 1
CMD ["node", "server.js"]Build and make the image available to your cluster:
# For minikube, use the minikube Docker daemon
eval $(minikube docker-env)
docker build -t my-app:1.0 .
# For kind, load the image into the cluster
kind load docker-image my-app:1.0
# For cloud clusters, push to a registry
docker tag my-app:1.0 gcr.io/my-project/my-app:1.0
docker push gcr.io/my-project/my-app:1.0Creating and Applying Manifests
Organize your Kubernetes manifests in a directory structure:
k8s/
├── base/
│ ├── deployment.yaml
│ ├── service.yaml
│ ├── configmap.yaml
│ └── kustomization.yaml
├── overlays/
│ ├── dev/
│ │ └── kustomization.yaml
│ ├── staging/
│ │ └── kustomization.yaml
│ └── prod/
│ └── kustomization.yaml
Apply manifests:
# Apply a single file
kubectl apply -f deployment.yaml
# Apply all files in a directory
kubectl apply -f k8s/
# Apply with kustomize for environment-specific configs
kubectl apply -k k8s/overlays/dev/
# Dry-run to validate without applying
kubectl apply -f deployment.yaml --dry-run=client
# Diff to see what would change
kubectl diff -f deployment.yamlConfigMaps and Secrets: Externalizing Configuration
Never hardcode environment-specific values in container images. Use ConfigMaps for non-sensitive configuration and Secrets for sensitive data.
ConfigMaps
apiVersion: v1
kind: ConfigMap
metadata:
name: my-app-config
data:
NODE_ENV: "production"
LOG_LEVEL: "info"
API_URL: "https://api.example.com"
# You can also store entire files
nginx.conf: |
server {
listen 80;
server_name example.com;
location / {
proxy_pass http://localhost:8080;
}
}Secrets
# Create a secret from literal values
kubectl create secret generic db-secret \
--from-literal=url=postgres://user:pass@host/db
# Create a secret from a file
kubectl create secret generic tls-secret \
--from-file=tls.crt=server.crt \
--from-file=tls.key=server.keyapiVersion: v1
kind: Secret
metadata:
name: my-app-secret
type: Opaque
data:
DATABASE_URL: cG9zdGdyZXM6Ly91c2VyOnBhc3NAaG9zdC9kYg== # base64 encoded
API_KEY: c2VjcmV0LWtleQ== # base64 encodedReference them in your Deployment:
spec:
containers:
- name: my-app
image: my-app:1.0
envFrom:
- configMapRef:
name: my-app-config
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: my-app-secret
key: DATABASE_URL
volumeMounts:
- name: config-volume
mountPath: /etc/nginx/conf.d
volumes:
- name: config-volume
configMap:
name: my-app-configSecurity Best Practices for Secrets
Never store secrets in environment variables if you can avoid it—environment variables are visible in pod descriptions and process listings. Instead, mount secrets as volumes:
spec:
containers:
- name: my-app
volumeMounts:
- name: secret-volume
mountPath: /etc/secrets
readOnly: true
volumes:
- name: secret-volume
secret:
secretName: my-app-secret
defaultMode: 0400For production, consider external secret managers like HashiCorp Vault, AWS Secrets Manager, or the External Secrets Operator, which syncs secrets from external sources into Kubernetes Secrets.
Exposing Applications: Ingress and Gateway API
Ingress
An Ingress resource routes external HTTP/HTTPS traffic to Services based on hostnames and paths:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: my-app-ingress
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
tls:
- hosts:
- myapp.example.com
secretName: myapp-tls
rules:
- host: myapp.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: my-app-service
port:
number: 80For local development with minikube:
# Enable the ingress addon
minikube addons enable ingress
# Or use port-forwarding for quick access
kubectl port-forward service/my-app-service 8080:80Gateway API (Modern Alternative)
The Gateway API is the modern replacement for Ingress, offering more expressive routing, traffic splitting, and header-based matching:
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: my-gateway
spec:
gatewayClassName: nginx
listeners:
- name: http
port: 80
protocol: HTTP
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: my-app-route
spec:
parentRefs:
- name: my-gateway
rules:
- matches:
- path:
type: PathPrefix
value: /
backendRefs:
- name: my-app-service
port: 80Health Checks and Readiness
Health probes are critical for production reliability. Kubernetes uses three types:
Readiness Probe — Determines when a pod is ready to receive traffic. If it fails, the pod is removed from Service endpoints.
Liveness Probe — Detects when a pod is stuck or deadlocked. If it fails, Kubernetes restarts the pod.
Startup Probe — For slow-starting applications. Disables liveness and readiness probes until the startup probe succeeds.
spec:
containers:
- name: my-app
startupProbe:
httpGet:
path: /health
port: 8080
failureThreshold: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 15
periodSeconds: 20Resource Management
Without resource requests and limits, pods can consume unlimited resources, causing noisy neighbor issues or OOMKills:
resources:
requests:
cpu: "100m" # 0.1 CPU cores guaranteed
memory: "128Mi" # 128 MiB guaranteed
limits:
cpu: "500m" # Max 0.5 CPU cores
memory: "256Mi" # Max 256 MiB (exceeding = OOMKill)QoS Classes:
- Guaranteed — requests = limits (highest priority, last to be evicted)
- Burstable — requests < limits (medium priority)
- BestEffort — no requests or limits (first to be evicted)
Debugging Kubernetes Applications
When things go wrong, these commands are your lifeline:
# Check pod status
kubectl get pods -o wide
# Describe a specific pod for events and conditions
kubectl describe pod my-app-xyz123
# View pod logs
kubectl logs my-app-xyz123
# Follow logs in real-time
kubectl logs -f my-app-xyz123
# View logs from a previous instance (after crash)
kubectl logs my-app-xyz123 --previous
# Execute a shell in a running pod
kubectl exec -it my-app-xyz123 -- /bin/sh
# Check resource usage
kubectl top pods
kubectl top nodes
# View events sorted by time
kubectl get events --sort-by='.lastTimestamp'
# Check service endpoints
kubectl get endpoints my-app-serviceHelm: Package Management for Kubernetes
Helm is the package manager for Kubernetes. It bundles manifests into reusable charts with templating:
# Install a chart from a repository
helm repo add bitnami https://charts.bitnami.com/bitnami
helm install my-postgres bitnami/postgresql --set auth.password=secret
# Create your own chart
helm create my-appmy-app/
├── Chart.yaml
├── values.yaml
├── templates/
│ ├── deployment.yaml
│ ├── service.yaml
│ └── ingress.yaml
└── tests/
# values.yaml - default configuration
replicaCount: 3
image:
repository: my-app
tag: "1.0"
pullPolicy: IfNotPresent
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "256Mi"CI/CD Integration
Integrate Kubernetes deployments into your CI/CD pipeline:
# GitHub Actions example
name: Deploy to Kubernetes
on:
push:
branches: [main]
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Build and push image
run: |
docker build -t gcr.io/${{ secrets.GCP_PROJECT }}/my-app:${{ github.sha }} .
docker push gcr.io/${{ secrets.GCP_PROJECT }}/my-app:${{ github.sha }}
- name: Deploy to GKE
uses: google-github-actions/get-gke-credentials@v2
with:
cluster_name: production
location: us-central1
- name: Update image tag
run: |
kubectl set image deployment/my-app my-app=gcr.io/${{ secrets.GCP_PROJECT }}/my-app:${{ github.sha }}
kubectl rollout status deployment/my-app --timeout=300sProduction Best Practices
-
Always set resource requests and limits — Without requests, the scheduler cannot make informed placement decisions. Without limits, a single pod can starve others on the same node.
-
Use readiness and liveness probes — Readiness probes control when a pod receives traffic. Liveness probes detect and restart unhealthy pods.
-
Use namespaces for environment isolation — Separate dev, staging, and production into different namespaces. Use RBAC to restrict access.
-
Pin image tags, never use
latest— Always use specific version tags or SHA digests for reproducible deployments. -
Implement graceful shutdown — Handle the SIGTERM signal to complete in-flight requests before exiting. Set
terminationGracePeriodSecondsappropriately. -
Use Pod Disruption Budgets — Ensure minimum availability during voluntary disruptions:
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: my-app-pdb
spec:
minAvailable: 2
selector:
matchLabels:
app: my-app- Store manifests in version control — Treat YAML files as code. Review changes in pull requests and apply through CI/CD.
Real-World Use Cases
Deploying a REST API with Database
A typical REST API deployment includes the application, a database, and configuration management. Here's a complete example:
# Complete REST API deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-server
spec:
replicas: 3
selector:
matchLabels:
app: api-server
template:
metadata:
labels:
app: api-server
spec:
containers:
- name: api
image: myapi:2.0
ports:
- containerPort: 8080
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: db-secret
key: url
resources:
requests:
cpu: "200m"
memory: "256Mi"
limits:
cpu: "1"
memory: "512Mi"
readinessProbe:
httpGet:
path: /api/health
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
livenessProbe:
httpGet:
path: /api/health
port: 8080
initialDelaySeconds: 30
periodSeconds: 15Running Background Workers
Background workers process tasks from queues and don't need Services or Ingress:
apiVersion: apps/v1
kind: Deployment
metadata:
name: email-worker
spec:
replicas: 2
selector:
matchLabels:
app: email-worker
template:
metadata:
labels:
app: email-worker
spec:
containers:
- name: worker
image: email-worker:1.0
env:
- name: QUEUE_URL
valueFrom:
secretKeyRef:
name: queue-secret
key: url
resources:
requests:
cpu: "100m"
memory: "128Mi"Canary Deployments with Traffic Splitting
Use an Ingress controller that supports traffic splitting to gradually route traffic to a new version:
# canary-ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: my-app-canary
annotations:
nginx.ingress.kubernetes.io/canary: "true"
nginx.ingress.kubernetes.io/canary-weight: "10"
spec:
rules:
- host: myapp.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: my-app-canary-service
port:
number: 80CronJobs for Scheduled Tasks
apiVersion: batch/v1
kind: CronJob
metadata:
name: cleanup-old-data
spec:
schedule: "0 2 * * *" # 2 AM daily
jobTemplate:
spec:
template:
spec:
containers:
- name: cleanup
image: my-app:2.0
command: ["node", "scripts/cleanup.js"]
restartPolicy: OnFailureScaling: Horizontal Pod Autoscaler
The Horizontal Pod Autoscaler (HPA) automatically scales the number of pod replicas based on CPU utilization or custom metrics:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: my-app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
behavior:
scaleUp:
stabilizationWindowSeconds: 60
policies:
- type: Pods
value: 2
periodSeconds: 60
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 10
periodSeconds: 60The behavior field controls scaling speed. Scale-up is fast (add 2 pods per minute), while scale-down is conservative (remove 10% per minute with a 5-minute stabilization window) to prevent flapping.
Common Pitfalls and Solutions
| Pitfall | Impact | Solution |
|---|---|---|
| No resource requests/limits | OOMKill, noisy neighbor issues | Always set requests and limits based on profiling |
| Missing readiness probe | Traffic sent to starting pods | Add readiness probe with appropriate initialDelaySeconds |
Using latest tag | Non-reproducible deployments | Pin to specific version tags |
| No graceful shutdown handling | Dropped requests during deployment | Handle SIGTERM, set terminationGracePeriodSeconds |
| Secrets in environment variables | Visible in pod description | Use mounted secret volumes or external secret managers |
| No Pod Disruption Budget | All pods evicted during node drain | Define PDB with minAvailable |
| Running as root | Security vulnerability | Set securityContext.runAsNonRoot |
| No horizontal scaling | Poor performance under load | Configure HPA based on CPU or custom metrics |
Testing Kubernetes Manifests
Test your manifests before applying them:
# Dry-run: validate manifests without applying
kubectl apply -f deployment.yaml --dry-run=client
# Diff: see what would change
kubectl diff -f deployment.yaml
# Use kubeval or kubeconform for schema validation
kubeconform -strict deployment.yaml
# Use kustomize for environment-specific testing
kubectl apply -k overlays/staging/ --dry-run=serverTest application behavior in a local cluster:
# Start minikube
minikube start --cpus=4 --memory=8192
# Deploy your application
kubectl apply -f k8s/
# Run integration tests against the cluster
npm run test:integration
# Check pod logs for errors
kubectl logs -l app=my-app --tail=100Comparison with Alternatives
| Feature | Kubernetes | Docker Swarm | AWS ECS | Nomad | Railway/Render |
|---|---|---|---|---|---|
| Complexity | High | Low | Medium | Medium | Very Low |
| Scalability | Very High | Medium | High | High | Medium |
| Self-healing | Yes | Yes | Yes | Yes | Yes |
| Auto-scaling | HPA, VPA, KEDA | No | Built-in | External | Built-in |
| Service mesh | Istio, Linkerd | No | App Mesh | Consul | No |
| Community | Massive | Declining | AWS-focused | Growing | Niche |
| Learning curve | Steep | Gentle | Moderate | Moderate | Minimal |
| Best for | Production at scale | Simple deployments | AWS-native | Multi-cloud | Prototypes |
Future Outlook
Kubernetes continues to evolve rapidly. The Gateway API is replacing the Ingress API with a more expressive networking model. Projects like Telepresence and Skaffold are making the inner development loop faster by enabling live code reloading against remote clusters. The Kubernetes API is being used as a platform for managing databases (Operator pattern), machine learning workloads (Kubeflow), and even virtual machines (KubeVirt).
Conclusion
Kubernetes is a powerful platform for running containerized applications at scale. While it has a steep learning curve, the core concepts—Pods, Deployments, Services, and declarative configuration—are straightforward once you understand them.
Key takeaways:
- A Pod is the smallest deployable unit; a Deployment manages Pod replicas; a Service provides stable networking
- Kubernetes uses a declarative model: describe the desired state, and Kubernetes maintains it
- Always set resource requests and limits on containers to enable proper scheduling and prevent resource contention
- Use readiness and liveness probes to ensure traffic only reaches healthy pods
- Store all Kubernetes manifests in version control and apply them through CI/CD pipelines
- Namespaces provide logical isolation; RBAC controls who can do what in each namespace
- Start with minikube or kind for local development before deploying to cloud clusters
- Handle graceful shutdown (SIGTERM) to prevent dropped requests during deployments
Start by deploying a simple application to a local cluster, then progressively add Services, Ingress, ConfigMaps, Secrets, and autoscaling. Each step builds on the previous one, and the hands-on experience will solidify your understanding far more than reading documentation alone.
For deeper exploration, see the Kubernetes documentation, the Kubernetes the Hard Way tutorial, and the 12-Factor App methodology for building cloud-native applications.