Introduction
Kubernetes networking is one of the most complex yet fundamental aspects of container orchestration. Unlike traditional networking where IP addresses are relatively static, pods in Kubernetes are ephemeral—they can be created, destroyed, and rescheduled across nodes at any moment. This dynamic nature requires a robust networking model that abstracts away the complexity and provides reliable service discovery and traffic routing.
Understanding how Services, Ingress, and DNS work together is essential for anyone deploying applications on Kubernetes. Without this knowledge, you'll struggle with basic connectivity issues, fail to expose applications properly, and miss out on advanced traffic management capabilities. This guide breaks down each networking primitive, explains how they interconnect, and provides practical implementation patterns you can apply immediately.
Whether you're running a simple web application or a complex microservices architecture, mastering Kubernetes networking is the foundation for building reliable, scalable systems. The concepts covered here apply to all major cloud providers and on-premises installations alike.
The Kubernetes Networking Model: Three Layers of Communication
Kubernetes implements a flat networking model where every pod gets its own IP address, and any pod can communicate with any other pod without NAT. This model is achieved through the Container Network Interface (CNI), which handles pod-to-pod networking across nodes. The networking stack operates on three distinct layers, each solving a specific problem.
Layer 1: Container-to-Container Communication
Within a pod, containers share the same network namespace, meaning they share the same IP address and port space. Containers in the same pod can communicate via localhost, which is how sidecar patterns work. For example, an Envoy proxy sidecar can intercept traffic on port 8080 while the application container listens on port 8081, with the sidecar forwarding requests after applying policies.
# Pod with sidecar proxy pattern
apiVersion: v1
kind: Pod
metadata:
name: app-with-proxy
spec:
containers:
- name: app
image: my-app:1.0
ports:
- containerPort: 8081
- name: envoy-sidecar
image: envoyproxy/envoy:v1.28-latest
ports:
- containerPort: 8080
volumeMounts:
- name: envoy-config
mountPath: /etc/envoy
volumes:
- name: envoy-config
configMap:
name: envoy-configLayer 2: Pod-to-Pod Communication Across Nodes
The Container Network Interface (CNI) plugin is responsible for pod-to-pod networking across nodes. Popular CNI implementations include Calico, Flannel, Cilium, and Weave, each offering different features like network policies, encryption, and observability.
Calico uses BGP (Border Gateway Protocol) for routing and supports rich network policies. It's the default CNI for many managed Kubernetes services and offers excellent performance through eBPF-based dataplane options.
Flannel is the simplest CNI, providing basic overlay networking using VXLAN or host-gw backends. It doesn't support network policies natively, making it suitable only for development clusters.
Cilium is the most advanced CNI, leveraging eBPF for高性能 networking, observability, and security. It replaces kube-proxy entirely and provides Hubble for network visibility. Cilium supports advanced features like bandwidth manager, host-level firewalling, and multi-cluster networking.
# Check your cluster's CNI plugin
kubectl get daemonset -n kube-system | grep -E "calico|flannel|cilium|weave"
# View CNI configuration on a node
ls /etc/cni/net.d/
# Check pod CIDR ranges
kubectl get nodes -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.podCIDR}{"\n"}{end}'Layer 3: External-to-Internal Communication
Services and Ingress handle external traffic entering the cluster. Services provide stable virtual IPs at Layer 4 (TCP/UDP), while Ingress provides HTTP/HTTPS routing at Layer 7. This layered approach allows you to expose applications securely without exposing pod IPs directly.
The kube-proxy component runs on every node and maintains network rules that allow communication to Services. It operates in three modes: iptables (default), IPVS, and userspace. Iptables mode uses kernel-level packet filtering for efficient routing, while IPVS mode provides better performance for large-scale deployments with thousands of Services through hash-table-based lookups instead of linear rule scanning.
Service Types: The Foundation of Kubernetes Networking
Kubernetes offers four Service types, each designed for specific networking scenarios. Understanding when to use each type is crucial for proper application architecture.
ClusterIP: Internal Service Discovery
ClusterIP is the default Service type and provides internal-only access within the cluster. When you create a ClusterIP Service, Kubernetes assigns a virtual IP from the Service CIDR range, and kube-proxy programs rules to route traffic from that IP to the backing pods.
# ClusterIP Service with session affinity
apiVersion: v1
kind: Service
metadata:
name: api-server
labels:
app: api-server
spec:
type: ClusterIP
selector:
app: api-server
sessionAffinity: ClientIP
sessionAffinityConfig:
clientIP:
timeoutSeconds: 1800
ports:
- name: http
port: 80
targetPort: 8080
protocol: TCPClusterIP Services are ideal for internal microservices that don't need external access. The virtual IP remains stable even as pods scale up/down or get rescheduled, providing reliable service discovery.
NodePort: Direct Node Access
NodePort builds on ClusterIP by opening a specific port (30000-32767) on every node in the cluster. External traffic can reach the Service by hitting any node's IP on that port. NodePort is useful for development and testing but is rarely used in production because it requires managing port assignments and doesn't integrate with load balancers or SSL termination.
# NodePort Service
apiVersion: v1
kind: Service
metadata:
name: web-nodeport
spec:
type: NodePort
selector:
app: web
ports:
- port: 80
targetPort: 8080
nodePort: 30080
protocol: TCPLoadBalancer: Cloud Provider Integration
LoadBalancer extends NodePort by provisioning an external load balancer through your cloud provider. When you create a LoadBalancer Service, the cloud controller manager creates a load balancer that routes traffic to the NodePort, which then routes to your pods.
# LoadBalancer Service
apiVersion: v1
kind: Service
metadata:
name: web-lb
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
spec:
type: LoadBalancer
selector:
app: web
externalTrafficPolicy: Local
ports:
- port: 443
targetPort: 8443
protocol: TCPImportant: Each LoadBalancer Service provisions a separate external load balancer, which can be expensive. Use a single Ingress controller with multiple Ingress resources to consolidate external access points.
ExternalName: DNS CNAME Mapping
ExternalName maps a Service to a DNS name using a CNAME record. It doesn't use selectors or define ports—it simply returns a CNAME record pointing to an external service.
# ExternalName Service for external database
apiVersion: v1
kind: Service
metadata:
name: external-database
spec:
type: ExternalName
externalName: database.prod.example.comThis is useful for integrating external services into your Kubernetes service discovery without proxying traffic through the cluster.
CoreDNS: The Service Discovery Backbone
DNS resolution in Kubernetes is handled by CoreDNS, which runs as a Deployment in the kube-system namespace. CoreDNS watches the Kubernetes API for new Services and Pods, automatically creating DNS records that enable service discovery by name rather than IP address.
DNS Naming Conventions
Kubernetes DNS follows a predictable naming convention:
- Service:
<service-name>.<namespace>.svc.cluster.local - Pod:
<pod-ip-dashed>.<namespace>.pod.cluster.local - SRV records:
_http._tcp.<service-name>.<namespace>.svc.cluster.local
Within the same namespace, you can use just <service-name>. Cross-namespace references require <service-name>.<namespace>.
Headless Services for StatefulSets
Headless Services (those with clusterIP: None) create individual DNS A records for each pod backing the Service. This is essential for StatefulSets where clients need to connect to specific pod instances, such as database primaries versus replicas.
# Headless Service for StatefulSet
apiVersion: v1
kind: Service
metadata:
name: postgres
spec:
clusterIP: None
selector:
app: postgres
ports:
- port: 5432
targetPort: 5432Each pod gets a DNS entry: postgres-0.postgres.default.svc.cluster.local, postgres-1.postgres.default.svc.cluster.local, etc. This allows application code to connect to specific instances for primary/replica routing.
DNS Debugging Techniques
DNS issues are among the most common networking problems in Kubernetes. Here's how to debug them:
# Deploy a debug pod
kubectl run dns-debug --image=busybox:1.36 --rm -it --restart=Never -- sh
# Test DNS resolution
nslookup api-server.default.svc.cluster.local
nslookup kubernetes.default.svc.cluster.local
# Check CoreDNS logs
kubectl logs -n kube-system -l k8s-app=kube-dns --tail=100
# Verify CoreDNS ConfigMap
kubectl get configmap coredns -n kube-system -o yaml
# Check if DNS is working for external domains
nslookup google.com
# Test SRV records
nslookup -type=SRV _http._tcp.api-server.default.svc.cluster.localCommon DNS issues include:
- DNS timeout: Usually indicates CoreDNS pods are unhealthy or network policies are blocking DNS traffic
- NXDOMAIN: Service doesn't exist or wrong namespace
- SERVFAIL: CoreDNS configuration error or upstream DNS issues
Ingress: Layer 7 HTTP Routing
Ingress is a higher-level abstraction that provides HTTP and HTTPS routing to Services based on hostnames and URL paths. While Services operate at Layer 4 (TCP/UDP), Ingress operates at Layer 7 (HTTP/HTTPS), enabling advanced routing rules, SSL termination, and virtual hosting.
Ingress Controllers Compared
An Ingress resource defines routing rules, but it requires an Ingress Controller to actually implement those rules. Here's a comparison of popular controllers:
| Controller | Performance | Features | Best For |
|---|---|---|---|
| NGINX Ingress | High | Mature, extensive annotations | General purpose, complex routing |
| Traefik | Very High | Auto-discovery, Let's Encrypt | Dynamic environments, microservices |
| HAProxy | Highest | TCP/UDP support, connection draining | High-performance, low-latency |
| Istio Gateway | High | Service mesh integration | Advanced traffic management |
| AWS ALB | High | Native AWS integration | AWS workloads |
| Contour | High | Envoy-based, Gateway API | Modern Kubernetes |
Ingress with Path-Based Routing
A single domain can route to multiple backend services based on URL path:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: unified-api
annotations:
nginx.ingress.kubernetes.io/proxy-body-size: "10m"
nginx.ingress.kubernetes.io/proxy-read-timeout: "60"
nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
ingressClassName: nginx
rules:
- host: platform.example.com
http:
paths:
- path: /auth
pathType: Prefix
backend:
service:
name: auth-service
port:
number: 80
- path: /users
pathType: Prefix
backend:
service:
name: user-service
port:
number: 80
- path: /orders
pathType: Prefix
backend:
service:
name: order-service
port:
number: 80
- path: /notifications
pathType: Prefix
backend:
service:
name: notification-service
port:
number: 80
tls:
- hosts:
- platform.example.com
secretName: platform-tlsConfiguring HTTPS with cert-manager
For production deployments, you need SSL termination at the Ingress level. cert-manager automates certificate provisioning from Let's Encrypt:
# ClusterIssuer for Let's Encrypt
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: admin@example.com
privateKeySecretRef:
name: letsencrypt-prod
solvers:
- http01:
ingress:
class: nginx
---
# Ingress with automatic TLS
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: production-ingress
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/rate-limit: "100"
spec:
ingressClassName: nginx
rules:
- host: app.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: web-app
port:
number: 80
tls:
- hosts:
- app.example.com
secretName: app-tls-certNetworkPolicies: Controlling Traffic Flow
NetworkPolicies are Kubernetes' native mechanism for controlling pod-to-pod communication. By default, all pods can communicate with all other pods. NetworkPolicies allow you to restrict this based on labels, namespaces, and IP blocks.
Default Deny Policy
Start with a default deny policy that blocks all ingress and egress traffic, then explicitly allow only what's needed:
# Default deny all ingress
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-ingress
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
---
# Allow frontend to backend communication
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-frontend-to-backend
namespace: production
spec:
podSelector:
matchLabels:
app: backend
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
app: frontend
ports:
- protocol: TCP
port: 8080Namespace-Based Isolation
For multi-tenant environments, isolate traffic between namespaces:
# Allow traffic only from same namespace
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-same-namespace
namespace: tenant-a
spec:
podSelector: {}
policyTypes:
- Ingress
ingress:
- from:
- podSelector: {}
---
# Allow ingress from ingress controller namespace
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-ingress-controller
namespace: tenant-a
spec:
podSelector:
matchLabels:
app: web
policyTypes:
- Ingress
ingress:
- from:
- namespaceSelector:
matchLabels:
name: ingress-nginx
ports:
- protocol: TCP
port: 8080Service Mesh Integration
For advanced traffic management, mutual TLS, and observability, consider integrating a service mesh like Istio or Linkerd. Linkerd is lightweight and provides automatic mTLS, traffic splitting, and observability. Istio offers comprehensive traffic management with VirtualServices and DestinationRules. Both can be enabled per-namespace with simple annotations.
# Enable Linkerd on a namespace
apiVersion: v1
kind: Namespace
metadata:
name: production
annotations:
linkerd.io/inject: enabledService meshes add operational complexity, so start with basic mTLS and add traffic management features as needed.
Performance Optimization
Kubernetes networking introduces overhead at every layer. For high-throughput Services, IPVS mode outperforms iptables significantly. Switch kube-proxy to IPVS mode when you have more than 1,000 Services:
# kube-proxy ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
name: kube-proxy
namespace: kube-system
data:
config.conf: |
mode: "ipvs"
ipvs:
scheduler: "rr"
minSyncPeriod: "0s"
syncPeriod: "30s"Use connection pooling in your application code to reduce the overhead of establishing new connections through the Service proxy. For latency-sensitive applications, use Topology Aware Routing to keep traffic within the same availability zone:
apiVersion: v1
kind: Service
metadata:
name: latency-sensitive-api
annotations:
service.kubernetes.io/topology-mode: Auto
spec:
selector:
app: api
ports:
- port: 80
targetPort: 8080Real-World Use Cases
Multi-Tenant SaaS Platform
A SaaS platform serving multiple customers needs traffic isolation between tenants. Using namespace-based segmentation with NetworkPolicies and Ingress routing by subdomain provides clean isolation. Each tenant gets its own namespace, and Ingress routes <tenant>.app.com to the corresponding namespace's services. NetworkPolicies prevent cross-tenant traffic at the pod level, while ResourceQuotas ensure fair resource allocation. This pattern scales to hundreds of tenants with minimal operational overhead.
Canary Deployments with Ingress
Ingress controllers like NGINX support traffic splitting through annotations, enabling canary deployments:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: api-canary
annotations:
nginx.ingress.kubernetes.io/canary: "true"
nginx.ingress.kubernetes.io/canary-weight: "10"
spec:
ingressClassName: nginx
rules:
- host: api.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: api-v2
port:
number: 80This allows you to gradually roll out new versions while monitoring error rates and performance metrics before full deployment.
gRPC Service Integration
gRPC services require HTTP/2 support. NGINX Ingress supports gRPC by setting the backend protocol annotation:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: grpc-ingress
annotations:
nginx.ingress.kubernetes.io/backend-protocol: "GRPC"
nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
ingressClassName: nginx
rules:
- host: grpc.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: grpc-service
port:
number: 50051
tls:
- hosts:
- grpc.example.com
secretName: grpc-tlsGateway API: The Future of Kubernetes Networking
The Gateway API is the successor to Ingress, providing a more expressive, role-oriented model. It introduces three resource types: GatewayClass (infrastructure provider), Gateway (cluster operator), and HTTPRoute (application developer). Gateway API offers several advantages: role separation, native protocol support (gRPC, TCP, UDP), header-based routing, and built-in traffic splitting. See the Gateway API specification for implementation details.
Testing and Validation
Test your networking configuration before deploying to production:
# Deploy a debug pod for connectivity testing
kubectl run debug --image=busybox --rm -it --restart=Never -- sh
# Test DNS resolution
nslookup api-server.default.svc.cluster.local
# Test HTTP connectivity
wget -qO- http://api-server/health
# Check all endpoints for a Service
kubectl get endpoints api-server
# Verify NetworkPolicies
kubectl get networkpolicies -n productionCommon Pitfalls and Solutions
| Pitfall | Impact | Solution |
|---|---|---|
Using hostNetwork: true | Port conflicts and security exposure | Use Services and Ingress instead |
Not setting externalTrafficPolicy: Local | Loss of client source IP | Set on LoadBalancer Services |
| Ingress without Controller | Resources have no effect | Install Ingress Controller first |
| Missing NetworkPolicies | Unrestricted communication | Implement default deny policies |
| Too many LoadBalancer Services | High cloud costs | Use Ingress to consolidate |
Best Practices Checklist
- Use NetworkPolicies by default: Define restrictive policies that only allow explicitly permitted traffic
- Implement health checks: Configure readiness and liveness probes on all pods
- Use cert-manager for TLS: Automates issuance, renewal, and secret management
- Monitor kube-proxy and CoreDNS: Set up alerts for restarts and latency spikes
- Prefer Ingress over LoadBalancer: Consolidate external access points
- Enable topology-aware routing: Reduce cross-zone latency and costs
- Use Gateway API for new deployments: More expressive than Ingress
Conclusion
Kubernetes networking through Services, Ingress, and DNS forms the backbone of any production cluster. Services provide stable endpoints for ephemeral pods, Ingress enables HTTP routing with SSL termination, and DNS automates service discovery across namespaces.
Key takeaways: use ClusterIP for internal services, Ingress for external HTTP access, and cert-manager for automated TLS. Implement NetworkPolicies from day one, monitor CoreDNS and kube-proxy health, and consider the Gateway API for new deployments. Start with the simplest networking configuration that meets your needs, and add complexity only when the trade-offs justify it.
The networking landscape continues to evolve with eBPF-based CNIs like Cilium replacing traditional iptables, Gateway API providing more expressive routing, and service meshes offering unified security and observability. Understanding these fundamentals prepares you for adopting these advanced technologies as they mature.
For further reading, consult the official Kubernetes networking documentation, the Gateway API specification, and your CNI plugin's documentation for advanced features like network policies and encryption.
Architecture and Design Patterns
Service Types and Their Use Cases
Kubernetes offers four Service types, each designed for specific networking scenarios. Understanding when to use each type is crucial for proper application architecture.
ClusterIP is the default Service type and provides internal-only access within the cluster. When you create a ClusterIP Service, Kubernetes assigns a virtual IP from the Service CIDR range, and kube-proxy programs rules to route traffic from that IP to the backing pods. ClusterIP Services are ideal for internal microservices that don't need external access.
NodePort builds on ClusterIP by opening a specific port (30000-32767) on every node in the cluster. External traffic can reach the Service by hitting any node's IP on that port. NodePort is useful for development and testing but is rarely used in production because it requires managing port assignments and doesn't integrate with load balancers or SSL termination.
LoadBalancer extends NodePort by provisioning an external load balancer through your cloud provider. When you create a LoadBalancer Service, the cloud controller manager creates a load balancer (like an AWS ELB or GCP Load Balancer) that routes traffic to the NodePort, which then routes to your pods. This is the standard approach for exposing services in cloud environments.
ExternalName maps a Service to a DNS name using a CNAME record. It doesn't use selectors or define ports—it simply returns a CNAME record pointing to an external service. This is useful for integrating external services into your Kubernetes service discovery without proxying traffic through the cluster.
Ingress Architecture
Ingress is a higher-level abstraction that provides HTTP and HTTPS routing to Services based on hostnames and URL paths. While Services operate at Layer 4 (TCP/UDP), Ingress operates at Layer 7 (HTTP/HTTPS), enabling advanced routing rules, SSL termination, and virtual hosting.
An Ingress resource defines routing rules, but it requires an Ingress Controller to actually implement those rules. The Ingress Controller is a reverse proxy (typically NGINX, Traefik, HAProxy, or Envoy) that watches for Ingress resources and configures itself accordingly. Without an Ingress Controller, Ingress resources have no effect.
# Example: Ingress resource with path-based routing
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: app-ingress
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
spec:
ingressClassName: nginx
rules:
- host: api.example.com
http:
paths:
- path: /v1
pathType: Prefix
backend:
service:
name: api-v1
port:
number: 80
- path: /v2
pathType: Prefix
backend:
service:
name: api-v2
port:
number: 80
tls:
- hosts:
- api.example.com
secretName: api-tlsDNS Resolution Patterns
Kubernetes DNS follows a predictable naming convention. Services are accessible at <service-name>.<namespace>.svc.cluster.local. Within the same namespace, you can use just <service-name>. Pods have DNS entries in the format <pod-ip-dashed>.<namespace>.pod.cluster.local, though direct pod DNS is rarely used since pod IPs change.
Headless Services (those with clusterIP: None) create individual DNS A records for each pod backing the Service. This is essential for StatefulSets where clients need to connect to specific pod instances, such as database primaries versus replicas.
Step-by-Step Implementation
Setting Up a Basic ClusterIP Service
Let's start by deploying an application and exposing it internally with a ClusterIP Service. This is the most common pattern for backend microservices.
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-server
spec:
replicas: 3
selector:
matchLabels:
app: api-server
template:
metadata:
labels:
app: api-server
spec:
containers:
- name: api
image: my-api:1.0
ports:
- containerPort: 8080
readinessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
---
# service.yaml
apiVersion: v1
kind: Service
metadata:
name: api-server
spec:
selector:
app: api-server
ports:
- port: 80
targetPort: 8080
protocol: TCP
type: ClusterIPAfter applying this configuration, other pods in the cluster can reach the API server at http://api-server or http://api-server.default.svc.cluster.local. Kubernetes automatically load balances requests across the three replicas.
Configuring HTTPS with Ingress and cert-manager
For production deployments, you need SSL termination at the Ingress level. cert-manager automates certificate provisioning from Let's Encrypt.
# Install cert-manager
# kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.0/cert-manager.yaml
# ClusterIssuer for Let's Encrypt
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: admin@example.com
privateKeySecretRef:
name: letsencrypt-prod
solvers:
- http01:
ingress:
class: nginx
---
# Ingress with automatic TLS
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: production-ingress
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/rate-limit: "100"
spec:
ingressClassName: nginx
rules:
- host: app.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: web-app
port:
number: 80
tls:
- hosts:
- app.example.com
secretName: app-tls-certImplementing a Headless Service for StatefulSets
Database workloads require stable network identities. Headless Services with StatefulSets provide this.
apiVersion: v1
kind: Service
metadata:
name: postgres
spec:
clusterIP: None
selector:
app: postgres
ports:
- port: 5432
targetPort: 5432
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgres
spec:
serviceName: postgres
replicas: 3
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres
image: postgres:15
ports:
- containerPort: 5432
env:
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: postgres-secret
key: password
volumeMounts:
- name: data
mountPath: /var/lib/postgresql/data
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 10GiEach pod gets a DNS entry: postgres-0.postgres.default.svc.cluster.local, postgres-1.postgres.default.svc.cluster.local, etc. This allows application code to connect to specific instances for primary/replica routing.
Real-World Use Cases and Case Studies
Use Case 1: Multi-Tenant SaaS Platform
A SaaS platform serving multiple customers needs traffic isolation between tenants. Using namespace-based segmentation with NetworkPolicies and Ingress routing by subdomain provides clean isolation. Each tenant gets its own namespace, and Ingress routes <tenant>.app.com to the corresponding namespace's services. NetworkPolicies prevent cross-tenant traffic at the pod level, while ResourceQuotas ensure fair resource allocation.
Use Case 2: Canary Deployments with Ingress
Ingress controllers like NGINX support traffic splitting through annotations, enabling canary deployments. By creating a secondary Ingress resource with nginx.ingress.kubernetes.io/canary: "true" and nginx.ingress.kubernetes.io/canary-weight: "10", you can route 10% of traffic to a new version while monitoring error rates before full rollout.
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: api-canary
annotations:
nginx.ingress.kubernetes.io/canary: "true"
nginx.ingress.kubernetes.io/canary-weight: "10"
spec:
ingressClassName: nginx
rules:
- host: api.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: api-v2
port:
number: 80Use Case 3: gRPC Service Mesh Integration
gRPC services require HTTP/2 support, which standard HTTP/1.1 Ingress configurations don't handle well. NGINX Ingress supports gRPC by setting nginx.ingress.kubernetes.io/backend-protocol: "GRPC" on the Ingress resource. For more advanced use cases like mutual TLS and traffic mirroring, a service mesh like Istio or Linkerd provides transparent gRPC load balancing with circuit breaking and retry capabilities.
Best Practices for Production
-
Use NetworkPolicies by default: Define restrictive NetworkPolicies that only allow explicitly permitted traffic. Start with a default deny policy and add allow rules as needed.
-
Implement health checks on Services: Configure readiness and liveness probes on all pods. Services only route to pods with passing readiness checks, preventing traffic from reaching unhealthy instances.
-
Set resource requests and limits: Ingress controllers consume CPU and memory. Set appropriate resource requests to ensure the scheduler places them correctly and limits to prevent resource exhaustion.
-
Use cert-manager for TLS automation: Manual certificate management is error-prone. cert-manager handles issuance, renewal, and secret management automatically, reducing the risk of expired certificates.
-
Monitor kube-proxy and CoreDNS: These components are critical for networking. Set up alerts for kube-proxy restarts, CoreDNS latency spikes, and Service endpoint changes.
-
Prefer Ingress over LoadBalancer Services: Each LoadBalancer Service provisions a separate external load balancer, which is expensive. Use a single Ingress controller with multiple Ingress resources to consolidate external access points.
-
Enable topology-aware routing: For multi-zone clusters, use
topologyKeyson Services to prefer routing traffic within the same zone, reducing cross-zone latency and costs. -
Document DNS naming conventions: Maintain documentation of your service naming conventions so teams can discover and reference services consistently.
Common Pitfalls and Solutions
| Pitfall | Impact | Solution |
|---|---|---|
Using hostNetwork: true on pods | Port conflicts and security exposure | Use Services and Ingress for external access instead of host networking |
Not setting externalTrafficPolicy: Local | Loss of client source IP on LoadBalancer Services | Set externalTrafficPolicy: Local to preserve source IPs (accept uneven load) |
| Ingress without Ingress Controller | Ingress resources have no effect | Always install an Ingress Controller before creating Ingress resources |
| DNS caching in applications | Stale endpoint resolution after scaling | Set appropriate TTL and use headless Services for stateful workloads |
| Missing NetworkPolicies | Unrestricted pod-to-pod communication | Implement default deny policies and allow only required traffic paths |
| Overriding cluster DNS with custom resolvers | Service discovery failures | Use dnsPolicy: ClusterFirst (default) unless you have specific external DNS needs |
Performance Optimization
Kubernetes networking introduces overhead at every layer. Understanding where latency accumulates helps you optimize effectively.
For high-throughput Services, IPVS mode outperforms iptables significantly. Switch kube-proxy to IPVS mode when you have more than 1,000 Services:
# kube-proxy ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
name: kube-proxy
namespace: kube-system
data:
config.conf: |
mode: "ipvs"
ipvs:
scheduler: "rr"
minSyncPeriod: "0s"
syncPeriod: "30s"Use connection pooling in your application code to reduce the overhead of establishing new connections through the Service proxy. HTTP/2 multiplexing further reduces connection overhead by sharing a single connection across multiple requests.
// Connection pooling example with Kubernetes service discovery
import { Agent } from 'undici';
const pool = new Agent({
connections: 10,
pipelining: 1,
keepAliveTimeout: 60000,
keepAliveMaxTimeout: 600000,
});
async function callService(serviceName: string, path: string) {
const url = `http://${serviceName}.default.svc.cluster.local${path}`;
const response = await fetch(url, {
dispatcher: pool,
});
return response.json();
}For latency-sensitive applications, consider using Topology Aware Routing to keep traffic within the same availability zone, reducing network hop latency from milliseconds to sub-milliseconds.
apiVersion: v1
kind: Service
metadata:
name: latency-sensitive-api
annotations:
service.kubernetes.io/topology-mode: Auto
spec:
selector:
app: api
ports:
- port: 80
targetPort: 8080Comparison with Alternatives
| Feature | Kubernetes Services | AWS ALB | Nginx Direct | Service Mesh |
|---|---|---|---|---|
| Layer 4 Load Balancing | Yes | No | Yes | Yes |
| Layer 7 Routing | Via Ingress | Yes | Yes | Yes |
| mTLS | No | No | Manual | Automatic |
| Traffic Splitting | Limited | Weighted | Manual | Advanced |
| Observability | Basic | CloudWatch | Logs | Full metrics |
| Complexity | Low | Low | Medium | High |
| Cost | Included | Per-LB cost | Free (OSS) | Overhead |
Kubernetes Services are the right choice for basic service discovery and load balancing. Add Ingress for HTTP routing. Consider a service mesh only when you need advanced traffic management, security policies, or observability that Services and Ingress cannot provide.
Advanced Patterns and Techniques
Ingress Path-Based Microservice Routing
A single domain can route to multiple backend services based on URL path, enabling microservice architectures behind a unified entry point:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: unified-api
annotations:
nginx.ingress.kubernetes.io/proxy-body-size: "10m"
nginx.ingress.kubernetes.io/proxy-read-timeout: "60"
spec:
ingressClassName: nginx
rules:
- host: platform.example.com
http:
paths:
- path: /auth
pathType: Prefix
backend:
service:
name: auth-service
port:
number: 80
- path: /users
pathType: Prefix
backend:
service:
name: user-service
port:
number: 80
- path: /orders
pathType: Prefix
backend:
service:
name: order-service
port:
number: 80
- path: /notifications
pathType: Prefix
backend:
service:
name: notification-service
port:
number: 80Service Mesh with Linkerd
For advanced traffic management without the overhead of Istio, Linkerd provides automatic mTLS, traffic splitting, and observability with minimal configuration:
# Enable Linkerd on a namespace
apiVersion: v1
kind: Namespace
metadata:
name: production
annotations:
linkerd.io/inject: enabled
---
# Traffic split for canary deployments
apiVersion: split.smi-spec.io/v1alpha4
kind: TrafficSplit
metadata:
name: api-canary
namespace: production
spec:
service: api-service
backends:
- service: api-service-stable
weight: 900
- service: api-service-canary
weight: 100Testing Strategies
Test your networking configuration before deploying to production. Use ephemeral debug containers to verify DNS resolution and Service connectivity:
# Deploy a debug pod
kubectl run debug --image=busybox --rm -it --restart=Never -- sh
# Inside the debug pod, test DNS resolution
nslookup api-server.default.svc.cluster.local
nslookup api-server
# Test HTTP connectivity
wget -qO- http://api-server/health
# Test from a specific namespace
nslookup api-server.production.svc.cluster.local
# Check all endpoints for a Service
kubectl get endpoints api-serverFor automated testing, write integration tests that verify Service discovery and Ingress routing:
import { describe, it, expect } from 'vitest';
describe('Kubernetes Networking', () => {
it('should resolve service DNS within the same namespace', async () => {
const response = await fetch('http://api-server/health');
expect(response.status).toBe(200);
});
it('should resolve cross-namespace service DNS', async () => {
const response = await fetch('http://api-server.production.svc.cluster.local/health');
expect(response.status).toBe(200);
});
it('should route Ingress traffic based on host header', async () => {
const response = await fetch('http://INGRESS_IP/api/v1/status', {
headers: { Host: 'api.example.com' },
});
expect(response.status).toBe(200);
});
it('should return 404 for unknown Ingress hosts', async () => {
const response = await fetch('http://INGRESS_IP/', {
headers: { Host: 'unknown.example.com' },
});
expect(response.status).toBe(404);
});
});Use kubectl exec to run connectivity tests from within pods in different namespaces, and validate NetworkPolicies by attempting cross-namespace connections that should be blocked.
Future Outlook
Kubernetes networking continues evolving with the Gateway API, which aims to replace Ingress with a more expressive, role-oriented model. Gateway API introduces three resource types: GatewayClass (infrastructure provider), Gateway (cluster operator), and HTTPRoute (application developer), enabling cleaner separation of concerns. Cilium's eBPF-based networking is gaining adoption by bypassing iptables entirely, offering significant performance improvements. Service mesh consolidation through the Gateway API and GAMMA initiative promises to unify ingress and mesh routing under a single API model.
Conclusion
Kubernetes networking through Services, Ingress, and DNS forms the backbone of any production cluster. Services provide stable endpoints for ephemeral pods, Ingress enables HTTP routing with SSL termination, and DNS automates service discovery across namespaces. Understanding these primitives and their interactions is essential for building reliable, scalable applications on Kubernetes.
Key takeaways: use ClusterIP for internal services, Ingress for external HTTP access, and cert-manager for automated TLS. Implement NetworkPolicies from day one, monitor CoreDNS and kube-proxy health, and consider the Gateway API for new deployments. Start with the simplest networking configuration that meets your needs, and add complexity only when the trade-offs justify it.
For further reading, consult the official Kubernetes networking documentation, the Gateway API specification, and your CNI plugin's documentation for advanced features like network policies and encryption.