MinhVo

Minh Vo

rss feed

Slaying code & making it lit fr fr 🔥 tagline

Hey there 👋 I'm an AI Engineer with 7 years of experience building scalable web and mobile applications. Currently at Neurond AI (May 2025 — present), architecting an Enterprise AI Assistant Platform with multi-tenant RAG on pgvector, multi-provider LLM orchestration, and Azure-native infrastructure. Previously spent 5+ years at SNAPTEC (Sep 2019 — Apr 2025), leading SaaS themes, admin dashboards, and e-commerce platforms — earned the Hero of the Year award in 2021. I specialize in TypeScript, React, Next.js, and AI-Native engineering with Claude Code and Cursor.bio

Back to blogs

AWS ECS vs EKS: Container Orchestration Compared

Compare AWS container services: ECS, EKS, Fargate, cost, and operational complexity.

AWSECSEKSContainers

By MinhVo

Introduction

AWS offers two primary container orchestration services: ECS (Elastic Container Service) and EKS (Elastic Kubernetes Service). Both run containerized workloads at scale, but they represent fundamentally different philosophies. ECS is AWS's proprietary, opinionated container platform — simple, tightly integrated, and optimized for the AWS ecosystem. EKS is AWS's managed Kubernetes — flexible, standards-based, and backed by the massive CNCF ecosystem. Choosing between them is not just a technical decision; it's a strategic one that affects your team's workflow, your architecture's flexibility, and your infrastructure costs.

AWS container services

The container orchestration landscape has matured significantly. Kubernetes has become the de facto standard for container orchestration in the industry, with a vast ecosystem of tools, patterns, and best practices. But "industry standard" doesn't mean "right for every team." ECS's simplicity makes it the better choice for many teams, especially those already invested in AWS and new to containers. EKS's power and flexibility make it the better choice for teams with Kubernetes expertise, multi-cloud requirements, or complex workload scheduling needs.

This guide provides a comprehensive comparison covering architecture, networking, security, cost, operational complexity, ecosystem, and real-world decision frameworks to help you make the right choice for your organization.

Understanding the Services: Core Concepts

ECS: AWS-Native Container Orchestration

ECS is a fully managed container orchestration service that supports Docker containers. Its core concepts include:

  • Task Definition — The blueprint for your application. Specifies container images, CPU/memory requirements, port mappings, volumes, environment variables, and logging configuration.
  • Task — A running instance of a task definition. Similar to a Kubernetes Pod.
  • Service — Maintains a desired number of tasks, integrates with load balancers, and handles rolling deployments and health checks.
  • Cluster — A logical grouping of tasks or services. Can use EC2 instances or Fargate as compute.

ECS's scheduler is proprietary and optimized for AWS. It considers CPU, memory, port availability, placement constraints, and availability zone distribution when placing tasks.

EKS: Managed Kubernetes

EKS runs the Kubernetes control plane and provides the standard Kubernetes API. Its core concepts mirror upstream Kubernetes:

  • Pod — The smallest deployable unit. One or more containers sharing network and storage.
  • Deployment — Manages replica sets and rolling updates.
  • Service — Stable network endpoint for a set of pods.
  • Ingress — HTTP routing and load balancing.
  • Namespace — Logical cluster partitioning for multi-tenancy.

EKS gives you the full Kubernetes API, including all upstream features, CRDs, operators, and the CNCF ecosystem.

Fargate: Serverless Compute

Fargate eliminates node management for both ECS and EKS. You specify CPU and memory requirements, and Fargate provisions the right compute. You pay only for what you use, billed per vCPU-second and GB-second.

ECS + Fargate is the most mature integration. EKS + Fargate has limitations: no DaemonSets, no privileged containers, no GPU support, and Linux/x86 only.

Container orchestration patterns

Architecture and Design Patterns

Service-Oriented Architecture

Both ECS and EKS support service-oriented architectures. Each service runs independently with its own scaling, deployment, and resource allocation. ECS services integrate with ALB for HTTP routing. EKS services use Ingress controllers (AWS Load Balancer Controller) for the same purpose.

Event-Driven Architecture

Run event-driven workloads (SQS consumers, EventBridge processors, Lambda triggers) as ECS tasks or Kubernetes Deployments. ECS integrates natively with SQS and EventBridge. EKS requires event sources (KEDA, AWS Event Bridge Controller).

Batch and ML Workloads

Kubernetes Jobs and CronJobs (EKS) are more flexible than ECS tasks for batch processing. For ML workloads, EKS supports GPU scheduling, custom resource definitions for training jobs, and integrations with Kubeflow.

Multi-Cluster and Multi-Region

EKS has better tooling for multi-cluster management (ArgoCD, Fleet, Admiralty). ECS multi-cluster setups require custom tooling. For global deployments, both services support multi-region architectures with Route 53 routing.

Step-by-Step Implementation

ECS with AWS CDK

import * as cdk from 'aws-cdk-lib';
import * as ecs from 'aws-cdk-lib/aws-ecs';
import * as ec2 from 'aws-cdk-lib/aws-ec2';
import * as elbv2 from 'aws-cdk-lib/aws-elasticloadbalancingv2';
import * as logs from 'aws-cdk-lib/aws-logs';
 
export class WebAppStack extends cdk.Stack {
  constructor(scope: cdk.App, id: string) {
    super(scope, id);
 
    const vpc = new ec2.Vpc(this, 'Vpc', { maxAzs: 3 });
 
    const cluster = new ecs.Cluster(this, 'Cluster', {
      vpc,
      containerInsights: true,
    });
 
    const taskDef = new ecs.FargateTaskDefinition(this, 'TaskDef', {
      memoryLimitMiB: 1024,
      cpu: 512,
    });
 
    const container = taskDef.addContainer('web', {
      image: ecs.ContainerImage.fromAsset('./app'),
      logging: ecs.LogDrivers.awsLogs({ streamPrefix: 'web' }),
      environment: {
        NODE_ENV: 'production',
      },
      secrets: {
        DATABASE_URL: ecs.Secret.fromSsmParameter(
          cdk.aws_ssm.Parameter.fromStringParameterName(this, 'DbUrl', '/prod/db-url')
        ),
      },
    });
 
    container.addPortMappings({ containerPort: 3000 });
 
    const service = new ecs.FargateService(this, 'Service', {
      cluster,
      taskDefinition: taskDef,
      desiredCount: 3,
      circuitBreaker: { enable: true, rollback: true },
    });
 
    const lb = new elbv2.ApplicationLoadBalancer(this, 'ALB', {
      vpc,
      internetFacing: true,
    });
 
    const listener = lb.addListener('Listener', { port: 443 });
    service.registerLoadBalancerTargets({
      containerName: 'web',
      containerPort: 3000,
      newTargetGroup: {
        healthCheck: { path: '/health', interval: cdk.Duration.seconds(30) },
      },
    });
 
    const scaling = service.autoScaleTaskCount({ minCapacity: 2, maxCapacity: 20 });
    scaling.scaleOnCpuUtilization('Cpu', { targetUtilizationPercent: 70 });
    scaling.scaleOnRequestCount('Requests', {
      requestsPerTarget: 1000,
      targetGroup: listener.addTargets('Target', { port: 3000 }),
    });
  }
}

EKS with Terraform

module "eks" {
  source          = "terraform-aws-modules/eks/aws"
  version         = "~> 19.0"
 
  cluster_name    = "production"
  cluster_version = "1.28"
 
  vpc_id          = module.vpc.vpc_id
  subnet_ids      = module.vpc.private_subnets
 
  eks_managed_node_groups = {
    general = {
      desired_size = 3
      min_size     = 2
      max_size     = 10
 
      instance_types = ["t3.large"]
      capacity_type  = "ON_DEMAND"
    }
 
    spot = {
      desired_size = 2
      min_size     = 0
      max_size     = 20
 
      instance_types = ["t3.large", "t3a.large"]
      capacity_type  = "SPOT"
    }
  }
 
  manage_aws_auth_configmap = true
  aws_auth_roles = [
    {
      rolearn  = "arn:aws:iam::123456789:role/developer"
      username = "developer"
      groups   = ["system:masters"]
    },
  ]
}

Kubernetes Manifests for EKS

# Namespace
apiVersion: v1
kind: Namespace
metadata:
  name: production
---
# Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
  namespace: production
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  selector:
    matchLabels:
      app: web-app
  template:
    metadata:
      labels:
        app: web-app
    spec:
      topologySpreadConstraints:
        - maxSkew: 1
          topologyKey: topology.kubernetes.io/zone
          whenUnsatisfiable: DoNotSchedule
          labelSelector:
            matchLabels:
              app: web-app
      containers:
        - name: web
          image: 123456789.dkr.ecr.us-east-1.amazonaws.com/web-app:v1.2.3
          ports:
            - containerPort: 3000
          resources:
            requests:
              cpu: 250m
              memory: 512Mi
            limits:
              cpu: 1000m
              memory: 1Gi
          readinessProbe:
            httpGet:
              path: /ready
              port: 3000
            initialDelaySeconds: 5
            periodSeconds: 10
          livenessProbe:
            httpGet:
              path: /health
              port: 3000
            initialDelaySeconds: 15
            periodSeconds: 20
          env:
            - name: NODE_ENV
              value: "production"
            - name: DATABASE_URL
              valueFrom:
                secretKeyRef:
                  name: db-credentials
                  key: url
---
# HPA
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: web-app
  namespace: production
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web-app
  minReplicas: 3
  maxReplicas: 20
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70

AWS infrastructure

Real-World Use Cases

Startup (10-person team, simple microservices)

Recommendation: ECS on Fargate. Small teams benefit from ECS's simplicity. Fargate eliminates node management. AWS CDK makes infrastructure as code straightforward. The team can focus on building product features instead of managing Kubernetes clusters.

Mid-size company (50-person engineering team, 20+ microservices)

Recommendation: Depends on expertise. If the team has Kubernetes experience, EKS provides more flexibility and a richer ecosystem. If the team is AWS-native with no Kubernetes experience, ECS is the pragmatic choice. Consider the team's growth trajectory — if you're hiring Kubernetes-experienced engineers, EKS may be the better long-term investment.

Enterprise (500+ engineers, multi-cloud strategy)

Recommendation: EKS. Enterprises benefit from Kubernetes's portability, multi-cloud support, and standardized tooling. EKS with GitOps (ArgoCD), policy engines (OPA/Gatekeeper), and service mesh (Istio) provides the governance and operational maturity enterprises require.

ML/AI Platform

Recommendation: EKS. ML workloads benefit from Kubernetes's GPU scheduling, custom operators (Kubeflow, KubeRay), and batch processing capabilities. ECS lacks the sophisticated scheduling and operator ecosystem that ML platforms require.

Best Practices for Production

  1. Use infrastructure as code from day one — CDK for ECS, Terraform for EKS. Never make manual changes to production infrastructure.

  2. Implement circuit breakers — ECS supports deployment circuit breakers natively. EKS uses pod disruption budgets and readiness gates.

  3. Use Spot/Graviton for cost savings — ECS and EKS both support Spot instances. EKS supports mixed instance policies with Karpenter. Graviton (ARM) instances are 20% cheaper.

  4. Implement proper observability — CloudWatch Container Insights for ECS. Prometheus + Grafana + CloudWatch for EKS. Distributed tracing with X-Ray or Jaeger.

  5. Secure the network — Use private subnets for tasks/pods. Security groups for task-level firewall rules. Network policies (EKS) for pod-level control.

  6. Manage secrets properly — AWS Secrets Manager or SSM Parameter Store for ECS. External Secrets Operator or Secrets Store CSI Driver for EKS.

  7. Implement zero-downtime deployments — ECS: rolling updates with circuit breakers. EKS: rolling updates with maxUnavailable=0 and readiness probes.

  8. Right-size your resources — Use CloudWatch metrics to identify over-provisioned tasks/pods. Implement resource requests and limits. Use VPA recommendations.

Common Pitfalls and Solutions

PitfallImpactSolution
Choosing EKS for simplicityOperational overhead, slower deliveryUse ECS unless Kubernetes features are needed
Not setting resource limitsOOM kills, noisy neighborsSet requests and limits for all containers
Using public subnets for tasksSecurity exposureUse private subnets with NAT gateway
No health checksTraffic to unhealthy containersConfigure readiness and liveness probes
Manual infrastructure changesConfiguration drift, no rollbackUse IaC exclusively
Ignoring Fargate pricingCost surpriseModel costs before committing to Fargate
Single-AZ deploymentNo resilienceSpread tasks/pods across multiple AZs
No deployment circuit breakerBad deployments in productionEnable circuit breaker with automatic rollback

Cost Deep Dive

ECS Cost Model

EC2 launch type: Pay for EC2 instances. Use Reserved Instances (1-year: ~40% savings, 3-year: ~60%) or Savings Plans for steady-state workloads.

Fargate launch type: Pay per vCPU-hour (0.04048)andGB−hour(0.04048) and GB-hour (0.004445). No upfront commitment. Fargate Spot offers up to 70% discount for fault-tolerant workloads.

EKS Cost Model

Control plane: 0.10/hour( 0.10/hour (~73/month). This is a fixed cost regardless of cluster size.

Worker nodes: Same as ECS EC2 — pay for EC2 instances. Use Reserved Instances or Savings Plans.

Fargate: Same pricing as ECS Fargate, plus the $0.10/hour control plane cost.

Cost Comparison (10 services, 3 replicas each, t3.medium equivalent)

OptionMonthly ComputeControl PlaneTotal
ECS EC2 (on-demand)~$600$0~$600
ECS EC2 (1yr RI)~$360$0~$360
ECS Fargate~$900$0~$900
EKS EC2 (on-demand)~$600$73~$673
EKS EC2 (1yr RI)~$360$73~$433
EKS Fargate~$900$73~$973

Comparison Table

FeatureECSEKS
Control planeFree, AWS-managed$0.10/hr, AWS-managed
APIAWS proprietaryKubernetes standard
Learning curveLowHigh
EcosystemAWS-nativeCNCF, Helm, operators
Multi-cloudAWS onlyAny K8s cluster
Networkingawsvpc, Cloud MapVPC CNI, network policies, service mesh
DeploymentRolling, circuit breakerRolling, blue/green, canary (ArgoCD)
ScalingService Auto ScalingHPA, VPA, Karpenter, Cluster Autoscaler
Batch/MLBasicJobs, CronJobs, operators (Kubeflow)
Multi-tenancyBasic (account-level)Namespaces, RBAC, network policies
ObservabilityCloudWatchPrometheus, Grafana, CloudWatch
Fargate supportFullLimited
On-premisesECS AnywhereEKS Anywhere, Outposts

Advanced Patterns

Karpenter for EKS

Karpenter replaces the Cluster Autoscaler with a more intelligent node provisioner. It watches for unschedulable pods and launches the optimal EC2 instance type based on pod requirements. Karpenter supports spot interruption handling, consolidation (removing underutilized nodes), and multi-architecture scheduling.

ECS Service Connect

ECS Service Connect provides service discovery and traffic routing without a service mesh. Services discover each other via DNS names, and Service Connect handles load balancing and retries. This is simpler than App Mesh but less feature-rich.

GitOps with EKS

Use ArgoCD or Flux for GitOps-driven deployments. Push Kubernetes manifests to Git, and ArgoCD automatically syncs the cluster state. This provides audit trails, rollbacks, and declarative infrastructure management.

Future Outlook

AWS is investing in both services. ECS is becoming more feature-rich (Service Connect, capacity providers, ECS Anywhere) while maintaining its simplicity advantage. EKS is becoming easier to operate (EKS Auto Mode, managed add-ons, Karpenter) while maintaining its flexibility advantage.

The most significant trend is the convergence of ECS and EKS on Fargate. As Fargate becomes more capable (GPU support, better performance, lower pricing), the compute layer becomes commoditized. The choice between ECS and EKS increasingly comes down to the control plane API — AWS-native vs Kubernetes-native.

Community Resources and Further Learning

The technology landscape evolves rapidly, making continuous learning essential for maintaining expertise. Building a systematic approach to staying current with developments in your technology stack ensures you can leverage new features and avoid deprecated patterns.

Curated Learning Pathways

Rather than consuming content randomly, create structured learning pathways aligned with your current projects and career goals. Start with official documentation and specification documents, which provide the most accurate and comprehensive information. Follow this with hands-on tutorials and workshops that reinforce concepts through practical application.

Technical blogs from framework maintainers and core team members often provide deeper insights into design decisions and upcoming features. Subscribe to the official blogs of your primary frameworks and libraries to stay ahead of breaking changes and deprecation timelines.

Contributing to Open Source

Contributing to open-source projects in your technology stack provides unparalleled learning opportunities. Start with documentation improvements and bug reports, then progress to fixing small issues tagged as "good first issue" in your favorite projects. This direct engagement with maintainers and the codebase accelerates your understanding far beyond what passive learning can achieve.

# Setting up for contribution
git clone https://github.com/project/repository.git
cd repository
git checkout -b fix/issue-description
 
# Run the project's contribution setup
npm run setup:dev
npm run test  # Ensure tests pass before making changes
 
# Make your changes, then run the full test suite
npm run test:full
npm run lint
npm run build
 
# Submit your contribution
git add -A
git commit -m "fix: description of the fix
 
Closes #1234"
git push origin fix/issue-description

Building a Technical Knowledge Base

Maintain a personal knowledge base that captures insights, solutions, and patterns you discover during your work. Tools like Obsidian, Notion, or even a simple Markdown repository can serve as an external memory that grows more valuable over time.

Organize your notes by topic rather than chronologically, and include code examples, links to relevant documentation, and explanations of why certain approaches work better than others. When you encounter a particularly insightful article or conference talk, write a summary that captures the key takeaways and how they apply to your current projects.

Follow key conferences and their published talks to stay informed about emerging patterns and best practices. Many conferences publish recorded talks on YouTube within weeks of the event, making world-class technical content freely accessible.

Join relevant Discord servers, Slack communities, and forums where practitioners discuss real-world challenges and solutions. These communities provide early warning about emerging issues and access to collective wisdom that isn't available through formal documentation.

Mentorship and Knowledge Sharing

Teaching others is one of the most effective ways to deepen your own understanding. Consider writing technical blog posts, giving talks at local meetups, or mentoring junior developers. The process of explaining concepts to others forces you to organize your knowledge and identify gaps in your understanding.

Pair programming sessions with colleagues of different experience levels create mutual learning opportunities. Senior developers gain fresh perspectives on problems they've solved the same way for years, while junior developers benefit from exposure to production-grade thinking and decision-making processes.

Conclusion

Both ECS and EKS are production-ready container orchestration services. The choice depends on your team's expertise, your workload's complexity, and your organization's infrastructure strategy.

Key takeaways:

  1. ECS is simpler — lower learning curve, tighter AWS integration, no control plane cost
  2. EKS is more powerful — Kubernetes ecosystem, multi-cloud support, advanced scheduling
  3. Fargate eliminates node management but costs more than EC2 for steady-state workloads
  4. Use Reserved Instances or Savings Plans for predictable workloads to reduce costs by 40-60%
  5. Implement health checks, auto scaling, and zero-downtime deployments regardless of choice
  6. Use infrastructure as code (CDK for ECS, Terraform for EKS) from day one
  7. Choose based on your team's expertise and your workload's requirements, not hype

Start by assessing your team's container expertise and your workload requirements. If you need Kubernetes features (operators, CRDs, multi-cloud, advanced scheduling), choose EKS. If you want the simplest path to production containers on AWS, choose ECS. Build a proof of concept and validate your choice before committing to production infrastructure.