GCP Study Hub

Compute

Cloud Run

Fully managed serverless containers — scales to zero

AWS equivalent

AWS Fargate / Lambda (container)

ServerlessContainersPaaS

Architecture Diagram

Cloud Run — Auto-scaling Model

🌐 InternetTraffic⚡ GlobalLoad BalancerAnycast IPCloud RunRev 1 (90%)Rev 2 (10%) canaryScale 0 → ∞☁️ Cloud SQL🗄️ Cloud Storage🔒 Secret Manager↑ Auto-scales to zero when idle, unlimited under load
🔄

AWS → GCP: Key Differences

  • Closer to Fargate than Lambda — runs any containerized workload, not just functions.

  • Scales to zero: no traffic = no cost. Fargate doesn't scale to zero by default.

  • Simpler operational model than ECS/Fargate: no clusters, no task definitions, just deploy a container image.

  • Cloud Run also supports jobs (batch/scheduled) alongside services (HTTP).

📌

Key Concepts to Know

  • 1

    Any language, any runtime — as long as it runs in a container and listens on HTTP.

  • 2

    Concurrency model: one instance handles multiple requests simultaneously (configurable). Lambda handles one at a time.

  • 3

    Traffic splitting: deploy a new revision and split traffic 90/10 for canary deployments.

  • 4

    Integrates with Pub/Sub for event-driven workloads — trigger a Cloud Run service from a message.

  • 5

    VPC connector: allows Cloud Run to connect to resources in your private VPC.

  • 6

    Minimum instances: set > 0 to avoid cold starts for latency-sensitive workloads.

💡

DCE Interview Tips

  • Default recommendation for new APIs and microservices on GCP — serverless, no infra management.

  • For Thai enterprise customers migrating from on-prem: 'Containerize your app and deploy to Cloud Run — you go from managing servers to just deploying containers.'

  • Explain the cost model: pay per CPU/memory per 100ms of request processing. Idle = free (with min-instances=0).

⚠️

Common Gotchas

  • !

    Not suitable for workloads needing persistent local disk or stateful connections — use GKE for those.

  • !

    Cold start latency: first request after scale-to-zero can be slow. Use minimum instances for latency-sensitive apps.

  • !

    Max request timeout is 60 minutes — good for long-running jobs.