Compute

Cloud Run

Fully managed serverless containers — scales to zero

AWS equivalent

AWS Fargate / Lambda (container)

ServerlessContainersPaaS

Architecture Diagram

Cloud Run — Auto-scaling Model

🔄

▸
Closer to Fargate than Lambda — runs any containerized workload, not just functions.
▸
Scales to zero: no traffic = no cost. Fargate doesn't scale to zero by default.
▸
Simpler operational model than ECS/Fargate: no clusters, no task definitions, just deploy a container image.
▸
Cloud Run also supports jobs (batch/scheduled) alongside services (HTTP).

📌

1
Any language, any runtime — as long as it runs in a container and listens on HTTP.
2
Concurrency model: one instance handles multiple requests simultaneously (configurable). Lambda handles one at a time.
3
Traffic splitting: deploy a new revision and split traffic 90/10 for canary deployments.
4
Integrates with Pub/Sub for event-driven workloads — trigger a Cloud Run service from a message.
5
VPC connector: allows Cloud Run to connect to resources in your private VPC.
6
Minimum instances: set > 0 to avoid cold starts for latency-sensitive workloads.

💡

★
Default recommendation for new APIs and microservices on GCP — serverless, no infra management.
★
For Thai enterprise customers migrating from on-prem: 'Containerize your app and deploy to Cloud Run — you go from managing servers to just deploying containers.'
★
Explain the cost model: pay per CPU/memory per 100ms of request processing. Idle = free (with min-instances=0).

⚠️

!
Not suitable for workloads needing persistent local disk or stateful connections — use GKE for those.
!
Cold start latency: first request after scale-to-zero can be slow. Use minimum instances for latency-sensitive apps.
!
Max request timeout is 60 minutes — good for long-running jobs.