examlab .net The most efficient path to the most valuable certifications.
In this note ≈ 25 min

CodeDeploy — Blue/Green, Canary, and Rolling Deployments

5,000 words · ≈ 25 min read ·

DOP-C02 deep dive on CodeDeploy for EC2 and on-prem: blue/green vs in-place deployments, CodeDeployDefault configurations (AllAtOnce, OneAtATime, HalfAtATime, Canary, Linear), appspec.yml lifecycle hooks, automatic rollback triggers, and ASG integration.

Do 20 practice questions → Free · No signup · DOP-C02

CodeDeploy on EC2 is the deployment orchestrator AWS expects every DOP-C02 candidate to know cold. Unlike ECS or Lambda — where CodeDeploy mostly shifts traffic at the routing layer — EC2 deployments involve agent installation, file copy, lifecycle script execution on each host, and tight coupling with Auto Scaling groups and load balancers. The exam tests appspec.yml hook order, deployment configuration trade-offs, automatic rollback triggers, and the difference between in-place and blue/green flavours, often in scenarios where multiple choices technically work but only one minimises downtime, blast radius, or rollback time.

This guide focuses on the Pro-level mechanics: which lifecycle hooks fire in which order during in-place vs blue/green, how CodeDeploy interacts with the Auto Scaling group during a green-fleet creation, when the alarm-based rollback wins over health-check-based rollback, why deployment configurations like OneAtATime and HalfAtATime carry different operational risks, and how to debug agent failures. By the end you should be able to look at any CodeDeploy EC2 question and recognise the missing piece — usually a hook order, a service role permission, or an alarm configuration — without re-reading the stem.

Why CodeDeploy on EC2 Punches Above Its Weight on DOP-C02

The exam treats CodeDeploy as a stand-in for "deployment safety thinking". Questions are rarely about pipeline plumbing; they probe whether you understand what happens on the host during a deployment, how rollback works under partial failure, and which deployment configuration matches a stated SLA. The full power of CodeDeploy on EC2 only shows when you combine it with Auto Scaling lifecycle hooks, ELB target group health checks, and CloudWatch alarms — exactly the multi-service integration DOP-C02 prizes.

Three constraints shape CodeDeploy EC2 design choices. First, agent installation: every target instance must run the CodeDeploy agent, registered to the deployment group via tags or Auto Scaling group membership. Second, deployment configuration: the named policy (AllAtOnce, OneAtATime, HalfAtATime, plus custom configurations for arbitrary percentages) governs how many instances deploy in parallel and the minimum healthy count. Third, rollback policy: rollback is automatic on deployment failure, alarm trigger, or manual invocation, but only if a previous successful revision exists.

  • Deployment group: a logical set of target instances or Auto Scaling groups for a CodeDeploy application; targets are matched by tag or ASG name.
  • Deployment configuration: a named policy (CodeDeployDefault.AllAtOnce, OneAtATime, HalfAtATime, custom percentage) controlling parallelism and minimum healthy count.
  • In-place deployment: deploys the new revision to the existing instances, stopping and starting the application on each host.
  • Blue/green deployment: provisions a new (green) fleet, deploys the new revision there, shifts traffic, then optionally terminates the old (blue) fleet.
  • appspec.yml: a YAML file in the source bundle declaring file mappings, permissions, and lifecycle hook scripts.
  • Lifecycle hook: a named point during the deployment (e.g., BeforeInstall, ApplicationStart, ValidateService) where a script runs on the target instance.
  • CodeDeploy agent: a daemon installed on each target instance that polls for deployments and executes hooks.
  • Auto rollback: a deployment-group setting that re-deploys the last known-good revision on deployment failure or alarm trigger.
  • Original Environment: the blue (current) fleet in a blue/green deployment; Replacement Environment is the green (new) fleet.
  • Reference: https://docs.aws.amazon.com/codedeploy/latest/userguide/welcome.html

Plain-Language Explanation: CodeDeploy on EC2

EC2 deployments mix infrastructure, configuration, and orchestration in one operation. Three analogies from different domains make the mechanics stick.

Analogy 1: Restaurant Chain Menu Rollout

Picture a restaurant chain with 20 locations rolling out a new menu. In-place deployment is sending a corporate trainer to each restaurant in turn; the trainer stops the kitchen, swaps the menu, retrains the staff, and reopens. While the kitchen is shut, no orders are taken. The deployment configuration OneAtATime is the careful version — one restaurant at a time, slow but preserves capacity. HalfAtATime closes ten restaurants simultaneously to shorten the rollout. AllAtOnce closes all 20 — fastest but maximum risk.

Blue/green deployment is renting a new building next to each existing restaurant, fitting it out with the new menu and equipment, opening it for business, and only after it is humming with orders does the chain close the old building. If the new menu bombs in the first hour of the green location, customers are redirected back to the still-warm blue location instantly — that is rollback.

The appspec.yml is the rollout playbook the trainer follows: arrive, hand out new uniforms (BeforeInstall), set up the new menu boards (Install), brief the kitchen (AfterInstall), open for business (ApplicationStart), and serve the first 100 customers as a smoke test (ValidateService). Each hook is a numbered checklist item; getting the order wrong (briefing before delivering uniforms) is a common failure.

Analogy 2: Hospital Equipment Upgrade

A hospital upgrading its imaging machines deploys software in stages. In-place is patching the existing machines on weekends — slow per machine, no extra equipment cost, but radiology is offline during patching. Blue/green is bringing in new machines, certifying them in parallel, then swinging patient flow to the new machines once validated; the old machines stay online for emergency rollback for 24 hours before being decommissioned.

Lifecycle hooks are the certification protocol steps the biomedical engineer follows. BeforeInstall is unboxing and verifying serial numbers. AfterInstall is power-on diagnostics. ApplicationStart is calibration with phantoms. ValidateService is the radiologist signing the acceptance report. Each step is mandatory and ordered. Auto rollback on alarm is the hospital's safety officer pulling the new machines off the floor automatically the moment patient incidents spike — no human needed for emergency reversion.

Analogy 3: Aircraft Carrier Aircraft Swap

When an aircraft carrier replaces its fighter squadron, the air wing officer plans the transition meticulously. In-place is grounding the existing squadron for retrofitting — the carrier has no air cover during the swap. Blue/green is bringing the new squadron aboard a tender ship, qualifying them on the carrier, then transferring command authority — the old squadron stays armed and ready until the new squadron has flown three combat air patrols.

The deployment configuration is the squadron rotation policy. AllAtOnce is the wartime emergency: every aircraft swapped tonight. OneAtATime is the peacetime process: one aircraft retrofitted while 11 stand alert. HalfAtATime is the compromise — six on, six off — used for major exercises. The CloudWatch alarm-triggered rollback is the safety-board chairperson with authority to abort the rotation and restore the previous squadron the moment crash rates exceed a threshold.

The restaurant analogy maps cleanest to deployment configurations and customer experience. The hospital analogy is the right model when the exam emphasises validation hooks and acceptance gates. The carrier analogy is best for high-stakes rollback and alarm-driven safety. Reference: https://docs.aws.amazon.com/codedeploy/latest/userguide/deployment-configurations.html

In-Place vs Blue/Green on EC2

The choice between in-place and blue/green is the most-asked CodeDeploy EC2 design question.

In-place stops the application on the existing instance, copies new files, runs hooks, and starts the application. Pros: cheapest (no duplicate fleet), simplest rollback (re-deploy previous revision), fastest setup. Cons: per-instance downtime during the swap, slower aggregate rollout, capacity dip during deployment.

Blue/green provisions a new ASG (or set of instances) matching the existing one, deploys the new revision there, optionally runs validation traffic, swings the load balancer to the green fleet, then waits for a configurable termination period before deleting the blue fleet. Pros: zero-downtime cut-over, instant rollback (re-route traffic to still-warm blue), better validation window. Cons: 2x compute cost during the deployment, slower setup (provisioning the green fleet takes minutes), requires load balancer integration.

The exam's heuristic: any stem mentioning "zero downtime", "fast rollback", or "validate before traffic shift" implies blue/green. Stems mentioning "cost-sensitive dev environment" or "small fleet, brief maintenance window OK" imply in-place.

Deployment Configurations in Detail

CodeDeploy ships several built-in configurations and supports custom ones.

CodeDeployDefault.AllAtOnce deploys to all instances simultaneously. Minimum healthy count is 0. Fastest rollout, highest blast radius. Suitable for dev/staging.

CodeDeployDefault.OneAtATime deploys to one instance at a time. Minimum healthy count is total - 1. Safest, slowest. Suitable for small fleets where rollout time is acceptable.

CodeDeployDefault.HalfAtATime deploys to half the fleet, waits for success, then deploys to the remaining half. Minimum healthy count is half the fleet. Balanced choice.

Custom configurations are created by CreateDeploymentConfig API with either MinimumHealthyHosts (count or percentage) or, for ECS/Lambda, TrafficRoutingConfig. Examples: MinimumHealthyHostsPercentage 75 allows 25% of instances offline at once.

For blue/green deployments, the configuration plus the Original Environment vs Replacement Environment behaviour matters: AllAtOnce shifts all traffic at once after the green fleet is ready; HalfAtATime shifts in chunks (less common but supported).

Match the configuration to the deployment SLA. AllAtOnce minimises rollout time at the cost of capacity. OneAtATime minimises capacity loss at the cost of rollout time. The exam frequently asks "the team wants 75% capacity throughout the deployment, which configuration?" — the answer is a custom config with MinimumHealthyHostsPercentage 75, not any default. Reference: https://docs.aws.amazon.com/codedeploy/latest/userguide/deployment-configurations.html

appspec.yml Lifecycle Hook Order

The appspec.yml file declares hook scripts that run at named lifecycle events. Hook order is fixed and tested heavily.

For in-place deployments, the hooks execute in this sequence on each instance:

  1. ApplicationStop — stops the running application (skipped on first deploy).
  2. DownloadBundle (managed by CodeDeploy agent, not user-scriptable).
  3. BeforeInstall — runs before files are copied; install dependencies, set up directories.
  4. Install (managed by CodeDeploy agent, copies files per the files section).
  5. AfterInstall — runs after files are copied; configure files, set permissions.
  6. ApplicationStart — starts the application.
  7. ValidateService — runs smoke tests; non-zero exit fails the deployment for this instance.

For blue/green on EC2, additional hooks fire on the green fleet before the load balancer swing:

  • BeforeBlockTraffic, BlockTraffic, AfterBlockTraffic — when deregistering blue instances from the load balancer.
  • BeforeAllowTraffic, AllowTraffic, AfterAllowTraffic — when registering green instances.

The exam tests verbatim hook names and order. Common traps: confusing ValidateService (runs after start) with BeforeAllowTraffic (runs before traffic shift in blue/green); placing application configuration in ApplicationStart instead of AfterInstall.

On the first deployment of a revision to an instance, ApplicationStop does not run because there is no previous revision installed. Scripts that assume ApplicationStop always runs (e.g., to clean up a directory) silently fail on first deploy. The fix is to make BeforeInstall idempotent — clean state explicitly there, do not rely on ApplicationStop. The exam loves this gotcha. Reference: https://docs.aws.amazon.com/codedeploy/latest/userguide/reference-appspec-file-structure-hooks.html

Auto Rollback Policy

Auto rollback re-deploys the last known-good revision when triggered by:

  1. Deployment failure — any instance reports failure (configurable threshold).
  2. Alarm trigger — a CloudWatch alarm associated with the deployment group enters ALARM state during or shortly after deployment.
  3. Manual — operator clicks rollback in the console or via API.

Configure auto rollback in the deployment group: autoRollbackConfiguration.enabled: true, events: [DEPLOYMENT_FAILURE, DEPLOYMENT_STOP_ON_ALARM, DEPLOYMENT_STOP_ON_REQUEST], plus alarmConfiguration listing CloudWatch alarms.

The alarm-based rollback is the powerful Pro-tier feature: tie an alarm on 5xx error rate from ALB target group to the deployment group, and a deployment that breaks production health auto-rolls back without operator intervention. This is the canonical "deployment safety" answer on the exam.

A subtle constraint: rollback is implemented as a new deployment of the previous revision, not a transactional reverse of the failed one. It runs through all hooks again, takes time, and can itself fail. For blue/green, rollback is faster — it just keeps the blue fleet alive and reverts the load balancer.

Auto Scaling Group Integration

When the deployment target is an ASG, CodeDeploy hooks into ASG lifecycle to handle scale-out events.

For in-place deployments to an ASG, CodeDeploy installs a lifecycle hook on the ASG: when ASG scales out, the new instance enters Pending:Wait, CodeDeploy detects it, deploys the current revision, then the instance enters InService. This guarantees newly launched instances run the same revision as the rest of the fleet.

For blue/green deployments, CodeDeploy provisions a new ASG matching the original (same launch template, same desired capacity, same subnets). The new ASG is registered with the load balancer's target group; the original is deregistered. After a configurable wait period, the original ASG is terminated.

A common DOP-C02 pattern: combining ASG scaling policies with CodeDeploy means new instances during a scale event automatically receive the current application revision — no human intervention. Make sure the ASG's launch template references an AMI that has the CodeDeploy agent pre-installed, or include agent install as user-data.

CloudWatch Alarm Integration for Deployment Safety

Tie deployment safety to alarms via three knobs in the deployment group: alarmConfiguration.enabled, alarmConfiguration.alarms[], and alarmConfiguration.ignorePollAlarmFailure.

When alarms are armed, CodeDeploy checks them at the start of each deployment and during execution. If any listed alarm enters ALARM, the deployment stops and (if auto rollback is enabled) re-deploys the previous revision.

Best-practice alarms: 5xx rate on the ALB target group, application-emitted custom metrics (request latency, error count), CPU utilisation crossing capacity thresholds, queue depth on SQS for async workloads. Combine with CloudWatch Synthetic Canaries that run hourly against staging — failed canaries trigger alarm, alarm aborts production deploy.

The alarms listed in alarmConfiguration.alarms[] must exist in the same region as the deployment group. Cross-region alarms are not supported. For multi-region deployments, configure region-local alarms in each deployment group. Reference: https://docs.aws.amazon.com/codedeploy/latest/userguide/deployments-rollback-and-redeploy.html

Common Trap Patterns

Trap one: confusing in-place rollback (re-deploy old revision, slow) with blue/green rollback (re-route traffic, fast). Stems mentioning "rollback in seconds" imply blue/green.

Trap two: forgetting that the CodeDeploy agent must be running on every target. New AMIs without the agent silently fail with agent not connected.

Trap three: placing configuration logic in ApplicationStart instead of AfterInstall. By the time ApplicationStart runs, the application is starting; if config is wrong, the app crashes before ValidateService.

Trap four: assuming OneAtATime means one second per instance; it means one instance at a time, but each instance still runs the full hook lifecycle (potentially minutes).

Trap five: enabling auto rollback without ever deploying a successful revision; the very first deployment cannot rollback because there is no prior known-good.

Auto rollback re-deploys the previous successful revision. Before the first successful deployment, there is no previous revision, so a failed first deployment cannot rollback — it leaves the application in a broken state. Always use a known-good baseline as the very first deployment to a new deployment group. The exam tests this with stems describing "the first canary failed and instances are now broken". Reference: https://docs.aws.amazon.com/codedeploy/latest/userguide/deployments-rollback-and-redeploy.html

End-to-End Deployment Pattern

A canonical DOP-C02 EC2 deployment pipeline assembles like this. Source pulls from CodeCommit. Build in CodeBuild produces a deployment bundle (zip with app files plus appspec.yml). Deploy to staging using CodeDeployDefault.OneAtATime against an ASG with 4 instances; alarm config monitors target-group 5xx rate. Approval action pauses for release-manager signoff. Production deploy uses blue/green with a custom configuration shifting all traffic at once after green-fleet validation; auto rollback is armed against four production alarms (5xx, p99 latency, CPU, queue depth).

Memorise this shape — it is the reference architecture the exam keeps asking permutations of.

For any CodeDeploy EC2 question, anchor on these four pieces:

  1. Deployment type: in-place (cheaper, slower rollback) or blue/green (zero-downtime, fast rollback).
  2. Deployment configuration: AllAtOnce / OneAtATime / HalfAtATime / custom MinimumHealthyHostsPercentage.
  3. appspec.yml hook order: ApplicationStop → BeforeInstall → AfterInstall → ApplicationStart → ValidateService (plus blue/green-specific hooks).
  4. Auto rollback: triggered by deployment failure or CloudWatch alarm, requires a previous successful revision.

Any CodeDeploy EC2 question maps to one of these four. Reference: https://docs.aws.amazon.com/codedeploy/latest/userguide/welcome.html

常考陷阱(Common Exam Traps)

  1. First deployment cannot auto-rollback — auto rollback re-deploys a previous successful revision; without one, a failed first deployment leaves instances broken.
  2. Confusing ValidateService with BeforeAllowTrafficValidateService runs after ApplicationStart (per-instance smoke test); BeforeAllowTraffic is blue/green only and runs before traffic registers on the green fleet.
  3. CodeDeploy agent not installed in custom AMI — without the agent the instance never receives deployments; pre-install in the AMI or use user-data.
  4. MinimumHealthyHosts percentage rounded incorrectly — for a 4-instance fleet with MinimumHealthyHostsPercentage 75, CodeDeploy rounds up to 3 healthy required, allowing only 1 offline; designs assuming 25% offline (1 instance) on a 6-instance fleet may not match expected behaviour.
  5. Alarm cross-region mismatch — alarms listed in alarmConfiguration.alarms[] must be in the same region as the deployment group; cross-region alarms are silently ignored.

FAQ

Q1: When does CodeDeploy choose blue/green over in-place if both are configured? The deployment group's deploymentStyle setting (IN_PLACE or BLUE_GREEN) is fixed at group creation. To switch styles, create a new deployment group. There is no automatic per-deployment toggle.

Q2: Can a single deployment group target both EC2 instances and an ASG? Yes, via tag-based targeting. Tags on standalone EC2 instances and on ASG-launched instances both qualify. However, the deployment group must have either an ASG-name target or tag-based target — combining autoScalingGroups and ec2TagFilters on the same group means CodeDeploy targets the union of the two.

Q3: How do I keep the blue fleet running for 24 hours after a successful blue/green cut-over? Set the deployment group's terminateBlueInstancesOnDeploymentSuccess.action to KEEP_ALIVE and terminationWaitTimeInMinutes to 1440 (24 hours). CodeDeploy keeps the blue fleet up but deregistered from the load balancer for the wait period; if rollback is triggered, the blue fleet is re-registered.

Q4: What is the difference between deployment failure rollback and alarm-based rollback? Deployment-failure rollback fires when CodeDeploy itself reports the deployment as failed (hook script exited non-zero, instance unreachable, etc.). Alarm-based rollback fires when a tied CloudWatch alarm enters ALARM during the deployment, even if all hooks succeeded — capturing application-layer failures CodeDeploy cannot detect directly.

Q5: How does CodeDeploy on EC2 differ from CodeDeploy on ECS? On EC2, CodeDeploy executes hook scripts directly on the host via the agent and copies files to the instance filesystem. On ECS, CodeDeploy creates new task definitions and shifts ALB target group traffic; no agent runs in the task. ECS supports only blue/green; EC2 supports both in-place and blue/green.

Q6: Can I use CodeDeploy without an Auto Scaling Group? Yes. Tag-based deployment groups target standalone EC2 instances. The trade-off is no automatic deployment of the current revision to newly-launched instances — if you spin up a fresh instance with the tag, you must explicitly trigger a deployment for it.

Q7: Why does my deployment hang in Waiting state? Most common causes: (1) CodeDeploy agent not running on target instances, (2) instance has no IAM role with codedeploy:* and S3 access for the revision bucket, (3) outbound network blocked from instance to CodeDeploy public endpoints, (4) ASG lifecycle hook left the instance in Pending:Wait indefinitely.

Q8: How do I orchestrate a multi-region rollout safely with CodeDeploy? Per-region deployment groups, one CodePipeline stage per region with parallel deploy actions, alarm-based auto rollback in each region, plus a top-level EventBridge rule that detects any-region failure and triggers Step Functions to roll back already-completed regions. This is the Pro-tier multi-region pattern.

Official sources

More DOP-C02 topics