examlab .net The most efficient path to the most valuable certifications.
In this note ≈ 27 min

VPC Flow Logs, Reachability Analyzer, and Traffic Mirroring

5,400 words · ≈ 27 min read ·

ANS-C01 Domain 3.2 deep dive into VPC Flow Logs (v2-v5 fields, ACCEPT/REJECT, Athena partitioning), Reachability Analyzer path simulation, Network Access Analyzer trusted-access scopes, VPC Traffic Mirroring with Nitro source ENIs, mirror filters and sessions, and the trap-rich choices between metadata, packet capture, and reachability simulation.

Do 20 practice questions → Free · No signup · ANS-C01

When the AWS Certified Advanced Networking — Specialty exam (ANS-C01) reaches Domain 3 task 3.2, "Monitor and analyze network traffic to troubleshoot and optimize connectivity patterns", it stops asking abstract design questions and starts handing you a broken network with three or four possible diagnoses. Three tools dominate that diagnostic surface: VPC Flow Logs for connection metadata at every ENI, VPC Reachability Analyzer for path simulation through the VPC graph without sending real packets, and VPC Traffic Mirroring for full-packet copies routed to an out-of-band analyzer. The exam expects you to recognise which one fits which symptom, what each cannot see, and the precise trap-rich limits — Nitro-only mirror sources, simulation-not-live for Reachability Analyzer, and the long list of traffic VPC Flow Logs silently exclude.

This topic covers task 3.2 in its entirety with the depth a Specialty exam demands. We walk through Flow Log architecture (capture types, destinations, v2 through v5 record formats, extended fields like pkt-srcaddr, flow-direction, and traffic-path); how to query Flow Logs with Athena and CloudWatch Logs Insights; what Reachability Analyzer actually simulates (and the components it cannot evaluate); the differences between Reachability Analyzer, Network Access Analyzer, and Transit Gateway Route Analyzer (a frequent ANS-C01 distractor cluster); Traffic Mirroring's source/target/filter/session model; the Nitro requirement; and the SCS-C02-style mappings between symptoms and the right diagnostic tool. Throughout, the focus is the kind of multi-line scenario the exam writes — "engineers see intermittent SSH timeouts on instance X but only from subnet Y at peak load" — and the question of which tool answers it fastest.

Why Network Monitoring Dominates ANS-C01 Domain 3

Domain 3 (Network Management and Operation) is 20 percent of the ANS-C01 exam — roughly 13 of the 65 questions, and roughly half of those land in task statement 3.2. That single task names CloudWatch, VPC Flow Logs, VPC Traffic Mirroring, Reachability Analyzer, and Transit Gateway Network Manager in the knowledge bullets, and demands you analyse tool output, map topology, identify packet shaping issues, troubleshoot misconfigurations, verify network design, and automate verification. Each one of those skill bullets is a question the exam can write.

Where the Security Specialty (SCS-C02) treats VPC Flow Logs as a SIEM input feeding GuardDuty, the Advanced Networking Specialty (ANS-C01) treats it as a network engineer's primary diagnostic — the equivalent of tcpdump on a real network plus the configuration audit trail. The exam version of "is this packet allowed by the SG and the NACL?" is a Reachability Analyzer question. The exam version of "what HTTP request body triggered the 500 error?" is a Traffic Mirroring question. The exam version of "what does the byte distribution between this NAT Gateway and these spoke VPCs look like over the last 24 hours?" is a Flow Logs + Athena question. Knowing the right answer is half the battle; knowing why the other three answers do not work is the other half.

Plain-Language Explanation: VPC Flow Logs, Reachability Analyzer, and Traffic Mirroring

The three tools live at different layers of network diagnostic abstraction, and three analogies help.

Analogy 1: Building Security CCTV, Architectural Drawing, and Forensic Wiretap

Think of a corporate building's security infrastructure. VPC Flow Logs are the lobby CCTV log — they record who entered and left, when, and through which door, with timestamps and door numbers, but no audio and no face-recognition payload. The footage is cheap to keep for months and indispensable for "who was here yesterday between 2 and 3 AM?" but useless for "what did they say in the meeting?". VPC Reachability Analyzer is the building's architectural drawing reviewer — given the blueprint of doors, locks, badge readers, security gates, and tenant lists, the reviewer can tell you "if a person from Office 4B tried to walk to Office 7C carrying badge type X, would they get stopped?" without sending an actual person down the hallway. The reviewer reads the configuration, follows the rules, and reports the verdict. VPC Traffic Mirroring is the forensic wiretap — when you suspect a specific door is being used for something illegal, you tap a duplicate audio/video feed on that door to a separate forensic office for full payload analysis, while the door continues to operate normally. The wiretap is expensive, targeted, and time-bounded; the CCTV is broad, cheap, and continuous; the architectural review is offline and exhaustive.

Analogy 2: Hospital Logs, Floor Plan, and Patient Audio Recording

A hospital's diagnostic stack maps cleanly. Flow Logs are the patient admission/discharge ledger — names, times, ward IDs, accept/reject from the triage desk. Reachability Analyzer is the hospital floor plan with the access badge rules overlaid — given any pair of rooms, can a specific staff badge get from A to B without being denied at any door? The floor plan does not test the badge against the live system; it reads the rules and reports the path. Traffic Mirroring is the bedside audio recorder that dupes patient consultations to an external transcription service for full-content review while the consultation continues uninterrupted.

Analogy 3: Highway Traffic — Toll Booth Records, Map Routing Software, Vehicle Dashcam

A highway authority uses three tools. Flow Logs are the toll booth records — every vehicle that crossed, when, plate number, lane, paid or rejected; great for traffic volume analysis and reconciling tolls, but no information about who was inside the cars. Reachability Analyzer is the map routing software with road-closure data — given any two cities, can a specific vehicle type (truck, bus, motorcycle) actually drive between them given current closures, height limits, and weight limits? The map answers without dispatching a vehicle. Traffic Mirroring is the dashcam recording from a specific vehicle, copied off to a forensic lab — full visual and audio of the journey, expensive to enable on every vehicle, used selectively when something needs investigating.

For ANS-C01, the CCTV / architectural review / forensic wiretap triad is the highest-yield mental model — it captures both the cost asymmetry (cheap, free, expensive) and the abstraction layer (metadata, configuration, payload). When a question asks about cost or always-on telemetry, think CCTV (Flow Logs). When it asks about path verification or pre-deployment design audit, think architectural review (Reachability Analyzer). When it asks about decoding a specific payload or matching a Suricata signature, think forensic wiretap (Traffic Mirroring). Reference: https://docs.aws.amazon.com/vpc/latest/userguide/flow-logs.html

VPC Flow Logs Architecture

VPC Flow Logs capture metadata about IP traffic flowing through ENIs. They are the foundational telemetry of every AWS network engineer's toolbox, the data feed for Amazon GuardDuty, Amazon Detective, AWS Network Access Analyzer findings, and most third-party SIEMs.

Scope: ENI, subnet, or VPC

A Flow Log is created at one of three levels. ENI-level captures one specific elastic network interface — used when investigating a single instance. Subnet-level captures every ENI in a subnet — the practical default for security audit. VPC-level captures every ENI in every subnet of the VPC — exhaustive and the AWS Security Reference Architecture default for centralised collection.

Flow Logs are not retroactive: enabling them now captures from now forward. There is no historical replay. For incident response, a "Flow Logs always on at VPC level" baseline is mandatory. Flow Logs cost per ingested gigabyte plus standard storage; turning them off in lower environments to save money is a frequent operational mistake that breaks forensic capability when needed.

Capture types: ACCEPT, REJECT, ALL

A Flow Log can capture ACCEPT records (allowed traffic), REJECT records (blocked by SG or NACL), or ALL. ACCEPT is the volume — most flows in a healthy VPC. REJECT is the signal — port scans, misconfigured clients, IAM-locked principals attempting access, lateral-movement attempts. The exam-default expectation is ALL: you need both for security and for performance baselining.

Destinations: CloudWatch Logs, S3, Kinesis Firehose

Three destinations. CloudWatch Logs for near-real-time interactive querying with Logs Insights and CloudWatch alarms on metric filters. S3 for cheap long-term storage and Athena queries — the right destination for organisation-wide aggregation and ad-hoc analytics. Kinesis Data Firehose for streaming into a SIEM, data lake, or third-party log analytics pipeline.

S3 destinations support Hive-compatible partitioning (year=YYYY/month=MM/day=DD/hour=HH/) and Parquet format as alternatives to plain text. Parquet partitioning slashes Athena query cost by 10–20x for selective queries because Athena scans only the relevant partitions.

Record format versions: v2, v3, v4, v5

Each version adds fields without removing the prior ones.

  • v2 — original 14 fields: version, account-id, interface-id, srcaddr, dstaddr, srcport, dstport, protocol, packets, bytes, start, end, action, log-status.
  • v3 — adds vpc-id, subnet-id, instance-id, tcp-flags, type (IPv4 vs IPv6), pkt-srcaddr, pkt-dstaddr (the actual packet src/dst before NAT, contrasting with srcaddr/dstaddr which can be the NAT-translated addresses).
  • v4 — adds region, az-id, sublocation-type, sublocation-id, pkt-src-aws-service, pkt-dst-aws-service, flow-direction, traffic-path. The traffic-path field encodes how the packet routed (1=intra-VPC, 2=through IGW or gateway endpoint, 3=through VGW, 4=through inter-region peering, 5=through inter-AZ, 6=through VPC peering, 7=through VPC endpoint, 8=through internet gateway).
  • v5 — adds ecs-cluster-arn, ecs-cluster-name, ecs-container-instance-arn, ecs-container-instance-id, ecs-container-id, ecs-second-container-id, ecs-service-name, ecs-task-definition-arn, ecs-task-arn, ecs-task-id — container-level annotation for ECS workloads.

Pick v5 when you need full feature coverage — the cost is identical to v2 per ingested byte. Older v2 logs are sufficient only when you are constrained by an older log analytics tool that cannot parse the newer fields.

tcp-flags interpretation

The tcp-flags field is a bitmap of TCP control bits seen in the flow window. Common values: 2 = SYN only, 18 = SYN-ACK, 16 = ACK, 4 = RST, 1 = FIN, 19 = SYN-ACK-PSH, 24 = ACK-PSH. A flow with tcp-flags=2 only and no responding 18 is a half-open connection, often diagnostic of a destination that did not respond (firewall drop on return path, or destination instance stopped). A flow with tcp-flags including 4 indicates a reset — often a stuck connection torn down or a port closed by the destination kernel.

The srcaddr and dstaddr fields show the addresses as the ENI saw them. The pkt-srcaddr and pkt-dstaddr fields (v3+) show the original packet addresses before any NAT translation. For a NAT Gateway, the difference reveals which underlying instance generated the flow even though the NAT GW ENI shows itself as srcaddr. For VPC peering and Transit Gateway flows, pkt-srcaddr lets you trace the actual origin instance even when intermediate hops rewrote the visible source. ANS-C01 frequently tests this: "given this Flow Log line at the NAT Gateway ENI, which private instance initiated the connection?" — the answer is in pkt-srcaddr, not srcaddr. Reference: https://docs.aws.amazon.com/vpc/latest/userguide/flow-logs-records-examples.html

ACCEPT vs REJECT and What Flow Logs Do Not See

A REJECT record is generated when an inbound flow is dropped by a security group or NACL. A drop by the security group looks identical in the Flow Log to a drop by the NACL — both produce REJECT. To distinguish, you reason: if the source is allowed by the SG inbound rule and the SG is stateful, the drop was the NACL (typically a missing ephemeral-port outbound allow). If the source is not in any SG inbound rule, the drop was the SG.

Flow Logs do not capture certain traffic categories. The exhaustive exclusion list:

  • Traffic to and from the Amazon DNS server (the .2 IP of the VPC CIDR or 169.254.169.253).
  • Traffic to and from the IMDS endpoint 169.254.169.254 (including IMDSv2).
  • Amazon Time Sync at 169.254.169.123.
  • Windows license activation traffic.
  • DHCP traffic.
  • Traffic to the VPC router reserved address (the .1 of the subnet CIDR).
  • Mirror traffic generated by VPC Traffic Mirroring sessions.
  • Some flows associated with Gateway Load Balancer transparent inline insertion patterns.

AWS Network Firewall drops do not appear as REJECT in VPC Flow Logs. Network Firewall has its own alert logs and flow logs, separate sources. A scenario describing "Network Firewall is dropping packets but VPC Flow Logs show no REJECT" is operating exactly as designed; the answer is to consult Network Firewall's own logs.

Candidates assume Flow Logs are a complete network record. They are not. The exclusion list above is the most common ANS-C01 trap territory. A scenario describing "we cannot see DNS queries to the internal resolver in Flow Logs" is reflecting the documented exclusion of .2 resolver traffic — the answer is Route 53 Resolver Query Logging, not Flow Logs. A scenario describing "an instance is reaching IMDS but Flow Logs show nothing" is correct behavior — IMDS at 169.254.169.254 is excluded. For complete capture of any of these, use Traffic Mirroring (which sees layer 2 onward including IMDS link-local traffic) or host-based capture. Reference: https://docs.aws.amazon.com/vpc/latest/userguide/flow-logs.html

Querying Flow Logs with Athena and Logs Insights

Athena on S3-destination Flow Logs

Athena reads Flow Logs from S3 with a CREATE EXTERNAL TABLE DDL matching the chosen format and partitioning. For Hive-partitioned S3 layout, the DDL declares PARTITIONED BY (region string, year int, month int, day int, hour int) and uses MSCK REPAIR TABLE (or partition projection) to register partitions. Common queries:

  • Top byte consumers: SELECT srcaddr, dstaddr, SUM(bytes) AS total FROM flow_logs WHERE day = '2026-05-02' GROUP BY srcaddr, dstaddr ORDER BY total DESC LIMIT 10.
  • REJECT pattern across a /16: SELECT srcaddr, COUNT(*) AS rejects FROM flow_logs WHERE action = 'REJECT' AND dstaddr LIKE '10.10.%' GROUP BY srcaddr ORDER BY rejects DESC.
  • Cross-AZ traffic identification using sublocation-id (v4+): filter where source and destination az-ids differ; multiply by per-GB cross-AZ pricing for cost attribution.

Partition projection (a Glue Data Catalog feature) eliminates the need for MSCK REPAIR by computing partition values from S3 keys; recommended for long-running query workloads.

CloudWatch Logs Insights

For the CloudWatch Logs destination, Logs Insights provides interactive query syntax with parse, filter, stats, sort, limit. A typical query:

parse @message "* * * * * * * * * * * * * *"
  as version, account, eni, srcaddr, dstaddr, srcport, dstport,
     protocol, packets, bytes, start, end, action, status
| filter action = "REJECT"
| stats count(*) as rejects by srcaddr
| sort rejects desc
| limit 20

Logs Insights queries are best for ad-hoc investigation; persistent dashboards are better served by S3 + Athena + QuickSight, which scale to organisational volume without per-query Logs Insights cost.

Cross-account aggregation pattern

The AWS Security Reference Architecture pattern: every member-account VPC writes Flow Logs to a centralised S3 bucket in the Log Archive account, partitioned by account/region/day. The bucket is immutable (S3 Object Lock plus a deny-delete bucket policy). The Security Tooling account queries via Athena across all partitions. For ANS-C01 the same pattern applies; the exam frames it as "centralise network telemetry across the organisation".

  • ENI-level / subnet-level / VPC-level: scope of capture, in increasing breadth.
  • ACCEPT / REJECT / ALL: filter for allowed flows, denied flows, or both.
  • v2 / v3 / v4 / v5: record format versions; v5 is richest.
  • pkt-srcaddr / pkt-dstaddr: original packet addresses before NAT translation.
  • traffic-path: encoded routing path (1=intra-VPC, 2=IGW, 3=VGW, 4=inter-region peering, 5=inter-AZ, 6=VPC peering, 7=VPC endpoint, 8=IGW egress).
  • flow-direction: ingress or egress relative to the ENI.
  • NODATA / SKIPDATA / OK: log-status values; NODATA means no traffic in the window, SKIPDATA means capture buffer overflow.
  • Reference: https://docs.aws.amazon.com/vpc/latest/userguide/flow-logs-records-examples.html

VPC Reachability Analyzer

VPC Reachability Analyzer simulates path reachability between two endpoints in your VPC graph using static configuration analysis. It reads the security groups, NACLs, route tables, peering connections, Transit Gateway attachments, internet gateways, and VPN connections, models the packet's hypothetical path, and reports whether the path is reachable, the hop-by-hop sequence, and which component (if any) blocks the path.

What Reachability Analyzer simulates

  • Source and destination resource types: VPC, subnet, instance, ENI, internet gateway, virtual private gateway, transit gateway, transit gateway attachment, peering connection, network insights endpoint.
  • Path components evaluated: route tables, security groups, NACLs, Transit Gateway route tables, VPC peering connections, VPN connections, gateway endpoints, internet gateway routing.
  • Path output: ordered hop list with component IDs, action at each hop, and explanation strings (e.g., "ENI_INGRESS — security group sg-abc allows traffic on port 443").

What Reachability Analyzer does not test

  • Live traffic — Reachability Analyzer is a simulation, not a probe. It does not send packets; it does not detect intermittent failures, congestion, MTU mismatch, or BGP session state.
  • Application-layer reachability — it stops at layer 4 (port-level) reachability; whether the application on the destination port responds is out of scope.
  • AWS Network Firewall, third-party appliances behind GWLB, or any Suricata/IDS rule — these are not modelled; Reachability Analyzer assumes any path that passes SG/NACL/route reaches its destination unless it is explicitly an unsupported component.
  • DNS resolution — Reachability Analyzer takes IP/ENI/instance inputs, not DNS names; if DNS resolution is the failure point, this tool does not see it.

Automation pattern

The high-yield ANS-C01 pattern: Reachability Analyzer + EventBridge + Lambda to verify that intended connectivity remains intact after every infrastructure change. EventBridge fires on configuration changes (a new SG rule, route table update, or VPC peering modification); Lambda invokes the Reachability Analyzer API for a saved set of "this must always be reachable" path tests; failures alert via SNS or open a Security Hub finding. This is the answer to "automate verification of connectivity intent as configuration changes" — exactly the ANS-C01 skill bullet wording.

Cost and limits

Each path analysis is billed per analysis. There is a quota on concurrent analyses per account per region (default low-tens, raisable). For organisational-scale automation, batch the analyses and rate-limit the EventBridge fan-out.

The single most-tested ANS-C01 trap on this tool. A scenario describes intermittent SSH failures and offers Reachability Analyzer as one of four possible diagnostic answers. Reachability Analyzer cannot diagnose intermittent failures because it does not test live traffic — it only reads configuration. If the configuration is correct but traffic still fails sometimes, the symptoms point at MTU black hole, congested NAT GW, BGP flapping, route propagation delay, or destination instance health — none of which Reachability Analyzer sees. The right answer for intermittent issues is Flow Logs (look for asymmetric drops or NODATA periods), CloudWatch metrics on the NAT Gateway or TGW, or Traffic Mirroring for payload investigation. Reference: https://docs.aws.amazon.com/vpc/latest/reachability/what-is-reachability-analyzer.html

Network Access Analyzer vs Reachability Analyzer vs Route Analyzer

ANS-C01 frequently presents these three tools as distractor choices. Distinguishing them is required.

VPC Reachability Analyzer

Path simulation between two specified endpoints. Answers "can A reach B?". Returns a path verdict and a hop trace. Used for connectivity intent verification.

Network Access Analyzer

Scope-based trusted-access policy analysis across accounts and VPCs. Defines a "scope" describing what reachability is allowed (e.g., "only resources tagged Production may reach the internet via NAT Gateway", or "no resource in Account A may reach S3 buckets in Account B"). Network Access Analyzer evaluates the entire AWS Organisation against the scope and produces findings for any access path that violates the scope — paths to the internet from sensitive subnets, paths between segregated VPCs, paths via PrivateLink endpoints to unintended services. Used for compliance and access-baseline auditing, integrated with Security Hub.

Transit Gateway Route Analyzer

Tied to Transit Gateway Network Manager, Route Analyzer traces a packet's path through TGW route tables for a given source and destination. It answers "given this source attachment and this destination CIDR, which TGW route table entries are matched and where does the packet egress?" — TGW-specific, doesn't extend into VPC SGs or NACLs.

VPC Traffic Mirroring

VPC Traffic Mirroring copies packets from a source ENI to a target for out-of-band analysis. It is the tool of choice when packet-content inspection is required — IDS/IPS signature matching, DLP scanning, application-layer debugging, or compliance packet retention.

Components: mirror source, target, filter, session

  • Mirror source — an ENI on a Nitro-based EC2 instance. Older non-Nitro instance types (m4, c4, r4, t2, etc.) are not supported as sources.
  • Mirror target — a Network Load Balancer, Gateway Load Balancer endpoint, or another ENI. Typically the target points to a security analytics appliance, a Suricata IDS cluster, a Zeek sensor, or a Wireshark/tshark collector.
  • Mirror filter — defines which traffic to mirror, with rule-number ordering and 5-tuple match (src/dst CIDR, protocol, port range). Filters apply per direction (ingress vs egress relative to source ENI).
  • Mirror session — ties source + target + filter together. Mirror sessions have a session number that determines precedence when an ENI matches multiple sessions (lower number wins).

What Traffic Mirroring captures

Full packet contents from layer 2 onward, encapsulated in a VXLAN-style header (UDP destination port 4789 by default) so the target can demultiplex multiple mirror sessions. The original source/destination MAC and IP are preserved inside the encapsulation; the analytics tool must decode VXLAN.

Use cases

  • Forensic capture during an incident — mirror an instance suspected of compromise to a forensic VPC for offline analysis.
  • Threat hunting — feed mirrored traffic to a Suricata IDS for signature and behavioural detection beyond what GuardDuty or AWS Network Firewall provides.
  • Compliance packet retention — some regulated industries require N days of full-packet capture; Traffic Mirroring + a custom collector + S3 (via Kinesis Firehose or an EC2 capturing daemon) is the canonical pattern.
  • Deep performance debugging — capture and replay TCP flows to investigate retransmits, MTU issues, or protocol-level latency.

Limits and considerations

  • Nitro-only sources — non-Nitro instances cannot be mirror sources. This is the most-tested limit.
  • Per-instance bandwidth budget — mirrored traffic shares the source instance's network bandwidth budget; mirroring a saturated instance can cause production traffic loss.
  • Session limits — a small number of mirror sessions per ENI; per-region quotas apply (raisable).
  • Mirror traffic excluded from Flow Logs — mirrored copies do not appear in VPC Flow Logs (avoiding double-counting).
  • No payload modification — mirror sessions are read-only copies; the production flow continues unmodified.
  • No mirror across regions — source and target must be in the same region (cross-AZ within region is fine).

Mirror filter design

The most common mistake in filter authoring: not configuring both ingress and egress filter rules. By default a mirror session captures nothing; you must add rules for the directions and 5-tuples you want. A typical "mirror everything from this ENI" filter has two rules: ingress 0.0.0.0/0 → 0.0.0.0/0 all-protocols accept; egress 0.0.0.0/0 → 0.0.0.0/0 all-protocols accept. Selective filters (e.g., only TCP/443 to a specific destination) reduce target NLB load substantially.

A canonical ANS-C01 distractor pairs all three as if they were alternatives. They are complements at three layers of abstraction. Traffic Mirroring = packet content. Flow Logs = connection metadata. Reachability Analyzer = static path verification. The right answer to "we need to inspect the HTTP request body for a known XSS payload" is Traffic Mirroring. The right answer to "detect an unusual volume of outbound TCP/22 from a database subnet" is Flow Logs. The right answer to "verify that the path from web-tier to data-tier is allowed by SG and route table after this change" is Reachability Analyzer. Reference: https://docs.aws.amazon.com/vpc/latest/mirroring/what-is-traffic-mirroring.html

CloudWatch Metrics, Alarms, and Network Manager Integration

CloudWatch metrics for networking

VPC and adjacent services emit numerous CloudWatch metrics relevant to operational diagnostics:

  • NAT GatewayBytesInFromDestination, BytesInFromSource, BytesOutToDestination, BytesOutToSource, ConnectionAttemptCount, ConnectionEstablishedCount, ErrorPortAllocation, IdleTimeoutCount, PacketsDropCount. Watch ErrorPortAllocation for SNAT port exhaustion (the failure mode when a single destination IP receives too many connections from the NAT GW).
  • VPN connectionTunnelState (1 = up, 0 = down), TunnelDataIn, TunnelDataOut per tunnel.
  • Direct ConnectConnectionState, ConnectionBpsEgress, ConnectionBpsIngress, ConnectionPpsEgress, ConnectionPpsIngress, ConnectionLightLevelTx, ConnectionLightLevelRx, ConnectionErrorCount.
  • Transit GatewayBytesIn, BytesOut, PacketsIn, PacketsOut, PacketDropCountBlackhole, PacketDropCountNoRoute.
  • Application Load BalancerRequestCount, TargetResponseTime, HTTPCode_ELB_5XX_Count, HTTPCode_Target_5XX_Count, HealthyHostCount, UnHealthyHostCount.

Alarm patterns

  • BGP session down on Direct Connect — alarm on ConnectionState < 1.
  • VPN tunnel flap — alarm on TunnelState mean over 5 minutes < 1.
  • NAT GW port exhaustion — alarm on ErrorPortAllocation > 0.
  • TGW blackhole drops — alarm on PacketDropCountBlackhole > 0.
  • NLB target unhealthy — alarm on UnHealthyHostCount > 0 sustained.

Transit Gateway Network Manager

TGW Network Manager registers Transit Gateways into a "global network" abstraction with a topology map, route analysis, event notifications, and CloudWatch metrics aggregation. For multi-region, multi-TGW deployments, Network Manager is the operational console — single pane for topology, configuration, and events. Network Manager events flow to CloudWatch Events / EventBridge for automation (e.g., notify on attachment state changes).

  • Flow Log scope: ENI, subnet, or VPC.
  • Capture types: ACCEPT, REJECT, ALL.
  • Destinations: CloudWatch Logs, S3, Kinesis Firehose.
  • Record formats: v2 (default 14 fields), v3 (+VPC/subnet/instance/tcp-flags/pkt-src/pkt-dst), v4 (+region/az-id/sublocation/aws-service/flow-direction/traffic-path), v5 (+ECS fields).
  • Excluded from Flow Logs: DNS to .2 resolver, IMDS link-local 169.254.169.254, Time Sync 169.254.169.123, Windows activation, DHCP, mirror traffic.
  • Network Firewall drops: not in Flow Logs; in NFW alert/flow logs separately.
  • Reachability Analyzer: simulation only, layer-3/4 path verification, does not test live traffic.
  • Reachability Analyzer: does not model AWS Network Firewall or third-party GWLB appliances.
  • Network Access Analyzer: scope-based trusted-access audit across the org.
  • TGW Route Analyzer: TGW-specific path trace via Network Manager.
  • Traffic Mirroring: source ENI must be on Nitro instance; non-Nitro is unsupported.
  • Mirror target: NLB, GWLB endpoint, or ENI; same region as source.
  • Mirror traffic encapsulation: VXLAN-style, UDP/4789 default.
  • Reference: https://docs.aws.amazon.com/vpc/latest/mirroring/traffic-mirroring-considerations.html

Symptom-to-Tool Decision Matrix

The fastest way to answer ANS-C01 task 3.2 questions is a symptom→tool lookup. Memorise this table.

Symptom or goal Primary tool Why
Top-N byte consumers across VPC over 24h Flow Logs (S3 + Athena) Aggregation across volume of data; partitioned columnar storage is fast and cheap.
Detect port scan from external IP Flow Logs (REJECT records) REJECT pattern across ports from same source.
Verify path from web to DB subnet after SG change Reachability Analyzer Configuration-time verification, not live test.
Audit "no resource in dev VPC reaches prod VPC" Network Access Analyzer Scope-based audit across org, not point-to-point.
Trace TGW path for given src/dst TGW Route Analyzer (Network Manager) TGW-specific routing.
Decode HTTP request body for forensic Traffic Mirroring → Suricata/Zeek Payload required, layer 7.
Diagnose intermittent SSH timeout Flow Logs + CloudWatch metrics Look for asymmetric REJECT, NODATA periods, or NAT GW port exhaustion; NOT Reachability Analyzer (live issue, not config).
Identify which private instance is generating outbound traffic via NAT Flow Logs at NAT GW with pkt-srcaddr NAT GW srcaddr shows NAT IP; pkt-srcaddr shows real instance.
Detect SNAT port exhaustion on NAT GW CloudWatch ErrorPortAllocation metric Specific SNAT failure metric.
BGP session flap on Direct Connect CloudWatch ConnectionState alarm BGP/connection state changes are metrics not Flow Logs.
Capture DNS query payload for malware C2 detection Route 53 Resolver Query Logging or Traffic Mirroring Flow Logs exclude .2 resolver traffic.
Compliance: 90 days full-packet capture Traffic Mirroring → S3 via custom collector Flow Logs are metadata only.
Verify connectivity intent across all changes Reachability Analyzer + EventBridge + Lambda Automated post-change verification.
Investigate why a Network Firewall rule is blocking traffic Network Firewall alert/flow logs NFW drops do not appear in VPC Flow Logs.
Topology map for multi-region TGW TGW Network Manager Single global-network view.

Common Traps Recap — Domain 3.2

The traps the exam writes most frequently in this territory.

Trap 1: Reachability Analyzer can detect intermittent failures

Wrong. It is a static configuration simulation. Intermittent failures belong to Flow Logs and CloudWatch metrics.

Trap 2: Flow Logs capture all traffic on the ENI

Wrong. DNS to the .2 resolver, IMDS link-local, Time Sync, Windows activation, DHCP, and mirror traffic are excluded. Use Traffic Mirroring or Resolver Query Logging for those.

Trap 3: Network Firewall drops appear in VPC Flow Logs

Wrong. NFW has its own alert and flow logs.

Trap 4: Traffic Mirroring works on all instance types

Wrong. Source ENIs must be on Nitro instances. m4/c4/r4/t2 cannot be mirror sources.

Trap 5: srcaddr in Flow Logs is always the original packet source

Wrong. For NAT'd flows, srcaddr is the NAT'd address; pkt-srcaddr (v3+) is the original.

Trap 6: Reachability Analyzer evaluates Network Firewall rules

Wrong. NFW, GWLB-routed third-party appliances, and Suricata rules are not modelled.

Trap 7: Network Access Analyzer is a synonym for Reachability Analyzer

Wrong. Reachability is point-to-point simulation; Network Access Analyzer is scope-based org-wide audit.

Trap 8: Choosing v2 Flow Logs to save cost

Wrong. v2 and v5 cost the same per ingested GB. Always pick v5 unless you have a downstream tool that cannot parse newer fields.

Trap 9: Mirror traffic shows up in the source ENI's Flow Logs

Wrong. Mirror traffic is excluded from Flow Logs to avoid double-counting.

Trap 10: Logs Insights is cheaper than Athena for organisational-scale queries

Wrong. Logs Insights pricing scales with bytes scanned per query; Athena over partitioned Parquet S3 is dramatically cheaper for large historical aggregations. Use Logs Insights for ad-hoc, Athena for scale.

Trap 11: NODATA in flow log status means traffic was dropped

Wrong. NODATA means no traffic was seen during the aggregation window — it is information about absence, not denial. SKIPDATA means the capture buffer overflowed and some flows were lost.

Trap 12: Traffic Mirroring captures across region

Wrong. Mirror sources and targets are same-region only.

FAQ — VPC Flow Logs, Reachability Analyzer, and Traffic Mirroring

Q1: Which Flow Logs version should I use, and is there ever a reason to pick v2?

Use v5 by default. Cost is identical to v2 per ingested GB, and v5 includes every field of v2/v3/v4 plus ECS container fields. The only reason to pick a lower version is a downstream parser (a legacy SIEM) that cannot read newer fields without modification — and even then, the right move is usually to upgrade the parser. v3 is the minimum version that includes pkt-srcaddr and pkt-dstaddr, which are critical for NAT, peering, and TGW flow attribution. v4 adds traffic-path (1=intra-VPC, 2=IGW, 3=VGW, 4=inter-region peering, 5=inter-AZ, 6=peering, 7=VPC endpoint, 8=IGW egress) — invaluable for cross-AZ cost analysis. ANS-C01 default expectation: v5 + ALL action + S3 destination + Hive partitioning + Athena.

Q2: Why doesn't Reachability Analyzer detect my intermittent SSH timeouts?

Because Reachability Analyzer is a simulation, not a live network probe. It reads the current configuration of route tables, security groups, NACLs, peering, and TGW attachments, and reports whether the configured path would theoretically allow a packet from A to B. Intermittent failures are caused by runtime conditions Reachability Analyzer cannot see: NAT Gateway SNAT port exhaustion, MTU black holes (PMTUD failure), BGP session flapping, route propagation delay after a route table change, NLB target health flapping, congested cross-AZ links, or destination-instance kernel-level drops. The right tools for intermittent diagnosis are: VPC Flow Logs (look for periods of NODATA on the affected ENI, or asymmetric REJECT patterns), CloudWatch metrics on NAT Gateway ErrorPortAllocation, ALB target health, or TGW PacketDropCountNoRoute, and Traffic Mirroring for protocol-level investigation when the metadata is insufficient.

Q3: How do Flow Logs and Network Firewall logs relate, and why don't NFW drops appear as REJECT?

VPC Flow Logs and AWS Network Firewall logs are separate sources at separate layers. VPC Flow Logs record the action taken at the ENI level — ACCEPT (allowed by SG and NACL) or REJECT (denied by SG or NACL). Network Firewall sits as a separate inspection device in the path, with its own decision logic (stateless rules first, then stateful Suricata-compatible rules) and its own log streams: alert logs (matched stateful rules) and flow logs (every flow seen by the firewall). When NFW drops a packet, the packet is consumed at the firewall — the source ENI's VPC Flow Log will show ACCEPT (because SG and NACL allowed it onto the wire), and NFW's own logs will show the drop. To investigate "is NFW blocking this?" you must consult NFW's logs in S3, CloudWatch Logs, or Kinesis Firehose — never assume Flow Log REJECT covers it. The same applies to third-party firewalls deployed via Gateway Load Balancer: their drop decisions are inside the appliance, not visible in VPC Flow Logs.

Q4: When should I choose Traffic Mirroring over Flow Logs, given the cost asymmetry?

Choose Traffic Mirroring whenever packet content is required and Flow Logs (metadata only) are insufficient. Concrete triggers: (a) payload-pattern detection — Suricata signatures, malware C2 patterns, DLP scanning of file uploads; (b) protocol-layer debugging — TCP retransmit investigation, MTU black hole reproduction, application-layer error reproduction; (c) compliance packet retention — when regulators demand N days of full PCAP rather than connection metadata; (d) incident forensic capture — when an instance is suspected compromised and you need to see what is actually being said in the traffic. Choose Flow Logs for everything else: top-talker analysis, REJECT pattern detection, baseline performance, GuardDuty input, Athena historical queries. Traffic Mirroring at scale can double your effective network bandwidth (the mirror copy travels alongside the original), saturate the target NLB, and incur additional per-GB processing — restrict it to specific ENIs and specific filter rules during the targeted investigation window, then disable.

Q5: How do I architect Reachability Analyzer for continuous post-change verification?

The pattern is Reachability Analyzer + EventBridge + Lambda + Security Hub or SNS. (1) Define a list of "always-must-be-reachable" path tests — for example, "web-tier ALB ENI must reach app-tier ENI on TCP 8080", "app-tier must reach DB on TCP 5432", "no public subnet must reach internal-only S3 endpoint". Encode the test inputs (source ARN, destination ARN, port) as a JSON file in S3. (2) Create an EventBridge rule that fires on configuration changes — aws.ec2 events for AuthorizeSecurityGroupIngress, RevokeSecurityGroupIngress, route-table modifications, peering changes, TGW attachment events, and Network Firewall policy updates. (3) Lambda triggered by EventBridge reads the test inputs, calls CreateNetworkInsightsAnalysis for each test, polls for completion, and inspects the result. (4) On a failed analysis, Lambda publishes to SNS for paging or creates a Security Hub finding for the security team. Rate-limit the Lambda invocation since Reachability Analyzer has per-region concurrency quotas. This pattern delivers exactly what the ANS-C01 skill bullet "automating verification of connectivity intent as a network configuration changes" demands.

Q6: What is the Athena pattern for querying Flow Logs at organisational scale?

Use S3 destination with Hive-compatible partitioning by region/year/month/day/hour, and Parquet format for the Flow Log records. Create the Athena external table with PARTITIONED BY (region string, year int, month int, day int, hour int) and either run MSCK REPAIR TABLE periodically or — better — use partition projection (Glue Data Catalog feature) to have Athena compute partitions from S3 keys without registration overhead. Write queries with explicit partition filters (e.g., WHERE year = 2026 AND month = 5 AND day = 2) so Athena scans only the relevant subset; without partition filters, Athena reads the entire bucket and cost balloons. For repeatable dashboards, materialise common queries into separate Glue tables or QuickSight datasets. For organisational scale, the bucket should live in the Log Archive account with cross-account read from the Security Tooling account; this is the AWS Security Reference Architecture canonical layout.

Q7: Reachability Analyzer says my path is reachable but my application still cannot connect — what gives?

Several possibilities Reachability Analyzer cannot see. (1) AWS Network Firewall rule — NFW and GWLB-routed appliances are not modelled by Reachability Analyzer; consult NFW's own logs to see if a stateful rule is dropping. (2) Application listener not bound — the destination instance might pass network reachability but the application process is not listening on the expected port. (3) Host-level firewalliptables, Windows Firewall, or container-level network policies (Calico, Cilium) are invisible to Reachability Analyzer. (4) MTU black hole — Reachability Analyzer does not model PMTUD; an asymmetric MTU between VPC peering or VPN can drop large packets silently. (5) DNS resolution failure — Reachability Analyzer takes IP/ENI inputs; if your application connects by name and DNS resolution fails, the network is fine but the connection never starts. The diagnostic ladder is: confirm Reachability Analyzer says reachable → check NFW logs → SSH to source instance and curl the destination IP/port directly → if curl works, problem is application-layer; if not, run tcpdump on source and use Traffic Mirroring on destination ENI to see if the SYN arrives.

Q8: Why does Traffic Mirroring need Nitro, and what do I do if I have older instances?

The Nitro hypervisor offloads network processing to the Nitro card, providing the line-rate packet duplication path that Mirror sessions require. On older Xen-based instance generations (m4, c4, r4, t2, m3, c3, r3), the hypervisor cannot duplicate packets at line rate without performance impact, so AWS does not support Mirror sources on these. The fallback for old generations: migrate to current generation (m6i, c6i, r6i, m5/c5/r5, t3/t4g — all Nitro). For workloads that genuinely cannot migrate, the alternatives are host-based capture (tcpdump or pktcap inside the OS, exporting to S3 or a sidecar), or inline capture appliances placed in the routing path (a Gateway Load Balancer with a third-party appliance can record traffic). For ANS-C01 the exam-canonical answer remains "Traffic Mirroring on Nitro" and "host-based capture for non-Nitro" — the exam does not ask you to design exotic capture for legacy instance types.

Q9: How does CloudWatch Logs Insights compare with Athena for Flow Log queries?

CloudWatch Logs Insights is best for interactive ad-hoc queries on recent data (last hours to days), where the developer-experience matters: parsed fields, sort/limit, click-to-filter, integrated with the console where you opened the log group. Pricing is per-byte-scanned per query, which is fine for kilobytes-to-megabytes ad-hoc but becomes expensive at gigabyte-per-query scale. Athena over S3-stored Flow Logs is best for organisational-scale, historical, dashboarded queries — Parquet columnar storage, partition pruning, and SQL semantics make it dramatically cheaper for large-volume analytics. The default ANS-C01 architecture: send Flow Logs to both CloudWatch Logs (short retention, ad-hoc) and S3 (long retention, Athena, partitioned). Use Logs Insights for "what happened in the last 30 minutes on this ENI" and Athena for "show me cross-AZ traffic for the last quarter, grouped by source service". Both can be scripted via start-query / start-query-execution APIs for automation.

Q10: My organisation requires 7-year retention of network traffic for regulatory compliance — what is the architecture?

Flow Logs to S3 with lifecycle policies and Object Lock, plus Traffic Mirroring for incidents requiring full PCAP. Specifically: (1) VPC Flow Logs (v5, action=ALL, Parquet, Hive-partitioned) to a centralised Log Archive bucket with Object Lock in compliance mode (cannot be deleted by anyone, including root, until retention expires); transition to S3 Glacier Deep Archive after 90 days for cost; lifecycle keeps for 7 years then expires. (2) For incidents requiring deeper packet content, enable Traffic Mirroring on suspect ENIs to a forensic VPC running a Suricata/Zeek collector that dumps PCAP to a separate write-once S3 bucket; keep PCAP only for the investigation window plus regulatory grace period. (3) AWS Config and CloudTrail logs join the same Log Archive bucket for the configuration audit trail. (4) Athena Workgroups in the Security Tooling account allow controlled cross-account query without granting raw bucket read. This pattern is exactly the AWS Security Reference Architecture and is the highest-credit answer for any ANS-C01 question framing as "regulatory" or "compliance" plus "long retention".

Once monitoring is in place, the natural ANS-C01 operational layers are: network performance optimisation with ENA, EFA, jumbo frames, and placement groups; network cost optimisation including NAT GW per-AZ deployment, VPC peering vs TGW economics, and CloudFront for egress reduction; hybrid connectivity maintenance for BGP route limits, summarisation, and Direct Connect failover testing; and AWS Network Firewall deep inspection for stateful traffic filtering at the inspection VPC.

Official sources

More ANS-C01 topics