CDN Caching & Performance Optimization

A CDN turns a single origin into hundreds of globally distributed cache nodes, and the difference between a 40% and a 97% cache hit ratio is almost entirely configuration. This guide covers the edge cache lifecycle, header precedence, cache keys, purge models, compression, and resilience patterns for engineers running production traffic on Cloudflare, AWS CloudFront, and Fastly.

Key concepts this page covers:

  • The edge cache state machine: HIT, MISS, EXPIRED, REVALIDATED, and stale serving.
  • Header precedence between Cache-Control, CDN-Cache-Control, and Surrogate-Control, plus browser TTL versus edge TTL.
  • Cache keys and the Vary header — why they make or break your hit ratio.
  • Purge and invalidation models, tiered caching, and origin shield.
  • Compression (Brotli, gzip, Zstandard) and resilience via stale-while-revalidate and stale-if-error.
CDN edge cache decision flow A request reaches the edge cache. On lookup it is a fresh HIT served directly, a MISS fetched from origin, or expired which triggers revalidation, serving stale on error. Client request Edge cache compute key, check freshness Fresh HIT serve from edge EXPIRED revalidate MISS fetch origin Stale served SWR / SIE Origin + shield REVALIDATED returns 304; origin error falls back to stale-if-error

The edge cache lifecycle

Every cacheable response moves through a small state machine at each edge node. When a request arrives, the CDN computes a cache key (by default the host plus path plus query string), looks for a stored object, and evaluates its freshness against the object’s stored TTL. The outcome is reported in a status header — cf-cache-status on Cloudflare, x-cache on CloudFront, and x-cache / x-served-by on Fastly — and that header is your single most important diagnostic signal.

The canonical states are:

  • HIT — a fresh copy exists and is served entirely from the edge. The origin sees nothing.
  • MISS — no usable copy exists, so the edge fetches from origin (or an upstream tier), stores the result if cacheable, and serves it.
  • EXPIRED — a copy exists but its TTL has elapsed. The edge revalidates with the origin using a conditional request.
  • REVALIDATED — the origin answered a conditional request with 304 Not Modified, so the edge refreshes the stored object’s freshness and serves it without re-downloading the body.
  • STALE / UPDATING — under stale-while-revalidate or stale-if-error, the edge serves an expired copy immediately while refreshing in the background or while the origin is unreachable.

Conditional revalidation depends on validators the origin must emit. An ETag enables If-None-Match; a Last-Modified date enables If-Modified-Since. Without a validator, an EXPIRED object cannot become REVALIDATED — it degrades to a full MISS, re-downloading the body and burning origin bandwidth. This is why a missing ETag on large, rarely-changing assets quietly inflates origin egress. The full lifecycle of TTL math and freshness lifetimes is covered in Cache-Control & CDN TTL, and the granular freshness extensions in Stale-While-Revalidate & Resilient Caching.

How cache decisions are made

A CDN will cache a response only if three conditions hold: the method is cacheable (typically GET / HEAD), the status is cacheable (200, 203, 301, 404, 410 by default per RFC 9111), and the response does not carry directives that forbid storage (no-store, private, often Set-Cookie). Most CDNs additionally refuse to cache responses with no freshness signal at all unless you opt in with a default edge TTL. The practical consequence: a 200 response with neither Cache-Control nor Expires is treated as uncacheable by Fastly and CloudFront unless a default TTL is configured, while Cloudflare caches a fixed list of static file extensions regardless.

Once an object is eligible, the edge derives its freshness lifetime — the number of seconds it may be served as a HIT before becoming EXPIRED. The lifetime is computed from the first present of s-maxage, max-age, or Expires minus Date, after any CDN-Cache-Control or Surrogate-Control override. The Age response header reports how long the object has already lived in shared caches; when Age exceeds the freshness lifetime, the next request transitions the object to EXPIRED. Understanding this arithmetic is what lets you predict, rather than guess, when origin will next be hit — and it is why an unexpectedly high Age on a freshly purged object usually points at a tiered cache layer still holding the old copy.

Soft purge versus hard purge

Two flavors of invalidation interact with this lifecycle. A hard purge evicts the object outright: the next request is a guaranteed MISS to origin. A soft purge (Fastly’s Fastly-Soft-Purge: 1, or any tag purge combined with stale-while-revalidate) marks the object stale rather than deleting it, so the edge keeps serving the stale copy while it revalidates in the background. Soft purge is almost always the right default for high-traffic objects because it avoids the synchronous origin round-trip that a hard purge forces on the first unlucky visitor.

Header precedence: browser TTL vs edge TTL

The same Cache-Control header is read by two different actors with different needs: the browser, which should usually cache for a short time so users see updates, and the edge, which should cache aggressively to maximize offload. The HTTP spec resolves this with targeted cache-control headers, and precedence matters.

For a CDN evaluating freshness, the order of precedence is:

  1. Surrogate-Control (Fastly / legacy surrogate spec) — consumed and stripped by the surrogate, highest priority where supported.
  2. CDN-Cache-Control (RFC 9213) — read by CDNs that support it, ignored by browsers.
  3. Cache-Control — read by everyone as the fallback.

This lets you decouple the two TTLs cleanly. A common pattern for HTML:

Cache-Control: max-age=0, must-revalidate
CDN-Cache-Control: max-age=300, stale-while-revalidate=86400

The browser revalidates on every navigation (so a deploy is visible immediately after purge), while the edge holds the page for five minutes and serves stale for a day during a background refresh. Cloudflare and Fastly honor CDN-Cache-Control; CloudFront does not parse it natively and instead derives edge TTL from Cache-Control plus your cache-policy min/max/default TTL settings. The full matrix of which header wins on which provider is detailed in Cache-Control & CDN TTL.

Directive Read by browser Read by edge Notes
Surrogate-Control: max-age=600 No Fastly (stripped) Highest priority surrogate directive
CDN-Cache-Control: max-age=300 No Cloudflare, Fastly RFC 9213; overrides Cache-Control at edge
Cache-Control: max-age=60 Yes Yes (fallback) Applies when no targeted header present
s-maxage=600 No Yes Shared-cache override inside Cache-Control
Expires: <date> Yes Yes Legacy; ignored if max-age present

Cache keys and the Vary header

The cache key determines whether two requests share a stored object. The default key — scheme, host, path, and full query string — is often wrong in both directions. Including a volatile query parameter like ?utm_source= fragments the cache into thousands of near-identical entries and craters the hit ratio. Conversely, ignoring a parameter that genuinely changes the response (?lang=de) serves the wrong content. Tuning the key by stripping tracking parameters, normalizing case, and selectively including cookies or headers is the highest-leverage change for hit ratio, covered in depth in Cache Key & Vary Configuration.

The Vary response header is the origin’s way of telling the cache “this object differs by these request headers.” Vary: Accept-Encoding is correct and necessary — it keeps the Brotli and gzip variants separate. Vary: Cookie or Vary: User-Agent is almost always a hit-ratio disaster, because the cardinality of those headers is effectively unbounded; every distinct cookie or UA string forces a separate cached object. Audit your origin for accidental Vary headers before blaming the CDN for a low hit ratio.

# Inspect what the origin claims the response varies on
curl -sI https://example.com/app.css | grep -i vary
# Vary: Accept-Encoding   <- good
# Vary: Accept-Encoding, Cookie   <- hit-ratio killer

When you genuinely need per-segment variation — say, a different response for logged-in versus anonymous users — do not lean on Vary: Cookie. Instead, normalize the dimension to a low-cardinality value inside the cache key itself: at the edge, read the session cookie, collapse it to a boolean auth=1 / auth=0, and include only that derived flag in the key. Two cache entries instead of millions. The same technique applies to device class (mobile / desktop) and currency, and it is the difference between a working segmented cache and one that never serves a HIT. Cache-key normalization recipes for each provider are collected in Cache Key & Vary Configuration.

Purge and invalidation models

Caches must be invalidated when content changes. There are three models, in increasing precision and decreasing blast radius:

  • Purge everything — flushes the entire zone. Instant but causes a thundering herd back to origin as every edge node re-fetches. Reserve for emergencies.
  • Purge by URL — invalidates specific objects. Precise but you must enumerate every URL and every variant; remember query-string and Vary permutations are distinct objects.
  • Purge by tag / surrogate key — attach a tag header to responses at origin (Surrogate-Key: product-42 on Fastly, Cache-Tag: product-42 on Cloudflare Enterprise) and purge the tag. One API call invalidates every object bearing that tag regardless of URL. This is the model to design around for dynamic content.

Tag-based purge is the only model that scales for catalogs and CMS-driven sites, and wiring it into your deploy pipeline is the subject of Cache Purging & Invalidation. CloudFront’s equivalent is path invalidation (free for the first 1,000 paths/month, billed thereafter), with no native tag model — a meaningful architectural constraint when choosing a provider.

The safest deploy-time pattern is to version your static assets with a content hash in the filename (app.4f3a.js) and mark them Cache-Control: max-age=31536000, immutable. Because the filename changes whenever the content changes, you never purge those assets at all — the new HTML simply references the new URL, and the old object ages out harmlessly. Purge is then reserved for the small set of unversioned resources (HTML entry points, API responses), shrinking your invalidation surface from thousands of objects to a handful. This separation of immutable hashed assets from purgeable entry points is the single most reliable cache-invalidation strategy in production.

# Cloudflare: purge by URL
curl -X POST "https://api.cloudflare.com/client/v4/zones/$ZONE/purge_cache" \
  -H "Authorization: Bearer $CF_TOKEN" \
  -H "Content-Type: application/json" \
  --data '{"files":["https://example.com/app.css"]}'

# CloudFront: invalidate a path prefix
aws cloudfront create-invalidation --distribution-id E1234567890 \
  --paths "/assets/*"

# Fastly: purge a surrogate key
curl -X POST "https://api.fastly.com/service/$SVC/purge/product-42" \
  -H "Fastly-Key: $FASTLY_TOKEN"

Tiered caching and origin shield

A flat CDN has every edge node fetching independently from origin, so a cold cache on a popular object means hundreds of simultaneous origin requests. Tiered caching (Cloudflare Argo Tiered Cache), an origin shield (CloudFront, Fastly), or a shield POP inserts an intermediate cache layer. Edge nodes fetch from the shield, the shield fetches from origin, and origin sees at most one request per object per shield. This collapses the thundering herd, dramatically improves hit ratio for occasionally-accessed content, and reduces origin egress.

Pick the shield region closest to your origin to minimize the shield-to-origin RTT. A shield also pairs naturally with stale-while-revalidate: the shield serves stale to all downstream edges while a single background fetch refreshes it. Because proxied CDN traffic depends on your DNS records pointing at the CDN’s anycast addresses, caching and resolution are coupled — see DNS Fundamentals & Advanced Record Configuration for proxied record setup and Edge Routing & Serverless Function Architecture for how edge compute interacts with the cache.

Compression and asset optimization

Compression is applied at the edge or passed through from origin, and the negotiated algorithm is selected from the client’s Accept-Encoding. Order of preference today is generally Brotli (br) for text, falling back to gzip, with Zstandard (zstd) emerging on Cloudflare for clients that advertise it. Brotli at quality 11 produces the smallest text payloads but is CPU-expensive to compress on the fly, so CDNs cache the compressed variant rather than recompressing per request — which is exactly why Vary: Accept-Encoding must be set correctly.

Critical pitfall: never compress already-compressed binaries (JPEG, PNG, WebP, MP4, fonts in WOFF2). Doing so wastes CPU and can slightly inflate size. Restrict compression to text MIME types (text/*, application/javascript, application/json, image/svg+xml). The trade-offs between algorithms and quality levels are detailed in Edge Compression & Asset Optimization.

# Confirm which encoding the edge actually served
curl -sI -H "Accept-Encoding: br,gzip,zstd" https://example.com/app.js \
  | grep -i 'content-encoding\|vary'
# content-encoding: br
# vary: Accept-Encoding

Resilience: stale-while-revalidate and stale-if-error

RFC 5861 defines two extensions that decouple availability from freshness. stale-while-revalidate=N lets the edge serve an expired object immediately while asynchronously revalidating, so users never wait on origin latency at TTL expiry. stale-if-error=N lets the edge serve a stale object when the origin returns 5xx or is unreachable, turning an origin outage into a soft, invisible degradation rather than a hard error page.

Cache-Control: max-age=60, stale-while-revalidate=600, stale-if-error=86400

This single header gives you a 60-second freshness window, ten minutes of zero-latency background refresh, and a full day of outage tolerance. Combined with an origin shield, it is the backbone of a resilient cache. Implementation specifics and provider support gaps live in Stale-While-Revalidate & Resilient Caching.

The mental model that prevents most resilience bugs: max-age controls when users might see stale, while stale-if-error controls how long you tolerate origin being down. Set the latter generously — a day or more — because a stale page is almost always better than a 502, and the worst case is simply that a user sees yesterday’s content during an outage they would otherwise have experienced as a hard failure. Pair this with health-aware edge routing so traffic shifts away from a failing origin before the stale window even expires; the routing side is covered under Edge Routing & Serverless Function Architecture. Note that stale-while-revalidate requires the CDN to support asynchronous background fetch; Cloudflare and Fastly do, while CloudFront’s support is more limited and frequently needs an explicit origin-header or function workaround.

Platform implementation notes

The three major CDNs share the same caching vocabulary but differ sharply in defaults and capabilities. Knowing where they diverge prevents config that silently no-ops on one provider.

Cloudflare caches by file extension out of the box and treats most HTML as uncacheable (DYNAMIC) until you write a Cache Rule. It honors CDN-Cache-Control, supports Cache-Tag purge on Enterprise plans, and offers Tiered Cache and Argo for the shield layer. Edge TTL is most reliably controlled with Cache Rules rather than origin headers, because dashboard rules take precedence over the response. Cloudflare also exposes cf-cache-status values beyond the basics — STALE, UPDATING, REVALIDATED, BYPASS, and EXPIRED — making header inspection unusually informative.

AWS CloudFront centers everything on cache policies and origin request policies. There is no CDN-Cache-Control parsing and no tag-based purge; invalidation is by path only and the first 1,000 paths per month are free. Compression is a per-behavior boolean (compress = true) that negotiates Brotli and gzip. Origin Shield is a per-origin setting, and getting its region wrong adds a needless cross-continent hop on every MISS. CloudFront’s x-cache header reports Hit from cloudfront, Miss from cloudfront, or RefreshHit from cloudfront for revalidations.

Fastly is the most programmable: VCL gives you the full request lifecycle, instant (sub-second) purge, native surrogate keys, and soft purge. Surrogate-Control and Surrogate-Key are first-class. The trade-off is that more behavior is your responsibility — there is no “just works” default cache list, so an origin that omits Cache-Control will pass through uncached unless your VCL sets a TTL. Fastly’s shielding designates one POP as the origin shield via the service configuration.

Capability Cloudflare CloudFront Fastly
CDN-Cache-Control honored Yes No Yes
Tag / surrogate-key purge Enterprise (Cache-Tag) No (path only) Yes (Surrogate-Key)
Soft purge Via SWR No Yes (Fastly-Soft-Purge)
Origin shield / tiered cache Argo Tiered Cache Origin Shield Shielding POP
Programmable rules Cache Rules / Workers Functions / cache policy VCL
Default HTML caching Off (DYNAMIC) Off unless TTL set Off unless TTL set

Diagnostic commands

Reading cache headers with curl -I is the fastest way to diagnose any caching issue. Annotate each run with the headers that matter:

# Full header dump — read cache status, age, and cache-control together
curl -sI https://example.com/ \
  | grep -iE 'cf-cache-status|x-cache|age|cache-control|cdn-cache-control|etag|vary'

# cf-cache-status: HIT        <- served from Cloudflare edge
# age: 142                    <- seconds since stored; should be < max-age for HIT
# cache-control: max-age=300
# cdn-cache-control: max-age=86400
# etag: "a1b2c3"              <- enables 304 revalidation

# Force a MISS to confirm origin behavior (bust the key with a unique query)
curl -sI "https://example.com/?cachebust=$RANDOM" | grep -i cf-cache-status

# Verify revalidation: send the stored ETag, expect 304
curl -sI -H 'If-None-Match: "a1b2c3"' https://example.com/ | head -1
# HTTP/2 304

# Compare TTLs across resolvers/POPs by hitting specific edges (CloudFront)
curl -sI https://example.com/ | grep -iE 'x-cache|x-amz-cf-pop'
# X-Cache: Hit from cloudfront
# X-Amz-Cf-Pop: FRA56-P3

A persistent age: 0 on what should be a HIT means the object is being stored but immediately treated as stale — almost always a max-age=0 or missing s-maxage. A cf-cache-status: DYNAMIC means Cloudflare never even attempted to cache, usually because the response carried Set-Cookie or private.

Specification reference

CDN caching behavior is standardized; knowing the source RFCs resolves most “which directive wins” arguments definitively.

Spec Title What it governs
RFC 9111 HTTP Caching Freshness, Cache-Control, Age, Expires, cacheable methods/status codes, Vary
RFC 9110 HTTP Semantics Conditional requests, ETag, Last-Modified, If-None-Match
RFC 5861 HTTP Cache-Control Extensions for Stale Content stale-while-revalidate, stale-if-error
RFC 9213 Targeted HTTP Cache Control CDN-Cache-Control and other targeted cache-control fields
RFC 8246 HTTP Immutable Responses Cache-Control: immutable for fingerprinted assets

Key field facts worth memorizing: s-maxage overrides max-age for shared caches only; immutable suppresses revalidation entirely (ideal for hashed filenames like app.4f3a.js); no-cache means “store but always revalidate,” which is distinct from no-store (“never store”). Confusing no-cache with no-store is the single most common cache misconfiguration.

Configuration examples

Cloudflare Cache Rule (API / JSON)

A Cache Rule that overrides edge TTL by status, caches a normally-uncacheable path, and strips a tracking parameter from the cache key:

{
  "expression": "(http.request.uri.path matches \"^/api/catalog/\")",
  "description": "Cache catalog API at edge for 5m, ignore utm params",
  "action": "set_cache_settings",
  "action_parameters": {
    "cache": true,
    "edge_ttl": { "mode": "override_origin", "default": 300 },
    "browser_ttl": { "mode": "override_origin", "default": 0 },
    "cache_key": {
      "custom_key": {
        "query_string": { "exclude": ["utm_source", "utm_medium", "utm_campaign"] }
      }
    },
    "serve_stale": { "disable_stale_while_updating": false }
  }
}

CloudFront cache behavior (Terraform HCL)

CloudFront separates the cache policy (what forms the key and the TTLs) from the origin request policy (what is forwarded). Compression and a long max-TTL for fingerprinted assets:

resource "aws_cloudfront_cache_policy" "static_assets" {
  name        = "static-assets-long-ttl"
  default_ttl = 86400
  max_ttl     = 31536000
  min_ttl     = 0

  parameters_in_cache_key_and_forwarded_to_origin {
    enable_accept_encoding_brotli = true
    enable_accept_encoding_gzip   = true
    cookies_config   { cookie_behavior = "none" }
    headers_config   { header_behavior = "none" }
    query_strings_config { query_string_behavior = "none" }
  }
}

resource "aws_cloudfront_distribution" "cdn" {
  origin {
    domain_name = "origin.example.com"
    origin_id   = "app-origin"
    # Origin Shield collapses origin fetches to one region
    origin_shield {
      enabled              = true
      origin_shield_region = "eu-central-1"
    }
  }

  default_cache_behavior {
    target_origin_id       = "app-origin"
    viewer_protocol_policy = "redirect-to-https"
    cache_policy_id        = aws_cloudfront_cache_policy.static_assets.id
    compress               = true
    allowed_methods        = ["GET", "HEAD"]
    cached_methods         = ["GET", "HEAD"]
  }
}

Fastly VCL

Fastly exposes the full request/response lifecycle in VCL, giving precise control over surrogate keys, TTL, and stale serving:

sub vcl_fetch {
  # Tag the object so it can be purged by surrogate key
  set beresp.http.Surrogate-Key = "catalog product-42";

  # Strip the surrogate directive so it never reaches the browser
  if (beresp.http.Surrogate-Control) {
    set beresp.ttl = 300s;
  }

  # Resilience: serve stale for revalidation and on origin error
  set beresp.stale_while_revalidate = 600s;
  set beresp.stale_if_error = 86400s;
  return(deliver);
}

sub vcl_recv {
  # Normalize the cache key: drop tracking params
  set req.url = querystring.regfilter(req.url, "^utm_");
}

Edge cases and warnings

Scenario Impact Mitigation
Vary: Cookie from origin Hit ratio collapses to near zero Strip Vary: Cookie at edge; key on a hashed session flag instead
Tracking params in default key Cache fragments per visitor Exclude utm_* / fbclid from the cache key
Missing ETag/Last-Modified EXPIRED becomes full MISS, no 304s Emit a validator from origin for all cacheable bodies
no-cache vs no-store confusion Either over-caching private data or zero caching Use no-store for private, no-cache+ETag for revalidated
Purge-everything on deploy Thundering herd, origin overload Use tag/surrogate-key purge scoped to changed objects
Compressing JPEG/MP4 at edge Wasted CPU, no size gain Restrict compression to text MIME types
Set-Cookie on a cacheable asset CDN downgrades to uncacheable (DYNAMIC) Move session cookies off static asset responses
Apex domain not proxied Traffic bypasses CDN entirely Use CNAME flattening / ALIAS so the apex resolves to the CDN

Frequently Asked Questions

Why is my cf-cache-status showing DYNAMIC even though I set max-age? DYNAMIC means Cloudflare never attempted to cache the response. The usual causes are a Set-Cookie header, private or no-store in Cache-Control, a non-GET method, or a file type not on the default cacheable list without a Cache Rule. Add a Cache Rule that forces caching for the path and remove any Set-Cookie from that response.

What is the difference between no-cache and no-store? no-store forbids any cache from keeping a copy at all. no-cache permits storage but requires the cache to revalidate with the origin before every reuse. Use no-store for genuinely private responses and no-cache together with an ETag for content that must always be checked but can still benefit from 304 revalidation.

How do I set a different TTL for the browser and the CDN? Send Cache-Control: max-age=0, must-revalidate for the browser and CDN-Cache-Control: max-age=300 for the edge. Cloudflare and Fastly read the targeted header for edge TTL and fall back to Cache-Control for the browser. On CloudFront, which does not parse CDN-Cache-Control, use s-maxage for the shared-cache TTL or set the edge TTL in the cache policy.

Does compression hurt my cache hit ratio? Only if Vary: Accept-Encoding is mismanaged. The CDN stores a separate variant per encoding, so the header is required to keep Brotli and gzip copies distinct. Without it, a client could be served the wrong encoding; with too broad a Vary, you fragment unnecessarily. Set exactly Vary: Accept-Encoding and let the CDN cache each variant.

How do I invalidate one product page without purging the whole site? Tag the response at origin with Cache-Tag (Cloudflare Enterprise) or Surrogate-Key (Fastly), then issue a tag purge. One API call invalidates every URL and variant bearing that tag with no thundering herd. On CloudFront, which lacks tags, scope a path invalidation to the specific object prefix instead of /*.

Back to Edge & DNS Ops Guide