WAF & Rate Limiting at the Edge
A Web Application Firewall (WAF) and rate limiting at the edge inspect, score, and throttle inbound HTTP traffic in the same point of presence (PoP) that terminates TLS, so malicious payloads and abusive request floods are dropped thousands of kilometers from your origin and milliseconds before any compute runs.
Key Implementation Points:
- Managed rulesets (OWASP Core Rule Set, vendor-curated signatures) catch known attack classes, while custom rules express precise conditions on path, method, headers, IP, ASN, and country.
- Rate limiting keys requests by IP, header, cookie, or a derived identity and counts them over fixed or sliding windows to bound per-client request volume.
- WAF and rate limiting evaluate before your routing logic, so they sit upstream of the API Gateway at the Edge and any Cloudflare Workers Routing you deploy.
- A safe rollout moves each rule through log-only, then interactive challenge, then block, watching false-positive rates at every step.
This guide is part of Edge Routing & Serverless Function Architecture and pairs closely with the detailed walkthrough on Blocking Common Attacks with Cloudflare WAF Rules.
How edge WAF and rate limiting actually work
An edge WAF is a rule engine embedded in the proxy that terminates the connection. Because the request is already decrypted at the PoP, the engine sees the full HTTP semantics: method, URI, query string, headers, cookies, and (when configured) the request body. Each enabled rule is an expression evaluated against those fields, and rules are grouped into phases that run in a fixed order.
The two complementary controls answer different questions. A WAF asks “is this individual request malicious?” — does the query string contain a SQL injection payload, does the path traverse directories, does the body carry a cross-site scripting vector. Rate limiting asks “is this client sending too many requests?” — regardless of whether any single request is well-formed. A credential-stuffing attack, for example, uses perfectly valid login requests; only the volume and pattern betray it, which is why rate limiting and bot scoring are necessary alongside signature matching.
Evaluation order and phases
Every major provider runs security in distinct phases. On Cloudflare the order is roughly: DDoS mitigation, IP/ASN reputation and IP Access Rules, then the http_request_firewall_managed phase (managed rulesets), the http_request_firewall_custom phase (your custom rules), then Rate Limiting Rules, then Bot Management scoring. Only after every security phase allows the request does it reach Transform Rules, cache, and finally your Worker or origin. This ordering is the reason WAF lives logically in front of your API Gateway at the Edge: there is no point validating a JWT or applying quota logic for a request that a signature already condemned.
The first rule whose expression matches and whose action is terminal (block, challenge) ends evaluation. Non-terminal actions — log, skip, or score adjustments — let evaluation continue. This is what makes a log-only rollout safe: a logging rule records the match without changing the response, so you can measure exactly what would have been blocked before you flip it to enforce.
Managed rulesets and the OWASP Core Rule Set
Managed rulesets are vendor-maintained collections of signatures mapped to attack classes: SQL injection (SQLi), cross-site scripting (XSS), local/remote file inclusion, command injection, and known CVE exploit patterns. The OWASP Core Rule Set (CRS) is the open-source canonical example and ships behind most commercial WAFs in some form. CRS operates in one of two modes:
- Anomaly scoring (recommended): each matching rule adds to a per-request score; the request is actioned only when the cumulative score crosses a threshold. This dramatically reduces false positives because a single weak signal cannot block traffic on its own.
- Traditional / self-contained: any single matching rule blocks immediately. Simpler, but far noisier.
The CRS paranoia level (PL1–PL4) trades coverage for false positives. PL1 is safe for almost all sites; PL3–PL4 catch more obscure attacks but routinely flag legitimate input and should only run after extensive tuning.
Custom rules and expressions
Custom rules are where you encode application-specific policy. Providers expose a filter language (Cloudflare’s Rules language, AWS WAF’s statement JSON, Fastly’s signals) that matches on fields like http.request.uri.path, http.request.method, http.request.headers, ip.src, ip.geoip.country, and ip.geoip.asnum. Country and ASN matching overlaps with Geo-Targeted Traffic Routing — the difference is intent: geo routing steers legitimate users to the nearest region, whereas a WAF country rule blocks or challenges traffic from regions you do not serve.
# Cloudflare Rules language — block admin paths from outside your office ASN
(http.request.uri.path matches "^/admin" and ip.geoip.asnum ne 64500)
# Challenge POSTs to login that lack an expected header
(http.request.uri.path eq "/login" and http.request.method eq "POST"
and not http.request.headers["x-app-client"][0] eq "web")
Rate limiting: windows and keys
Rate limiting counts requests per key over a window and acts when the count exceeds a threshold. The two design choices that matter most:
- Key: what identifies a “client.” The default is source IP, but IPs are shared behind carrier-grade NAT and trivially rotated by attackers using proxy pools. Stronger keys combine IP with a header (
Authorization, an API key, a session cookie) or a fingerprint. Keying byAuthorizationtoken rate-limits a credential rather than an address, which is the right unit for API abuse. - Window: fixed windows reset the counter at wall-clock boundaries (e.g. every 60s), which is cheap but allows a burst of up to 2× the limit straddling the boundary. Sliding windows weight the previous window’s count proportionally, smoothing that edge at higher cost. Most edge platforms approximate sliding windows.
Provider-specific implementation
Cloudflare
Cloudflare splits the controls into Managed Rules, custom WAF rules, and Rate Limiting Rules, all expressible in Terraform via cloudflare_ruleset. Enable a managed ruleset by deploying the entry-point ruleset for the managed phase:
resource "cloudflare_ruleset" "managed_owasp" {
zone_id = var.zone_id
name = "Managed OWASP"
kind = "zone"
phase = "http_request_firewall_managed"
rules {
action = "execute"
description = "Cloudflare OWASP Core Ruleset"
expression = "true"
action_parameters {
id = "4814384a9e5d4991b9815dcfc25d2f1f" # Cloudflare OWASP CRS
overrides {
action = "log" # start in log mode
categories {
category = "paranoia-level-2"
action = "log"
}
}
}
}
}
A custom WAF rule lives in the http_request_firewall_custom phase:
resource "cloudflare_ruleset" "custom_waf" {
zone_id = var.zone_id
name = "Custom WAF"
kind = "zone"
phase = "http_request_firewall_custom"
rules {
action = "managed_challenge"
description = "Challenge logins from datacenter ASNs"
expression = "(http.request.uri.path eq \"/login\" and ip.geoip.is_in_european_union eq false and cf.bot_management.score lt 30)"
enabled = true
}
}
Rate Limiting Rules use the dedicated http_ratelimit phase with a ratelimit block:
resource "cloudflare_ruleset" "rl_login" {
zone_id = var.zone_id
name = "Login rate limit"
kind = "zone"
phase = "http_ratelimit"
rules {
action = "block"
expression = "(http.request.uri.path eq \"/api/login\")"
ratelimit {
characteristics = ["ip.src", "cf.colo.id"]
period = 60
requests_per_period = 10
mitigation_timeout = 600
counting_expression = "(http.response.code eq 401)" # count only failed logins
}
}
}
Note counting_expression lets you count responses (failed logins) while the rule matches on the request — the canonical pattern for stopping credential stuffing without throttling legitimate users.
AWS WAF
AWS WAF attaches a Web ACL to a CloudFront distribution, ALB, or API Gateway. The Web ACL contains ordered rules and rule groups, including AWS Managed Rules (e.g. AWSManagedRulesCommonRuleSet, which mirrors much of CRS) and RateBasedStatement rules.
{
"Name": "edge-web-acl",
"Scope": "CLOUDFRONT",
"DefaultAction": { "Allow": {} },
"Rules": [
{
"Name": "AWSCommon",
"Priority": 1,
"OverrideAction": { "Count": {} },
"Statement": {
"ManagedRuleGroupStatement": {
"VendorName": "AWS",
"Name": "AWSManagedRulesCommonRuleSet"
}
},
"VisibilityConfig": {
"SampledRequestsEnabled": true,
"CloudWatchMetricsEnabled": true,
"MetricName": "AWSCommon"
}
},
{
"Name": "LoginRateLimit",
"Priority": 2,
"Action": { "Block": {} },
"Statement": {
"RateBasedStatement": {
"Limit": 100,
"EvaluationWindowSec": 60,
"AggregateKeyType": "IP",
"ScopeDownStatement": {
"ByteMatchStatement": {
"FieldToMatch": { "UriPath": {} },
"PositionalConstraint": "STARTS_WITH",
"SearchString": "/api/login",
"TextTransformations": [{ "Priority": 0, "Type": "NONE" }]
}
}
}
},
"VisibilityConfig": {
"SampledRequestsEnabled": true,
"CloudWatchMetricsEnabled": true,
"MetricName": "LoginRateLimit"
}
}
]
}
OverrideAction: Count is AWS WAF’s log-only mode for a managed group: it records matches as CloudWatch metrics and sampled requests without blocking. AWS rate-based rules support AggregateKeyType of IP, FORWARDED_IP, or custom keys built from headers/cookies, and EvaluationWindowSec of 60, 120, 300, or 600 seconds.
Fastly Next-Gen WAF (Signal Sciences)
Fastly’s Next-Gen WAF works on signals rather than static signatures alone. Each request accrues signals (SQLI, XSS, plus custom signals you define), and rules act on signal thresholds over a window — effectively merging WAF and rate limiting into one model. The agent/module pattern inspects requests inline and reports to the console:
# Fastly NGWAF site rule (conceptual): flag and rate-limit on a custom signal
WHEN request.path ~ "^/api/" AND signal == "SQLI"
THEN add-signal "API-ATTACK"
RATE-LIMIT signal "API-ATTACK" > 5 in 60s per client.ip
ACTION block for 600s
Signals make tuning incremental: you can watch a signal’s volume in the console, add it to a templated rule, then promote that rule from “log” to “block” once the signal cleanly separates attackers from users.
Platform comparison
| Provider | WAF mechanism | Rate-limit wire behavior | Failover / Notes |
|---|---|---|---|
| Cloudflare | Managed Rulesets + Rules-language custom rules, anomaly scoring | Distributed counters per PoP; block/challenge/js_challenge; counting_expression on responses |
Runs before Workers/cache; challenges integrate with Turnstile; fail-open by design |
| AWS WAF | Web ACL with AWS/marketplace managed rule groups + JSON statements | RateBasedStatement, 60–600s windows, IP/forwarded-IP/custom keys |
Attaches to CloudFront/ALB/APIGW; counters are region-replicated with slight lag |
| Fastly NGWAF | Signal-based detection + custom signals, threshold rules | Signal rate thresholds per key/window, inline at the agent | Unified WAF+rate-limit model; tuning via console signal volume |
| OWASP CRS (self-hosted) | Open ruleset, anomaly scoring, PL1–PL4 | No native rate limiting (pair with nginx limit_req / Envoy) |
Portable across ModSecurity/Coraza; you own tuning and updates |
Step-by-step rollout procedure
Never deploy a new WAF or rate-limit rule straight to block. Follow a staged promotion and measure at each stage.
- Baseline traffic. Enable WAF logging/sampled requests and let it run 24–72 hours to capture daily and weekly cycles. Export request samples so you know your normal request rate per endpoint and the legitimate countries, ASNs, and user agents.
- Deploy in log-only mode. Set the managed ruleset override to
logand custom rules tolog(Cloudflare) orCount(AWS). Confirm matches appear in the security event log but no request is blocked. - Analyze matches for false positives. For every rule firing, classify the matched requests. Legitimate traffic matching a signature is a false positive that must be tuned out before enforcement.
- Promote to challenge. Switch ambiguous rules — especially login, signup, and search — to a managed challenge or JS challenge rather than a hard block. Humans pass transparently; most bots fail. This buys a safety margin over outright blocking.
- Promote clear-cut rules to block. Move unambiguous rules (directory traversal, known CVE exploits, requests to nonexistent admin paths) to
block. - Enable rate limiting last. Start with a generous threshold (e.g. 2× observed p99), key by IP, and run in log/count. Tighten toward the real abuse threshold and, for auth endpoints, switch the key to the credential and count only failed responses.
- Verify and document. Replay an attack from a staging IP and confirm the block. Record each rule’s intent, expected match volume, and rollback command in your runbook.
Verify a rule end to end with curl, checking for the block status and the WAF’s response signature:
# Should be blocked (403) by the SQLi managed rule
curl -sS -o /dev/null -w "%{http_code}\n" \
"https://app.example.com/search?q=1%27%20OR%20%271%27%3D%271"
# Hammer the login endpoint to trip the rate limit; expect 200s then 429s
for i in $(seq 1 25); do
curl -sS -o /dev/null -w "%{http_code} " \
-X POST "https://app.example.com/api/login" \
-d '{"u":"x","p":"y"}' -H 'content-type: application/json'
done; echo
Caching, TTL, and propagation implications
WAF and rate-limiting changes are control-plane configuration, not data records, so they do not carry DNS TTLs. On Cloudflare and AWS, ruleset updates propagate to the edge globally within seconds to a couple of minutes; there is no resolver caching to wait out the way there is when you adjust a DNS TTL. Two caching interactions still matter:
- Security runs before cache, so blocks are never cached. A blocked request never reaches the cache or origin, and a
blockresponse is generated fresh each time. Conversely, a cached200served from the edge still passes through the WAF and rate-limit phases first, so a cached asset can still be rate limited. - Challenge responses must not be cached. Ensure challenge interstitials carry
Cache-Control: no-store; a cached challenge page would lock out users. Providers handle this automatically, but verify if you front the WAF with a separate cache layer. - Rate-limit counters are per-PoP or per-region. A globally distributed attacker hitting many PoPs may need a lower per-PoP threshold than a single-region calculation suggests, because each PoP counts independently before any aggregation.
Troubleshooting and rollback
| Symptom | Likely cause | Action |
|---|---|---|
| Legitimate users get 403 | Managed rule false positive (often CRS PL too high) | Identify the rule ID in the event log; add a skip/exclusion override scoped to the path, or lower paranoia level |
| 429s during normal traffic | Rate limit threshold too low or key too coarse (shared NAT IP) | Raise threshold to 2× p99; add a more specific key (header/cookie) so one IP ≠ one user |
| Rule “does nothing” | Wrong phase ordering, or an earlier skip/allow short-circuits it |
Check phase order; confirm no higher-priority allow rule matches first |
| API clients challenged | JS/CAPTCHA challenge applied to non-browser traffic | Exempt authenticated API paths; use block not challenge for machine endpoints |
| Body-based attack not caught | Body inspection disabled or payload exceeds inspection size limit | Enable body inspection; note size caps (e.g. AWS WAF inspects first 8 KB by default) |
Rollback protocol. Keep every rule in version control (Terraform/CloudFormation). To roll back fast: (1) set the offending rule’s action back to log/Count or disable it, (2) apply — propagation is seconds, (3) confirm the security event log shows the rule no longer enforcing, (4) re-run the verification curl to confirm legitimate traffic flows. Because config propagates far faster than DNS, a bad WAF rule is one of the quickest production changes to reverse — favor disabling a single rule over tearing down the whole ruleset.
Edge cases and gotchas
- Trust the right client IP. Behind another proxy or load balancer, the WAF may see the proxy IP. Use
FORWARDED_IP(AWS) or configure the true-client-IP header so IP rules and rate-limit keys reflect the real client. - Inspection size limits are real. Large JSON or multipart bodies may exceed the WAF’s inspectable window; an attacker can pad a payload past the limit. Cap request body size at the edge and validate critical fields explicitly.
- Encoding evasions. Double-URL-encoding, mixed case, and comment injection defeat naive signatures; rely on managed rules’ built-in
TextTransformations/normalization rather than hand-rolled regex. - Bot scoring is probabilistic. A
bot_management.scorethreshold will misclassify some traffic; combine it with corroborating signals (ASN, missing headers) before blocking, and prefer challenge over block where humans might be affected. - Rate-limit keys leak through shared infrastructure. Keying solely on IP punishes everyone behind a corporate NAT or mobile carrier; always layer in an application identity for authenticated endpoints.
- Don’t WAF your health checks. Exempt monitoring and uptime-check sources by ASN or a shared secret header so your own probes aren’t throttled or blocked.
Frequently Asked Questions
Should I put the WAF in front of my edge functions or inside them? Put it in front. WAF and rate-limiting phases run before your routing logic and API gateway, so a malicious request is dropped without ever invoking compute, saving cost and reducing attack surface. Reserve in-function checks for business-logic authorization that the WAF cannot express.
How do I avoid blocking real users when I turn on managed rules?
Always start in log-only (Cloudflare log, AWS Count), run for at least a full traffic cycle, and classify every match. Tune out false positives with scoped exclusions or a lower OWASP paranoia level before promoting to challenge, and only then to block. The staged log → challenge → block path is the single most important practice.
What’s the right rate-limit key for a login endpoint? Combine source IP with the submitted username or session, and count only failed responses (401/403) using a counting expression. This throttles brute-force attempts against a specific credential without penalizing a busy office that shares one NAT IP, and it ignores successful logins entirely.
Does a WAF rule change propagate as slowly as a DNS change? No. WAF and rate-limit rules are control-plane config that reaches every PoP in seconds to a couple of minutes, with no resolver caching involved. That is why rolling back a bad rule is far faster than waiting out a DNS TTL — disable the rule, re-apply, and verify with curl.