Weighted Load Balancing Across Multi-Region Origins

Q: Can I combine weighted load balancing with latency-based routing?

Yes. Use proximity or geo steering so the closest healthy pool is chosen first, with weights acting as a tiebreaker among nearby pools. This gives you “serve locally but cap each region’s share.”

This guide shows you how to distribute traffic across multiple regional origins using explicit weights, so you can shape capacity, run gradual rollouts, and drain a region cleanly without touching DNS or redeploying code. After reading it you will be able to assign per-origin weights in both Cloudflare Load Balancing and Amazon Route 53, layer weights on top of latency and geo steering, push a canary release at a fixed percentage, and verify the resulting split with a sampling loop. Weighted distribution is the foundation of safe multi-region operation and a natural companion to Configuring Edge Health Checks and Automatic Failover within your broader Load Balancing at the Edge strategy.

Key objectives:

Assign per-origin and per-pool weights to control the exact traffic share each region receives.
Combine static weights with latency or geo steering so proximity and capacity are honored together.
Drain a region for maintenance by setting its weight to zero, and run canary releases at small fixed percentages.
Verify the realized split by sampling an origin-identifying response header in a curl loop.

Prerequisites and environment setup

Before assigning weights you need a load balancer that already fronts two or more origins grouped into pools, each pool backed by a working health check. Weights only divide traffic among origins that the load balancer considers healthy, so an unhealthy pool silently receives zero share regardless of its configured weight. If you have not yet wired up health monitoring, configure it first using Configuring Edge Health Checks and Automatic Failover.

For the Cloudflare examples you need Terraform 1.5+, the cloudflare/cloudflare provider 4.x, a zone on a plan that includes Load Balancing, and an API token scoped to Load Balancing: Edit, DNS: Edit, and Zone: Read. For the Route 53 example you need the AWS CLI v2 authenticated against a hosted zone. Export your credentials so Terraform and the CLI can read them.

export CLOUDFLARE_API_TOKEN="<lb-scoped-token>"
export AWS_PROFILE="dns-ops"
terraform -version    # expect: Terraform v1.5.x or newer
aws sts get-caller-identity --query Account --output text

Each origin must return a header that uniquely identifies which region served the request. Add a static response header at the origin (or inject one in your edge function) such as x-origin: us-east-1. This header is the only reliable way to measure the realized split from the outside, since the load balancer hostname is identical for every region.

Step-by-step procedure

1. Assign origin and pool weights

A weight is a relative number, not a percentage. Cloudflare normalizes the weights of all healthy origins in a pool (and all healthy pools in a load balancer) so they sum to 1.0. Two origins weighted 0.6 and 0.3 therefore receive roughly 67% and 33%, not 60% and 30% — the missing 10% is redistributed. To get a clean 60/30/10 split, make the weights sum to 1.0 explicitly.

resource "cloudflare_load_balancer_pool" "us_east" {
  account_id = var.account_id
  name       = "us-east-pool"
  origins {
    name    = "us-east-1"
    address = "us-east.origin.example.com"
    enabled = true
    weight  = 1.0
  }
  monitor = cloudflare_load_balancer_monitor.https.id
}

resource "cloudflare_load_balancer" "app" {
  zone_id          = var.zone_id
  name             = "app.example.com"
  default_pool_ids = [
    cloudflare_load_balancer_pool.us_east.id,
    cloudflare_load_balancer_pool.eu_west.id,
    cloudflare_load_balancer_pool.ap_south.id,
  ]
  fallback_pool_id = cloudflare_load_balancer_pool.us_east.id
  steering_policy  = "random"          # weighted random across pools
  proxied          = true

  random_steering {
    default_weight = 0.1               # AP-South inherits this
    pool_weights = {
      (cloudflare_load_balancer_pool.us_east.id) = 0.6
      (cloudflare_load_balancer_pool.eu_west.id) = 0.3
    }
  }
}

Apply the change and confirm the plan adds only the steering block. Expected side effect: traffic begins shifting within seconds of terraform apply because Load Balancing config propagates through Cloudflare’s edge without a DNS TTL wait.

terraform apply -target=cloudflare_load_balancer.app

2. Combine weights with latency or geo steering

Pure weighted random ignores proximity, so a user in Frankfurt may still be sent to US-East 60% of the time. To respect geography while keeping weights as a tiebreaker, switch steering_policy to "proximity" or "geo". With proximity steering, Cloudflare picks the closest healthy pool first and only falls back to weights when pools are equidistant or co-located. If you need finer regional control than the load balancer offers, route at the function layer using the patterns in Implementing Geo-Routing with Edge Functions for Latency Reduction.

  steering_policy = "proximity"
  # pool_weights still apply as the tiebreaker among nearby pools

Expected behavior: a request from Germany now lands on EU-West almost always, while the 60/30/10 weights only govern traffic where no pool is clearly closer. This is the correct setup for “serve locally, but cap each region’s share.”

3. Drain a region by setting its weight to zero

To take AP-South out of rotation for maintenance without deleting the pool or breaking health-check history, set its weight to zero. Existing in-flight requests complete; new requests route to the remaining weighted pools. This is gentler than disabling the pool, which trips failover logic and can fire alerts.

  random_steering {
    default_weight = 0.0               # AP-South drains to zero
    pool_weights = {
      (cloudflare_load_balancer_pool.us_east.id) = 0.67
      (cloudflare_load_balancer_pool.eu_west.id) = 0.33
    }
  }

Side effect: the two remaining pools absorb the drained region’s share according to their relative weights. Restore traffic by setting the weight back to a non-zero value; ramping it up gradually (0.05, then 0.2, then full) is the safe way to reintroduce a region after a deploy.

4. Use weights for canary releases

A canary is just a fourth origin in the same pool, weighted small. Point it at the new build and give it weight = 0.05 against 0.95 for the stable origin. Because the weight applies inside the pool, the canary receives 5% of that region’s traffic and shares its health check, so a failing canary is ejected automatically.

resource "cloudflare_load_balancer_pool" "us_east" {
  origins {
    name    = "us-east-stable"
    address = "us-east-stable.origin.example.com"
    weight  = 0.95
  }
  origins {
    name    = "us-east-canary"
    address = "us-east-canary.origin.example.com"
    weight  = 0.05
  }
}

Promote by walking the canary weight toward 1.0 across successive applies. Roll back instantly by setting the canary weight to 0 — far faster than a DNS-based cutover, which would wait on cached records.

5. The Route 53 equivalent

If you run DNS-level weighting instead of an edge load balancer, Route 53 weighted record sets achieve a similar split. Each record carries a Weight, and Route 53 returns each record with probability weight / sum(weights). Note the key difference from edge LB: clients cache the answer for the record TTL, so the realized split is far coarser and slower to change. Keep the TTL low (such as 60s) for any record you intend to reweight.

{
  "Comment": "60/30/10 weighted split across regions",
  "Changes": [
    {"Action": "UPSERT", "ResourceRecordSet": {
      "Name": "app.example.com", "Type": "CNAME", "TTL": 60,
      "SetIdentifier": "us-east-1", "Weight": 60,
      "ResourceRecords": [{"Value": "us-east.origin.example.com"}]}},
    {"Action": "UPSERT", "ResourceRecordSet": {
      "Name": "app.example.com", "Type": "CNAME", "TTL": 60,
      "SetIdentifier": "eu-west-1", "Weight": 30,
      "ResourceRecords": [{"Value": "eu-west.origin.example.com"}]}},
    {"Action": "UPSERT", "ResourceRecordSet": {
      "Name": "app.example.com", "Type": "CNAME", "TTL": 60,
      "SetIdentifier": "ap-south-1", "Weight": 10,
      "ResourceRecords": [{"Value": "ap-south.origin.example.com"}]}}
  ]
}

aws route53 change-resource-record-sets \
  --hosted-zone-id Z123456ABCDEFG \
  --change-batch file://weighted.json

Set a record’s weight to 0 to drain it; Route 53 stops returning that record entirely. A record with weight 0 is only returned if every record in the group is weighted 0, which makes it a safe maintenance toggle.

Verification

Measure the realized split from the outside by hammering the endpoint and tallying the origin header. Use a fresh connection each iteration so keepalive does not pin you to one origin.

for i in $(seq 1 200); do
  curl -s -o /dev/null -D - --no-keepalive \
    https://app.example.com/healthz \
  | awk 'tolower($1)=="x-origin:"{print $2}'
done | sort | uniq -c | sort -rn

Expected output for a healthy 60/30/10 configuration over 200 samples (counts will vary by chance):

    119 us-east-1
     64 eu-west-1
     17 ap-south-1

Confirm the load balancer itself reports the pools as healthy and weighted as intended:

curl -s -H "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \
  "https://api.cloudflare.com/client/v4/zones/$ZONE/load_balancers" \
  | jq '.result[].random_steering'

A single sample should never be trusted; only the aggregate over a few hundred requests converges on the configured ratio.

Troubleshooting

Symptom	Likely cause	Fix
Split heavily favors one origin	HTTP keepalive reusing one connection	Add `--no-keepalive`; in production confirm clients open fresh connections
Edge cache serves one region	Cacheable response pinned to first-hit pool	Set `Cache-Control: no-store` on the probe path or bypass cache for `/healthz`
Frankfurt always hits US-East	Proximity steering off; pure weighted random	Switch `steering_policy` to `proximity` or `geo`
Weight 0.6/0.3 yields 67/33	Weights normalized only across healthy origins	Make weights sum to 1.0, or expect redistribution when a pool is down
Canary gets 0% traffic	Canary origin failing its health check	Run `curl -I` against the canary address; fix the check before reweighting

Uneven splits from caching or keepalive

The most common false alarm is a “broken” split that is actually a measurement artifact. A reused TCP connection sticks to whichever origin served the first request, and a CDN edge cache returns the same cached body regardless of weight. Always probe an uncacheable path with fresh connections, and confirm by adding a cache-status header to your sample loop.

Session stickiness overriding weights

If you enabled session affinity (cookie or IP-based stickiness), a returning client is pinned to its original origin and ignores subsequent weight changes. This is correct behavior for stateful apps, but it means a drain or canary shift only affects new sessions. Reduce the affinity TTL before a drain, or accept that existing sessions bleed off gradually.

Weight not honored under a steering policy

Weights interact differently with each steering policy. Under proximity and geo, weights are only a tiebreaker among co-located pools, so a global weight change may appear ignored for users near a single region. If you need weights to dominate everywhere, use random steering; if you need geography to dominate, accept that weights apply locally. Verify which policy is live with the API jq call above before concluding a weight is broken.

Frequently Asked Questions

Do weights represent exact percentages of traffic? No. Weights are relative values that the load balancer normalizes across healthy origins. A 60/30 pair gives roughly 67/33 unless you make the weights sum to 1.0 or 100, and the realized share only converges to the ratio over many requests.

Why is my measured split so far off the configured weights? Almost always keepalive connection reuse or edge caching during measurement. Probe an uncacheable path with --no-keepalive and sample at least a few hundred requests before comparing against the configured ratio.

How do I cleanly take a region offline for maintenance? Set that origin or pool weight to zero. In-flight requests finish, new requests route to the remaining weighted pools, and health-check history is preserved — unlike disabling the pool, which can trigger failover alerts.

Can I combine weighted load balancing with latency-based routing? Yes. Use proximity or geo steering so the closest healthy pool is chosen first, with weights acting as a tiebreaker among nearby pools. This gives you “serve locally but cap each region’s share.”

What is the difference between Cloudflare LB weights and Route 53 weighted records? Cloudflare applies weights at the edge per request and changes take effect in seconds. Route 53 weights are applied in DNS answers that clients cache for the record TTL, so the split is coarser and slower to change — keep the TTL low when reweighting.

Back to Load Balancing at the Edge