Resilience

CDN and Edge Analytics: Measuring Traffic Before JavaScript Loads

Client-side analytics are blind to 12-22% of traffic.

Google Analytics 4, Plausible, Fathom, and most analytics platforms rely on JavaScript that executes in user's browser. JavaScript loads → fires tracking beacon → data recorded.

The gap: Not all visitors allow JavaScript to execute.

Who's invisible:

Combined: 18-30% of actual HTTP requests never appear in client-side analytics.

The solution: Measure traffic at the edge—server-side, before content reaches browser. CDN (Content Delivery Network) logs capture every HTTP request, regardless of JavaScript execution. Edge analytics expose true traffic volume.

Use cases:

CDN edge analytics = ground truth. Client-side analytics = sample of engaged users.

Links: server-side-analytics-accurate-traffic, direct-traffic-measurement-analytics, ga4-setup-multi-channel-tracking


CDN Edge Logs vs Client-Side Analytics: Understanding the Gap

Two measurement layers capture different realities.

What CDN Edge Logs Capture

Edge logs = server-side records of HTTP requests.

CDN (Cloudflare, Fastly, AWS CloudFront, Akamai) sits between user and origin server. Every request passes through CDN edge nodes.

Edge logs record:

Field Example Use
Request timestamp 2026-02-07 14:32:18 UTC Traffic timing analysis
Client IP 192.0.2.146 Geographic analysis, bot detection
User agent Mozilla/5.0... Chrome/120.0 Browser/device identification
Request URL /blog/article-title Page-level traffic
HTTP status 200, 404, 500 Success rate, error tracking
Bytes transferred 124,583 bytes Bandwidth consumption
Referrer google.com, twitter.com Traffic source
Cache status HIT, MISS Performance optimization
TLS version TLS 1.3 Security analysis

Crucially: Edge logs capture every HTTP request, even if:

This is raw traffic reality.

What Client-Side Analytics Miss

GA4 and similar tools require JavaScript execution + network beacon success.

Failure points:

1. JavaScript load failure

2. Analytics blocker interference

3. Browser privacy settings

4. Corporate network filtering

5. Bot traffic

Measurement comparison:

Example site:

Gap breakdown:

Validating Client-Side Data with Edge Metrics

Cross-reference GA4 against edge logs to detect undercounting.

Method 1: Total request comparison

Edge logs (Cloudflare Analytics):

GA4:

Discrepancy: 20% (edge captures 25,000 more requests)

Conclusion: GA4 undercounts by ~20%. Either analytics blockers prevalent in audience or bot traffic higher than expected.

Method 2: User-agent matching

Edge logs contain user-agent strings. Extract and compare to GA4 browser distribution.

Edge log browser breakdown:

GA4 browser breakdown:

Discrepancy: Edge shows 4% Edge browser, GA4 shows 2%. Suggests Edge users have higher analytics blocking rate (50% blocked).

Action: If targeting Edge users (enterprise audience), GA4 severely undercounts this segment.

Method 3: Geographic validation

Edge logs include client IP → GeoIP lookup.

Edge geographic distribution:

GA4 geographic distribution:

Discrepancy: Germany underrepresented in GA4 (6% edge vs 5% GA4). Possible higher privacy-tool adoption in Germany (GDPR-aware users).


Implementing CDN Analytics Across Major Providers

Each CDN offers edge analytics. Setup varies.

Cloudflare Analytics Setup and Interpretation

Cloudflare provides edge analytics natively (available on all plans, including Free).

Access:

  1. Log in to Cloudflare dashboard
  2. Select domain
  3. Navigate to Analytics & Logs

Key metrics:

Requests:

Bandwidth:

Threats:

Performance:

Traffic sources:

Custom filtering:

Cloudflare Logpush (Business+ plans):

Example query (BigQuery):

SELECT
  ClientRequestURI,
  COUNT(*) as requests,
  SUM(EdgeResponseBytes) as total_bytes
FROM edge_logs
WHERE EdgeResponseStatus = 200
  AND ClientRequestURI NOT LIKE '%/admin%'
GROUP BY ClientRequestURI
ORDER BY requests DESC
LIMIT 100

Result: Top 100 pages by edge request volume (true traffic, not GA4-filtered).

Bot detection:

Cloudflare classifies traffic:

Filter:

Validation workflow:

  1. Cloudflare total requests: 127,000
  2. Filter out bot traffic: 112,000 (human requests)
  3. GA4 sessions: 94,000
  4. Analytics blocking rate: 16% (18,000 / 112,000)

Fastly Real-Time Analytics Configuration

Fastly offers real-time edge analytics with granular filtering.

Access:

  1. Log in to Fastly dashboard
  2. Select service
  3. Navigate to Observability → Real-time analytics

Real-time dashboard:

Metrics updated every second:

Custom dimensions:

Filter by:

Historical analysis:

Navigate to Observability → Historical stats:

Advanced logging (Fastly Log Streaming):

Send edge logs to:

Setup:

  1. Navigate to Configure → Logging
  2. Select endpoint (e.g., S3)
  3. Configure log format (JSON, Apache, custom)
  4. Choose fields: %{req.url}V %{resp.status}V %{client.geo.country_code}V
  5. Save, deploy

Logs stream in real-time.

Comparing Fastly to GA4:

Fastly edge requests (HTML only):

Filter: Content-Type: text/html (exclude CSS, JS, images)

GA4 pageviews: 94,000

Gap: 4% (4,000 requests)

Interpretation: Low gap (4%) suggests minimal analytics blocking (audience less privacy-conscious or fewer bot requests).

AWS CloudFront Logs and CloudWatch Metrics

AWS CloudFront separates real-time monitoring (CloudWatch) and detailed logs (access logs).

CloudWatch Metrics (real-time):

Access:

  1. AWS Console → CloudFront → Select distribution
  2. Navigate to Monitoring tab

Metrics:

Granularity: 1-minute intervals

Use case: Real-time traffic spikes, performance monitoring.

Access Logs (detailed historical):

Setup:

  1. CloudFront distribution settings
  2. Enable Standard Logging
  3. Specify S3 bucket for log storage
  4. Logs delivered every ~1 hour

Log format: Tab-delimited text file

Fields include:

Analysis:

Upload logs to AWS Athena (SQL query engine):

Athena table creation:

CREATE EXTERNAL TABLE cloudfront_logs (
  request_date DATE,
  time STRING,
  location STRING,
  bytes BIGINT,
  request_ip STRING,
  method STRING,
  host STRING,
  uri STRING,
  status INT,
  referrer STRING,
  user_agent STRING
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
LOCATION 's3://your-bucket/cloudfront-logs/'

Query example (top pages):

SELECT
  uri,
  COUNT(*) as requests,
  COUNT(DISTINCT request_ip) as unique_ips
FROM cloudfront_logs
WHERE status = 200
  AND request_date >= DATE '2026-02-01'
GROUP BY uri
ORDER BY requests DESC
LIMIT 50

Result: True request volume per page (edge-level, pre-JavaScript).

Comparing to GA4:

Export GA4 page data, join with Athena results:

Page CloudFront Requests GA4 Pageviews Gap %
/homepage 32,400 28,200 13%
/article-1 18,600 16,100 13%
/article-2 12,800 9,400 27%

Insight: Article-2 has 27% gap (high analytics blocking). Possible causes:


Analyzing Edge Data for Bot Traffic and Anomaly Detection

Edge logs expose non-human traffic client-side analytics miss.

Distinguishing Good Bots from Malicious Scrapers

Bot classification:

Good bots (allow):

Bad bots (block):

Neutral bots (evaluate):

Detection via user-agent:

Edge logs include User-Agent header.

Good bot user-agents:

Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)

Bad bot patterns:

Detection via request patterns:

Good bots:

Bad bots:

Cloudflare Bot Management:

Enable:

  1. Security → Bots
  2. Select bot protection level:
    • Off: Allow all
    • Essentially off: Allow verified bots + humans
    • Low: Challenge suspicious bots
    • Medium: Challenge most bots (recommended)
    • High: Challenge all non-verified

Bot Score:

Cloudflare assigns score 1-99:

Filter edge analytics:

View traffic by Bot Score:

If GA4 shows 94,000 sessions and edge shows 104,000 human requests:

Detecting Analytics Blocker Prevalence

Method: User-agent + edge log correlation

Step 1: Export edge logs

Filter: Status 200, Content-Type text/html, Bot Score >56 (humans)

Result: 104,000 requests (verified human traffic)

Step 2: Export GA4 user-agent data

GA4 → Explore → Free form:

Result: 94,000 sessions

Step 3: Match distributions

Browser Edge Requests GA4 Sessions Blocking Rate
Chrome 70,720 (68%) 67,680 (72%) 4.3%
Safari 18,720 (18%) 17,860 (19%) 4.6%
Firefox 8,320 (8%) 6,580 (7%) 20.9%
Brave 3,120 (3%) 940 (1%) 69.9%

Insight: Firefox users block analytics at 21% rate. Brave users (privacy browser) block at 70% rate.

Audience implication:

If targeting privacy-conscious users (developers, security professionals), expect 15-30% analytics undercounting.

Rate Limiting and DDoS Detection

Edge analytics surface traffic spikes invisible to GA4.

Anomaly: Request spike without session spike

Edge logs:

GA4:

Discrepancy: Edge spike not reflected in GA4 → automated traffic.

Investigation:

Filter edge logs for that hour:

Diagnosis: Scraper bot, likely harvesting content.

Response:

Cloudflare Firewall Rule:

  1. Security → WAF → Firewall rules
  2. Create rule: (ip.src eq 185.220.0.0/16) then Block
  3. Deploy

Result: Bot blocked at edge, doesn't reach origin server.

Legitimate spike detection:

Edge logs:

GA4:

Correlation: Both spike → legitimate viral traffic (article shared on Hacker News).

Referrer analysis (edge logs):

Confirmation: Viral traffic, not attack.


Building Hybrid Analytics: Combining Edge and Client-Side Data

Neither edge nor client-side analytics alone tell full story. Combine for accuracy.

Creating Unified Dashboard with Edge + GA4 Metrics

Goal: Single view showing edge reality + user engagement.

Architecture:

  1. Edge data source: Cloudflare, Fastly, or CloudFront logs
  2. Client-side data: GA4
  3. Data warehouse: Google BigQuery, AWS Redshift, or Snowflake
  4. Visualization: Looker Studio, Tableau, or custom dashboard

ETL pipeline:

Step 1: Export edge logs to BigQuery

Cloudflare Logpush:

Schema:

Field Type
timestamp TIMESTAMP
client_ip STRING
url STRING
status INTEGER
bytes INTEGER
user_agent STRING
referrer STRING

Step 2: Export GA4 to BigQuery

GA4 → Admin → BigQuery Links:

Step 3: Join datasets

BigQuery SQL:

WITH edge_traffic AS (
  SELECT
    DATE(timestamp) as date,
    url,
    COUNT(*) as edge_requests
  FROM `project.cdn_logs.edge_requests`
  WHERE status = 200
  GROUP BY date, url
),
ga4_traffic AS (
  SELECT
    event_date as date,
    (SELECT value.string_value FROM UNNEST(event_params) WHERE key = 'page_location') as url,
    COUNT(DISTINCT user_pseudo_id) as ga4_sessions
  FROM `project.analytics_XXXXXX.events_*`
  WHERE event_name = 'page_view'
  GROUP BY date, url
)

SELECT
  e.date,
  e.url,
  e.edge_requests,
  COALESCE(g.ga4_sessions, 0) as ga4_sessions,
  e.edge_requests - COALESCE(g.ga4_sessions, 0) as gap,
  SAFE_DIVIDE(e.edge_requests - COALESCE(g.ga4_sessions, 0), e.edge_requests) * 100 as gap_percentage
FROM edge_traffic e
LEFT JOIN ga4_traffic g
  ON e.date = g.date AND e.url = g.url
ORDER BY e.date DESC, e.edge_requests DESC

Output:

Date URL Edge Requests GA4 Sessions Gap Gap %
2026-02-07 /homepage 3,240 2,890 350 10.8%
2026-02-07 /article-1 1,820 1,640 180 9.9%
2026-02-07 /privacy-guide 980 720 260 26.5%

Insight: Privacy-guide article has 26.5% gap → audience blocking analytics (self-selection bias: privacy-interested readers use privacy tools).

Dashboard visualization (Looker Studio):

Chart 1: Total traffic comparison

Chart 2: Analytics blocking rate

Chart 3: Top pages by gap

Alert rules:

Reconciling Discrepancies Between Data Sources

Edge and client-side data will never match perfectly. Understand acceptable variance.

Expected discrepancies:

Analytics blockers: 8-18%

Bot traffic: 3-12%

JavaScript load failures: 2-5%

Total expected gap: 15-30%

Acceptable: Edge requests 15-25% higher than GA4 sessions

Investigate if:

Reconciliation process:

Step 1: Filter edge logs to human traffic only

Remove:

Filtered edge requests: 112,000 (from 127,000 total)

Step 2: Compare to GA4

GA4 sessions: 94,000

Gap: 16% (acceptable range)

Step 3: Segment gap by traffic source

Source Edge Requests GA4 Sessions Gap %
Organic search 68,000 62,000 8.8%
Direct 22,000 18,000 18.2%
Social 15,000 10,500 30.0%
Referral 7,000 3,500 50.0%

Insight: Social and referral traffic have high gaps → shared in communities with privacy tools (Reddit, Hacker News).


FAQ

Do I need to pay for CDN analytics or are they included?

Cloudflare: Basic analytics free on all plans (Free, Pro, Business, Enterprise). Advanced features (Logpush, custom retention) require Business+ ($200+/month). Fastly: Real-time analytics included in all plans. Historical stats and log streaming available. AWS CloudFront: CloudWatch metrics free (basic). Access logs free (stored in S3, pay S3 storage costs ~$0.023/GB/month). Vercel/Netlify: Analytics add-on costs $10-20/month. Most CDNs include basic analytics; advanced features may cost extra.

Can edge analytics completely replace GA4?

No. Edge analytics measure requests (server-side). GA4 measures engagement (client-side). Edge logs show page was requested but not if user read it, scrolled, clicked, or converted. Edge analytics validate volume. GA4 tracks behavior. Use both: Edge for accurate traffic counts + bot detection, GA4 for user journeys + conversions + attribution. Hybrid approach yields complete picture.

How accurate is bot detection in edge analytics?

Cloudflare Bot Management: 95%+ accuracy (proprietary ML model). AWS CloudFront: No built-in bot detection (requires manual user-agent filtering, 70-85% accuracy). Fastly: Good (uses device detection + patterns, 85-92% accuracy). DIY bot filtering via user-agent matching: 60-75% accuracy (bots can spoof user-agents). Best accuracy: Combine CDN bot scoring + behavioral analysis (request rate, patterns, JavaScript execution). No solution is perfect—sophisticated bots evade detection.

What's the performance impact of enabling detailed edge logging?

Negligible. Edge logs are generated server-side (CDN level) without affecting client page load. Logging adds <1ms latency (imperceptible). Storage costs depend on traffic volume: 100k requests/day ≈ 500MB logs/day ≈ $0.35/month (S3 storage). Network transfer costs (if streaming logs to external analytics): ~$0.01-0.09 per GB. For most sites, edge logging costs <$10/month. High-traffic sites (10M+ requests/day): $50-200/month for log storage + processing.

How do I handle GDPR/privacy compliance with edge analytics?

Edge logs contain IP addresses (personal data under GDPR). Compliance options: (1) Anonymize IPs before storage (hash or truncate last octet). (2) Set retention limits (delete logs after 30-90 days). (3) Exclude EU traffic from detailed logging (Cloudflare allows geo-based rules). (4) Use aggregated analytics only (don't store raw logs). GA4 anonymizes IPs by default (compliant). Edge logs require manual privacy configuration. If serving EU users, consult legal counsel on log retention policies.

Stop gambling on single traffic sources.

Find gives you the complete framework for building, measuring, and defending a diversified traffic portfolio. Calculators, templates, and the full methodology.

Get Find — $997

Related Analysis

← All Articles