Your Jekyll site seems to be running fine, but you're flying blind. You don't know if it's actually available to visitors worldwide, how fast it loads in different regions, or when errors occur. This lack of visibility means problems go undetected until users complain. The frustration of discovering issues too late can damage your reputation and search rankings. You need a proactive monitoring system that leverages Cloudflare's global network and Ruby's automation capabilities.

Building a Monitoring Architecture for Static Sites

Monitoring a Jekyll site requires a different approach than dynamic applications. Since there's no server-side processing to monitor, you focus on: (1) Content delivery performance, (2) Uptime and availability, (3) User experience metrics, and (4) Third-party service dependencies. Cloudflare provides the foundation with its global vantage points, while Ruby gems add automation and integration capabilities.

The architecture should be multi-layered: real-time monitoring (checking if the site is up), performance monitoring (how fast it loads), business monitoring (are conversions happening), and predictive monitoring (trend analysis). Each layer uses different Cloudflare data sources and Ruby tools. The goal is to detect issues before users do, and to have automated responses for common problems.

Four-Layer Monitoring Architecture

Layer	What It Monitors	Cloudflare Data Source	Ruby Tools
Infrastructure	DNS, SSL, Network	Health Checks, SSL Analytics	net-http, ssl-certificate gems
Performance	Load times, Core Web Vitals	Speed Analytics, Real User Monitoring	benchmark, ruby-prof gems
Content	Broken links, missing assets	Cache Analytics, Error Analytics	nokogiri, link-checker gems
Business	Traffic trends, conversions	Web Analytics, GraphQL Analytics	chartkick, gruff gems

Essential Cloudflare Metrics for Jekyll Sites

Cloudflare provides dozens of metrics. Focus on these key ones for Jekyll:

1. Cache Hit Ratio

Measures how often Cloudflare serves cached content vs fetching from origin. Ideal: >90%.

# Fetch via API
def cache_hit_ratio
  response = cf_api_get("zones/#{zone_id}/analytics/dashboard", {
    since: '-1440', # 24 hours
    until: '0'
  })
  
  totals = response['result']['totals']
  cached = totals['requests']['cached']
  total = totals['requests']['all']
  
  (cached.to_f / total * 100).round(2)
end

2. Origin Response Time

How long GitHub Pages takes to respond. Should be < 200ms.

def origin_response_time
  data = cf_api_get("zones/#{zone_id}/healthchecks/analytics")
  data['result']['origin_response_time']['p95'] # 95th percentile
end

3. Error Rate (5xx Status Codes)

Monitor for GitHub Pages outages or misconfigurations.

def error_rate
  data = cf_api_get("zones/#{zone_id}/http/analytics", {
    dimensions: ['statusCode'],
    filters: 'statusCode ge 500'
  })
  
  error_requests = data['result'].sum { |r| r['metrics']['requests'] }
  total_requests = get_total_requests()
  
  (error_requests.to_f / total_requests * 100).round(2)
end

4. Core Web Vitals via Browser Insights

Real user experience metrics:

def core_web_vitals
  cf_api_get("zones/#{zone_id}/speed/api/insights", {
    metrics: ['lcp', 'fid', 'cls']
  })
end

Ruby Gems for Enhanced Monitoring

Extend Cloudflare's capabilities with these gems:

1. cloudflare-rails

Though designed for Rails, adapt it for Jekyll monitoring:

gem 'cloudflare-rails'

# Configure for monitoring
Cloudflare::Rails.configure do |config|
  config.ips = []  # Don't trust Cloudflare IPs for Jekyll
  config.logger = Logger.new('log/cloudflare.log')
end

# Use its middleware to log requests
use Cloudflare::Rails::Middleware

2. health_check

Create health check endpoints:

gem 'health_check'

# Create a health check route
get '/health' do
  {
    status: 'healthy',
    timestamp: Time.now.iso8601,
    checks: {
      cloudflare: check_cloudflare_connection,
      github_pages: check_github_pages,
      dns: check_dns_resolution
    }
  }.to_json
end

3. whenever + clockwork

Schedule monitoring tasks:

gem 'whenever'

# config/schedule.rb
every 5.minutes do
  runner "CloudflareMonitor.check_metrics"
end

every 1.hour do
  runner "PerformanceAuditor.run_full_check"
end

4. slack-notifier

Send alerts to Slack:

gem 'slack-notifier'

notifier = Slack::Notifier.new(
  ENV['SLACK_WEBHOOK_URL'],
  channel: '#site-alerts',
  username: 'Jekyll Monitor'
)

def send_alert(message, level: :warning)
  notifier.post(
    text: message,
    icon_emoji: level == :critical ? ':fire:' : ':warning:'
  )
end

Setting Up Automated Alerts and Notifications

Create smart alerts that trigger only when necessary:

# lib/monitoring/alert_manager.rb
class AlertManager
  ALERT_THRESHOLDS = {
    cache_hit_ratio: { warn: 80, critical: 60 },
    origin_response_time: { warn: 500, critical: 1000 }, # ms
    error_rate: { warn: 1, critical: 5 }, # percentage
    uptime: { warn: 99.5, critical: 99.0 } # percentage
  }
  
  def self.check_and_alert
    metrics = CloudflareMetrics.fetch
    
    ALERT_THRESHOLDS.each do |metric, thresholds|
      value = metrics[metric]
      
      if value >= thresholds[:critical]
        send_alert("#{metric.to_s.upcase} CRITICAL: #{value}", :critical)
      elsif value >= thresholds[:warn]
        send_alert("#{metric.to_s.upcase} Warning: #{value}", :warning)
      end
    end
  end
  
  def self.send_alert(message, level)
    # Send to multiple channels
    SlackNotifier.send(message, level)
    EmailNotifier.send(message, level) if level == :critical
    
    # Log to file
    File.open('log/alerts.log', 'a') do |f|
      f.puts "[#{Time.now}] #{level.upcase}: #{message}"
    end
  end
end

# Run every 15 minutes
AlertManager.check_and_alert

Add alert deduplication to prevent spam:

def should_alert?(metric, value, level)
  last_alert = $redis.get("last_alert:#{metric}:#{level}")
  
  # Don't alert if we alerted in the last hour for same issue
  if last_alert && Time.now - Time.parse(last_alert) < 3600
    return false
  end
  
  $redis.setex("last_alert:#{metric}:#{level}", 3600, Time.now.iso8601)
  true
end

Creating Performance Dashboards

Build internal dashboards using Ruby web frameworks:

Option 1: Sinatra Dashboard

gem 'sinatra'
gem 'chartkick'

# app.rb
require 'sinatra'
require 'chartkick'

get '/dashboard' do
  @metrics = {
    cache_hit_ratio: CloudflareAPI.cache_hit_ratio,
    response_times: CloudflareAPI.response_time_history,
    traffic: CloudflareAPI.traffic_by_country
  }
  
  erb :dashboard
end

# views/dashboard.erb
<%= line_chart @metrics[:traffic] %>
<%= pie_chart @metrics[:cache_hit_ratio] %>
<%= geo_chart @metrics[:traffic_by_country] %>

Option 2: Static Dashboard Generated by Jekyll

# _plugins/metrics_generator.rb
module Jekyll
  class MetricsGenerator < Generator
    def generate(site)
      # Fetch metrics
      metrics = fetch_cloudflare_metrics
      
      # Create data file
      site.data['metrics'] = metrics
      
      # Generate dashboard page
      page = PageWithoutAFile.new(site, __dir__, '', 'dashboard.md')
      page.content = generate_dashboard_content(metrics)
      page.data = {
        'layout' => 'dashboard',
        'title' => 'Site Metrics Dashboard',
        'permalink' => '/internal/dashboard/'
      }
      site.pages   page
    end
  end
end

Option 3: Grafana + Ruby Exporter

Use `prometheus-client` gem to export metrics to Grafana:

gem 'prometheus-client'

# Configure exporter
Prometheus::Client.configure do |config|
  config.logger = Logger.new('log/prometheus.log')
end

# Define metrics
CACHE_HIT_RATIO = Prometheus::Client::Gauge.new(
  :cloudflare_cache_hit_ratio,
  'Cache hit ratio percentage'
)

# Update metrics
Thread.new do
  loop do
    CACHE_HIT_RATIO.set(CloudflareAPI.cache_hit_ratio)
    sleep 60
  end
end

# Expose metrics endpoint
get '/metrics' do
  Prometheus::Client::Formats::Text.marshal(Prometheus::Client.registry)
end

Error Tracking and Diagnostics

Monitor for specific error patterns:

# lib/monitoring/error_tracker.rb
class ErrorTracker
  def self.track_cloudflare_errors
    errors = cf_api_get("zones/#{zone_id}/analytics/events/errors", {
      since: '-60',  # Last hour
      dimensions: ['clientRequestPath', 'originResponseStatus']
    })
    
    errors['result'].each do |error|
      next if whitelisted_error?(error)
      
      log_error(error)
      alert_if_critical(error)
      attempt_auto_recovery(error)
    end
  end
  
  def self.whitelisted_error?(error)
    # Ignore 404s on obviously wrong URLs
    path = error['dimensions'][0]
    status = error['dimensions'][1]
    
    return true if status == '404' && path.include?('wp-')
    return true if status == '403' && path.include?('.env')
    false
  end
  
  def self.attempt_auto_recovery(error)
    case error['dimensions'][1]
    when '502', '503', '504'
      # GitHub Pages might be down, purge cache
      CloudflareAPI.purge_cache_for_path(error['dimensions'][0])
    when '404'
      # Check if page should exist
      if page_should_exist?(error['dimensions'][0])
        trigger_build_to_regenerate_page
      end
    end
  end
end

Automated Maintenance and Recovery

Automate responses to common issues:

# lib/maintenance/auto_recovery.rb
class AutoRecovery
  def self.run
    # Check for GitHub Pages build failures
    if build_failing_for_more_than?(30.minutes)
      trigger_manual_build
      send_alert("Build was failing, triggered manual rebuild", :info)
    end
    
    # Check for DNS propagation issues
    if dns_propagation_delayed?
      increase_cloudflare_dns_ttl
      send_alert("Increased DNS TTL due to propagation delays", :warning)
    end
    
    # Check for excessive cache misses
    if cache_hit_ratio < 70
      warm_cache_for_top_pages
      send_alert("Cache hit ratio low, warming cache", :warning)
    end
    
    # Weekly maintenance tasks
    if Time.now.wday == 0 && Time.now.hour == 2  # Sunday 2 AM
      run_link_check
      compress_old_logs
      backup_analytics_data
    end
  end
  
  def self.trigger_manual_build
    # Trigger GitHub Actions workflow via API
    HTTParty.post(
      "https://api.github.com/repos/#{repo}/dispatches",
      headers: { 'Authorization' => "token #{ENV['GITHUB_TOKEN']}" },
      body: { event_type: 'manual-build' }.to_json
    )
  end
end

# Run every hour
AutoRecovery.run

Implement a comprehensive monitoring system this week. Start with basic uptime checks and cache monitoring. Gradually add performance tracking and automated alerts. Within a month, you'll have complete visibility into your Jekyll site's health and automated responses for common issues, ensuring maximum reliability for your visitors.

In This Article