Your Jekyll site seems to be running fine, but you're flying blind. You don't know if it's actually available to visitors worldwide, how fast it loads in different regions, or when errors occur. This lack of visibility means problems go undetected until users complain. The frustration of discovering issues too late can damage your reputation and search rankings. You need a proactive monitoring system that leverages Cloudflare's global network and Ruby's automation capabilities.
Monitoring a Jekyll site requires a different approach than dynamic applications. Since there's no server-side processing to monitor, you focus on: (1) Content delivery performance, (2) Uptime and availability, (3) User experience metrics, and (4) Third-party service dependencies. Cloudflare provides the foundation with its global vantage points, while Ruby gems add automation and integration capabilities.
The architecture should be multi-layered: real-time monitoring (checking if the site is up), performance monitoring (how fast it loads), business monitoring (are conversions happening), and predictive monitoring (trend analysis). Each layer uses different Cloudflare data sources and Ruby tools. The goal is to detect issues before users do, and to have automated responses for common problems.
| Layer | What It Monitors | Cloudflare Data Source | Ruby Tools |
|---|---|---|---|
| Infrastructure | DNS, SSL, Network | Health Checks, SSL Analytics | net-http, ssl-certificate gems |
| Performance | Load times, Core Web Vitals | Speed Analytics, Real User Monitoring | benchmark, ruby-prof gems |
| Content | Broken links, missing assets | Cache Analytics, Error Analytics | nokogiri, link-checker gems |
| Business | Traffic trends, conversions | Web Analytics, GraphQL Analytics | chartkick, gruff gems |
Cloudflare provides dozens of metrics. Focus on these key ones for Jekyll:
Measures how often Cloudflare serves cached content vs fetching from origin. Ideal: >90%.
# Fetch via API
def cache_hit_ratio
response = cf_api_get("zones/#{zone_id}/analytics/dashboard", {
since: '-1440', # 24 hours
until: '0'
})
totals = response['result']['totals']
cached = totals['requests']['cached']
total = totals['requests']['all']
(cached.to_f / total * 100).round(2)
end
How long GitHub Pages takes to respond. Should be < 200ms.
def origin_response_time
data = cf_api_get("zones/#{zone_id}/healthchecks/analytics")
data['result']['origin_response_time']['p95'] # 95th percentile
end
Monitor for GitHub Pages outages or misconfigurations.
def error_rate
data = cf_api_get("zones/#{zone_id}/http/analytics", {
dimensions: ['statusCode'],
filters: 'statusCode ge 500'
})
error_requests = data['result'].sum { |r| r['metrics']['requests'] }
total_requests = get_total_requests()
(error_requests.to_f / total_requests * 100).round(2)
end
Real user experience metrics:
def core_web_vitals
cf_api_get("zones/#{zone_id}/speed/api/insights", {
metrics: ['lcp', 'fid', 'cls']
})
end
Extend Cloudflare's capabilities with these gems:
Though designed for Rails, adapt it for Jekyll monitoring:
gem 'cloudflare-rails'
# Configure for monitoring
Cloudflare::Rails.configure do |config|
config.ips = [] # Don't trust Cloudflare IPs for Jekyll
config.logger = Logger.new('log/cloudflare.log')
end
# Use its middleware to log requests
use Cloudflare::Rails::Middleware
Create health check endpoints:
gem 'health_check'
# Create a health check route
get '/health' do
{
status: 'healthy',
timestamp: Time.now.iso8601,
checks: {
cloudflare: check_cloudflare_connection,
github_pages: check_github_pages,
dns: check_dns_resolution
}
}.to_json
end
Schedule monitoring tasks:
gem 'whenever'
# config/schedule.rb
every 5.minutes do
runner "CloudflareMonitor.check_metrics"
end
every 1.hour do
runner "PerformanceAuditor.run_full_check"
end
Send alerts to Slack:
gem 'slack-notifier'
notifier = Slack::Notifier.new(
ENV['SLACK_WEBHOOK_URL'],
channel: '#site-alerts',
username: 'Jekyll Monitor'
)
def send_alert(message, level: :warning)
notifier.post(
text: message,
icon_emoji: level == :critical ? ':fire:' : ':warning:'
)
end
Create smart alerts that trigger only when necessary:
# lib/monitoring/alert_manager.rb
class AlertManager
ALERT_THRESHOLDS = {
cache_hit_ratio: { warn: 80, critical: 60 },
origin_response_time: { warn: 500, critical: 1000 }, # ms
error_rate: { warn: 1, critical: 5 }, # percentage
uptime: { warn: 99.5, critical: 99.0 } # percentage
}
def self.check_and_alert
metrics = CloudflareMetrics.fetch
ALERT_THRESHOLDS.each do |metric, thresholds|
value = metrics[metric]
if value >= thresholds[:critical]
send_alert("#{metric.to_s.upcase} CRITICAL: #{value}", :critical)
elsif value >= thresholds[:warn]
send_alert("#{metric.to_s.upcase} Warning: #{value}", :warning)
end
end
end
def self.send_alert(message, level)
# Send to multiple channels
SlackNotifier.send(message, level)
EmailNotifier.send(message, level) if level == :critical
# Log to file
File.open('log/alerts.log', 'a') do |f|
f.puts "[#{Time.now}] #{level.upcase}: #{message}"
end
end
end
# Run every 15 minutes
AlertManager.check_and_alert
Add alert deduplication to prevent spam:
def should_alert?(metric, value, level)
last_alert = $redis.get("last_alert:#{metric}:#{level}")
# Don't alert if we alerted in the last hour for same issue
if last_alert && Time.now - Time.parse(last_alert) < 3600
return false
end
$redis.setex("last_alert:#{metric}:#{level}", 3600, Time.now.iso8601)
true
end
Build internal dashboards using Ruby web frameworks:
gem 'sinatra'
gem 'chartkick'
# app.rb
require 'sinatra'
require 'chartkick'
get '/dashboard' do
@metrics = {
cache_hit_ratio: CloudflareAPI.cache_hit_ratio,
response_times: CloudflareAPI.response_time_history,
traffic: CloudflareAPI.traffic_by_country
}
erb :dashboard
end
# views/dashboard.erb
<%= line_chart @metrics[:traffic] %>
<%= pie_chart @metrics[:cache_hit_ratio] %>
<%= geo_chart @metrics[:traffic_by_country] %>
# _plugins/metrics_generator.rb
module Jekyll
class MetricsGenerator < Generator
def generate(site)
# Fetch metrics
metrics = fetch_cloudflare_metrics
# Create data file
site.data['metrics'] = metrics
# Generate dashboard page
page = PageWithoutAFile.new(site, __dir__, '', 'dashboard.md')
page.content = generate_dashboard_content(metrics)
page.data = {
'layout' => 'dashboard',
'title' => 'Site Metrics Dashboard',
'permalink' => '/internal/dashboard/'
}
site.pages page
end
end
end
Use `prometheus-client` gem to export metrics to Grafana:
gem 'prometheus-client'
# Configure exporter
Prometheus::Client.configure do |config|
config.logger = Logger.new('log/prometheus.log')
end
# Define metrics
CACHE_HIT_RATIO = Prometheus::Client::Gauge.new(
:cloudflare_cache_hit_ratio,
'Cache hit ratio percentage'
)
# Update metrics
Thread.new do
loop do
CACHE_HIT_RATIO.set(CloudflareAPI.cache_hit_ratio)
sleep 60
end
end
# Expose metrics endpoint
get '/metrics' do
Prometheus::Client::Formats::Text.marshal(Prometheus::Client.registry)
end
Monitor for specific error patterns:
# lib/monitoring/error_tracker.rb
class ErrorTracker
def self.track_cloudflare_errors
errors = cf_api_get("zones/#{zone_id}/analytics/events/errors", {
since: '-60', # Last hour
dimensions: ['clientRequestPath', 'originResponseStatus']
})
errors['result'].each do |error|
next if whitelisted_error?(error)
log_error(error)
alert_if_critical(error)
attempt_auto_recovery(error)
end
end
def self.whitelisted_error?(error)
# Ignore 404s on obviously wrong URLs
path = error['dimensions'][0]
status = error['dimensions'][1]
return true if status == '404' && path.include?('wp-')
return true if status == '403' && path.include?('.env')
false
end
def self.attempt_auto_recovery(error)
case error['dimensions'][1]
when '502', '503', '504'
# GitHub Pages might be down, purge cache
CloudflareAPI.purge_cache_for_path(error['dimensions'][0])
when '404'
# Check if page should exist
if page_should_exist?(error['dimensions'][0])
trigger_build_to_regenerate_page
end
end
end
end
Automate responses to common issues:
# lib/maintenance/auto_recovery.rb
class AutoRecovery
def self.run
# Check for GitHub Pages build failures
if build_failing_for_more_than?(30.minutes)
trigger_manual_build
send_alert("Build was failing, triggered manual rebuild", :info)
end
# Check for DNS propagation issues
if dns_propagation_delayed?
increase_cloudflare_dns_ttl
send_alert("Increased DNS TTL due to propagation delays", :warning)
end
# Check for excessive cache misses
if cache_hit_ratio < 70
warm_cache_for_top_pages
send_alert("Cache hit ratio low, warming cache", :warning)
end
# Weekly maintenance tasks
if Time.now.wday == 0 && Time.now.hour == 2 # Sunday 2 AM
run_link_check
compress_old_logs
backup_analytics_data
end
end
def self.trigger_manual_build
# Trigger GitHub Actions workflow via API
HTTParty.post(
"https://api.github.com/repos/#{repo}/dispatches",
headers: { 'Authorization' => "token #{ENV['GITHUB_TOKEN']}" },
body: { event_type: 'manual-build' }.to_json
)
end
end
# Run every hour
AutoRecovery.run
Implement a comprehensive monitoring system this week. Start with basic uptime checks and cache monitoring. Gradually add performance tracking and automated alerts. Within a month, you'll have complete visibility into your Jekyll site's health and automated responses for common issues, ensuring maximum reliability for your visitors.