Cron Job Anomaly Detection

Catch unhealthy cron jobs before they fail outright

A job can still ping successfully while something is clearly wrong. It runs three times slower, starts arriving at an irregular cadence, or begins producing warnings every few runs. Cronping learns each heartbeat's baseline and alerts when behavior drifts from normal.

nightly-sync.sh
#!/bin/bash
PING_URL="https://ping.cronping.com/YOUR_TOKEN"

curl -fsS "$PING_URL/start" > /dev/null

if ./sync-customers.sh; then
  curl -fsS "$PING_URL" > /dev/null
else
  curl -fsS "$PING_URL/fail" > /dev/null
fi

# Cronping tracks duration, cadence, and warn/fail rate
# against this heartbeat's historical baseline.

One line added to your script. Cronping handles the rest.

The cost of silent failures

Successful but slower

The job exits 0, but it now takes 14 minutes instead of 90 seconds. Standard uptime monitoring says everything is fine while the queue is already backing up.

Cadence drift

A job expected hourly starts arriving every 45 minutes, then every 90. It is still within the grace window, but the scheduler behavior changed.

Warning clusters

A few warnings per month become several warnings in one afternoon. The job is not fully down yet, but the failure rate is moving in the wrong direction.

Manual thresholds go stale

A hard-coded duration threshold works until traffic grows, data volume changes, or a migration shifts the normal runtime. Baselines keep up with reality.

Set up in under 2 minutes

  1. 1

    Create a heartbeat

    Set the job schedule and grace period as usual. Enable anomaly detection for the heartbeat on a plan that includes statistical baselines.

  2. 2

    Send normal pings

    Use the same ping URL flow. Add /start when the job begins and the base URL on success so Cronping can measure run duration.

  3. 3

    Let Cronping learn

    After enough pings, Cronping calculates baseline duration, interval stability, and warn/fail rate, then alerts when recent behavior leaves that baseline.

Everything you need

No SDK, no dashboard agent, no infrastructure to manage.

Duration spike detection

Alert when a successful or warning run takes significantly longer than the heartbeat's recent median duration.

Interval drift detection

Spot jobs whose ping cadence becomes less predictable, even if every ping still arrives before the normal missed-run alert.

Error-rate surge detection

Compare the last 24 hours of warn/fail pings with the historical baseline to catch deterioration before a hard outage.

Rolling baselines

Cronping recalculates per-heartbeat baselines from recent history so each job is judged against its own normal behavior.

Existing alert channels

Send anomaly alerts to the same email, Slack, Discord, Teams, Telegram, PagerDuty, Incident.io, and webhook channels you already use.

Cooldown controls

Avoid noisy repeat notifications with per-heartbeat anomaly alert cooldowns while keeping unresolved anomalies visible.

Frequently asked questions

No. Cronping uses statistical anomaly detection based on each heartbeat's own history. That makes the behavior explainable: duration median, interval variation, and warn/fail rate are visible and easy to reason about.

By default, Cronping waits for at least 20 useful pings before treating a baseline as reliable. You can adjust the minimum per heartbeat.

No. Missed-run alerts still catch jobs that fail to check in. Anomaly detection catches jobs that still check in, but are behaving differently from their historical baseline.

The active anomaly resolves automatically the next time Cronping analyzes the heartbeat and the signal no longer crosses the threshold.

Yes. Duration spike factor, enabled anomaly types, cooldown, and minimum baseline sample size are configurable per heartbeat.

Stop discovering failures when it's too late.

Free to start. No credit card required. Add your first heartbeat in under 5 minutes.