Monitoring

Monitors

Monitors are scheduled checks that detect availability, latency, correctness, SSL, keyword, port, host, cron, and agent quality problems.

Audience: Engineers, SREs, platform teams

Monitor types

HTTP/HTTPS: validates endpoint availability, response status, latency, optional keyword, and optional SSL expiry.
TCP: checks whether a host and port accept connections.
Ping: checks basic host reachability.
Heartbeat: alerts when a cron job or scheduled worker fails to report in.
AI agent: sends a prompt to an agent endpoint and validates the response against an expectation.

Name: a readable label responders can understand during an incident.
Target: URL, host, host:port, heartbeat token, or agent endpoint depending on monitor type.
Interval: how often the check runs; plan limits may enforce a minimum interval.
Regions: probe locations used to detect regional failures and latency differences.
Expected status code: the HTTP status that counts as success.
Keyword: optional content validation to catch broken pages returning HTTP 200.
SSL expiry: optional certificate expiry validation for HTTPS targets.

Confirmed failures open incidents instead of paging on a single flaky request.
Recovery notifications are sent when a monitor returns to a healthy state.
Linked monitors contribute to service health and incident context.
Paused monitors do not generate normal probe-driven incidents.
Monitor creation, updates, pause/resume, and deletion are audit logged for owner/admin users.

If a monitor fails immediately, verify the target is publicly reachable from the probe worker.
If an HTTP check returns the wrong status, confirm redirects, authentication, and expected status code.
If a keyword check fails, confirm the exact text appears in the response body.
If heartbeat checks alert unexpectedly, confirm the scheduled job calls the heartbeat endpoint before the timeout window.