Monitoring
Monitors
Monitors are scheduled checks that detect availability, latency, correctness, SSL, keyword, port, host, cron, and agent quality problems.
Monitor types
- HTTP/HTTPS: validates endpoint availability, response status, latency, optional keyword, and optional SSL expiry.
- TCP: checks whether a host and port accept connections.
- Ping: checks basic host reachability.
- Heartbeat: alerts when a cron job or scheduled worker fails to report in.
- AI agent: sends a prompt to an agent endpoint and validates the response against an expectation.
Important fields
- Name: a readable label responders can understand during an incident.
- Target: URL, host, host:port, heartbeat token, or agent endpoint depending on monitor type.
- Interval: how often the check runs; plan limits may enforce a minimum interval.
- Regions: probe locations used to detect regional failures and latency differences.
- Expected status code: the HTTP status that counts as success.
- Keyword: optional content validation to catch broken pages returning HTTP 200.
- SSL expiry: optional certificate expiry validation for HTTPS targets.
Operational behavior
- Confirmed failures open incidents instead of paging on a single flaky request.
- Recovery notifications are sent when a monitor returns to a healthy state.
- Linked monitors contribute to service health and incident context.
- Paused monitors do not generate normal probe-driven incidents.
- Monitor creation, updates, pause/resume, and deletion are audit logged for owner/admin users.
Troubleshooting
- If a monitor fails immediately, verify the target is publicly reachable from the probe worker.
- If an HTTP check returns the wrong status, confirm redirects, authentication, and expected status code.
- If a keyword check fails, confirm the exact text appears in the response body.
- If heartbeat checks alert unexpectedly, confirm the scheduled job calls the heartbeat endpoint before the timeout window.
Related documentation
Services and SLOs
Model owned services, link monitors, define dependencies, and track service-level objectives.
Incidents
Acknowledge, investigate, route, resolve, and review service incidents.
Routing and maintenance
Create escalation policies, on-call schedules, temporary overrides, secondary escalation, and planned maintenance windows.
Status pages
Publish customer-facing status pages with monitored service health and uptime history.