Thresholds

Thresholds provide a way to trigger events when different measurements exceed a given value. Thresholds configured at the domain level apply to all nodes in the domain, except when overridden. Events will be of type Metric Threshold Violation.

To view thresholds, a user will need domains::read permissions. To configure them, they will need domains::configure:threshold permissions.

Navigate to your domain, and click Thresholds on the left.

Load Thresholds

Load thresholds measure the health of the node itself.

Load Metrics

Metric TypeDescriptionDefault Value
CPU UsageMonitors percent of CPU usage across all cores95% for 10 minutes
Memory UsageMonitors percent of total memory (RAM) used90% for 30 minutes
Disk UsageMonitors percent of total disk usage for the root partition80% for 1 minute
Embrionic FlowsMonitors the number of TCP flows (connections) that are in the embryonic state (waiting for ACK)none
JVM HeapMonitors the percent of allocated JVM memory usednone

Load Fields

Field NameDescription
NameThe name of the threshold. This will be available in generated events.
TelemetryThe metric to monitor. Options are CPU usage (%), memory usage (%), disk usage (%), and embryonic flows (absolute count).
ThresholdThe value that must be exceeded for an event to be generated.
DurationThe time period to measure. If the threshold is exceeded for this duration, an event will be generated.

Network Thresholds

Network thresholds measure the health of the network from the node’s perspective.

Network Metrics

Metric TypeDescription
Latency (ms)Monitors the round trip tunnel latency between this node and the target
Bandwidth IN Usage (Mbps)Monitor the amount of received bandwidth on the specified interface
Bandwidth Out Usage (Mbps)Monitors the amount of sent bandwidth on the specified interfaces

Network Fields

Field NameDescription
NameThe name of the threshold. This will be available in generated events.
TelemetryThe metric to monitor. Currently only latency (measured in milliseconds) is available.
ThresholdThe value that must be exceeded for an event to be generated.
DurationThe time period to measure. If the threshold is exceeded for this duration, an event will be generated.
Target
  • For Latency - The target node to measure the latency to. Each node will measure the latency to the target node.
  • For Bandwidth In/Out - The network interface to monitor usage on.