**License: Free** - No license required. Available in all tiers.

CPU Probe#

The CPU probe monitors processor performance across all major operating systems, providing comprehensive metrics for usage, load, and system-level CPU statistics.

Quick Start#

Basic Configuration#

probes:
  - name: cpu
    params:
      interval: 30  # Collection interval in seconds (default: 30)

Minimal Configuration#

probes:
  - name: cpu
    params: {}

The CPU probe requires no mandatory parameters and works out-of-the-box with default settings.

Supported Platforms#

  • Windows: Windows Server 2012+ / Windows 10+
  • Linux: All modern distributions (Ubuntu, RHEL, CentOS, Debian, etc.)
  • macOS: macOS 10.13+ (with graceful degradation)
  • BSD: FreeBSD, OpenBSD, NetBSD

Platform-specific metrics are automatically detected and collected based on the operating system.

macOS Platform Notes#

On macOS, the gopsutil library has limited support for detailed CPU time metrics (cpu.Times()). The CPU probe implements graceful degradation:

  • If detailed CPU times unavailable: Probe continues with load average metrics (cpu_load1, cpu_load5, cpu_load15)
  • Always available: CPU usage percentage (cpu_usage_total, cpu_core_usage)
  • Behavior: Logs warnings for unavailable metrics but remains active

This ensures the probe stays functional even when platform limitations exist, providing at minimum load average and usage percentage metrics.

Key Metrics Summary#

Cross-Platform Metrics#

MetricDescriptionAvailable On
cpu_usage_totalTotal CPU usage percentage (0-100%)All platforms
cpu_core_usagePer-core CPU usage percentageAll platforms
cpu_userUser-mode CPU timeAll platforms
cpu_systemSystem-mode CPU timeAll platforms
cpu_irqHardware interrupt timeAll platforms
cpu_softirqSoftware interrupt timeAll platforms

Unix/Linux/macOS Specific#

MetricDescription
cpu_idleCPU idle time (seconds)
cpu_niceCPU nice priority time (seconds)
cpu_iowaitCPU I/O wait time (seconds)
cpu_stealCPU steal time for VMs (seconds)
cpu_load1Load average (1 minute)
cpu_load5Load average (5 minutes)
cpu_load15Load average (15 minutes)

Windows Specific#

MetricDescription
cpu_dpc_rateDeferred Procedure Calls per second
cpu_dpc_queuedDPCs queued per second
cpu_interruptsHardware interrupts per second
cpu_queue_lengthProcessor queue length

Configuration Parameters#

ParameterTypeDefaultDescription
intervalinteger30Collection interval in seconds

Example Configurations#

High-frequency monitoring (every 10 seconds):

probes:
  - name: cpu
    params:
      interval: 10

Standard monitoring (every minute):

probes:
  - name: cpu
    params:
      interval: 60

Monitoring Tool Integration#

PRTG Network Monitor#

Access CPU metrics in PRTG JSON format:

# All CPU metrics
curl http://localhost:8080/api/{agentkey}/prtg/metrics

# Configure PRTG HTTP Advanced Sensor:
# - URL: http://agent-host:8080/api/{agentkey}/prtg/metrics
# - Method: POST
# - Request body: {"probe": "cpu"}

PRTG Channels Available:

  • CPU Total Usage (%)
  • CPU Core 0-N Usage (%)
  • CPU User Time (% or seconds)
  • CPU System Time (% or seconds)
  • CPU Load Average (Linux/Unix)
  • DPC Rate & Interrupts (Windows)

Nagios/Icinga#

Access CPU metrics in Nagios format:

# All CPU metrics with performance data
curl http://localhost:8080/api/{agentkey}/nagios/metrics?probe=cpu

# Example output:
# OK - CPU monitoring active | cpu_usage_total=42.5%;80;90 cpu_load1=1.23;;;

Nagios Performance Data:

  • cpu_usage_total - Total CPU usage with 80% warning, 90% critical
  • cpu_load1, cpu_load5, cpu_load15 - Load averages (Unix)
  • cpu_queue_length - Processor queue (Windows)

Grafana/Prometheus#

Access metrics in Prometheus-compatible format:

# Prometheus format
curl http://localhost:8080/api/{agentkey}/prometheus/metrics

# Example output:
# cpu_usage_total{hostname="server01"} 42.5
# cpu_core_usage{hostname="server01",core="0"} 38.2
# cpu_load1{hostname="server01"} 1.23

Web Interface#

View CPU metrics in the built-in dashboard:

http://localhost:8080/web/{agentkey}/dashboard

Features:

  • Real-time CPU usage visualization
  • Per-core CPU usage breakdown
  • Load average trends (Unix/Linux)
  • System-wide CPU statistics

Use Cases#

Performance Monitoring#

Monitor CPU usage to identify:

  • High CPU consumers
  • CPU bottlenecks
  • Per-core imbalances
  • System vs. user time distribution

Capacity Planning#

Track CPU trends over time:

  • Peak usage patterns
  • Average load levels
  • Core utilization distribution
  • Growth trends

VM Performance Analysis#

Monitor virtualized environments:

  • CPU steal time (hypervisor overhead)
  • Queue length (scheduling delays)
  • Per-core allocation effectiveness

Troubleshooting#

Diagnose system issues:

  • High I/O wait (storage bottleneck)
  • Excessive interrupts (hardware issues)
  • High DPC rate (Windows driver issues)
  • Load average spikes (Unix/Linux)

Troubleshooting#

No Metrics Collected#

Check probe status:

# View agent logs with CPU probe debugging
./agent run --authentication-key YOUR_KEY --verbose --debug-modules probe.cpu

Verify probe is enabled:

# Check configuration
cat agent-config.yaml | grep -A5 "name: cpu"

Windows: PDH Counter Errors#

Symptom: Error messages about Performance Data Helper (PDH) counters

Solution:

  1. Verify Performance Counter service is running:

    Get-Service | Where-Object {$_.Name -eq "PerfHost"}
  2. Rebuild Performance Counters:

    lodctr /R
  3. Check Windows Event Log for PDH errors

Unix/Linux: Permission Denied#

Symptom: Cannot read /proc/stat or system files

Solution: Run the agent with appropriate permissions:

# Option 1: Run as root
sudo ./agent run --authentication-key YOUR_KEY

# Option 2: Grant capabilities (Linux)
sudo setcap cap_sys_ptrace=eip ./agent

High CPU Usage from Agent#

Symptom: Agent itself consuming significant CPU

Solution:

  1. Increase collection interval:

    - name: cpu
      params:
        interval: 60  # Collect every minute instead of 30 seconds
  2. Check for other resource-intensive probes

  3. Review system load and available resources

Per-Core Metrics Missing#

Windows: Ensure all CPU cores are enabled in BIOS/firmware

Unix/Linux: Verify /proc/cpuinfo shows all cores:

cat /proc/cpuinfo | grep processor

Performance Considerations#

Collection Overhead#

The CPU probe has minimal overhead:

  • Windows: ~10ms per collection (PDH counters)
  • Unix/Linux: ~50ms per collection (gopsutil library)
  • macOS: ~30ms per collection (system calls)

Memory Usage#

Typical memory footprint per collection:

  • Base probe: ~500 KB
  • Per-core metrics: ~50 KB per core
  • Example: 16-core system = ~1.3 MB total
Use CaseIntervalReason
Real-time monitoring10-15sCatch short-lived spikes
Standard monitoring30-60sBalance accuracy and overhead
Long-term trending120-300sReduce storage and overhead

Advanced Configuration#

Multi-Instance Monitoring#

Monitor multiple systems with individual configurations:

probes:
  - name: cpu_realtime
    params:
      interval: 10

  - name: cpu_trending
    params:
      interval: 300

Note: This will create duplicate metrics. Use unique probe names for different collection intervals.

Integration with Other Probes#

Correlate CPU metrics with other system metrics:

probes:
  - name: cpu
    params:
      interval: 30

  - name: memory
    params:
      interval: 30

  - name: logicaldisk
    params:
      interval: 60

This provides comprehensive system monitoring with aligned collection intervals.

Authentication#

The CPU probe requires no authentication as it collects local system metrics only.

Requirements#

Windows#

  • Windows Server 2012+ or Windows 10+
  • Performance Counter service enabled
  • No special permissions required (runs as service account)

Linux/Unix/macOS#

  • Read access to /proc/stat (Linux)
  • Read access to /proc/loadavg (Linux)
  • System information APIs (macOS, BSD)

Network#

  • No network access required (local metrics only)
  • HTTP strategy required for remote access to metrics
SenHub Agent 0.1.80-beta