MONITORING WITHOUT AGENTS — THE HYBRID APPROACH
Most monitoring tools (Datadog, New Relic, Prometheus node_exporter) require agents installed on every target system. For a small infrastructure, that’s overhead — more software to maintain, more attack surface, more resource consumption.
Starting Agentless
The initial approach: a central observer queries target nodes using native protocols. SSH for system metrics, SQL for database health, Unix sockets for container status. No software installed on targets.
This worked for basic monitoring. But SSH-based polling is slow — each check opens a connection, runs a command, parses output. At 30-second intervals across multiple nodes, the observer spends most of its time waiting on SSH handshakes.
The Hybrid Evolution
The solution: lightweight HTTP microservices on nodes that benefit from faster polling. A minimal Flask app (< 50 lines) exposes /health and /stats endpoints. The observer hits these endpoints instead of opening SSH sessions.
What stayed agentless: - Database health checks (native SQL protocol) - Container status (Docker socket) - Security feeds (existing HTTP APIs)
What got agents: - Edge nodes where SSH latency was noticeable - Nodes needing sub-second response times
The Lesson
“Agentless” is a design preference, not a religion. The right answer is using native protocols where they’re fast enough and adding lightweight agents where they’re not. The key constraint: agents must be stateless, single-purpose, and trivially replaceable.