Key Takeaways
- Wazuh scales in two independent dimensions: the manager cluster (handling agents and analysis) and the indexer cluster (handling storage and search). Plan them separately.
- A manager cluster uses one master and multiple worker nodes. Workers share the agent connection and analysis load; the master owns configuration and synchronisation.
- Load balancing distributes agents across workers. A load balancer in front of the worker nodes spreads connections evenly and survives a single worker failing.
- The indexer scales by adding data nodes and distributing shards, with dedicated master nodes for cluster stability at larger sizes.
- Capacity planning is driven by agent count, EPS and retention, with headroom for growth so you scale ahead of need rather than during an incident.
Wazuh Scales in Two Dimensions
The most important concept in Wazuh scalability is that the manager and the indexer scale independently. The manager handles agent connections, event collection and the analysis pipeline. The indexer handles storage, indexing and search. These are different workloads with different constraints, and a healthy growth plan sizes each one against its own demand rather than treating Wazuh as a single monolith.
This separation is liberating in practice. An estate with many agents but modest event volume per agent puts pressure on the manager cluster while leaving the indexer comfortable. An estate with fewer agents but extremely chatty sources (verbose application logs, high-traffic firewalls) puts pressure on the indexer while the manager copes easily. Knowing which dimension you are growing in tells you exactly what to add.
A single all-in-one server is the right starting point for small deployments, but it conflates both dimensions on one box. The first architectural decision in any scalability plan is splitting the indexer onto its own node, because that single change removes the most common early bottleneck and creates the clean two-dimensional model that the rest of the plan builds on.
Manager Clustering: Master and Workers
A Wazuh manager cluster consists of one master node and one or more worker nodes. The master is authoritative for configuration: rules, decoders, agent registration and the shared configuration that all nodes use. Worker nodes pull that configuration from the master and do the heavy lifting of accepting agent connections and running the analysis pipeline.
Adding worker nodes is how the manager dimension scales. Each worker can service a share of the agent fleet and run its own analysisd, so total analysis throughput grows roughly linearly with worker count. The master does not need to be large because it does not carry agent load directly; its job is synchronisation and being the single source of truth for configuration.
Cluster synchronisation keeps workers consistent. When you push a new rule to the master, it propagates to the workers so detection is uniform across the fleet. This means an analyst changes detection logic in one place and it applies everywhere, which is essential at scale where manually maintaining many independent managers would be unworkable and error-prone.
Need a Wazuh-Based Managed SOC?
Codesecure deploys and operates Wazuh, TheHive, n8n, Cortex and MISP as a managed SOC. 24x7 named analysts, detection engineering, tuned dashboards and audit-ready compliance reporting. No commercial SIEM licensing.
See Managed SOC →Load Balancing and Agent Distribution
With multiple worker nodes, agents must be distributed across them. A load balancer placed in front of the worker nodes spreads incoming agent connections, so no single worker is overwhelmed while others sit idle. Agents are configured to connect to the load balancer address rather than to a specific worker, which also means a worker can be added or removed without reconfiguring every agent.
Load balancing provides resilience as well as distribution. If a worker node fails, the balancer routes its agents to the surviving workers, and the agents reconnect automatically. This is the mechanism that keeps collection running through a node outage, which matters because a SIEM that stops collecting during an incident is worse than useless, it gives false confidence.
For geographically distributed estates spanning India, Singapore, the UAE and Malaysia, agent distribution can also be regional. Placing workers or collectors closer to clusters of agents shortens the network hop, reduces batching delay and keeps detection responsive. The load-balancing design and the geographic design should be planned together rather than retrofitted.
Scaling the Indexer Cluster
The indexer scales horizontally by adding data nodes. Each data node stores a share of the shards, so indexing throughput and search capacity both increase as nodes are added and shards rebalance across them. This is the dimension that grows with event volume and retention rather than with agent count.
At larger cluster sizes, dedicating master-eligible nodes that do not hold data improves stability. These dedicated masters manage cluster state and elect a leader, isolated from the heavy indexing load on the data nodes. Mixing cluster-management duties with heavy indexing on the same nodes is a common cause of instability as a cluster grows, because a node under indexing pressure can become unresponsive at exactly the wrong moment.
Replica shards provide both resilience and search capacity. A replica is a copy of a primary shard on a different node, so the loss of a node does not lose data, and searches can be served from replicas in parallel with primaries. The replica count is a deliberate trade-off: more replicas mean more resilience and search throughput but proportionally more storage, so it is sized against the availability requirement.
Building a Capacity Model
A scalability plan needs numbers. The three inputs that drive Wazuh capacity are agent count (and the event rate each agent generates), total EPS at average and peak, and retention period. Agent count and per-agent event rate size the manager cluster. EPS and retention size the indexer cluster and its storage. Getting these inputs from measurement rather than estimate is what makes a plan trustworthy.
Headroom is not optional. Sizing to one hundred percent of current need guarantees a scramble the moment the estate grows or an incident spikes volume. A sound model sizes for the projected agent count and EPS over the planning horizon, with additional headroom for bursts, so capacity is added ahead of need during a maintenance window rather than reactively during an outage.
The model should also be revisited on a cadence. Estates grow, new log sources are onboarded, retention requirements change with compliance obligations. A capacity model reviewed quarterly catches the trend lines early. Codesecure maintains a living capacity model for managed clients, tracking agent growth, EPS trends and storage burn so scaling is a planned project, never an emergency.
Want Help With Detection Engineering?
Whether you run Wazuh in-house or want a fully managed service, our SOC engineers build custom rules, dashboards and integrations tuned to your environment. ISO/IEC 27001:2022 certified delivery, fixed-fee monthly retainer.
Talk to a SOC Engineer →A Practical Growth Roadmap
Scaling Wazuh follows a predictable progression. Stage one is a single all-in-one node for a small estate. Stage two splits the indexer onto its own server, removing the most common early bottleneck. Stage three introduces a multi-node indexer cluster as event volume and retention grow. Stage four adds manager worker nodes behind a load balancer as the agent fleet expands.
Each stage should be reached deliberately, triggered by capacity-model thresholds rather than by an outage. The signals are clear: indexer queues and rejections trigger indexer scaling; analysisd queue saturation and agent count trigger manager scaling. Because the two dimensions are independent, an estate might be at stage four on the indexer while still at stage two on the manager, or vice versa.
Planning the path in advance means each step is a routine change rather than a re-architecture. The agents already point at a load-balancer address, the indexer is already a cluster that accepts new nodes, and the manager is already configured for clustering. Codesecure designs Wazuh deployments with this growth path built in, so clients scale by adding nodes to an architecture that anticipated growth rather than rebuilding under pressure.
Frequently Asked Questions
How many agents can one Wazuh manager handle?
A single well-sized manager node can comfortably handle into the low thousands of agents, depending on each agent's event rate and the ruleset weight. Beyond that, a manager cluster with worker nodes distributes the load, scaling to tens of thousands of agents. The real limit is event throughput per node, not agent count alone.
What is the difference between a Wazuh master and worker node?
The master node is authoritative for configuration: rules, decoders, agent registration and shared settings. Worker nodes pull that configuration and do the heavy lifting of accepting agent connections and running the analysis pipeline. Adding workers scales analysis throughput; the master stays small because it handles synchronisation, not agent load.
How does load balancing work in a Wazuh cluster?
A load balancer sits in front of the worker nodes, and agents connect to the balancer address rather than a specific worker. It spreads agent connections evenly and reroutes agents from a failed worker to surviving ones automatically. This provides both even distribution and resilience, keeping collection running through a node outage.
Do the Wazuh manager and indexer scale together?
No, and that is the key to good planning. The manager scales with agent count and analysis load; the indexer scales with event volume and retention. They are independent workloads, so you might add indexer data nodes while the manager is still a single server, or add manager workers while the indexer is comfortable. Size each dimension against its own demand.
How do I plan Wazuh storage for growth?
Storage is driven by EPS and retention. Multiply average and peak EPS by the retention period to estimate index size, add replica overhead, and size data nodes accordingly. Use index lifecycle management to move older data to cheaper tiers. Review the model on a cadence because retention requirements and event volume both drift over time.
Can Codesecure design a scalable Wazuh architecture?
Yes. Codesecure designs Wazuh with a growth path built in: a clustered indexer that accepts new data nodes, a manager configured for worker clustering, and agents pointed at a load-balancer address. We maintain a living capacity model tracking agent growth, EPS and storage so scaling is a planned project rather than an emergency.
Plan Wazuh Growth Before You Hit The Wall
Codesecure designs Wazuh architectures that scale by adding nodes, not by rebuilding: clustered managers, multi-node indexers and a living capacity model. ISO/IEC 27001:2022 certified delivery, named SOC engineers, fixed monthly retainer.

