Key Takeaways
- Wazuh stores two copies of data: searchable indices in the Wazuh indexer (an OpenSearch fork) and optional compressed archives on the manager for long-term cold storage.
- Retention is compliance-driven. PCI DSS requires at least one year of log history with 90 days immediately available; ISO 27001 expects a documented, risk-based retention period.
- Use index lifecycle tiers: hot for recent searchable data, warm for older data on cheaper storage, and cold or archived for compliance-only history.
- Size the indexer realistically. Account for the JSON overhead and replica copies, which typically make stored data larger than the raw log volume.
- Separate alerting from archiving. Keep a short, fast searchable window for active investigation and a long, cheap archive for audit and forensics.
How Wazuh Stores Data
Wazuh keeps data in two distinct places, and understanding the difference is the foundation of any retention strategy. The first is the Wazuh indexer, a fork of OpenSearch, which holds alerts and events as searchable JSON documents. This is what the Wazuh dashboard queries when you investigate. The second is the manager's archives, flat compressed files on disk that store every received event regardless of whether it triggered an alert.
By default the indexer stores alerts (events that matched a rule at or above a configured level) in daily indices named with the wazuh-alerts pattern. Optionally, the manager can also forward all events (whether or not they alerted) into wazuh-archives indices, though this multiplies index size and is usually reserved for environments with a strict full-fidelity requirement.
On the manager itself, the archives directory (logs/archives) can retain a compressed copy of raw events organised by date. These archives are not searchable through the dashboard without re-ingestion, but they are cheap to keep and invaluable for after-the-fact investigation or compliance evidence.
The practical model is therefore a fast, expensive searchable layer (the indexer) and a slow, cheap durable layer (the manager archives). Retention strategy is about deciding how much data lives in each, and for how long.
Retention Periods: What Compliance Requires
Retention length should be set by your regulatory and contractual obligations, not picked arbitrarily. PCI DSS requires audit log history to be retained for at least one year, with a minimum of three months (90 days) immediately available for analysis. That maps cleanly onto a tiered design: 90 days searchable, the remainder archived.
ISO/IEC 27001:2022 does not prescribe a fixed number. Annex A control 8.15 on logging expects logs to be produced, protected and retained, and control 5.33 covers protection of records. The standard expects you to define a retention period based on a documented risk and legal assessment, then apply it consistently. Auditors look for the documented decision and evidence that it is enforced.
Other regimes add their own expectations. Sector regulators and national directions may require longer windows for specific data, and incident investigations sometimes need history going back many months. A safe default for many businesses is 90 days hot and searchable, with 12 months or more in cold archive, then extend where a specific obligation demands it.
Need Help Designing Your Wazuh Deployment?
Codesecure deploys and tunes Wazuh-based SOC stacks (manager, indexer, dashboard, agents, custom rules and decoders) for businesses across India, Singapore, UAE and Malaysia. ISO/IEC 27001:2022 certified delivery, named OSCP and CISSP consultants, fixed-price proposals.
See SOC and SIEM Services →Hot, Warm and Cold Tiers
A tiered storage model balances search speed against cost. Hot data is recent and frequently queried; it lives on fast storage (ideally SSD) so investigations and dashboards respond quickly. Warm data is older, queried occasionally, and can sit on slower or higher-density storage with fewer resources allocated. Cold or archived data is rarely touched and exists mainly to satisfy retention and forensic requirements, so it lives on the cheapest durable storage available.
The Wazuh indexer, being OpenSearch-based, supports Index State Management (ISM) policies that automate this lifecycle. An ISM policy can roll indices from hot to warm after a set age, reduce replica counts, force-merge segments to save space, and finally delete or move indices once they pass the retention threshold. Configuring ISM is what turns a retention policy on paper into one that actually enforces itself.
For genuinely cold, compliance-only data, the manager archives on inexpensive object or block storage are usually more economical than keeping old indices online. The trade-off is that archived data must be re-ingested or grep-searched rather than queried instantly, which is acceptable for data you expect to touch only during an audit or investigation.
Index Sizing and Storage Overhead
Estimating storage requires more than counting raw log bytes. When Wazuh indexes an event, it stores the original message plus extracted fields, decoder and rule metadata, and indexing structures, all as JSON. This typically makes an indexed document larger than the raw log line. On top of that, the indexer keeps replica shards for resilience, so one replica doubles the on-disk footprint of searchable data.
A defensible sizing method is to measure rather than guess. Ingest a representative day of real traffic, then read the actual index size from the dashboard or the indexer's cat APIs. Divide by the raw volume to get your environment's real expansion factor, then project across your retention window and replica count.
Worked example: if you receive 20 GB of raw logs per day, the searchable index might land around 25 to 30 GB per day after overhead. With one replica that is 50 to 60 GB per day on disk. Holding 90 days hot would need roughly 4.5 to 5.4 TB before headroom. Archived compressed data for the same period is a fraction of that because it is compressed and not indexed.
Archiving and Cold Storage
Archiving is how you meet long retention obligations without paying to keep everything searchable. When the manager is configured to log archives, it writes a compressed, date-organised copy of received events to disk. These files can then be shipped to cheap durable storage such as an object store, an NFS volume or tape, on a schedule.
Two operational details matter for archives. First, integrity: logs used as audit evidence should be protected against tampering, so apply write-once or restricted permissions and consider hashing or signing archive batches. ISO 27001 control 8.15 specifically expects logs to be protected from unauthorised alteration. Second, retrievability: document how archived data is restored and searched, because an archive you cannot practically query is not real retention.
When a regulator, auditor or investigator asks for history beyond the hot window, the workflow is to pull the relevant archive files, re-ingest them into a temporary index or search them directly, and produce the evidence. Testing that workflow before you need it is the difference between confident compliance and a scramble during an audit.
Drowning in Alerts or Missing Logs?
Whether you need a log source review, a retention and storage strategy, custom rule development or a full Wazuh tuning engagement, our SOC lead is available for a 30-minute free scoping call.
Talk to a SOC Engineer →Putting the Strategy Together
A complete Wazuh retention strategy combines four decisions. What to index: alerts always, plus full events only where full fidelity is required. How long to keep it hot: typically 90 days searchable to satisfy PCI DSS immediate-availability and active investigation. How long to retain cold: 12 months or more in compressed archive, extended where a specific obligation demands. How to enforce it: ISM policies for the indexer and a scheduled archive-shipping job for cold storage.
Document the policy explicitly, including the compliance basis for each number, and review it as data volume grows or obligations change. Auditors want to see a deliberate, documented retention decision, not a default left untouched. A clear strategy also controls cost, because storage is usually the single largest recurring expense of running a SIEM at scale.
Retention sits downstream of log collection and feeds into architecture decisions about cluster size and node count. Get collection and sizing right first, then retention and storage planning follows naturally from the per-day volume those produce.
Frequently Asked Questions
How long should I retain logs in Wazuh?
Set retention by your obligations. PCI DSS requires at least one year of audit log history with 90 days immediately available. ISO 27001 expects a documented, risk-based retention period rather than a fixed number. A common baseline is 90 days searchable in the indexer plus 12 months or more in compressed cold archive.
What is the difference between Wazuh alerts and archives?
Alerts are events that matched a rule at or above a configured level; they are indexed and searchable in the dashboard. Archives are compressed copies of all received events stored on the manager regardless of whether they alerted. Archives are cheap long-term storage but are not searchable without re-ingestion.
How do I implement hot, warm and cold tiers in Wazuh?
The Wazuh indexer is OpenSearch-based and supports Index State Management (ISM) policies. An ISM policy can transition indices from hot to warm storage after a set age, reduce replicas, force-merge to save space, and delete or move indices once they pass retention. Cold compliance data is best kept as compressed manager archives on cheap storage.
How much disk does Wazuh need for logs?
More than the raw log volume. Indexed documents include extracted fields and metadata as JSON, and replica shards duplicate searchable data. Expansion of two to three times raw volume after overhead and one replica is common. Measure a representative day, read the real index size from the dashboard, then project across your retention window.
Does Wazuh meet PCI DSS log retention requirements?
Wazuh can be configured to meet them. Keep 90 days of searchable data in the indexer for immediate availability and retain at least one year total using compressed manager archives. Protect archived logs against tampering with restricted permissions and integrity checks, which also supports ISO 27001 control 8.15.
Can Codesecure design our Wazuh retention and storage plan?
Yes. Codesecure designs Wazuh storage architecture including index sizing, ISM lifecycle policies, hot, warm and cold tiers and compliance-aligned archiving for PCI DSS and ISO 27001. ISO/IEC 27001:2022 certified delivery with named OSCP and CISSP consultants across India, Singapore, UAE and Malaysia.
Retain What Compliance Needs, Pay Only For What You Use
Codesecure designs Wazuh retention and storage strategies that satisfy PCI DSS and ISO 27001 while controlling the single largest cost of running a SIEM. Index sizing, ISM lifecycle tiers and tamper-resistant archiving, delivered by named OSCP and CISSP consultants.

