Home  /  Blog  /  Wazuh Decoders: Log Parsing and Normalization

● SOC

Wazuh Decoders: Log Parsing and Normalization

Before any rule can fire, Wazuh has to understand the log. Decoders are the parsing layer that turns raw, vendor-specific log lines into structured, normalised fields the rules engine can reason about. Get decoding right and detection becomes precise; get it wrong and rules silently fail. Here is how Wazuh decoders work.

Published 26 June 2026 11 min read Codesecure SOC Engineering SOC

Key Takeaways

  • Decoders parse raw logs first, before rules. They extract structured fields (IPs, users, ports, actions) from vendor-specific log lines so rules can evaluate clean data.
  • Wazuh decoders use regex, including the efficient PCRE2 engine, plus simpler prematch and offset techniques to locate and capture fields.
  • Normalisation maps many vendors to common fields, so a username from a firewall, a server and an application all populate the same field name for consistent detection and reporting.
  • Parent and child decoders work together: a parent identifies the log source (prematch), and children extract the specific fields from that source.
  • Custom decoders live in local_decoder.xml and are tested with wazuh-logtest, which shows exactly which fields a decoder extracts from a sample log.

What Decoders Do

A decoder is the first thing Wazuh applies to a raw log line. Its job is parsing: taking an unstructured or semi-structured log message and extracting the meaningful pieces, the source IP, the username, the action, the port, the status, into named fields. Only after a decoder has produced these fields does the rules engine run. Rules evaluate the decoded fields, never the raw text, so decoding is the foundation that all detection sits on.

This separation is deliberate and powerful. It means a single set of rules can work across many vendors, because each vendor's quirky log format is normalised by its own decoder into the same field names. A rule that alerts on a failed login by a privileged user does not need to know whether the log came from Linux sshd, a Windows event, a firewall or a custom application, as long as a decoder mapped each one into the shared username and status fields.

When a decoder fails to extract the fields a rule expects, the rule silently does not fire. This is the most common and most frustrating cause of missing detections in Wazuh, and it is why understanding decoders is as important as understanding rules.

How Decoders Match and Extract

Wazuh decoders work in two conceptual steps: identification and extraction. Identification uses a prematch, a pattern that recognises which kind of log this is. If the prematch succeeds, the decoder applies, and Wazuh stops searching for other parent decoders. This is how Wazuh quickly routes a log to the right parser among thousands of possibilities.

Extraction then pulls the fields out. A decoder uses a regex (Wazuh supports the efficient PCRE2 engine) with capture groups, and an order attribute that names each captured group, mapping the first capture to (say) srcip, the second to srcuser, and so on. The result is a set of named fields attached to the event. Wazuh also offers simpler, faster techniques such as offset-based extraction and static field assignment for predictable formats, which avoid the cost of full regex where it is not needed.

Performance matters because decoders run on every single log line. Wazuh's design, prematch to identify quickly and PCRE2 plus offsets to extract efficiently, is built so the parsing layer keeps up with high event rates. Well-written decoders are specific (so the prematch is cheap and unambiguous) and extract only the fields detection actually needs.

Need Help Designing Your Wazuh Deployment?

Codesecure deploys and tunes Wazuh-based SOC stacks (manager, indexer, dashboard, agents, custom rules and decoders) for businesses across India, Singapore, UAE and Malaysia. ISO/IEC 27001:2022 certified delivery, named OSCP and CISSP consultants, fixed-price proposals.

See SOC and SIEM Services →

Parent and Child Decoders

Many real log formats are best handled by a pair of decoders. A parent decoder identifies the log source with a prematch (for example, recognising that a line came from a particular firewall) and may extract a few common fields. One or more child decoders, linked to the parent, then handle the variations: different message types from the same source, each with its own field layout.

This structure keeps decoders maintainable. Rather than one enormous regex trying to cover every message a device emits, you have a parent that says 'this is from device X' and children that each parse one message format cleanly. Wazuh ships exactly this pattern for many products, and custom decoders should follow it for any source that produces more than one message shape.

The parent-child relationship mirrors the way rules use parent-child relationships, and the two work together: a decoder family normalises a source's many message types into consistent fields, and a rule family then makes broad and specific detections on those fields. Designing the decoder structure first makes the rules that follow much simpler.

Multi-Vendor Normalization

The real payoff of decoding is normalisation: making heterogeneous logs speak a common language. Different vendors describe the same concept in completely different text, one firewall writes 'src=10.0.0.5', a web server writes the client IP in a positional access-log field, an application writes it inside a JSON object. Decoders extract each into the same canonical field (such as srcip), so downstream everything is consistent.

Normalisation is what makes cross-source detection, dashboards and compliance reporting possible. When every source's username lands in the same field, you can write one rule, build one dashboard and run one report that spans firewalls, servers, cloud platforms and your own applications. Without normalisation, you would need vendor-specific logic everywhere, which does not scale.

Wazuh's built-in decoders already normalise a wide range of common products. The work of a detection engineer is to extend that coverage to the sources Wazuh does not ship decoders for, your in-house applications, niche appliances and bespoke log formats, mapping their fields onto the same canonical names so they participate in the same detections as everything else.

Writing Custom Decoders

When Wazuh has no decoder for a log source, you write one. Custom decoders go in local_decoder.xml (alongside custom rules in local_rules.xml) so they survive product upgrades. The approach is the same as the built-in decoders: define a parent decoder with a prematch that uniquely identifies the source, then child decoders that extract fields using PCRE2 regex with an order attribute naming each capture.

The most reliable way to build a custom decoder is sample-driven. Collect several real log lines from the source, including the variants you care about, and identify the stable structure and the fields you need. Write the prematch against a constant part of the message that always appears, then write extraction regex with named captures for each field. Map those captures onto canonical field names so the new source normalises into the same vocabulary as the rest of your data.

Custom decoding is iterative and benefits from the same discipline as rule development: keep a library of sample logs, version-control local_decoder.xml, and write specific prematches so your decoder does not accidentally claim logs from other sources.

Drowning in Alerts or Missing Logs?

Whether you need a log source review, a retention and storage strategy, custom rule development or a full Wazuh tuning engagement, our SOC lead is available for a 30-minute free scoping call.

Talk to a SOC Engineer →

Testing and Validating Decoders

As with rules, the tool that makes decoder development safe is wazuh-logtest. Run it on the manager, feed it a sample log line, and it reports which decoder matched and, crucially, exactly which fields were extracted and what values they hold. This immediate feedback is how you confirm a decoder is pulling out the srcip, srcuser and action you expect, rather than missing them silently.

A robust workflow ties decoders and rules together through testing. First confirm the decoder extracts the right fields with wazuh-logtest, then confirm the rule fires on those fields. Because a rule depends entirely on the decoder's output, testing both in sequence catches the single most common failure, a perfectly good rule that never fires because the field it needs was never extracted.

Validating decoders against a range of sample logs, including malformed and edge-case lines, hardens your parsing so it does not break when real-world logs deviate from the happy path. Decoders are the quiet foundation of a Wazuh SOC: invisible when they work, and the root cause of missed detections when they do not. Investing in correct, well-tested decoders pays off across every rule, dashboard and report that depends on them.

SHARE

Frequently Asked Questions

What is the difference between a Wazuh decoder and a rule?

A decoder parses a raw log line and extracts structured fields such as source IP, username and action. A rule then evaluates those extracted fields and generates an alert when its conditions match. Decoders run first and produce the structured data; rules run second and make detection decisions on it. Rules never read raw text directly.

Why is my Wazuh rule not firing?

The most common cause is the decoder, not the rule. If the decoder did not extract the field the rule depends on, the rule silently never matches. Test the log line with wazuh-logtest to see which decoder matched and which fields were extracted; if the expected field is missing, fix the decoder before touching the rule.

What is a prematch in a Wazuh decoder?

A prematch is a pattern that identifies which kind of log a line is, so Wazuh can route it to the correct decoder. If the prematch succeeds, the decoder applies and field extraction proceeds. Prematches make decoding efficient by quickly selecting the right parser among thousands of possibilities.

How does Wazuh normalise logs from different vendors?

Each vendor's decoder extracts the same concept into the same canonical field name. A source IP written as src= by a firewall, positionally by a web server and inside JSON by an application all map to the srcip field. This lets one rule, dashboard or report span every source instead of needing vendor-specific logic everywhere.

How do I write a custom decoder in Wazuh?

Put custom decoders in local_decoder.xml so they survive upgrades. Collect real sample logs, write a parent decoder with a prematch that uniquely identifies the source, then child decoders that extract fields with PCRE2 regex and an order attribute naming each capture. Map captures onto canonical field names and validate with wazuh-logtest.

Can Codesecure build custom Wazuh decoders for our applications?

Yes. Codesecure writes and tests custom Wazuh decoders for in-house applications, niche appliances and bespoke log formats, normalising them onto canonical fields so they join your existing detections, dashboards and compliance reports. ISO/IEC 27001:2022 certified delivery with named OSCP and CISSP consultants across India, Singapore, UAE and Malaysia.

CS

Codesecure SOC Engineering

OSCP / CEH / CISSP Certified SOC Engineers

Codesecure Solutions is ISO/IEC 27001:2022 certified and builds open-source SOC and SIEM platforms on Wazuh, TheHive, Cortex, MISP and n8n for businesses across India, Singapore, UAE and Malaysia. Named OSCP, CEH and CISSP consultants. We design log pipelines, write custom decoders and rules, and run 24x7 detection engineering programmes.

✓ ISO/IEC 27001:2022 Certified

Parse Every Log Source Into Detection-Ready Data

Codesecure writes and tests custom Wazuh decoders that normalise your in-house applications and niche appliances onto canonical fields, so one ruleset and one dashboard cover your whole estate. Validated with wazuh-logtest, delivered by named OSCP and CISSP consultants.