Why YARA Rules are Better Than Hashes in Malware Detection

When conducting threat intelligence or threat hunting, one of the most vital requirements is IOCs or IOAs.
These little fragments serve as cues that guide the analyst toward a better malware investigation. However, when malware is a digital file, or better, broken down by its name, MALicious softWARE, a form of digital file. This is why one of the most effective ways to detect them is to use file-based indicators, such as malicious file hashes and malware signatures.
However, their potency levels differ, and this difference is what makes one more valuable for long-term malware threat detection than another.
What is a file hash:
A file hash is an irreversible, unique string of characters that is used to identify a file on digital systems.
Since a file hash is unique, it ensures that every file, regardless of which digital system it is in, or whatever place it has been, can still be identified by its hash, EXCEPT a slight change occurs in the file.
This means that if as many as a character is appended to the file, the file's hash automatically changes to an entirely unique value.
Unlike encryption, a file hash is one-way and can’t be decrypted using a decryption key.
These cryptographic means of identifying file uniqueness and security can be classified into different algorithms based on functions and generations. Here are some of the popular ones:
Message Digest 5 (MD5)
SHA1: No longer relevant due to insecurity.
SHA2: The popular industry standard. Under it are three hashing methods:
SHA-256: Uses a 256-bit hash value for its cryptography.
SHA-384: Uses a 384-bit hash value for its cryptography.
SHA-512: Uses a 512-bit hash value for its cryptography, and since it uses more bits, there is more security and way less possibility of an integrity collision as often seen in MD5 hashes.
SHA3: A new generation of file hash that, in the future, will likely replace SHA2, with more robust security.
What is a YARA rule
A YARA rule is used to match files or malware using patterns, behaviors and signatures, unlike a file hash, it is more robust as it traps more malware.
When it comes to malware detection, a YARA rule combines many IOCs into a single rule that captures many similar variants.
A side-by-side comparison of how YARA rules and file hashes work
Suppose we have malware like the Mirai malware; its file hash in SHA-256 is a2e9d9436e5a2e4bb41927cfd77fc52dea1230e7dee0b688a03e5948abe783fd, but this isn’t the only variant of the same Mirai malware.
Here is another hash of the Mirai malware:1ba51b481413f35cda4b682af825344686c521fb4da6709a13b85ff2ac6b3637
Checking both file hashes on VirusTotal will indicate that they are of the Mirai family, as seen below:

This second image:

Now, suppose I have the hash of just one variant and integrate it into my EDR/SIEM. What happens then when another variant comes knocking? It gets in!
Threat actors know this, and this is why they are fond of sometimes modifying the binaries of their malware because the slightest change in a malware’s binary creates a new variant with a new hash.
This is where YARA rules come into the picture. They don’t just capture the file hash; they capture the malware's behavioral patterns. This is because, just as a family of people has common traits, a malware family, regardless of the specific variant, tends to exhibit similar behavior to other variants in that family.
The image below is a YARA rule for the malware in the image above:

This rule is from a 1200-line YARA ruleset as seen on GitHub. This means that if I, as an analyst, want broader control over different variants of the Mirai malware, I can integrate the entire 1200-rule YARA ruleset into my EDR, and when a variant is captured, the specific rule referencing it flags it.
Reference here for the complete open source package from Elastic Security on GitHub.
Advantages and Disadvantages of YARA Rules vs. File Hashes
Both file hashes and YARA rules serve as file-based indicators for threat detection, but their utility and longevity vary significantly due to their respective strengths and weaknesses.File


Ending notes
In summary, while file hashes offer a fast and simple method for the definitive identification and integrity checking of a single known malicious file, their extreme fragility and short shelf life make them a poor choice for long-term threat detection. The ease with which threat actors can slightly modify a binary to generate a new, undetected hash.
YARA rules, on the other hand, represent a more robust and enduring solution. By capturing shared patterns, behaviors, and unique strings across an entire malware family, a single YARA ruleset can effectively detect hundreds of related variants. This capability—resilience to minor changes and the provision of deeper, contextual intelligence—solidifies YARA rules as the more potent and valuable tool for modern, proactive threat intelligence and threat hunting. They are, ultimately, an investment in detecting the malware itself, not just a fleeting instance of it.
