YARA Hunting for Code Reuse: DoppelPaymer Ransomware & Dridex Families

Vitali Kremez explaining how to hunt malware families such as DoppelPaymer, BitPaymer & Dridex loader using YARA rules.

Whenever we discuss how to proactively hunt for malware of interest, whether it be crimeware or APT for threat intelligence purposes, YARA is the true swiss-army knife that makes the work of malware researchers and threat intelligence analysts that much easier.

Malware developers work just like legitimate software developers, aiming to automate their work and reduce the time wasted on repetitive tasks wherever possible. That means they create and reuse code across their malware. This has a pay-off for malware hunters: we can learn how to create search rules to detect this kind of code reuse, reducing our workload, too! In this post, we will we learn how to write YARA rules for the following three crimeware variants belonging to the Dridex family:

1 – BitPaymer ransomware (known as “wp_encrypt”) part of the Everis extortion case
2 – DoppelPaymer ransomware leveraged in the PEMEX lockdown
3 – Dridex Loader (known as “ldr”) botnet ID “23005”

In a nutshell, our goal is to hunt for malware software developer code leveraging YARA code reuse rules rather than relying on rules covering easily changeable strings.

One of the primary original purposes of YARA was to classify and identify malware samples. It is a rather simple Ruby-based language syntax used to describe various patterns.

The latest YARA version is 3.11.0. YARA is a signature-based tool with multiple command-line interfaces in various programming languages. In other words, it is similar to static anti-virus signatures used to detect malicious files.

The major functionality of YARA is to scan folders for files and buffers for patterns. Many tools rely on YARA such as yarashop, for example.

Some of the most common uses of YARA for our purposes is to scan, categorize and identify malware samples of interest based on code and string reuse.

The typical YARA syntax example is as follows:

import <module>

<rule type> rule <rule name> : <tags>
{
    meta:
        <name> = "<value>"
        ...
    strings:
        $<string name> = <value> <modifiers>
        ...
    condition:
        <some condition>
}

Let’s practice writing YARA rules for Zero2Hero:

/*

We practice writing YARA rules for Zero2Hero

*/

import "pe"

rule zero2hero_course : best 
{
    meta:
        // Comment
        description = "This is an example rule to demonstrate typical syntax"
        reference = "https://www.sentinelone.com/lp/zero2hero"
        author = "@VK_Intel"
        tlp = "white"

    strings:
        $hero = "helloworld" xor wide
        $unique_function = { ?? ?? 8b fa 8b ?? 8b cf e8 ?? ?? ?? ?? 85 c0 75 ?? 81 ?? }

    condition:
        uint16(0) == 0x5A4D and pe.exports("CryptEncrypt") and  all of them
}

The additional modifiers can be as follows:

global: match before any subsequent rules
private: build other rules
none: match unless global is used

The tags are as follows:

strings: regular expression, text or hex
string modifiers: wide, ascii, xor, fullword, nocase

The conditions can be as follows:

Boolean expressions
Built-in, external, module variables, and functions

YARA String & Code Reuse Hunting

There is a difference between writing YARA rules for malware hunting versus detection. In this part of the course, we aim to produce “looser” YARA rules for threat hunting purposes with the higher chance of capturing newer variants and false positives. In other contexts, you may want stricter YARA rules for a specific detection mechanism and malware strain.

By and large, efficient YARA rules are only as good as the data sources used to vet the YARA rules against. Anti-virus and malware researchers rely on large datasets of known good and known bad (and known random) samples to produce the most high-fidelity rules as it is often hard to predict YARA rule performance given the limited view of an individual researcher.

Some of the known bad and known good data sources for YARA rules performance include VirusTotal, Hybrid-Analysis, VirusBay, Malpedia, Microsoft, and VirusShare. Florian Roth’s tool yarGen includes some of the necessary string and opcode datasets for YARA performance checks as well. Another excellent tool for YARA rule management is the KLara tool developed by Kaspersky.

One of the major abilities of YARA rules that lead to successful and long-term hits is combining both string-based and code-based coverage. We believe that the key to efficient YARA rules depends on simple and clear rulesets utilizing both. I highly recommend watching Jay Rosenberg’s presentation from Confidence Conference 2019 entitled Utilizing YARA to Find Evolving Malware.

When creating code reuse YARA rules, we need to be aware of compilation flags, different compilers, and slightly altered code that can change the code and break the YARA rules. Consequently, we should wild card ?? certain instances such as used registers, which can change from one sample to another.

For example, various instructions such as xor eax produce different opcodes depending on the xor’ed register. Skipping opcodes with “[1-2]” from one to two bytes is often necessary to survive compilers and make the YARA rules cover different environments.

The cyclical nature of the YARA rule development can be described in the following 7 steps:

7. Repeat

Practical Crimeware Code Reuse: “Dridex” Malware Family

Dridex by far is one of the most complex and sophisticated pieces of malware on the crimeware landscape.

The malware is also referred to as “Bugat” and “Cridex” by various researchers. The original Bugat malware dates back to 2010, which at some point rivaled the “Zeus” banking malware.

The development group behind it is responsible for the three malware variants, which are the subject of our YARA course:

1 – BitPaymer ransomware (known as “wp_encrypt”) part of the Everis extortion case
2 – DoppelPaymer ransomware leveraged in the PEMEX lockdown
3 – Dridex Loader (known as “ldr”) botnet ID “23005”

The YARA rule for the overarching code reuse across the Dridex developer samples is based on the unique API hashing function used to resolve the Windows API calls. It is one of the most obvious unique features of this family.

The Dridex developer family can be described by this YARA rule as follows based on the API hashing function (as seen on the screenshot above):

rule dridex_family
{
    strings:
        $code = { 5? 5? 8b fa 8b ?? 8b cf e8 ?? ?? ?? ?? 85 c0 75 ?? 81 ?? ?? ?? ?? ?? 7? ?? }

    condition:
        $code
}

Always test the rules, for example, via command-line:

yara -s <rule_name> <malware_location>

Testing the YARA rule reveals multiple hits on the Dridex family across the folder.

Uniting Code Reuse & String Detection

I. DoppelPaymer ransomware contains a peculiar string reused across samples we can add to the Dridex family code reuse. It copies the unicode string "setup runn" to eax via lstrcpyW API call.

The possible specific DoppelPaymer ransomware rule is as follows:

rule crime_win32_ransomware_doppelpaymer_1
{
    strings:
        $str1 = "Setup runn" wide
        $code = { 5? 5? 8b fa 8b ?? 8b cf e8 ?? ?? ?? ?? 85 c0 75 ?? 81 ?? ?? ?? ?? ?? 7? ??}

    condition:
        $code and $str1
}

II. BitPaymer ransomware contains the same referenced string across the samples aimed to act as anti-Windows Defender emulator checking the existence of the file "C:\aaa_TouchMeNot_.txt", which is indicative of Windows Defender sandbox activity.

The possible specific BitPaymer ransomware rule is as follows:

rule crime_win32_ransomware_bitpaymer_1
{
    strings:
        $str1 = "C:\aaa_TouchMeNot_.txt" wide
        $code = { 5? 5? 8b fa 8b ?? 8b cf e8 ?? ?? ?? ?? 85 c0 75 ?? 81 ?? ?? ?? ?? ?? 7? ??}

    condition:
        $code and $str1
}

III. Across the Dridex loader samples, this malware carries the same string "installed" called via OutputDebugStringW many times acting as anti-emulator. It is indicative of the Dridex loader.

The possible specific Dridex loader rule is as follows:

rule crime_win32_loader_dridex_1
{
    strings:
        $str1 = "installed" wide
        $code = { 5? 5? 8b fa 8b ?? 8b cf e8 ?? ?? ?? ?? 85 c0 75 ?? 81 ?? ?? ?? ?? ?? 7? ??}
    condition:
        $code and $str1
}

The final YARA rule, for example, covering both code and strings for the DoppelPaymer ransomware unpacked payload is as follows:

rule crime_win32_doppelpaymer_ransomware_1 
{
    meta:
        description = "Detects DoppelPaymer payload Nov 11 Signed"
        author = "@VK_Intel"
        reference = "https://twitter.com/VK_Intel/status/1193937831766429696"
        date = "2019-11-11"
        hash1 = "46254a390027a1708f6951f8af3da13d033dee9a71a4ee75f257087218676dd5"

    strings:
        $s1 = "Setup run" wide
        $hash_function = { 5? 5? 8b fa 8b ?? 8b cf e8 ?? ?? ?? ?? 85 c0 75 ?? 81 ?? ?? ?? ?? ?? 7? ??}

    condition:
        ( uint16(0) == 0x5a4d and
            filesize < 2500KB and
            ( all of them )
        )
}

Malware Samples

DoppelPaymer Ransomware (unpacked) SHA-256: 46254a390027a1708f6951f8af3da13d033dee9a71a4ee75f257087218676dd5

BitPaymer Ransomware (unpacked) SHA-256 78e180e5765aa7f4b89d6bcd9bcef1dd1e0d0261ad0f9c3ec6ab0635bf494eb3

Dridex Banker (unpacked) SHA-256 ce509469b80b97e857bcd80efffc448a8d6c63f33374a43e4f04f526278a2c41

YARA String & Code Reuse Hunting

Practical Crimeware Code Reuse: “Dridex” Malware Family

Uniting Code Reuse & String Detection

Malware Samples

Read more about Cyber Security

Read More