What is Gh0st RAT and How do You Find It?

What happened?

I’ve recently come across a sample of Gh0st RAT that our agent failed to catch. This wasn’t alarming news. Usually, the reason for this is that the sample doesn’t work. It’s missing some resource that was to be installed alongside it by the dropper, or it fails to connect to its long-dead C2 server. Since we detect malware based on analysis of malicious activity, software which does nothing is by definition not malware. The solution is to find another sample of Gh0st RAT to test against, one that works, just to make sure everything’s still okay. Looking up Gh0st RAT on VirusTotal, I ran into my real issue:

What is Gh0st RAT?

Looking at a few of the top results on VT, I get samples such as
61542994890fa7981ca38cbbd9103a081a0c036c9c512506464f772170943b7b
which gets analyzed as “Bck/Gh0stRat.F” by Panda AV and by 41 other vendors as other semi-gibberish names. This gibberish naming scheme seems to be a tradition among AV vendors. Only one result says it’s actually Gh0st. That’s a lot less than I usually get when I try to confirm the identity of a sample I’m working on. Probably not Gh0st, then.
047071f8cbf3c4be9406509c1ebf7cb4203a3f0217882200d797c1629199983a
is detected as malware by 22 out of 54 vendors but none of them claim it’s Gh0st RAT. The most recurrent name for it among the analyses seems to be “8muaa0TfM2lb”, by 7 vendors. The only feature that hints to it being Gh0st is the filename – “Gh0st RAT” – under which it was sent to VT. So, if no AV in the world classifies this as Gh0st, it’s probably not Gh0st, and whoever named the file as such made a mistake.

I’ll save you the trouble of going down the list of results. The analyses are usually 0 or 1 classifications as Gh0st RAT, and 40 other classifications as other names. However, it seems that “zegost” comes up in many of these miscellaneous classifications one or more times. Is this a lead worth checking? Sure, why not.

The Zegost Lead

According to an article in E Hacking News and an analysis written by Andrew Rutkiewicz, IT Security Analyst at RSA, zegost is Gh0st RAT. However, the partial analysis of zegost which can be found here, and this one about a variant of Gh0st RAT called Miansha found here, along with a variant of Gh0st Rat from the VOHO campaign here, doesn’t seem to describe the same malware. So I concluded some people on the internet were probably wrong and Gh0st is its own, different, RAT. So how do I find a Gh0st sample?

Finding Gh0st

Going with a zegost sample isn’t a valid option. What I can do, however, is get a sample from someone who already analyzed and published a paper about Gh0st. However, even between these papers, there seems to exist a considerable difference in capabilities.

In the VOHO campaign, analyzed by RSA FirstWatch team, Gh0st:

Performs comprehensive RAT capabilities (keylogging, screenshots, remote shell, downloading files, etc.)
Persists by registering for autorun in the current user’s registry hive, and obfuscates its entry by encoding it in hexadecimal, or persists by registering in the local machine’s registry hive and does not obfuscate itself.
Disables regedit or not, corresponding to whether it obfuscates itself in the registry or not.
Disables Windows restore.
Uses the keyword “HTTPS” at the beginning of each communication packet.

In “The Monju Incident”, analyzed by Context Information Security:

Exfiltrates data from the infected system.
Propagates throughout the infected network.
Persists by registering as a DLL in a CLSID entry in the registry.
Utilizes DLL load order hijacking. It does this to run malicious code via a trusted (signed) executable (GOM player in this case.)
Also uses the keyword “HTTPS” at the beginning of each communication packet. The communication is encrypted by a homebrew xor-and-add function, and then encoded in base 64.
Not included in their analysis, but by reverse engineering their sample, it seems it also exhibits a couple more behaviors. It downloads a whole lot of .ocx files from the C2 server onto the infected client. It uses encrypted configuration strings placed at the end of its own executable between delimiter strings such as “AAAAAA”, “BBBBBB”, and “PPPPPP”.

In the Gh0st RAT samples analyzed by Infosec Institute, Gh0st:

Performs comprehensive RAT capabilities (as in the VOHO campaign).
Persists by registering as a service.
Clears the SSDT of existing hooks via an installed kernel module.
Each variant uses a (usually) five letter keyword at the beginning of each communication packet. By default this is “Gh0st”. Infosec Institute claims this is the most stable feature of Gh0st, which is how you should classify a variant as Gh0st.
Has encryption capabilities.

Each refers to his own Gh0st as a “variant”, but then, what is the original one? The community doesn’t seem to agree that any of them really are Gh0st, if the classification of the AV vendors is any indication. These samples do seem to share similarities, but contain different abilities, sometimes vastly so. Installing a kernel module to disable existing rootkit hooks by clearing the SSDT compared to disabling regedit and Windows restore points, or the methods of persistence and obfuscation, or a myriad of minor features not mentioned here.

The method of communication seems to be the common trait. All RATs offer the same features, more or less, so that can’t be used to aid their classification as variants of the same source. But is the communication alone enough to deem them all Gh0sts? Does the problem compound when I tell you that the source code for Gh0st is available online? (https://github.com/sincoder/gh0st). The availability of the source code makes it easier for people to cut out snippets of code or entire modules and use them in their own custom malware. So what do we classify as Gh0st?

Which Gh0st is Gh0st?

Having the source code available is actually a tempting way out of this problem. I could say that it’s enough, the source code is what I define as Gh0st. The others, such as the VOHO campaign Gh0st, or the Monju Incident Gh0st, are close but no cigar. Maybe they re-used some modules from Gh0st RAT, because it’s easier to copy and paste things from the internet than to reinvent the wheel. What really makes them into Gh0st? Just the similar way in which they communicate, or what they actually do? It should be noted, I sent e-mails asking researchers what made them classify their samples as Gh0st specifically, but received no answers. Classifying each of them as new malware, different from Gh0st, would mean I have to make sure our product detects each of them properly. Actually, this would potentially make an infinite variety of completely new strains, one for every custom-compiled version of Gh0st, each of which only share a similar method of communication or not even that. This would be a difficult list to maintain for our product. Open source malware samples are not easy to classify as “variant” or “new strain”.

At the beginning of the post, I remind you, my issue was finding a sample of Gh0st RAT that wasn’t defective so that I can make sure our agent catches it. Surely, the source code itself will be a good enough sample. And don’t worry, we do catch it. But then, do we also detect the so called “variants”? Luckily, our agent is behavior based. For us, it doesn’t matter how many traits these “variants” share with the original Gh0st. It doesn’t matter whether 1 module or all of them have been ripped off from the source code. It doesn’t matter whether extra functionality has been written on top of them. It doesn’t matter if they’ve been wrapped, re-encrypted, armored, and obfuscated. This is because the granularity of what we detect as malicious is much finer. However, the other solutions out there are based on static signatures which would result in a missed detection. Static signatures are problematic. They remain the same while the malware may not. Look at this ‘detection method’ for Gh0st from the ContextIS advisory:

A quick look at this would make you realize that any Gh0st variant that doesn’t send “HTTPS” needs a new rule for it. Other shenanigans could also happen. A Gh0st “variant” might still send “HTTPS” but add more persistence. It would register for autorun via registry and also via service. The remediation process would be difficult if the response team assumes that this is the same specific “HTTPS” variant. Automatic remediation would be downright impossible. Static signatures are too limited in what they can achieve. Identification by specific features, even a collection of features, can cause even more damage. For example, misleading a client into thinking his system is now clean, when you applied the fix for a malware that’s almost the right one.

The difficulty of classifying malware

Naming implies classification, and should carry meaning. Meaningless naming is an issue in the community. The fact is, knowing a sample’s name does not imply what a specific sample is composed of. People often can’t tell you what a specific sample is, or what it does even if they can name it. But then, what purpose does a name have if that name doesn’t help you? If you labeled malware as Gh0st because it acted in a similar – but not similar enough – manner? I think it doesn’t help you. Your response team will not have a proper response for it. Their pre-made scripts for extracting and decrypting the configuration data may fail. Their remediation process may miss crucial actions.

Worse, meaningless naming causes confusion. A fellow researcher mentioned to me that “Gh0st RAT is just a variant of njRAT”. njRAT is a C# sample with no similar traits to Gh0st. njRAT is known as Bladabindi, but only by Microsoft. If you remember, AV vendors thought Gh0st was more often “zegost”, despite large differences. These multiple labels and mixups cause nothing but chaos in the community. The process of finding out the malware name of a binary is heuristic and clumsy.

The solution, in my opinion, is to stop naming detected malware. The name doesn’t matter. A variant or a misdetected signature will stump your antivirus and make your life difficult. What really matters is what this one specific sample did on this one specific system. This requires a paradigm shift in the world of security; it requires behavioral analysis. An antivirus report shouldn’t tell you the name of the threat, but it should include anything that the virus touched. Graphs of process origination and OS entities manipulated (files, registry keys, etc). Processes manipulated, network communication endpoints, I/O anomalies. Empirical data that can be used to pinpoint the exact type of malware you’re dealing with. Data that can be used to tell you exactly what it did on your system, and how to undo it. How to fix it, which is what you really need now after that piece of ransomware hit your company. If your data is complete enough, you’ll even be able to fix everything automatically, immediately.

Static signatures do have their place in the world of information security. They should be a complementary tool. Behavioral analysis should be the crux of the solution, the main focus of a security product. If your static signature detected something malicious – sure, block it. But new malware will be written specifically to avoid you. Your static signatures will be carefully stepped around. You will undoubtedly fail to catch new malware. You need to shift your primary focus to behavioral detection, and nothing can avoid you anymore.

We catch the bad guys, even if they do something different.

Traits of the original Gh0st

Performs comprehensive RAT capabilities.

Keylogging
Mouselogging
Disabling the mouse and keyboard
Screenshot captures
Webcam snapshots
Webcam video surveillance
Microphone surveillance
Downloading and uploading files (to and from) the infected computer
Executing files
Disabling the screen
File listing, process listing
Remote shell
Remote shutdown/reboot

Persists by registering as a service.
Installs a kernel module which clears SSDT hooks, which may remove the presence of monitoring tools or pre-existing rootkits.
Begins communication with a magic word
Performs encrypted communication with a MyEncode/MyDecode function. By default this is a xor-and-add function.
Stores such encrypted configuration strings at the end of its binary between delimiter strings such as “AAAAAA”, “BBBBBB”, “PPPPPP”.