Image of phishing domain hardcoded into file

Return to Sender: A Technical Analysis of A Paypal Phishing Scam

Sooner or later we all get spam or phishing emails. For enterprise, phishing emails represent the most common vector by which adversaries gain a foothold into the network. In the last 12 months, Microsoft reported a 250% increase in phishing email detections, and phishing that targeted SaaS and webmail services doubled in the previous quarter.

We’ve discussed elsewhere tactics for resisting phishing attempts, but in this post we’ll take a deeper-look at how a phishing email works, revealing just how easily victims can give away their Paypal credentials with this kind of social engineering attack.

Behind the Link: HTML File Properties

For this walk-through, we’ll use a phishing email that I received recently through an alias that was set up to collect malware samples. Let’s start by seeing the file hash for the HTML file they sent me:

$: shasum PayPal_Document916.html
948fa2be822a9320f6f17599bc2066b2919ff255 PayPal_Document916.html

Let’s take a look on VirusTotal and see what comes back.

Image of Phishing email not known to VT

So no one knows what this is. Let’s take a closer look at the file with the Detect-It-Easy tool:

Image of File Analysis with DIE

From DIE we can confirm the following file properties:

  • File type: Plain Text HTML
  • Entropy: 6.056
  • Packed: No
  • File Size: ~34k

Great! The file isn’t packed and the entropy indicates that the contents are probably only lightly obfuscated. Let’s take a look inside.

Inside the Phish: Exploring the Content

Now it’s probably not ideal to just double click the file to peek inside since HTML files can run on any OS, and we still do not know what OS this sample is targeting. It is important to view the contents, but we want to do so using a very basic text editor. Use what you like, but make sure that you open it as if it were a regular text file so that nothing is executed.

Image of Obfuscated text in phishing mail

As beautiful as this is, it is kind of hard to read. Fortunately there are things that we can do to make this easier to read, but before we make alterations we are going to remove all the HTML code tags:

<!DOCTYPE html><html><head><script>

and

</script></head><body></body></html>

All that is left should be the JavaScript code, and this is where SublimeText has some nice features that can really help us out:

Image of JavaScript beautify

I will use a “beautifier” plugin to clean up the code and make it easier to read:

Image of Cleaned up JavaScript

You can do this via the CLI in macOS/Linux but it is not as nice looking:

$: awk -v RS=';' -v ORS=';n' 'NF' PayPal_Document916.html

Image of cleaned up code in Terminal

With this reformatted version of the Javascript, I am going to write this new version to a secondary file called:

decoded_PayPal_Document916.js

Murky Waters: Decoding the Text

Lets clean up some of the variables and start trying to understand what’s going on here.

First we know that:

var nxjCDAXFwFEX=

holds the raw Base64 code block (naturally assumed to be the embedded payload).

Let’s try to write this to a file and see what we get:

$: grep nxjCDAXFwFEX= decoded_PayPal_Document916.js|awk -F '"' '{print$6}'|base64 --decode >> payload

Note that this command is used to isolate the Base64 encoded string, but there are two important things to note about the line that contains our string:

1) At the start of the string:

return g=x.join(""),g.replace(/+$/,"")}var nxjCDAXFwFEX="

2) At the end of the string: ";

We need to remove everything before and after the Base64 encoded string in order to try and decode it. To do this we will use:

awk -F '"' '{print$6}'

This command simply splits the string you target into columns using spaces as delimiters. We will change the delimiter to a double quote using the -F '"' switch. Now that we have an isolated string to work with, we can decode it. Keep in mind that it’s not always as easy as decoding the base64 string. In this case, the encoded block looks odd when written to a file:

$: file payload
payload: data

A data file is not necessarily an indicator that a string failed to decode, but it does mean that there is a possibility the decode failed. Either way, we should review the code to look for logic that we may have overlooked.

A Clearer View: Isolating Variables

When renaming variables everyone has their own methods. Personally, I start easy with the ones that are obvious. We already know that nxjCDAXFwFEX contains the Base64 code string so I am going to change all occurrences of nxjCDAXFwFEX to raw_base64 and see what else we can find.

This can take a good deal of time. I have a lab that I use to run samples in without fear of infecting everyone in the office! So to speed things up, I simply copy the decoded_PayPal_Document916.js to a VM. For this I am going to use Linux since nearly everything wants to kill Windows dead.

To further de-obfuscate the variables in the script code block, we can do a few things. It’s a bit tedious, but we can put print statements for each variable to see what output they give and then name them accordingly. For now, there are some obvious ones that we can replace:

image of obfuscated variables

So let’s do some renaming:

xoCisgpExGEs –> function_01

  • This the first function that we can see in the script: function xoCisgpExGEs(rr,oo)

sCmCMuMlIZJy –> function_02

  • This is the second function that we see in the script: function sCmCMuMlIZJy(rr)

nxjCDAXFwFEX –> raw_base64

  • The variable that only contains our base64 encoded string.

lSiYOlcTTfmR –> call_array

  • The primary list of arguments.

TZGYADnjYnzp –> function_02_call

  • This simply calls the second function in the script:

lSiYOlcTTfmR:

    • lSiYOlcTTfmR[0] –> cyQvdxDbHhpBfpCX
      • This is just the first value in the “lSiYOlcTTfmR” (renamed to “call_array”) array.
    • lSiYOlcTTfmR[1] –> write
      • This is just the second value in the “lSiYOlcTTfmR” (renamed to “call_array”) array.

Note: We will actually remove this whole variable since we know where these values are used.

images of encoded variables

In this case the very last line of the script is the execution statement, so we will simply comment it out and apply new code to dump the output to 2 files:

Image of renamed variables

Post execution, it’s plain to see that the “function_02_call” still looks like gibberish, so we will ignore it for now. The “function_01_call”, however, looks like it gave us a lot of great new code to review:

image of output of javascript function

The output file contained code that was all on a single line and yet again a plugin was used to beautify the code (get used to cleaning up code!).

Suspicious Domains

Right off the bat, since we already suspect this to be a phishing attack, it might be nice to see all the Domains that are coded in the page.

Image of some domains in phishing email

The PayPal domains for the most part are not that interesting because those domains are known to be legit, and it would be hard (though not impossible) to spoof those. These look more interesting:

Image of suspicious domains in phishing email

Let’s run some cURLs to check a few things:

Image of using curl to check a domain

Image of using curl to check png resources

image of checking png images with curl

Image of final png checked with curl

While we’re cURLing things, we might as well download the PNG files and check their hash reputations

$: shasum *.png
f18a83299a9dbf4905e27548c13c9ceb8fb5687d AM_mc_vs_ms_ae_UK.png
53b7e80a8a19959894af795969c2ff2e8589e4f0 bdg_secured_by_pp_2line.png
b311f639f1de20d7c70f321b90c71993aca60a44 pp-logo-200px.png

These files appear to have a low chance of being malicious:

Image of file reputation on Virustotal

Let’s focus on a domain that does not belong to PayPal. There’s a variable in the code that I came across that I want to take a closer look at, _0x78eb7f:

Image of phishing domain hardcoded into file

The logic is fairly simple: when the user clicks the submit button after inputting their credit card information, the page script will alter the destination of where the end user input data is sent to.

In this case, rather than PayPal, the user’s information is sent to the attacker’s web server.

Conclusion

From this basic analysis we’ve identified exactly how the attack works and the domain of the attacker’s server. If it were still live, we would now be able to ensure it’s blocked across all our endpoints by using, for example, the SentinelOne Firewall Control.

While this phishing email is not as complex as a lot of attacks that we have read about and experienced to date, it is very easy to overlook. This type of attack can often be even more effective than more modern attacks. As Security professionals we can use this example as a small reminder that we should be educating our friends, family, and end users about questioning the validity of emails at all times.