MAY 24, 2022

Blame the Messenger: 4 Types of Dropper Malware in Microsoft Office & How to Detect Them

Microsoft Office droppers have been a favorite of threat actors for years, continuously finding and exploiting them. Cybersecurity vendors take note and block these entry routes. It‘s a perpetual cat and mouse game and, unfortunately, bad actors typically have the upper hand – at least for a short time. And as AI-based solutions have matured and gained market share these tools have also been targeted for evasion.

This blog will review a variety of VBA droppers that employ different bypass techniques, including an analysis of an evasion method used in the recent Emotet wave. We will also introduce a Python script I wrote to increase the likelihood of detecting these malware threats.

You Got Malware — Aggah’s Use of MsgBox Comments

Aggah, a threat actor group that has been active since 2019, has delivered many payloads, mostly RevengeRAT, to numerous victims. This group is particularly adept at working with Microsoft Office documents and employs various methods in their VBA scripts to make them stealthier. One of these methods, which appears to be used to evade AI-based cyber tools, is the use of comments containing the string ‘MsgBox.’

‘MsgBox’ is a function used in VBA to prompt message boxes, which appear in many Visual Basic scripts and is usually benign. Having this string in the comments of a VBA code increases the likelihood that it will be classified as benign by an AI module. If the code is short and the lengthy ‘MsgBox’ comments comprise a substantial part of it, this will further increase the chances that it will be classified as benign.

An Aggah dropper's VBA code

A Command in a Comments Stack — Emotet’s Use of Random Sentences

We have seen recent Emotet VBA droppers containing long comments composed of random words. As we see in the figure below, the executed command and the variable containing it were not obfuscated, just floating in a sea of long random comments.

Using these excessive comments might fool both analysts and AI solutions (the former might miss the malicious MSHTA execution when looking at the code, and the latter might give more consideration to the benign features, aka the excessive comments, than to the malicious ones).

Figure 2: An Emotet dropper's VBA code, the actual commands are highlighted in yellow. Note: a few long comments were redacted, since each of them is just a compilation of random words and none of them contribute to the understanding of the code’s functionality.

Homegrown Obfuscation — Dridex’s Usage of Self-Created Functions 

One of the most interesting droppers we have recently observed was crafted by the notorious threat group Dridex. In the following example, Dridex employs several sophisticated methods aimed at increasing its likelihood of success — delivering a payload successfully and without detection.

As we see below, the script retrieves strings stored in Excel cells and runs them through the ‘slow’ function, which returns a de-obfuscated version of its input. The first string is collected from the “B101” cell and is translated into “WScript.Shell,” the second is assembled by activating VBA’s “Transpose” and “Join” commands on the cells range “K111:K118.” 

The Dridex dropper’s VBA output. Note: some parts of the code were redacted, since they are irrelevant to this blog.

After retrieving the data from the cells, the following is received:

To de-obfuscate this part, I replaced every “${PJ}” and “${GAB}” mentioned in comma and quotation mark, respectively. I also replaced the indexed placeholders with the appropriate strings and removed unnecessary characters, such as backticks.

This resulted in the following code:

This is obviously obfuscated as well — the main executed string is base64 encoded and deflate compressed. Of note, the attackers went the extra mile and tried to hide their use of the ‘iex’ command (short for ‘Invoke-Expression’) by retrieving the characters ‘i’ and ‘e’ from the value of the environment variable ‘pshome,’ which contains the path to the PowerShell directory, as can be seen in the highlighted section above.

After base64 decoding and decompressing the base64 encoded string, yet another obfuscated string is received. 

After reassembling the strings and removing unnecessary characters, the following is received:

Just as before, base64 decoding and decompression are required in order to retrieve the code of the next stage. However, this time Dridex employs something we have not seen in previous stages — aliases.

In the above snippet, ‘nal’ (‘New-Alias’) and ‘sal’ (‘Set-Alias’) are used to set ‘cf’ and ‘ox’ as aliases for ‘New-Object’ and ‘iex,’ respectively.

“.(‘yi’)(${aB})” returns another call to the ‘yi’ function, which in turn provides the following output:

And after some cleanup, we can finally get a semi-clear picture of what the dropper tries to do:

After going over the above code (and adding a few notes for myself along the way, which I left in the snippet), I finally reached a verdict regarding the dropper’s true intention: it retrieves the user’s ID, removes the hyphens it contains, and assembles a URL that looks like this https://geronaga[.]com/gero?myHyphenLackingUID. It then downloads a file to the user’s temp directory, decodes and decrypts it, executes the file’s content using ‘regsvr32’ and then, finally, deletes this content to avoid leaving any traces. 

Since the domain is inactive and the focus of our blog is to present evasion techniques in Microsoft Office droppers, I did not expand my analysis of the downloaded file. However, since we know that ‘regsvr32’ is used to execute the file’s content and that the payload is a DLL, we can assume that the downloaded file contains a DLL registration command for the payload.

For a more expanded analysis of this dropper, you can read this excellent blog.

Less Complicated, More Files

Sometimes, simple obfuscation techniques can be sufficient to avoid detection, especially if the infection flow involves multiple stages and files written in different scripting languages, as demonstrated below in the analysis of an Emotet dropper from the malware family’s recent resurrection.

The Emotet dropper's VBA output. Note: some parts of the code were redacted, since they are irrelevant to this blog, moreover, some of them are never executed.

As you can see, the VBA function “Cells” is used in this script to extract contents of specified Excel cells and use them in the VBA script. Without knowing what these cells contain, it is difficult to determine whether the file is malicious or not, especially since none of the commands seems damning enough.

To get a clearer picture, I replaced all the cells highlighted functions in the above code snippet with the matching string values, highlighted in yellow in the below code snippet.

This provided greater insight into the script’s functionality; the “Wscript.shell” string suggests Wscript will be used to execute additional commands, while "c:\programdata\ughldskbhn.bat" and "c:\programdata\yhjlswle.vbs" imply that Emotet uses these Batch and VBS files in this infection flow.

The strings highlighted in green in the above snippet are replaced in the lengthy strings extracted from the Excel cells by an empty string using the VBA “Replace” function. Padding parts of the actual commands with these strings decreases the chances of them being flagged during a static analysis. After the VBA “Replace” command is run, the following is received:

With the information from the above decoded strings in hand, I could determine that the next stage in the infection flow is the VBS script, which the VBA dropper executes using “wscript.” Since there were no direct calls to the BAT script in the VBA code, I could assume that, if used, it would be executed from the VBS script.

Basically, the VBA dropper only creates the VBS and BAT files, writes content into each of them, and then the VBS script takes center stage.

c:\programdata\yhjlswle.vbs’s original content

As can be seen above, the VBS script contains several commands, all concatenated using colons. After separating the commands into different lines and activating the “replace” functions, I received the following:

Basically, the script executes the previously created Batch file and then tries to execute “c\:programdata\x08neuihlows.dll,” while providing it with the value “hjyldksfkw3” using rundll32. Since this is the first mention of “x08neuihlows.dll” and the VBS file executes the Batch script before running the DLL, it is fair to assume that the BAT script is in charge of dropping the executable in the right location.

Just like the VBS file uses colons to concatenate commands, the BAT script uses ampersands to do the same:

In short, the script sets a few variables, and concatenates their values in the below command.

Which translates into the following:

After base64 decoding the PowerShell script, I discovered how Emotet downloads their DLL payload and from where.

As can be seen below, the variable “MJXdfshDrfGZses4” contains a list of URLs which the script goes over using a “for” loop. Each time the “for” loop runs, it tries to download the Emotet DLL into "c:\programdata\bneuihlows.dll" using “Invoke-WebRequest.” Then, it checks if the downloaded file’s length is greater than 47436 bytes. If so, it means that the DLL was downloaded successfully, and the loop breaks.

The PowerShell code used to retrieve the Emotet payload

Interesting Cells and Where to Find Them

As we see in the above analysis, storing the actual commands in Excel cells instead of in the VBA code itself can be a good way to avoid detection because when a static analysis mechanism goes over the VBA code, it cannot determine whether the executed content is malicious or not. Since Excel cells have benign uses in VBA code as well, a security product may deem them as benign, to avoid a false positive.

Of course, if the cells are replaced with their content, the likelihood for detection increases. So I tried to find a way to replace the “cells” function calls with the right strings without running the VBA code during the analysis. 

During my research, which focused on OOXML files, I found two files, which Excel creates by default, that could help achieve this goal: “sharedStrings.xml” and “xl/worksheets/sheetName.xml.” 

The first file, “sharedStrings.xml,” contains all the strings in the Excel file. The class SharedStringItem (ssi) represents string items (si) and each si element contains a text (t). The file contains unique strings, each representing the full content of one or more Excel cells.

A SharedStrings.xml example

To match the strings to the right cells, we need a cell to string mapping — this is where “xl/worksheets/sheetName.xml” comes into the picture. In OOXML Excel files, data containing cells will be mapped in an XML file, which will be found in the following path- “xl/worksheets/sheetName.xml,” for example, the cells of “sheet1” will be mapped in “xl/worksheets/sheet1.xml.” Each one of these cells mapping files contains a tag called “SheetData,” which contains a “row” tag for each row in the sheet that contains data. Each “row” entry contains “c” (cell) entries. Cells that contain strings have their ‘t’ (type) values set to ‘s’ and their ‘v’ (value) tags contain an integer that is the index of the ‘si‘ object whose string the cell contains in “sharedStrings.xml.” Cells that contain other types of data, such as integers and floats, have it contained in their ‘v’ tags.

An example of an “xl/worksheets/sheetName.xml” file

By writing a script that extracts that data, matches cells to their appropriate values, and replaces “cell” function calls with these values, I could make the script less obfuscated and increase the likelihood of it being flagged by a static analysis mechanism. I also addressed the VBA “replace” functions issue and mimicked its functionality in my code.

The script is still in the works and currently handles only the “cells,” “transpose,” and “replace” functions. In addition, it only works on OOXML files and expects to get the VBA code as an input (I used oledump to extract it from examined Office files). There is still much work to do and cases to address, such as use of variables in function calls, e.g.: “cells($i, $j)” and of OLE files.

Prevention, Detection, and Everything in Between

Obfuscated droppers are more difficult to detect — they contain intentionally broken strings that evade static signatures, store malicious content in Excel cells, and use excessive comments in the hope of hiding their malicious content. But difficult does not mean impossible. Some patterns can still be signed statically, other behaviors can be detected dynamically, and if you want to take the bulldozer approach, you can just forbid all script executions (or at least most of them).

Conclusion

Deep Instinct’s agent uses deep learning to prevent malicious droppers, ensuring they can’t execute in your environment. The Deep Instinct Prevention Platform stops known, unknown, and zero-day threats with the highest accuracy and lowest false-positive rate in the industry. We stop attacks before they happen, identifying malicious files in <20ms, before execution.

If you’d like to see the platform in action for yourself, we’d be honored to show you what true prevention looks like. Please request a demo.

Indicators of Compromise (IoCs)

0042404ac9cbe7c082b9c0ae130e956ab7989cfa72a3f3b0c7f2226e23a6c6cb Emotet (Excel cells method) Office dropper

40a1e0aa0e580e2a15bbfd70ba4b89d3dd549bdc7bc075a223f12db0ddd2195d Emotet (Excel cells method) VBA code

ed7c68c3c103beaa7e5f30a3b70a52bb5428ce1498b7f64feda74342f93e16fe Emotet (excessive comments method) VBA code

028a5447d36c7445e3b24757d5cb37bafa54c5dfa7c3393fa69dd26e278442a4 Emotet (excessive comments method) Office dropper

9caed14e7f7d3e4706db2e74dc870abff571cce715f83ef91c563627822af6ad Dridex Office dropper

4f5ecf2c3073edd549e8ea2b1e65d8c478f3390567cffa3c909d328a3969ddd8 Dridex VBA code

cb9a5f0ad26cbb7b9f510b80df97f0045d7232d31cfde3cbce095d1c88c90e89 Aggah VBA code