This blog is the first in a 3 part series that will provide an in-depth technical analysis on the H1N1 malware. I'll be looking at how H1N1 has evolved, its obfuscation, analyzing its execution including new information stealing and user account control bypass capabilities, and finally exploring how we are both using and influencing security tools with this research.
Through the use of general characteristics exhibited by malware authors we are able to broadly categorize and positively identify malicious samples. These characteristics, discussed inThe General Behavior ofRansomware are indexed in a database, which allows us to identify patterns, outliers and obtain greater visibility and insight into various threats.
These data sets provide insight into the ever-growing attack vectors that affect our customers, which include malware delivery mechanisms. In this blog series we highlight newly added functionality to a malware variant that started out as being a 'loader' (strictly provides capabilities of loading other more complex malware variants) known as H1N1, and has now evolved into an information stealing variant.
Throughout the data mining exercises conducted by my colleagues and I on the AMP Threat Grid Research & Efficacy Team (RET) we have observed a widely distributed campaign using VBA macros to infect machines with a variant of information-stealing malware. Based on the initial characteristics observed by AMP Threat Grid we believed these malicious documents were distributing a Ransomware variant; however, we later found the dropped executables to be a variant of the H1N1 loader. H1N1 is a loader malware variant that has been known to deliver Pony DLLs and Vawtrak executables to infected machines. Upon infection, H1N1 previously only provided loading and system information reporting capabilities.1,2
H1N1 has added a plethora of new functionality in comparison to earlier reports. Throughout this blog series we will be analyzing the capabilities of H1N1 including: obfuscation, a User Account Control (UAC) bypass, information stealing, data exfiltration, loader/dropper, and self-propagation/lateral movement techniques used by this variant.1,2
The use of Visual Basic macros is nothing new, however, in recent months they have become one of the most popular infection vectors for all malware types, especially for Ransomware campaigns. These macros vary in sophistication from performing the download and execution of hosted binaries, to dropping the binaries themselves. In this campaign we see the latter where the document ships an entire encoded binary within the text box of a VBA macro form. All documents throughout this campaign have used a common naming convention in the following formats:
The domains for the first format observed include the financial, energy, communications, military and government sectors. Unsurprisingly, these documents are delivered through spear-phishing e-mail campaigns. A number of subject headings can be observed in VirusTotal:
Figure 1.0: Attached e-mail subject headings in VirusTotal for identified documentsAlthough the specified domain in the filename differentiates between targets, the lure message within the phishing e-mail does not vary drastically, for example:
Figure 2.0: Example phishing message within attached e-mailThe remaining formats appear to simply seem enticing enough to open being related finance, corporate or personal information.
Upon opening the document, the attacker attempts to social engineer the user into executing the malicious macro content by stating it will adjust to their version of Microsoft Word:
Figure 3.0: Social engineering content of document to open macrosThe VBA macro is highly obfuscated, making use of many VBA tricks to hide its true intent. These include the use of string functions: StrReverse, Ucase, Lcase, Right, Mid, and Left. For example, the following gets the %temp% path:
Figure 4.0: String obfuscation mechanisms to get %temp%Mid is used here to produce ".Scripting", Ucase and StrReverse are used to produce "FIleSystemObject", which is used to create a VBA FileSystemObject, that is then used with GetSpecialFolder, and some basic arithmetic resulting in "2" to get %temp%.As mentioned, the binary to be executed is extracted from a VBA form text box:
Figure 5.0: VBA form containing obfuscated PE within text boxThe text box content is set into a variable, which is then passed off to a de-obfuscation function. The core de-obfuscation functionality is a two steps process. The first is an XOR loop with a fixed byte key of 0xE, which produces a base64 encoded portable executable (PE):
Figure 6.0: XOR decoding/de-obfuscation loopThe second is a VBA implementation of base64 that decodes it to produce a final Portable Executable (PE):
Figure 7.0: VBA Base64 implementationThe de-obfuscated executable is then written to %temp% and executed. We can follow the execution flow through the use of process visualization in AMP Threat Grid. What this provides is graphed process interactions (child-parent relationships) for the entirety of the run. In the case of the H1N1 malicious document, it is very apparent that WINWORD.EXE is executing a separate binary:
Figure 8.0: Process graph showing execution of dropped executable from Microsoft WordThe binary has a total of three routines responsible for unpacking and injection. The first routine injects via the following steps:
We can then dump this to disk for analysis of the next unpacking stage. The next routine makes use of the injection method used by Duqu to write its unpacked image3:
In order to continue to trace the execution of this code (to what we discovered was more unpacking code) we wrote 0xEBFE (relative JMP to offset 0) to the entry point of the newly written Explorer.exe. This causes Explorer.exe to spin until we can attach to this process with a debugger.
Breaking on the first VirtualAlloc performed by the injected process enabled us to see a large allocation occur, and setting a breakpoint on writing to this memory location makes it apparent that an entire DLL is written to this memory location by the (current) unpacking code:
Figure 10.0: Upack MZ to be injectedLooking at the PE header the string "UpackByDwing" is apparent which indicates that this packer is being used on the final binary. Opening up this code with a disassembler (in this case IDA Pro) showed the following jump that could not be followed when the functions were graphed:
Figure 11.0: Function graph for final Upack unpacking stageThere is an infamous POPAD prior to this jump, which for those seasoned unpackers, is indicative of leading to the OEP of an unpacked binary due to restoring of the register state prior to the unpacked code being called. If a breakpoint is set on the OEP identified and we continue to trace through the injected code within Explorer.exe, it becomes clear that this address is eventually called from the unpacking code. At this point, once the breakpoint is hit, we can dump the unpacked binary to disk.
One final hurdle is required in order to get an independent executable that can be debugged. When the binary is written and jumped to, a pointer argument is passed on the stack that is later dereferenced within the binary. This is provided when the binary is unpacked from the injected Explorer.exe, however a null pointer is passed when the binary is executed independently. This argument points to a size value of 0x31DB used for a call to VirtualAlloc. We can edit the unpacked code in-line to point to a known address with this value:
Figure 12.0: In-line edits to allow independent binary executionI'm only going to cover the obfuscation techniques used by H1N1 in this blog. The remaining analysis of H1N1 will be posted in my next blog.
Upon opening the binary in a disassembler (in this case IDA Pro) we see that imports are resolved dynamically using hashing of DLLs and exports, and a string obfuscation technique used throughout the binary.
The string obfuscation technique makes use of SUB, XOR, and ADD with fixed DWORD values, and the result of each step using is stored using STOSD. The result of each operation is then used as the input (within EAX) for each subsequent step. For example:
Figure 13.0: String obfuscation technique exampleThe result of these operations produces the path to the WOW64 version of svchost.exe. We've written an IDAPython script to automatically decode these strings from a provided address starting with the XORing of EAX, performing operations on the DWORDs involved up to a certain "depth" (as strings vary in length), and adding the resulting string as a comment next to the next instruction head.4
Hashed imports can be resolved by hashing the library export names ourselves. Import name strings are obfuscated using the technique mentioned above, and export names from each library are hashed by walking the export table and performing a simple XOR and ROL loop over each name:
for(i = 0; i < strlen(export_name); i++) {
r = rol32(r, 7);
r ^= export_name[i];
}
We've replicated the hashing algorithm and all exports can be hashed from a given DLL. These hash values can be mapped within IDA using a C header file generated by our python script.5
In the next blog I'll provide the analysis of H1N1's execution. Stay tuned!
[1] https://www.proofpoint.com/tw/threat-insight/post/hancitor-ruckguv-reappear
[2] https://www.arbornetworks.com/blog/asert/wp-content/uploads/2015/06/blog_h1n1.pdf
[3] http://blog.w4kfu.com/tag/duqu
[4] https://communities.cisco.com/docs/DOC-69444
[5] https://communities.cisco.com/docs/DOC-69443