Windows PE Malware Analysis Part II

12 minute read

Overview

In Part I we conducted static analysis using basic tools and techniques on a malicious Windows portable executable and came up with several findings. In this article, we will continue our analysis using IDA Pro to see if we can validate those findings as well as uncover additional functionality of the binary.

IDA Pro

IDA or the Interactive Disassembler is a tool created by Hex-Rays that “…is capable of creating maps of their execution to show the binary instructions that are executed by the processor in a symbolic representation (assembly language).” In other words, IDA can take a compiled binary (EXE, ELF, Mach-O, etc..) and break it down into assembly level instructions which makes it easier for reverse engineers to analyze.

IDA is known for its Fast Library Identification and Recognition Technology also known as F.L.I.R.T which identifies standard function calls for multiple compilers. It comes in a Pro, Home or Free version. For this article, I will be using the pro version so there may be some additional labeling seen in the screenshots to come that will not appear in the free version. The minimum functionality needed to accomplish our analysis is the standard x86 disassembler which comes with all versions.

Loading the PE in IDA

When we load the binary in IDA, we get a pop-up window with several options to check before the analysis starts. By default, IDA will not load the .rsrc section of the binary. To ensure that the section is loaded, we need to check the Load resources button at the bottom right of the window. If you recall in Part I, we discovered a DLL in the .rsrc section so we might want to verify in IDA as well and see if it is referenced at all.

image

Once IDA has finished its analysis of the PE, we are presented with the default view which has several tabs such as the IDA View, Hex View, Structures, Enums, Imports, Functions, and Exports. I won’t do a tutorial on IDA but I will give some quick explanations on features that are relevant to our analysis.

If we take a look at the Functions box, we see a list of function names and if you scroll a bit to the right you will see some info about each function such as the segment location and start address. Functions labeled sub_XXXXXXXX (XXXXXXXX = function address) are functions that IDA does not have a symbol table or F.L.I.R.T signature for. Which means we will have to reverse the function and label it ourselves. Other functions that are highlighted and have names are functions that IDA has a F.L.I.R.T signature for or it parsed some header info in the binary like the WinMain(x,x,x,x) function.

image

The next box we should be familiar with is the Imports tab. The Imports tab includes all loaded DLL’s function imports and the memory address of their location.

image

You can perform a Ctrl+f to search for a specific function, double click the function, and automatically jump to its location. From there you can press the x key and find all the cross-references (xrefs) to that function in the code. Below is an example of the xrefs to ReadFile:

image

These are the basics for now. We will cover more as we go. You should take some time learning how to use IDA and discover more about what IDA has to offer.

Analysis

Embedded DLL

The first thing we will do is locate and verify the DLL we found in the .rsrc section. We can do this by searching for the magic bytes MZ. To do this we can press the Alt+b keys and search the string "MZ" making sure the Find all occurrences box is checked along with the Hex box. Click ok and you will be taken to a list of all places where MZ is found. (NOTE: Ensure it is wrapped in quotes or you might get a pop-up stating a bad digit was found in the input.)

Below we can see an occurrence in the .rsrc section. Double click it to go to the offset.

image

image

Click the hex view to get a different look. IDA automatically syncs the IDA and Hex view for easier switching.

image

We could search for xrefs to the offset of where the DLL begins but this yields no results. Since ASLR is enabled on the binary, it will load at a different memory address at runtime, so we could search just the lower two bytes (9C30) using the byte search feature we used earlier, but this too does not give us anything.

image

image

More investigation will have to be done during code analysis to see how this DLL is used.

Stack Strings (the hard way)

Throughout the binary there are numerous locations containing a series of hex bytes that are placed on the stack in 4-byte increments, then XOR’d with a 16-byte key. See the example below at the beginning of the WinMain() function:

image

The main focus for the rest of this article will be deobfuscating stack strings. First, I will demonstrate how to manually decode the strings and then we will use the FLARE Obfuscated String Solver a.k.a FLOSS by FireEye to do it automatically. (*NOTE: FLOSS comes preinstalled in FlareVM.)

We will start by observing the first block of the WinMain function at address 0x00404780. After the function prologue, starting at address 0x0040478B we see the first 4 bytes of the string placed on the stack at offset [esp+70h+ModuleName] and the address is placed into the eax register.

image

This is the beginning of the string. If we try and convert it to ASCII as it is, we get unreadable ASCII nonsense as seen in CyberChef. (NOTE: Bytes are read backward or in little-endian format)

image

The next three instructions store the rest of the string on the stack. If we include them in CyberChef again there is even more nonsense and no readable ASCII strings. After the string is stored on the stack, the next instruction stores a pointer to the beginning of the string in the xmm1 register with the instruction movaps xmm1, xmmword ptr [esp+70h+ModuleName] at address 0x004047AF.

Next, there is a push instruction for the address of the phModule parameter. If you haven’t noticed by now, IDA’s FLIRT technology recognizes a call to GetModuleHandleExA at address 0x004047E7. GetModuleHandleExA is used to return a handle to a specified module (DLL). Because IDA has a signature for that function, it also knows what parameters it takes and automatically labels them as they are pushed on the stack. phModule is the third argument to the function, therefore it is pushed on the stack first. (NOTE: In x86, the calling convention takes arguments at the top of the stack with the first argument at the top and so on.) Below is how GetModuleHandleExA looks in C++:

BOOL GetModuleHandleExA(
  [in]           DWORD   dwFlags, 
  [in, optional] LPCSTR  lpModuleName,
  [out]          HMODULE *phModule
);

It is good to have a general understanding of what might be happening with this block of code as we deobfuscate because it can help generate assumptions on what the string could be. Since we know GetModuleHandleExA is being called, we can assume that the string might be a DLL.

The next four instructions starting at address 0x004047B9 will place 16 bytes on the stack at stack offset [esp+74h+var_20]. In little-endian this is 75 00 D2 55 02 41 17 97 76 CC A0 8A 31 a0 2f de. This is the key that is used to deobfuscate the previous 16-byte string we saw earlier. How do we know this? Take a look at the next instruction: pxor xmm1, [esp+74h+var_20]

image

The pxor instruction is a logical bitwise exclusive OR operation that stores the result in the xmm1 register. The xmm1 register currently points to our obfuscated string but after the pxor operation, the string will become something different. Let’s use CyberChef again and see if we can’t decode the string to something more readable:

image

Just as we suspected, the string now becomes kernel32.dll..... The last four bytes are .... or zeros because the last four bytes of the original string and the key are the same, which effectively cancels each other out in an XOR operation.

At address 0x004047DF, the register eax, which holds a pointer to the beginning of the string on the stack, is now pushed onto the top of the stack as the lpModuleName argument which we now know is the string kernel32.dll. Next, the dwFlags argument, which is 0, is pushed onto the stack. Lastly, the pointer stored in the xmm1 register is moved into the stack at offset xmmword ptr [esp+7Ch+ModuleName] and finally GetModuleHandleExA is called. We can add our own comments by highlighting a line and pressing Shift+;. I will add a comment to show our string’s true value is kernel32.dll:

image

We can repeat these steps across the entire binary but that would be a very long and tedious process. Fortunate for us, there is an automated solution.

Stack Strings (the easy way)

The folks over at FireEye Labs have developed a tool called the FLARE Obfuscated String Solver a.k.a FLOSS. FLOSS is a Windows binary included with FlareVM that takes care of all the heavy lifting for us and simply takes a binary as input.

NOTE: Before copying any malware to your FlareVM, ensure you are set to a Host-only network adapter to prevent potential infections on your network.

Copy the specimen to your FLareVM and in a command prompt type floss.exe Setup.exe

FLARE Tue 10/26/2021 12:54:54.25
C:\Users\User\Desktop\Setup.bin>floss Setup.exe

FLOSS will output all normal strings, encoded strings, and stack strings that it finds to the console. We are mostly interested in the stack string section. Take a look at all the stack strings FLOSS discovered automatically!

image

Complete output:

FLOSS extracted 65 stackstrings
InternetCloseHandle
null
WinHttpQueryDataAvailable
VirtualAlloc
CreateThread
.dll
GetProcessHeap
HttpSendRequestA
Kernel32.dll
DeleteFileA
InternetConnectA
https://
wininet.dll
*!Nb
WINHTTP.dll
HOST:
D/.@
WinHttpOpen
InternetQueryOptionA
CharNextA
3nPQ+
WinHttpSendRequest
http://45.133.1.107/server.txt
InternetSetOptionA
.dll
User32.dll
wininet.dll
GetModuleHandleA
GetLastError
HttpOpenRequestA
CloseHandle
SetPriorityClass
WinHttpQueryHeaders
InternetCloseHanO
51.178.186.149
SHGetFolderPathA
_Thttp://wfsdragon.ru/api/setStats.php
InternetOpenA
GetCurrentProcess
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.63 Safari/537.36
Winhttp.dll
.dll
HeapAlloc
HttpQueryInfoA
URL:
ZBAA
WinHttpReceiveResponse
http://
InternetReadFile
WinHttpReadData
kernel32.dll
HEAD
Shell32.dll
LoadLibraryA
abcdefghijklmnopqrstuvwxyz0123456789_ABCDEFGHIJKLMNOPQRSTUVWXYZ
/base/api/statistics.php
Sleep
WinHttpCloseHandle
GetCurrentProcesu
pastebin.com/raw/A7dSG1te
InternetOpenUrlA
WinHttpQueryHead
ineIGenu
WinHttpOpenRequest
WinHttpConnect

Finished execution after 41.094000 seconds

This is a lot of great info that can be used to formulate better assumptions and a more firm hypothesis on the features of this malware. We can also start to build out a list of potential IOCs (Indicators of Compromise) with the discovered URLs and IP addresses and files.

FLOSS can also generate an IDA python script that will do its best to comment all the discovered strings in a .idb file. Just run the command: floss Setup.exe -i setup.exe.py. Below is an example of the generated python script:


def AppendComment(ea, string, repeatable=False):
    current_string = get_cmt(ea, repeatable)

    if not current_string:
        cmt = string
    else:
        if string in current_string:  # ignore duplicates
            return
        cmt = string + "\n" + string
    set_cmt(ea, cmt, repeatable)


def AppendLvarComment(fva, frame_offset, s, repeatable=False):
    stack = get_func_attr(fva, FUNCATTR_FRAME)
    if stack:
        lvar_offset = get_func_attr(fva, FUNCATTR_FRSIZE) - frame_offset
        if lvar_offset and lvar_offset > 0:
            string = get_member_cmt(stack, lvar_offset, repeatable)
            if not string:
                string = s
            else:
                if s in string:  # ignore duplicates
                    return
                string = string + "\n" + s
            if set_member_cmt(stack, lvar_offset, string, repeatable):
                print('FLOSS appended stackstring comment \"%s\" at stack frame offset 0x%X in function 0x%X' % (s, frame_offset, fva))
                return
    print('Failed to append stackstring comment \"%s\" at stack frame offset 0x%X in function 0x%X' % (s, frame_offset, fva))


def main():
    print('Annotating 65 strings from FLOSS for Setup.exe')
    print("Imported decoded strings from FLOSS")
    AppendLvarComment(4229024, 92, "FLOSS stackstring: InternetCloseHandle", True)
    AppendLvarComment(4203616, 156, "FLOSS stackstring: null", True)
    AppendLvarComment(4216496, 92, "FLOSS stackstring: WinHttpQueryDataAvailable", True)
    AppendLvarComment(4200928, 28, "FLOSS stackstring: VirtualAlloc", True)
    AppendLvarComment(4200928, 28, "FLOSS stackstring: CreateThread", True)
    AppendLvarComment(4223952, 300, "FLOSS stackstring: .dll", True)
    AppendLvarComment(4229024, 28, "FLOSS stackstring: GetProcessHeap", True)
    AppendLvarComment(4229024, 92, "FLOSS stackstring: HttpSendRequestA", True)
    AppendLvarComment(4229024, 28, "FLOSS stackstring: Kernel32.dll", True)
    AppendLvarComment(4200928, 28, "FLOSS stackstring: DeleteFileA", True)
    AppendLvarComment(4229024, 92, "FLOSS stackstring: InternetConnectA", True)
    AppendLvarComment(4210064, 316, "FLOSS stackstring: https://", True)
    AppendLvarComment(4210064, 300, "FLOSS stackstring: wininet.dll", True)
    AppendLvarComment(4247280, 21, "FLOSS stackstring: *!Nb", True)
    AppendLvarComment(4200704, 28, "FLOSS stackstring: WINHTTP.dll", True)
    AppendLvarComment(4204880, 380, "FLOSS stackstring: HOST:", True)
    AppendLvarComment(4227952, 273, "FLOSS stackstring: D/.@", True)
    AppendLvarComment(4216496, 44, "FLOSS stackstring: WinHttpOpen", True)
    AppendLvarComment(4229024, 92, "FLOSS stackstring: InternetQueryOptionA", True)
    AppendLvarComment(4229024, 28, "FLOSS stackstring: CharNextA", True)
    AppendLvarComment(4216496, 42, "FLOSS stackstring: 3nPQ+", True)
    AppendLvarComment(4216496, 92, "FLOSS stackstring: WinHttpSendRequest", True)
    AppendLvarComment(4204880, 428, "FLOSS stackstring: http://45.133.1.107/server.txt", True)
    AppendLvarComment(4229024, 92, "FLOSS stackstring: InternetSetOptionA", True)
    AppendLvarComment(4199408, 348, "FLOSS stackstring: .dll", True)
    AppendLvarComment(4229024, 44, "FLOSS stackstring: User32.dll", True)
    AppendLvarComment(4200704, 44, "FLOSS stackstring: wininet.dll", True)
    AppendLvarComment(4200928, 108, "FLOSS stackstring: GetModuleHandleA", True)
    AppendLvarComment(4229024, 28, "FLOSS stackstring: GetLastError", True)
    AppendLvarComment(4229024, 92, "FLOSS stackstring: HttpOpenRequestA", True)
    AppendLvarComment(4200928, 28, "FLOSS stackstring: CloseHandle", True)
    AppendLvarComment(4200928, 108, "FLOSS stackstring: SetPriorityClass", True)
    AppendLvarComment(4216496, 92, "FLOSS stackstring: WinHttpQueryHeaders", True)
    AppendLvarComment(4229024, 92, "FLOSS stackstring: InternetCloseHanO", True)
    AppendLvarComment(4204880, 287, "FLOSS stackstring: 51.178.186.149", True)
    AppendLvarComment(4200928, 108, "FLOSS stackstring: SHGetFolderPathA", True)
    AppendLvarComment(4204880, 478, "FLOSS stackstring: _Thttp://wfsdragon.ru/api/setStats.php", True)
    AppendLvarComment(4229024, 28, "FLOSS stackstring: InternetOpenA", True)
    AppendLvarComment(4200928, 108, "FLOSS stackstring: GetCurrentProcess", True)
    AppendLvarComment(4227952, 204, "FLOSS stackstring: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.63 Safari/537.36", True)
    AppendLvarComment(4216496, 28, "FLOSS stackstring: Winhttp.dll", True)
    AppendLvarComment(4210064, 348, "FLOSS stackstring: .dll", True)
    AppendLvarComment(4229024, 44, "FLOSS stackstring: HeapAlloc", True)
    AppendLvarComment(4229024, 28, "FLOSS stackstring: HttpQueryInfoA", True)
    AppendLvarComment(4208768, 156, "FLOSS stackstring: URL:", True)
    AppendLvarComment(4261746, 24, "FLOSS stackstring: ZBAA", True)
    AppendLvarComment(4216496, 92, "FLOSS stackstring: WinHttpReceiveResponse", True)
    AppendLvarComment(4219056, 108, "FLOSS stackstring: http://", True)
    AppendLvarComment(4229024, 92, "FLOSS stackstring: InternetReadFile", True)
    AppendLvarComment(4216496, 44, "FLOSS stackstring: WinHttpReadData", True)
    AppendLvarComment(4212608, 28, "FLOSS stackstring: kernel32.dll", True)
    AppendLvarComment(4224848, 76, "FLOSS stackstring: HEAD", True)
    AppendLvarComment(4200928, 60, "FLOSS stackstring: Shell32.dll", True)
    AppendLvarComment(4200928, 28, "FLOSS stackstring: LoadLibraryA", True)
    AppendLvarComment(4215024, 71, "FLOSS stackstring: abcdefghijklmnopqrstuvwxyz0123456789_ABCDEFGHIJKLMNOPQRSTUVWXYZ", True)
    AppendLvarComment(4208768, 92, "FLOSS stackstring: /base/api/statistics.php", True)
    AppendLvarComment(4200928, 28, "FLOSS stackstring: Sleep", True)
    AppendLvarComment(4216496, 92, "FLOSS stackstring: WinHttpCloseHandle", True)
    AppendLvarComment(4200928, 108, "FLOSS stackstring: GetCurrentProcesu", True)
    AppendLvarComment(4204880, 364, "FLOSS stackstring: pastebin.com/raw/A7dSG1te", True)
    AppendLvarComment(4229024, 92, "FLOSS stackstring: InternetOpenUrlA", True)
    AppendLvarComment(4216496, 92, "FLOSS stackstring: WinHttpQueryHead", True)
    AppendLvarComment(4233986, 8, "FLOSS stackstring: ineIGenu", True)
    AppendLvarComment(4216496, 92, "FLOSS stackstring: WinHttpOpenRequest", True)
    AppendLvarComment(4216496, 44, "FLOSS stackstring: WinHttpConnect", True)
    print("Imported stackstrings from FLOSS")
    ida_kernwin.refresh_idaview_anyway()

if __name__ == "__main__":
    main()

Here is what IDA looks like after the script is run:

image

Play around with FLOSS’s features and discover more of its functionality.

Part II Conclusion

In Part II of this series we became familiar with the IDA, validated a previous finding from Part I, and learned how to deobfuscate stack strings, used by malware authors to bypass signature-based scanners, both in a manual and automated fashion.

We could probably spend days digging through this binary using static analysis alone but in order to speed things up we can perform behavior analysis to capture all modifications to the OS and network traffic. We can also use code analysis to automatically discover obfuscated strings and view content before its encoded/encrypted.

In Part III of this series we will be performing code analysis using x64dbg in FlareVM. If you have any questions please reach out via LinkedIn, Discord, or email.