Windows PE Malware Analysis Part II
Overview
In Part I we conducted static analysis using basic tools and techniques on a malicious Windows portable executable and came up with several findings. In this article, we will continue our analysis using IDA Pro
to see if we can validate those findings as well as uncover additional functionality of the binary.
IDA Pro
IDA
or the Interactive Disassembler is a tool created by Hex-Rays that “…is capable of creating maps of their execution to show the binary instructions that are executed by the processor in a symbolic representation (assembly language).” In other words, IDA can take a compiled binary (EXE, ELF, Mach-O, etc..) and break it down into assembly level instructions which makes it easier for reverse engineers to analyze.
IDA
is known for its Fast Library Identification and Recognition Technology also known as F.L.I.R.T
which identifies standard function calls for multiple compilers. It comes in a Pro, Home or Free version. For this article, I will be using the pro version so there may be some additional labeling seen in the screenshots to come that will not appear in the free version. The minimum functionality needed to accomplish our analysis is the standard x86 disassembler which comes with all versions.
Loading the PE in IDA
When we load the binary in IDA, we get a pop-up window with several options to check before the analysis starts. By default, IDA will not load the .rsrc
section of the binary. To ensure that the section is loaded, we need to check the Load resources
button at the bottom right of the window. If you recall in Part I, we discovered a DLL in the .rsrc
section so we might want to verify in IDA as well and see if it is referenced at all.
Once IDA has finished its analysis of the PE, we are presented with the default view which has several tabs such as the IDA View, Hex View, Structures, Enums, Imports, Functions, and Exports. I won’t do a tutorial on IDA but I will give some quick explanations on features that are relevant to our analysis.
If we take a look at the Functions box, we see a list of function names and if you scroll a bit to the right you will see some info about each function such as the segment location and start address. Functions labeled sub_XXXXXXXX
(XXXXXXXX = function address) are functions that IDA does not have a symbol table or F.L.I.R.T
signature for. Which means we will have to reverse the function and label it ourselves. Other functions that are highlighted and have names are functions that IDA has a F.L.I.R.T
signature for or it parsed some header info in the binary like the WinMain(x,x,x,x)
function.
The next box we should be familiar with is the Imports tab. The Imports tab includes all loaded DLL’s function imports and the memory address of their location.
You can perform a Ctrl+f
to search for a specific function, double click the function, and automatically jump to its location. From there you can press the x
key and find all the cross-references (xrefs) to that function in the code. Below is an example of the xrefs to ReadFile
:
These are the basics for now. We will cover more as we go. You should take some time learning how to use IDA and discover more about what IDA has to offer.
Analysis
Embedded DLL
The first thing we will do is locate and verify the DLL we found in the .rsrc
section. We can do this by searching for the magic bytes MZ
. To do this we can press the Alt+b
keys and search the string "MZ"
making sure the Find all occurrences
box is checked along with the Hex
box. Click ok
and you will be taken to a list of all places where MZ
is found. (NOTE: Ensure it is wrapped in quotes or you might get a pop-up stating a bad digit was found in the input.)
Below we can see an occurrence in the .rsrc
section. Double click it to go to the offset.
Click the hex view to get a different look. IDA automatically syncs the IDA and Hex view for easier switching.
We could search for xrefs to the offset of where the DLL begins but this yields no results. Since ASLR is enabled on the binary, it will load at a different memory address at runtime, so we could search just the lower two bytes (9C30
) using the byte search feature we used earlier, but this too does not give us anything.
More investigation will have to be done during code analysis to see how this DLL is used.
Stack Strings (the hard way)
Throughout the binary there are numerous locations containing a series of hex bytes that are placed on the stack in 4-byte increments, then XOR’d with a 16-byte key. See the example below at the beginning of the WinMain()
function:
The main focus for the rest of this article will be deobfuscating stack strings. First, I will demonstrate how to manually decode the strings and then we will use the FLARE Obfuscated String Solver a.k.a FLOSS
by FireEye to do it automatically. (*NOTE: FLOSS comes preinstalled in FlareVM.)
We will start by observing the first block of the WinMain
function at address 0x00404780
. After the function prologue, starting at address 0x0040478B
we see the first 4 bytes of the string placed on the stack at offset [esp+70h+ModuleName]
and the address is placed into the eax
register.
This is the beginning of the string. If we try and convert it to ASCII as it is, we get unreadable ASCII nonsense as seen in CyberChef. (NOTE: Bytes are read backward or in little-endian format)
The next three instructions store the rest of the string on the stack. If we include them in CyberChef again there is even more nonsense and no readable ASCII strings. After the string is stored on the stack, the next instruction stores a pointer to the beginning of the string in the xmm1
register with the instruction movaps xmm1, xmmword ptr [esp+70h+ModuleName]
at address 0x004047AF
.
Next, there is a push
instruction for the address of the phModule
parameter. If you haven’t noticed by now, IDA’s FLIRT technology recognizes a call to GetModuleHandleExA
at address 0x004047E7
. GetModuleHandleExA
is used to return a handle
to a specified module (DLL). Because IDA has a signature for that function, it also knows what parameters it takes and automatically labels them as they are pushed on the stack. phModule
is the third argument to the function, therefore it is pushed on the stack first. (NOTE: In x86, the calling convention takes arguments at the top of the stack with the first argument at the top and so on.) Below is how GetModuleHandleExA
looks in C++:
BOOL GetModuleHandleExA(
[in] DWORD dwFlags,
[in, optional] LPCSTR lpModuleName,
[out] HMODULE *phModule
);
It is good to have a general understanding of what might be happening with this block of code as we deobfuscate because it can help generate assumptions on what the string could be. Since we know GetModuleHandleExA
is being called, we can assume that the string might be a DLL.
The next four instructions starting at address 0x004047B9
will place 16 bytes on the stack at stack offset [esp+74h+var_20]
. In little-endian this is 75 00 D2 55 02 41 17 97 76 CC A0 8A 31 a0 2f de
. This is the key that is used to deobfuscate the previous 16-byte string we saw earlier. How do we know this? Take a look at the next instruction: pxor xmm1, [esp+74h+var_20]
The pxor
instruction is a logical bitwise exclusive OR operation that stores the result in the xmm1
register. The xmm1
register currently points to our obfuscated string but after the pxor
operation, the string will become something different. Let’s use CyberChef again and see if we can’t decode the string to something more readable:
Just as we suspected, the string now becomes kernel32.dll....
. The last four bytes are ....
or zeros because the last four bytes of the original string and the key are the same, which effectively cancels each other out in an XOR operation.
At address 0x004047DF
, the register eax
, which holds a pointer to the beginning of the string on the stack, is now pushed onto the top of the stack as the lpModuleName
argument which we now know is the string kernel32.dll
. Next, the dwFlags
argument, which is 0, is pushed onto the stack. Lastly, the pointer stored in the xmm1
register is moved into the stack at offset xmmword ptr [esp+7Ch+ModuleName]
and finally GetModuleHandleExA
is called. We can add our own comments by highlighting a line and pressing Shift+;
. I will add a comment to show our string’s true value is kernel32.dll
:
We can repeat these steps across the entire binary but that would be a very long and tedious process. Fortunate for us, there is an automated solution.
Stack Strings (the easy way)
The folks over at FireEye Labs have developed a tool called the FLARE Obfuscated String Solver a.k.a FLOSS
. FLOSS is a Windows binary included with FlareVM that takes care of all the heavy lifting for us and simply takes a binary as input.
NOTE: Before copying any malware to your FlareVM, ensure you are set to a Host-only network adapter to prevent potential infections on your network.
Copy the specimen to your FLareVM and in a command prompt type floss.exe Setup.exe
FLARE Tue 10/26/2021 12:54:54.25
C:\Users\User\Desktop\Setup.bin>floss Setup.exe
FLOSS will output all normal strings, encoded strings, and stack strings that it finds to the console. We are mostly interested in the stack string section. Take a look at all the stack strings FLOSS discovered automatically!
Complete output:
FLOSS extracted 65 stackstrings
InternetCloseHandle
null
WinHttpQueryDataAvailable
VirtualAlloc
CreateThread
.dll
GetProcessHeap
HttpSendRequestA
Kernel32.dll
DeleteFileA
InternetConnectA
https://
wininet.dll
*!Nb
WINHTTP.dll
HOST:
D/.@
WinHttpOpen
InternetQueryOptionA
CharNextA
3nPQ+
WinHttpSendRequest
http://45.133.1.107/server.txt
InternetSetOptionA
.dll
User32.dll
wininet.dll
GetModuleHandleA
GetLastError
HttpOpenRequestA
CloseHandle
SetPriorityClass
WinHttpQueryHeaders
InternetCloseHanO
51.178.186.149
SHGetFolderPathA
_Thttp://wfsdragon.ru/api/setStats.php
InternetOpenA
GetCurrentProcess
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.63 Safari/537.36
Winhttp.dll
.dll
HeapAlloc
HttpQueryInfoA
URL:
ZBAA
WinHttpReceiveResponse
http://
InternetReadFile
WinHttpReadData
kernel32.dll
HEAD
Shell32.dll
LoadLibraryA
abcdefghijklmnopqrstuvwxyz0123456789_ABCDEFGHIJKLMNOPQRSTUVWXYZ
/base/api/statistics.php
Sleep
WinHttpCloseHandle
GetCurrentProcesu
pastebin.com/raw/A7dSG1te
InternetOpenUrlA
WinHttpQueryHead
ineIGenu
WinHttpOpenRequest
WinHttpConnect
Finished execution after 41.094000 seconds
This is a lot of great info that can be used to formulate better assumptions and a more firm hypothesis on the features of this malware. We can also start to build out a list of potential IOCs (Indicators of Compromise) with the discovered URLs and IP addresses and files.
FLOSS can also generate an IDA python script that will do its best to comment all the discovered strings in a .idb
file. Just run the command: floss Setup.exe -i setup.exe.py
. Below is an example of the generated python script:
def AppendComment(ea, string, repeatable=False):
current_string = get_cmt(ea, repeatable)
if not current_string:
cmt = string
else:
if string in current_string: # ignore duplicates
return
cmt = string + "\n" + string
set_cmt(ea, cmt, repeatable)
def AppendLvarComment(fva, frame_offset, s, repeatable=False):
stack = get_func_attr(fva, FUNCATTR_FRAME)
if stack:
lvar_offset = get_func_attr(fva, FUNCATTR_FRSIZE) - frame_offset
if lvar_offset and lvar_offset > 0:
string = get_member_cmt(stack, lvar_offset, repeatable)
if not string:
string = s
else:
if s in string: # ignore duplicates
return
string = string + "\n" + s
if set_member_cmt(stack, lvar_offset, string, repeatable):
print('FLOSS appended stackstring comment \"%s\" at stack frame offset 0x%X in function 0x%X' % (s, frame_offset, fva))
return
print('Failed to append stackstring comment \"%s\" at stack frame offset 0x%X in function 0x%X' % (s, frame_offset, fva))
def main():
print('Annotating 65 strings from FLOSS for Setup.exe')
print("Imported decoded strings from FLOSS")
AppendLvarComment(4229024, 92, "FLOSS stackstring: InternetCloseHandle", True)
AppendLvarComment(4203616, 156, "FLOSS stackstring: null", True)
AppendLvarComment(4216496, 92, "FLOSS stackstring: WinHttpQueryDataAvailable", True)
AppendLvarComment(4200928, 28, "FLOSS stackstring: VirtualAlloc", True)
AppendLvarComment(4200928, 28, "FLOSS stackstring: CreateThread", True)
AppendLvarComment(4223952, 300, "FLOSS stackstring: .dll", True)
AppendLvarComment(4229024, 28, "FLOSS stackstring: GetProcessHeap", True)
AppendLvarComment(4229024, 92, "FLOSS stackstring: HttpSendRequestA", True)
AppendLvarComment(4229024, 28, "FLOSS stackstring: Kernel32.dll", True)
AppendLvarComment(4200928, 28, "FLOSS stackstring: DeleteFileA", True)
AppendLvarComment(4229024, 92, "FLOSS stackstring: InternetConnectA", True)
AppendLvarComment(4210064, 316, "FLOSS stackstring: https://", True)
AppendLvarComment(4210064, 300, "FLOSS stackstring: wininet.dll", True)
AppendLvarComment(4247280, 21, "FLOSS stackstring: *!Nb", True)
AppendLvarComment(4200704, 28, "FLOSS stackstring: WINHTTP.dll", True)
AppendLvarComment(4204880, 380, "FLOSS stackstring: HOST:", True)
AppendLvarComment(4227952, 273, "FLOSS stackstring: D/.@", True)
AppendLvarComment(4216496, 44, "FLOSS stackstring: WinHttpOpen", True)
AppendLvarComment(4229024, 92, "FLOSS stackstring: InternetQueryOptionA", True)
AppendLvarComment(4229024, 28, "FLOSS stackstring: CharNextA", True)
AppendLvarComment(4216496, 42, "FLOSS stackstring: 3nPQ+", True)
AppendLvarComment(4216496, 92, "FLOSS stackstring: WinHttpSendRequest", True)
AppendLvarComment(4204880, 428, "FLOSS stackstring: http://45.133.1.107/server.txt", True)
AppendLvarComment(4229024, 92, "FLOSS stackstring: InternetSetOptionA", True)
AppendLvarComment(4199408, 348, "FLOSS stackstring: .dll", True)
AppendLvarComment(4229024, 44, "FLOSS stackstring: User32.dll", True)
AppendLvarComment(4200704, 44, "FLOSS stackstring: wininet.dll", True)
AppendLvarComment(4200928, 108, "FLOSS stackstring: GetModuleHandleA", True)
AppendLvarComment(4229024, 28, "FLOSS stackstring: GetLastError", True)
AppendLvarComment(4229024, 92, "FLOSS stackstring: HttpOpenRequestA", True)
AppendLvarComment(4200928, 28, "FLOSS stackstring: CloseHandle", True)
AppendLvarComment(4200928, 108, "FLOSS stackstring: SetPriorityClass", True)
AppendLvarComment(4216496, 92, "FLOSS stackstring: WinHttpQueryHeaders", True)
AppendLvarComment(4229024, 92, "FLOSS stackstring: InternetCloseHanO", True)
AppendLvarComment(4204880, 287, "FLOSS stackstring: 51.178.186.149", True)
AppendLvarComment(4200928, 108, "FLOSS stackstring: SHGetFolderPathA", True)
AppendLvarComment(4204880, 478, "FLOSS stackstring: _Thttp://wfsdragon.ru/api/setStats.php", True)
AppendLvarComment(4229024, 28, "FLOSS stackstring: InternetOpenA", True)
AppendLvarComment(4200928, 108, "FLOSS stackstring: GetCurrentProcess", True)
AppendLvarComment(4227952, 204, "FLOSS stackstring: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.63 Safari/537.36", True)
AppendLvarComment(4216496, 28, "FLOSS stackstring: Winhttp.dll", True)
AppendLvarComment(4210064, 348, "FLOSS stackstring: .dll", True)
AppendLvarComment(4229024, 44, "FLOSS stackstring: HeapAlloc", True)
AppendLvarComment(4229024, 28, "FLOSS stackstring: HttpQueryInfoA", True)
AppendLvarComment(4208768, 156, "FLOSS stackstring: URL:", True)
AppendLvarComment(4261746, 24, "FLOSS stackstring: ZBAA", True)
AppendLvarComment(4216496, 92, "FLOSS stackstring: WinHttpReceiveResponse", True)
AppendLvarComment(4219056, 108, "FLOSS stackstring: http://", True)
AppendLvarComment(4229024, 92, "FLOSS stackstring: InternetReadFile", True)
AppendLvarComment(4216496, 44, "FLOSS stackstring: WinHttpReadData", True)
AppendLvarComment(4212608, 28, "FLOSS stackstring: kernel32.dll", True)
AppendLvarComment(4224848, 76, "FLOSS stackstring: HEAD", True)
AppendLvarComment(4200928, 60, "FLOSS stackstring: Shell32.dll", True)
AppendLvarComment(4200928, 28, "FLOSS stackstring: LoadLibraryA", True)
AppendLvarComment(4215024, 71, "FLOSS stackstring: abcdefghijklmnopqrstuvwxyz0123456789_ABCDEFGHIJKLMNOPQRSTUVWXYZ", True)
AppendLvarComment(4208768, 92, "FLOSS stackstring: /base/api/statistics.php", True)
AppendLvarComment(4200928, 28, "FLOSS stackstring: Sleep", True)
AppendLvarComment(4216496, 92, "FLOSS stackstring: WinHttpCloseHandle", True)
AppendLvarComment(4200928, 108, "FLOSS stackstring: GetCurrentProcesu", True)
AppendLvarComment(4204880, 364, "FLOSS stackstring: pastebin.com/raw/A7dSG1te", True)
AppendLvarComment(4229024, 92, "FLOSS stackstring: InternetOpenUrlA", True)
AppendLvarComment(4216496, 92, "FLOSS stackstring: WinHttpQueryHead", True)
AppendLvarComment(4233986, 8, "FLOSS stackstring: ineIGenu", True)
AppendLvarComment(4216496, 92, "FLOSS stackstring: WinHttpOpenRequest", True)
AppendLvarComment(4216496, 44, "FLOSS stackstring: WinHttpConnect", True)
print("Imported stackstrings from FLOSS")
ida_kernwin.refresh_idaview_anyway()
if __name__ == "__main__":
main()
Here is what IDA looks like after the script is run:
Play around with FLOSS’s features and discover more of its functionality.
Part II Conclusion
In Part II of this series we became familiar with the IDA, validated a previous finding from Part I, and learned how to deobfuscate stack strings, used by malware authors to bypass signature-based scanners, both in a manual and automated fashion.
We could probably spend days digging through this binary using static analysis alone but in order to speed things up we can perform behavior analysis to capture all modifications to the OS and network traffic. We can also use code analysis to automatically discover obfuscated strings and view content before its encoded/encrypted.
In Part III of this series we will be performing code analysis using x64dbg
in FlareVM. If you have any questions please reach out via LinkedIn, Discord, or email.