The analysis in this article will focus on a maliciously dropped DLL file discovered by the Portcullis CTADS team during an investigation.
The malware actions are based on the configuration that the dropper applies to the infected system, however, typically it will create a service to ensure that the malware will run on every system startup.
During this article we will analyse the selective on-the-fly-decryption of data based on the generic functions of the malware or its dynamic importing of the required Windows APIs. We will also demonstrate the value of using scripts with Olly Debug.
As a summary we will extract some conclusions on the behaviour of the malware, based on the points identified during the reverse engineering of the decryption process.
The technical and practical value of using these techniques reduces the likelihood of analysts being able to statically identify the code.
The main concept behind this trick being “if you don’t need it, then why reveal it?”. In other words, if there is a set of data that can turn the detection of specific malware into a trivial process, then why keep it in plain text in the first place! Furthermore, the effectiveness of this technique is not just about preventing static analysis through a disassembler, but also functions as an effective defence against memory dumpers.
We can safely assume that the malware will probably hide some encrypted information, we can also assume that someone will attempt to run it and dump its memory for use in further analysis, hopefully finding the data decrypted.
Obviously in cases where the algorithm has been identified and the decryption key recovered, the malware analyst can manually decrypt the data for analysis.
However, typically those levels of automation only happens once the malware has been analysed, in this case we are proceeding on the basis that this has not already been done. This malware makes things even more complicated, individual parts of data are only decrypted as they are needed, and the decryption key is different for each one, making it impossible to decrypt all the encrypted information simply by creating a decryption loop.
Unless of course you have a table of all the keys used from inside the malware along with the data chunks that correspond to each one of them for the decryption process, this would only require potentially 10-20 calls to the decryption routine. However, in this instance the same routine is referenced from almost 700 different code locations each using different decryption keys!
Additionally, the malware author has further complicated the analysis by implementing a function that removes the decrypted data from memory once used in an attempt to slow down the analysis.
This process involves a number of cycles which are used to protect the original encrypted data. By reading the encrypted data, decrypting it and writing the result to another memory location, the code author can effectively control the time the data is in memory and reliably overwrite the memory location the decrypted data is written to.
The following steps summarise how the malware dynamically retrieves the address of the Windows API that it is going to use next.
Decrypt DLL name → Get a handle to it (Call LoadLibrary) → Clean Decrypted DLL name from memory → Decrypt API name → Get API VA (Call GetProcAddress) → Clean Decrypted API name from memory → Call API.
As you can see, this is an effective method of deleting the plain text memory data once it is no longer needed.
For an effective analysis, we need all of the chunks of data that get decrypted during remote communications between the controller and the infected machine.
As this Remote Access Trojan (RAT) variant will only use all of its malicious actions whilst being remotely instructed to do so, it is likely that some of the important data will never be called by the decryption routine. This is something we need to resolve in order to fully understand the malware capabilities.
We have identified two slightly different code patterns where a CALL to the decryption routine is performed.
1001CB76 68 D02F0710 PUSH WinMsi.10072FD0 ← decrypted data buffer 1001CB7B 68 875E2263 PUSH 63225E87 ← key 1001CB80 6A 28 PUSH 28 ← data length 1001CB82 68 505F0310 PUSH WinMsi.10035F50 1001CB87 8D4D EC LEA ECX, DWORD PTR SS:[EBP-14] 1001CB8A E8 F195FEFF CALL WinMsi.10006180 ← Call Decryption Routine 1001CB8F 8BC8 MOV ECX, EAX 1001CB91 E8 1A67FEFF CALL WinMsi.100032B0 1001CB96 50 PUSH EAX
10004464 68 5CDE0310 PUSH WinMsi.1003DE5C ← decrypted data buffer 10004469 68 28EA828A PUSH 8A82EA28 ← key 1000446E 6A 06 PUSH 6 ← data length 10004470 68 78430310 PUSH WinMsi.10034378 10004475 8D8D 78FFFFFF LEA ECX, DWORD PTR SS:[EBP-88] 1000447B E8 001D0000 CALL WinMsi.10006180 ← Call Decryption Routine 10004480 8BC8 MOV ECX, EAX 10004482 E8 29EEFFFF CALL WinMsi.100032B0 10004487 50 PUSH EAX
The instruction in bold is the one that actually makes the difference in terms of pattern size, in the first case it is a 3-byte instruction, while in the second one, it is a 6-byte instruction (due to the address constant memory reference).
However, interestingly this works against the author slightly, in that the byte signature from the beginning of these patterns until the start of the instruction (including its 1st byte 8Dh) provides enough information to create a signature and safely search for it without creating too many/any false positives.
As stated earlier in this document, this highlights the decryption keys being used as different..
The first parameter pushed onto the stack is the memory address where the decrypted data is written, next it pushes the key and finally the length of the original data to be decrypted.
In part two, we will discuss how our analysis leads to the automation of the identification of all the locations containing encrypted data and all the calls to the decryption routine, allowing us to manipulate the malware into calling the decryption routing for all the encrypted data hence revealing all the malware’s prized secrets.
Written by: Kyriakos Economou of Portcullis.