Tried, Tested and Proven
Over the past few weeks, Portcullis has shared two parts of a three part series on “Static Analysis vs Dynamic Imports”. The previous articles, part 1, discussed in detail the reasons “why malware developers use dynamic imports”, while part 2, discussed “dynamic imports methodology used by a malicious driver”.

In the very last article, part 3, we will demonstrate to the reader, on how to modify the code of the ChecksumExportLocator function, in order to retrieve the names instead of the addresses. The investigation will also show how to make a simple tool that will automate the whole process for all the dynamically imported APIs.

This particular investigation required Portcullis analysts to write a custom tool to automate the resolving of checksums to API names and how it speeds up the analysis process. This can be used on subsequent malware investigations which use similar dynamic importing methods. A tool was necessary in order to automate the process of finding the matches between he checksum constants and the kernel function names. Even though, this tool is particularly targeting the custom checksum algorithm of this driver, it can be maintained by adding more functions corresponding to different malware families. Furthermore, it is highly possible that a future version of the same malware will incorporate the same dynamic importing method through the same algorithm. So, by keeping the algorithms inside our tool, we will only have to update the table with the new checksums to play with. We can also just update the algorithm with minor modifications, which otherwise would require more time in case we build something from scratch. This is, if, we come across the same issue everytime.

What makes a tool like this really useful, is that, we can get all the information necessary regarding the dynamic imports without having to perform remote kernel debugging for drivers, which of course saves a lot of time.

Code organization:

The code of the tool is mainly divided into two parts. Since the first step towards this process is mapping the module in memory (ntoskrnl.exe in our case), we have created a Class called ModuleMapper (ModuleMapper.h), which will actually map a module in memory (assuming that we can have access on it) with Read Access permissions.

Creating a Class for steps that we might want to take again, in other cases, it can help us eliminate any extra work the next time we might need it. Then, we have the MalwareClasses.h where we can add extra classes depending on which family the malware belongs to. In this case we have the GraftorVariant Class with two methods called char * Rbadza_sys_GetNtApiName and DWORD Rbadza_sys_GetNtApiRVA, which will return the name of function and its RVA respectively.

The first part of the function name is the name of the module. This way, we can easily remember at which sample that specific checksum algorithm belongs to. Of course, in order to know the real full address (VA) of the API we need to know the base address of the module. In this case, since we are interested in the kernel, but from a working user mode, the tool will return the current VA as mapped inside the tool. This can be useful if someone wants to take a quick look at the actual code of that kernel function without using a kernel debugger. This is not something we are usually interested in, instead it can be considered as some extra information that we might need or not. It can be used in order to identify tampered code or hooked functions by combining this tool with a local kernel debugger such as WinDbg. Furthermore, we have a header file called ChecksumTables.h where we can store the different custom checksums used by each sample, so that we can go back on them whenever we want. The *Func.cpp files contain the implementation of the methods of their respective Classes.

At this stage in the article, we are only showing modifications we did at the custom checksuming algorithm in order to obtain the necessary information.

Getting the API name from the checksum:

As mentioned earlier, the function that involves the custom checksum algorithm retrieves the VA of the function and not its name. On the other hand, it is very easy to modify, and cut out the code that we don’t need. This to obtain the information that we are looking for.

__declspec(naked) char * GetNtApiName(DWORD chksum, DWORD mapaddr)
{

__asm{
	push ebp
	mov ebp,esp
	push ebx 
	push ebp
	mov eax,buff
	mov ebx, chksum
	mov ebp,eax
	push esi
	xchg eax,edx
	push edi
	mov eax, [ebp + 3ch]
	mov edx, [eax+ebp+78h]
	add edx,ebp
	push edx
	mov ecx, [edx+18h]
	mov edx, [edx+20h]
	add edx,ebp
	xor eax,eax

_apiloop:
	dec ecx
	jl _notfound
	mov esi, [edx+ecx*4]
	add esi,ebp
	push esi            // save ptr to api name 
	xor edi,edi

_nameloop:	
	ror edi, 7
	lodsb
	add edi,eax
	test eax,eax
	jnz _nameloop
	cmp edi,ebx
	pop edi           // ptr to api name from stack
	jnz _apiloop
	mov eax,edi  // eax must point to apiname
	pop edx
	pop edi
	pop esi
	pop ebp
	pop ebx
	mov esp,ebp
	pop ebp
	ret

_notfound: 
	pop edx
	pop edi
	pop esi
	pop ebp
	pop ebx
	xor eax,eax
	mov esp,ebp
	pop ebp
	ret
     }

}

The modifications involve:

i) saving temporarily the value of ESI (which always points to the function name involved in the current checksum calculation) in order to be able to return this value on EAX, once we have found the match between a checksum and function name.

ii) cut out the code that calculates the real VA of the function name. In case, there was no match found for the given checksum, then we set the pointer (EAX) to NULL, which we always check in the main function before printing out the name.

Getting the API RVA from the checksum:

This is much easier to obtain, since the algorithm is always using the end the RVA of that function by adding it to the ntoskrnl base address. This is in order to obtain the VA of that function. So, we only need to cut out a single instruction, which in this case, has been commented out..

__declspec(naked) DWORD GetNtApiRVA(DWORD chksum, DWORD buff)
{

__asm{
	push ebp
	mov ebp,esp
	push ebx 
	push ebp
	mov eax,buff
	mov ebx, chksum
	mov ebp,eax
	push esi
	xchg eax,edx
	push edi
	mov eax, [ebp + 3ch]
	mov edx, [eax+ebp+78h]
	add edx,ebp
	push edx
	mov ecx, [edx+18h]
	mov edx, [edx+20h]
	add edx,ebp
	xor eax,eax

_apiloop:
	dec ecx
	jl _notfound
	mov esi, [edx+ecx*4]
	add esi,ebp
	xor edi,edi

_nameloop:	
	ror edi, 7
	lodsb
	add edi,eax
	test eax,eax
	jnz _nameloop
	cmp edi,ebx
	jnz _apiloop

	pop edx
	mov ebx, [edx+24h]
	add ebx,ebp
	mov cx, [ebx+ecx*2]   
	mov ebx, [edx+1ch]    
	add ebx,ebp
	mov eax, [ebx+ecx*4]
	mov ecx,eax
	//add eax,ebp calculates VA (RVA + image base)
	pop edi
	pop esi
	pop ebp
	pop ebx
	mov esp,ebp
	pop ebp
	ret

_notfound: 
	pop edx
	pop edi
	pop esi
	pop ebp
	pop ebx
	xor eax,eax
	mov esp,ebp
	pop ebp
	ret

    }

}

Dealing with dynamic imports during static analysis can be really time consuming, it can be said with certainty, that such a tool will be vital in some cases. Especially, if we really want to know the purpose of the module we are analysing. In this particular investigation, we had to deal with a driver, which showed us that, it made sense to code a tool to automate the checksum to API names resolving process. Otherwise, we would have missed a lot of important information about the hidden functionalities of this malware.

Finally, we must not forget that the same techniques are also used by malware components that run in user mode. It does have a difference though, in that case, it is much easier to do dynamic analysis and get the necessary information without the use of an additional tool.

Did you miss the first two articles? You can find them here:

Part 1 – Why malware developers use dynamic imports

Part 2 – Dynamic Imports methology used by a malicious driver

Written by: Kyriakos Economou of Portcullis.

Any questions/feedback?

If you have any further question/feedback regarding the Static Analysis and Dynamic Imports article (part 3), please do get in touch! We would like to hear your thoughts. You may contact us at: labs@portcullis-security.com

Categories