CTI

TEHTRIS releases new open source shellcode extraction tool

The use of packing, shellcode execution and in-memory Dynamic Loaded Library (DLL) loading is very common in the malware scene. This can be quite tedious to extract the real payload by sole static analysis techniques. A dynamic approach can help the reverser to find a near generic method to de- obfuscate stages n+1. This is what TEHTRIS offers you the “Shellcode extraction” open source tool. This tool does not pretend to be 100% reliable but find quick-win in most of the cases.

This article describes the capabilities of this tool made by TEHTRIS who helps to extract payloads from a Portable Executable (PE) file. It aims to provide a minimalist Sandboxing environment to track the Self Modifying Code (SMC) with limited memory consumption.

The pin [0] Application Programming Interface (API) already offers SMC detection callbacks [1]:

typedef VOID(* LEVEL_PINCLIENT::SMC_CALLBACK) (ADDRINT traceStartAddress, ADDRINT traceEndAddress, VOID *v)

Unfortunately, this API does not always detect SMC in kernel32!VirtualAlloc memory pages. Also, more features are required to help the reverser:

  • Automatic dump of the shellcode
  • Dump PE file in case of manual DLL loading (outside kernel32!LoadLibrary)
  • Dump the trace which performs the decryption
  • Log the Original Entry Point (OEP)

When detected, the reverser has immediate access to the payloads and the address/dump of the decryption routine. This may help to save time during decryption and generate signatures. The manual debugging or static analysis is a time-consuming process which an analyst cannot always afford to waste when quick reaction time is critical.

The code can be found in a github repository [2].

Analysis

A generic example of obfuscation tricks commonly found in the wild has been generated for demonstration purposes.

Code

A test sample has been generated using mingw32, then UPX [5] packed:

#include <windows.h>
#include <stdio.h>

int main()
{
    char shellcode[] = "\x91\x91\x91\xc2";
    // Alloc memory
    LPVOID addressPointer = VirtualAlloc(NULL, sizeof(shellcode), MEM_COMMIT, PAGE_EXECUTE_READWRITE);
    if(!addressPointer) {
    printf("Fail to allocate\n");
    return	0;
}

// unxor

    for(size_t	i=0; i<sizeof(shellcode); i++) {
        ((BYTE	*)addressPointer)[i] = shellcode[i] ^ 1;
    }
    // Create thread pointing to shellcode address
    CreateThread(NULL, 0, (LPTHREAD_START_ROUTINE)addressPointer, NULL, 0, 0);
    // Sleep for a second to wait for the thread
    Sleep(1000);
    return 0;
}

There is 2 layers obfuscation: UPX + xor 1. To trigger a SMC detection, the shellcode will be executed in separate thread.

The payload consists only in nop/ret instructions: x91x91x91xc2 before xor, x90x90x90 xc3. As it is a pretty trivial example, this can be enough to hide strings and fool YARA rules against malicious code. In conjunction with anti-debug techniques, the analysis can be very tedious.

Behaviour

The in-memory manipulations are taking place as follows:

Manipulations in memory

Demo on the example sample

It is time to show the Shellcode Extraction Tool in action against the previous sample. To analyse, simply run the script:

quickShellcodeDetector /home/user/samples/shellcode.exe /dev/shm

The logs show that UPX unpacked sample and the shellcode has correctly been decrypted. The full log extract for this tool is pasted in the code box below. Every trace is prefixed by the Process IDentifier (PID) of the process to avoid collision with sub processes if the shellcode address is the same:

[INFO] ShellcodeDetector.cpp:503	Starting program: pid=7360
[INFO] ShellcodeDetector.cpp:446	IMG_LOAD: F:\pin-3.18-98332gaebd7b1e6-msvc-windows\source\tools\QuickDetector\shellcode.exe addr=0x00370000 id=1 size=0x1c000
[INFO] ShellcodeDetector.cpp:446	IMG_LOAD: C:\Windows\SysWOW64\ KernelBase.dll addr=0x76d80000 id=2 size=0x214000
[INFO] ShellcodeDetector.cpp:446	IMG_LOAD: C:\Windows\SysWOW64\ kernel32.dll addr=0x77300000 id=3 size=0xf0000
[INFO] ShellcodeDetector.cpp:446	IMG_LOAD: C:\Windows\SysWOW64\ntdll .dll addr=0x77560000 id=4 size=0x1a3000
[INFO] ShellcodeDetector.cpp:446	IMG_LOAD: C:\Windows\SysWOW64\ apphelp.dll addr=0x74c80000 id=5 size=0x9f000
[INFO] ShellcodeDetector.cpp:113	Found obfuscation routine at: 0 x37105e (F:\pin-3.18-98332-gaebd7b1e6-msvc-windows\source\tools\ QuickDetector\shellcode.exe+0x105e)
[INFO] ShellcodeDetector.cpp:115	Dumping trace into: C:\Users\user\ AppData\Local\Temp\ShellcodeDetector\0x1cc0_0x0037105e.trc
[INFO] ShellcodeDetector.cpp:239	Dumping ShellCode: C:\Users\user\ AppData\Local\Temp\ShellcodeDetector\0x1cc0_0x0b230000.bin ep=0 x00000000 size=0x1000
[INFO] ShellcodeDetector.cpp:254	Dumped: 4096 bytes
[INFO] ShellcodeDetector.cpp:446	IMG_LOAD: C:\Windows\SysWOW64\ kernel.appcore.dll addr=0x74000000 id=6 size=0xf000
[INFO] ShellcodeDetector.cpp:446	IMG_LOAD: C:\Windows\SysWOW64\
msvcrt.dll addr=0x77400000 id=7	size=0xbf000
[INFO] ShellcodeDetector.cpp:446	IMG_LOAD: C:\Windows\SysWOW64\
rpcrt4.dll addr=0x76150000 id=8	size=0xc0000
[INFO]	ShellcodeDetector.cpp:463	Done in 3 seconds
[INFO]	ShellcodeDetector.cpp:467	Freing	image: 1
[INFO]	ShellcodeDetector.cpp:467	Freing	image: 2
[INFO]	ShellcodeDetector.cpp:467	Freing	image: 3
[INFO]	ShellcodeDetector.cpp:467	Freing	image: 4
[INFO]	ShellcodeDetector.cpp:467	Freing	image: 5
[INFO]	ShellcodeDetector.cpp:467	Freing	image: 6
[INFO]	ShellcodeDetector.cpp:467	Freing	image: 7
[INFO]	ShellcodeDetector.cpp:467	Freing	image: 8

Both the obfuscation routine and the payload have been found. The log file indicates the address of the trace which deobfuscates the payload:

Found obfuscation routine at: 0x37105e (F:\pin-3.18-98332-gaebd7b1e6msvc-windows\source\tools\QuickDetector\shellcode.exe+0x105e)

The trace found at shellcode.exe+0x105e, exported in the 0x1cc0_0x0037105e.trc file:

Figure 1: Trace extraction: 0x1cc0_0x0037105e.trc

The IDA [4] trace matching Relative Virtual Address (RVA) (at 0x37105e) helps to find the xor 1 (this is the same trace as extracted by the tool):

Figure 2: Trace extraction

The logfile indicates the address of shellcode along with its entry point relative to the address base of the memory allocation:

[INFO] ShellcodeDetector.cpp:239	Dumping ShellCode: C:\Users\user\ AppData\Local\Temp\ShellcodeDetector\0x1cc0_0x0b230000.bin ep=0 x00000000 size=0x1000

This is the extracted shellcode, found at address 0x0b230000 exported in the 0x1cc0_0x0b230000.bin file:

Figure 3: Shellcode extraction: 0x1cc0_0x0b230000.bin

The shellcode size is 4096 bytes due to page alignment forced by memory allocation.

Architecture

The tool is pin based and does not needs any dependency. The following schema describes the overall architecture. This is a simple Virtual Machine (VM) based architecture using VirtualBox [3] CLI.

The commands are passed throw guest additions which is not ideal for stealth. This should be replaced by an agent in the future.

Tool internals

The tool is trying to keep every memory access with minimal memory consumption. To achieve this, the tool logs for each trace its W/X access to memory blocks determined by kernel32!VirtualQuery.

This shortcut allows not to double the memory allocated and compare each copy.

Only the trace and memory region are logged. The following structures are used to log the events:

typedef struct _TRACEACCESS {
UINT32 access_type;
ADDRINT membase;
} TRACEACCESS, *PTRACEACCESS;

typedef struct _TRACE {
ADDRINT address;
USIZE length;
size_t accessnb;
PTRACEACCESS access;
} *PTRACE;

typedef struct _MEMACCESS {
size_t tracenb;
PTRACE trace;
} MEMACCESS, PMEMACCESS;

These structures are provisioned by the W/X callbacks.

The collision detection is performed in real time to ensure the memory dump is processed as soon as the page is executed for the first time. This ensures in most of the cases the payload to be fully deobfuscated.

A very simple yet efficient PE parser is included to parse the PE header then DUMP the DLL.

Conclusion

This tool is quite handy when an analyst needs to extract an obfuscated payload in a sample with few protections. There are many workarounds and detection possible, this could be improved in the future.

Bibliography

[0] https://www.intel.com/content/www/us/en/developer/articles/tool/pin-a-dynamic-binary-instrumentation-tool.html

[1] https://software.intel.com/sites/landingpage/pintool/docs/98484/Pin/html/group__TRACE.html#gad80d434b4df6285334079c19df32a2e8

[2]  https://github.com/tehtris-hub/ShellCodeDetector

[3] https://www.virtualbox.org/


[4] https://www.hex-rays.com/ida-pro/


[5] https://upx.github.io/