Injdrv
proof-of-concept Windows Driver for injecting DLL into user-mode processes using APC
Install / Use
/learn @wbenny/InjdrvREADME
injdrv
injdrv is a proof-of-concept Windows Driver for injecting DLL into user-mode processes using APC.
Motivation
Even though [APCs][apc] are [undocumented to decent extent][inside-apc], the technique of using them to inject a DLL into a user-mode process is not new and has been talked through many times. Such APC can be queued from regular user-mode process (seen in [Cuckoo][apc-cuckoo]) as well as from kernel-mode driver (seen in [Blackbone][apc-blackbone]).
Despite its popularity, finding small, easy-to-understand and actually working projects demonstrating usage of this technique isn't very easy. This project tries to fill this gap.
Features
- Support for Windows 7 up to Windows 10
- Support for x86, x64, ARM32 & ARM64 architectures
- Ability to inject Wow64 processes
- With DLL of the same architecture as the injected process (e.g. x86 DLL into x86 Wow64 process)
- With DLL of the same architecture as the OS (e.g. x64 DLL into Wow64 process on Windows x64)
- DLL is injected in very early process initialization stage
- Injection is performed from the
PsSetLoadImageNotifyRoutinecallback - Native processes (e.g. x86 on Windows x86, x64 on Windows x64, ...) are injected on next load of DLL after
ntdll.dll - Wow64 processes are injected on next load of DLL after the Wow64-DLLs are loaded
- Injection is performed from the
- Because of that, injected DLL must depend only on
ntdll.dll - Demonstrative DLL performs hooking of few
ntdll.dllfunctions- Achieved using [DetoursNT][DetoursNT]
- Detoured functions use
ETWto trace hooked function calls
Compilation
Because [DetoursNT][DetoursNT] project is attached as a git submodule, which itself carries the [Detours][Detours] git submodule, you must not forget to fetch them:
git clone --recurse-submodules git@github.com:wbenny/injdrv.git
After that, compile this project using Visual Studio 2017. Solution file is included. The only required dependency is [WDK][wdk].
Implementation
When the driver is loaded, it'll register two callbacks:
- For process create/exit notification ([
PsSetCreateProcessNotifyRoutineEx][MSDN-CreateProcessNotify]) - For image load notification ([
PsSetLoadImageNotifyRoutine][MSDN-LoadImageNotify])
When a new process is created, the driver allocates small structure, which will hold information relevant to the process injection, such as:
- Which DLLs are already loaded in the process
- Addresses of important functions (such as
LdrLoadDllinntdll.dll)
Start of a new Windows process is followed by mapping ntdll.dll into its address space and then ongoing load of DLLs
from the process's import table. In case of Wow64 processes on Windows x64, the following libraries are loaded
immediately after native ntdll.dll: wow64.dll, wow64cpu.dll, wow64win.dll and second (Wow64) ntdll.dll.
The driver is notified about load of these DLLs and marks down this information.
When these DLLs are loaded, it is safe for the driver to queue the user-mode APC to the process, which will load our DLL into the process.
Challenges
Although such project might seem trivial to implement, there are some obstacles you might be facing along the way. Here I will try to summarize some of them:
"Thunk"-method
This method injects DLL of the same architecture as the process. This method is available on all architectures.
Injection of DLL requires a small allocation inside of the user-mode address space. This allocation
holds path to the DLL to be injected and a small thunk (shellcode), which basically calls LdrLoadDll with the DLL path as
a parameter. It is obvious that this memory requires PAGE_EXECUTE_READ protection, but the driver has to fill this
memory somehow - and PAGE_EXECUTE_READWRITE is unacceptable security concern.
It might be tempting to use ZwAllocateVirtualMemory and ZwProtectVirtualMemory but unfortunatelly, the second
function is exported only since Windows 8.1.
The solution used in this driver is to create section ([ZwCreateSection][MSDN-CreateSection]), map it
([ZwMapViewOfSection][MSDN-MapViewOfSection]) with PAGE_READWRITE protection, write the data, unmap it
([ZwUnmapViewOfSection][MSDN-UnmapViewOfSection]) and then map it again with PAGE_EXECUTE_READ protection.
With usage of sections another problem arises. Since this driver performs injection from the image load notification
callback - which is often called from the NtMapViewOfSection function - we'd be calling MapViewOfSection
recursively. This wouldn't be a problem, if mapping of the section wouldn't lock the EPROCESS->AddressCreationLock.
Because of that, we would end up in deadlock.
The solution used in this driver is to inject kernel-mode APC first, from which the ZwMapViewOfSection is called.
This kernel-mode APC is triggered right before the kernel-to-user-mode transition, so the internal NtMapViewOfSection
call won't be on the callstack anymore (and therefore, AddressCreationLock will be unlocked).
Injection of our DLL is triggered on first load of DLL which happens after all important system DLLs (mentioned above) are already loaded.
In case of native processes, the codeflow is following:
process.exeis created (process create notification)process.exeis loaded (image load notification)ntdll.dllis loaded (image load notification)kernel32.dllis loaded (image load notification + injection happens here)
In case of Wow64 processes, the codeflow is following:
process.exeis created (process create notification)process.exeis loaded (image load notification)ntdll.dllis loaded (image load notification)wow64.dllis loaded (image load notification)wow64cpu.dllis loaded (image load notification)wow64win.dllis loaded (image load notification)ntdll.dllis loaded (image load notification - note, this is 32-bit ntdll.dll)kernel32.dllis loaded (image load notification + injection happens here)
NOTE: Load of the
kernel32.dllwas used as an example. In fact, load of any DLL will trigger the injection. But in practice,kernel32.dllis loaded into every Windows process, even if:
- it has no import table
- it doesn't depend on
kernel32.dll- it does depend only on
ntdll.dll(covered in previous point, I just wanted to make that crystal-clear)- it is a console application
Also note that the order of loaded DLLs mentioned above might not reflect the exact order the OS is performing.
The only processes that won't be injected by this method are:
- native processes (such as
csrss.exe)- pico processes (such as applications running inside [Windows Subsystem for Linux][WSL])
Injection of these processes is not in the scope of this project.
NOTE: On Windows 7, the Wow64 loads
kernel32.dllanduser32.dll(both native and Wow64) into the process. Unfortunatelly, this load is performed in the initialization of Wow64 (bywow64!ProcessInit), therefore on Windows 7 we have to wait until these DLLs are loaded as well before injecting a Wow64 process.
The injected user-mode APC is then force-delivered by calling KeTestAlertThread(UserMode). This call internally
checks if any user-mode APCs are queued and if so, sets the Thread->ApcState.UserApcPending variable to TRUE.
Because of this, the kernel immediately delivers this user-mode APC (by KiDeliverApc) on next transition from
kernel-mode to user-mode.
If we happened to not force the delivery of the APC, the APC would've been delivered when the thread would be in the alertable state. (There are two alertable states per each thread, one for kernel-mode, one for user-mode; this paragraph is talking about
Thread->Alerted[UserMode] == TRUE.) Luckily, this happens when the Windows loader in thentdll.dllfinishes its job and gives control to the application - particularly by callingNtAlertThreadin theLdrpInitialize(or_LdrpInitialize) function. So even if we happened to not force the APC, our DLL would still be injected before the main execution would take place.NOTE: This means that if we wouldn't force delivery of the APC on our own, the APC would be delivered BEFORE the
main/WinMainis executed, but AFTER all [TLS callbacks][TLS-callbacks] are executed. This is because TLS callbacks are executed also in the early process initialization stage, within theLdrpInitializefunction.This behavior is configurable in this project by the
ForceUserApcvariable (by default it'sTRUE).NOTE: Some badly written drivers try to inject DLL into processes by queuing APC at wrong time. For example:
- Queuing an APC for injecting DLL that doesn't depend only on ntdll.dll right when ntdll.dll is mapped
- Queuing an APC for injecting DLL that depends on kernel32.dll right when kernel32.dll is mapped (but not loaded!)
Such injection will actually work as long as someone won't try to forcefully deliver user-mode APCs. Because this driver triggers immediate deliver of user-mode APCs (all of them, you can't pick which should be delivered), it might happen that APC of other driver will be triggered. If such APC consisted, let's say, of calling
LoadLibraryAfromkernel32.dlland thekernel32.dllwon't be fully loaded (just mapped), such APC would fail. And because this injection happens in early process initialization stage, this error would be considered critical and the process start would fail. Also because basically every process is being injected, if start of every process would fail, it would make the system very unusable.
The reason why our DLL is not injected immediately from the ntdll.dll image load callback is simple: the image
load callback is called when the DLL is mapped into the process - and at this stage, the DLL is not fully initialized.
The initialization takes place
