SkillAgentSearch skills...

Injdrv

proof-of-concept Windows Driver for injecting DLL into user-mode processes using APC

Install / Use

/learn @wbenny/Injdrv
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

injdrv

injdrv is a proof-of-concept Windows Driver for injecting DLL into user-mode processes using APC.

Motivation

Even though [APCs][apc] are [undocumented to decent extent][inside-apc], the technique of using them to inject a DLL into a user-mode process is not new and has been talked through many times. Such APC can be queued from regular user-mode process (seen in [Cuckoo][apc-cuckoo]) as well as from kernel-mode driver (seen in [Blackbone][apc-blackbone]).

Despite its popularity, finding small, easy-to-understand and actually working projects demonstrating usage of this technique isn't very easy. This project tries to fill this gap.

Features

  • Support for Windows 7 up to Windows 10
  • Support for x86, x64, ARM32 & ARM64 architectures
  • Ability to inject Wow64 processes
    • With DLL of the same architecture as the injected process (e.g. x86 DLL into x86 Wow64 process)
    • With DLL of the same architecture as the OS (e.g. x64 DLL into Wow64 process on Windows x64)
  • DLL is injected in very early process initialization stage
    • Injection is performed from the PsSetLoadImageNotifyRoutine callback
    • Native processes (e.g. x86 on Windows x86, x64 on Windows x64, ...) are injected on next load of DLL after ntdll.dll
    • Wow64 processes are injected on next load of DLL after the Wow64-DLLs are loaded
  • Because of that, injected DLL must depend only on ntdll.dll
  • Demonstrative DLL performs hooking of few ntdll.dll functions
    • Achieved using [DetoursNT][DetoursNT]
  • Detoured functions use ETW to trace hooked function calls

Compilation

Because [DetoursNT][DetoursNT] project is attached as a git submodule, which itself carries the [Detours][Detours] git submodule, you must not forget to fetch them:

git clone --recurse-submodules git@github.com:wbenny/injdrv.git

After that, compile this project using Visual Studio 2017. Solution file is included. The only required dependency is [WDK][wdk].

Implementation

When the driver is loaded, it'll register two callbacks:

  • For process create/exit notification ([PsSetCreateProcessNotifyRoutineEx][MSDN-CreateProcessNotify])
  • For image load notification ([PsSetLoadImageNotifyRoutine][MSDN-LoadImageNotify])

When a new process is created, the driver allocates small structure, which will hold information relevant to the process injection, such as:

  • Which DLLs are already loaded in the process
  • Addresses of important functions (such as LdrLoadDll in ntdll.dll)

Start of a new Windows process is followed by mapping ntdll.dll into its address space and then ongoing load of DLLs from the process's import table. In case of Wow64 processes on Windows x64, the following libraries are loaded immediately after native ntdll.dll: wow64.dll, wow64cpu.dll, wow64win.dll and second (Wow64) ntdll.dll. The driver is notified about load of these DLLs and marks down this information.

When these DLLs are loaded, it is safe for the driver to queue the user-mode APC to the process, which will load our DLL into the process.

Challenges

Although such project might seem trivial to implement, there are some obstacles you might be facing along the way. Here I will try to summarize some of them:

"Thunk"-method

This method injects DLL of the same architecture as the process. This method is available on all architectures.

Injection of DLL requires a small allocation inside of the user-mode address space. This allocation holds path to the DLL to be injected and a small thunk (shellcode), which basically calls LdrLoadDll with the DLL path as a parameter. It is obvious that this memory requires PAGE_EXECUTE_READ protection, but the driver has to fill this memory somehow - and PAGE_EXECUTE_READWRITE is unacceptable security concern.

It might be tempting to use ZwAllocateVirtualMemory and ZwProtectVirtualMemory but unfortunatelly, the second function is exported only since Windows 8.1.

The solution used in this driver is to create section ([ZwCreateSection][MSDN-CreateSection]), map it ([ZwMapViewOfSection][MSDN-MapViewOfSection]) with PAGE_READWRITE protection, write the data, unmap it ([ZwUnmapViewOfSection][MSDN-UnmapViewOfSection]) and then map it again with PAGE_EXECUTE_READ protection.

With usage of sections another problem arises. Since this driver performs injection from the image load notification callback - which is often called from the NtMapViewOfSection function - we'd be calling MapViewOfSection recursively. This wouldn't be a problem, if mapping of the section wouldn't lock the EPROCESS->AddressCreationLock. Because of that, we would end up in deadlock.

The solution used in this driver is to inject kernel-mode APC first, from which the ZwMapViewOfSection is called. This kernel-mode APC is triggered right before the kernel-to-user-mode transition, so the internal NtMapViewOfSection call won't be on the callstack anymore (and therefore, AddressCreationLock will be unlocked).

Injection of our DLL is triggered on first load of DLL which happens after all important system DLLs (mentioned above) are already loaded.

In case of native processes, the codeflow is following:

  • process.exe is created (process create notification)
  • process.exe is loaded (image load notification)
  • ntdll.dll is loaded (image load notification)
  • kernel32.dll is loaded (image load notification + injection happens here)

In case of Wow64 processes, the codeflow is following:

  • process.exe is created (process create notification)
  • process.exe is loaded (image load notification)
  • ntdll.dll is loaded (image load notification)
  • wow64.dll is loaded (image load notification)
  • wow64cpu.dll is loaded (image load notification)
  • wow64win.dll is loaded (image load notification)
  • ntdll.dll is loaded (image load notification - note, this is 32-bit ntdll.dll)
  • kernel32.dll is loaded (image load notification + injection happens here)
<br>

NOTE: Load of the kernel32.dll was used as an example. In fact, load of any DLL will trigger the injection. But in practice, kernel32.dll is loaded into every Windows process, even if:

  • it has no import table
  • it doesn't depend on kernel32.dll
  • it does depend only on ntdll.dll (covered in previous point, I just wanted to make that crystal-clear)
  • it is a console application

Also note that the order of loaded DLLs mentioned above might not reflect the exact order the OS is performing.

The only processes that won't be injected by this method are:

  • native processes (such as csrss.exe)
  • pico processes (such as applications running inside [Windows Subsystem for Linux][WSL])

Injection of these processes is not in the scope of this project.

NOTE: On Windows 7, the Wow64 loads kernel32.dll and user32.dll (both native and Wow64) into the process. Unfortunatelly, this load is performed in the initialization of Wow64 (by wow64!ProcessInit), therefore on Windows 7 we have to wait until these DLLs are loaded as well before injecting a Wow64 process.

The injected user-mode APC is then force-delivered by calling KeTestAlertThread(UserMode). This call internally checks if any user-mode APCs are queued and if so, sets the Thread->ApcState.UserApcPending variable to TRUE. Because of this, the kernel immediately delivers this user-mode APC (by KiDeliverApc) on next transition from kernel-mode to user-mode.

If we happened to not force the delivery of the APC, the APC would've been delivered when the thread would be in the alertable state. (There are two alertable states per each thread, one for kernel-mode, one for user-mode; this paragraph is talking about Thread->Alerted[UserMode] == TRUE.) Luckily, this happens when the Windows loader in the ntdll.dll finishes its job and gives control to the application - particularly by calling NtAlertThread in the LdrpInitialize (or _LdrpInitialize) function. So even if we happened to not force the APC, our DLL would still be injected before the main execution would take place.

NOTE: This means that if we wouldn't force delivery of the APC on our own, the APC would be delivered BEFORE the main/WinMain is executed, but AFTER all [TLS callbacks][TLS-callbacks] are executed. This is because TLS callbacks are executed also in the early process initialization stage, within the LdrpInitialize function.

This behavior is configurable in this project by the ForceUserApc variable (by default it's TRUE).

NOTE: Some badly written drivers try to inject DLL into processes by queuing APC at wrong time. For example:

  • Queuing an APC for injecting DLL that doesn't depend only on ntdll.dll right when ntdll.dll is mapped
  • Queuing an APC for injecting DLL that depends on kernel32.dll right when kernel32.dll is mapped (but not loaded!)

Such injection will actually work as long as someone won't try to forcefully deliver user-mode APCs. Because this driver triggers immediate deliver of user-mode APCs (all of them, you can't pick which should be delivered), it might happen that APC of other driver will be triggered. If such APC consisted, let's say, of calling LoadLibraryA from kernel32.dll and the kernel32.dll won't be fully loaded (just mapped), such APC would fail. And because this injection happens in early process initialization stage, this error would be considered critical and the process start would fail. Also because basically every process is being injected, if start of every process would fail, it would make the system very unusable.

The reason why our DLL is not injected immediately from the ntdll.dll image load callback is simple: the image load callback is called when the DLL is mapped into the process - and at this stage, the DLL is not fully initialized. The initialization takes place

View on GitHub
GitHub Stars1.3k
CategoryDevelopment
Updated3d ago
Forks297

Languages

C

Security Score

100/100

Audited on Mar 26, 2026

No findings