diff --git a/thesis/src/04.related-work.tex b/thesis/src/04.related-work.tex index ac44b4b..e7003db 100644 --- a/thesis/src/04.related-work.tex +++ b/thesis/src/04.related-work.tex @@ -1,30 +1,63 @@ \chapter{Related Work}\label{ch:related-work} +This chapter gives a rough overview on techniques and methods to intercept or hook system calls and function calls. See also Section~\ref{sec:methods-for-intercepting}. - -Lorem Ipsum. +Many methods have already been discussed there. -\section{GDB Checker}\label{sec:gdb-checker} +\section{Function Call Interception}\label{sec:function-call-interception} -Lorem Ipsum. +All related work regarding function call interception has already been mentioned in the aforementioned Section. +See \texttt{ltrace} (Subsection~\ref{subsec:ltrace}), wrapper functions (Subsection~\ref{subsec:wrapper-functions}), and \texttt{LD\_PRELOAD} (Subsection~\ref{subsec:preloading}). -\section{Binary-Rewriting-Based}\label{ec:binary-rewriting-based} +\section{System Call Interception}\label{sec:system-call-interception} + +This section discusses further related work regarding system call interception. +This excludes techniques already discussed in Section~\ref{sec:methods-for-intercepting}, +like \texttt{ptrace} (Subsection~\ref{subsec:ptrace}), and \texttt{strace} (Subsection~\ref{subsec:strace}). +Almost all following methods use binary rewriting to replace system calls with other instructions (except SUD, Subsection~\ref{subsec:syscall-user-dispatch}). +This is one of the reasons why they were not mentioned in Section~\ref{sec:methods-for-intercepting}. +Another one is that the focus of this work is function call interception, and not system call interception. + + +\subsection{\texttt{int3} Signaling}\label{subsec:int3-signaling} + +\texttt{int3} is a one-byte instruction (\texttt{0xcc}) that invokes a software interrupt. +On Linux, the kernel handles it and raises \texttt{SIGTRAP} to the user-space process that executed \texttt{int3}. +The \texttt{int3} signaling technique exploits this behavior to hook system calls; it replaces \texttt{syscall}/\texttt{sysenter} with \texttt{int3} and employs the signal handler for \texttt{SIGTRAP} as the hook function. +Since \texttt{int3} is one byte, it can replace an arbitrary instruction without breaking the neighbor instructions. +This technique is traditionally used in debuggers to implement breakpoints. +However, signal handling incurs a large overhead because it involves context manipulation by the kernel. +\cite{zpoline} + + +\subsection{Syscall User Dispatch (SUD)}\label{subsec:syscall-user-dispatch} + +Syscall User Dispatch (SUD)~\cite{sud} was added in Linux 5.11, and it offers a way to redirect system calls to arbitrary user-space code. +For the SUD feature, the kernel implements a hook point at the entry point of system calls. +A user-space process can activate SUD via the \texttt{prctl} interface. +When SUD is activated, the hook point raises \texttt{SIGSYS} to the user-space process. +This mechanism allows a user-space program to leverage the \texttt{SIGSYS} signal handler as the system call hook. +However, similarly to the \texttt{int3} signaling technique, SUD imposes a significant performance penalty on the user-space program due to the overhead of the signal handling. +\cite{zpoline} + \subsection{zpoline}\label{subsec:zpoline} -Lorem Ipsum. +zpoline is a system call hook mechanism for x86-64 CPUs. +Binary rewriting is used to replace (two-byte) \texttt{syscall}/\texttt{sysenter} instructions with a (two-byte) \texttt{callq *\%rax} instruction. +Because this instruction jumps to \texttt{rax}, where also the syscall number is stored, the trampoline code has to be initialized beginning at virtual address 0. +zpoline is exhaustive and achieves very low performance reduction (28--761 times less overhead compared to other exhaustive system call hooking techniques). \cite{zpoline} \subsection{DataHook}\label{subsec:datahook} -Lorem Ipsum. +DataHook is a system call hooking technique for 32-bit programs based on glibc running on x86 or x86-64 machines. +It relies on glibc's way of performing system calls, namely a \texttt{call *\%gs:0x10} instruction to call the \texttt{\_\_kernel\_vsyscall} function. +The content of \texttt{gs:0x10} is backed up and modified to jump to a given hook function. +DataHook is only exhaustive when used on glibc-based programs. +It achieves a very low performance reduction (5--1429 times less overhead compared to existing hooking techniques). \cite{datahook} - - -\section{Non-Binary-Rewriting-Based}\label{sec:non-binary-rewriting-based} - -Lorem Ipsum. diff --git a/thesis/src/99.intercept.bib b/thesis/src/99.intercept.bib index 91661e8..0f1fa44 100644 --- a/thesis/src/99.intercept.bib +++ b/thesis/src/99.intercept.bib @@ -59,6 +59,10 @@ title = {Using the GNU Compiler Collection (GCC)}, url = {https://gcc.gnu.org/onlinedocs/gcc/index.html}, } +@manual{sud, + title = {Syscall User Dispatch -- The Linux Kernel documentation}, + url = {https://docs.kernel.org/admin-guide/syscall-user-dispatch.html}, +} @inproceedings{zpoline, author = {Kenichi Yasukata and Hajime Tazaki and Pierre-Louis Aublin and Kenta Ishiguro}, title = {zpoline: a system call hook mechanism based on binary rewriting},