272 lines
11 KiB
TeX
272 lines
11 KiB
TeX
|
|
\chapter{Intercepting Function Calls}\label{ch:intercepting-function-calls}
|
|
|
|
Lorem Ipsum.
|
|
|
|
\section{Identified Methods for Intercepting Function and System Calls}\label{sec:methods-for-intercepting}
|
|
|
|
Lorem Ipsum.
|
|
|
|
|
|
\subsection{\texttt{ptrace} System Call}\label{subsec:ptrace}
|
|
|
|
The first thing that pops up when researching on how to intercept system calls in Linux is the \texttt{ptrace} (``process trace'') system call.
|
|
This system call allows one process to observe and control the execution of another process (including memory and registers).
|
|
The control is handed from the traced process to the tracing process each time any signal is delivered.
|
|
\cite{ptrace.2}
|
|
|
|
To make use of this system call, a corresponding command already exists.
|
|
See~\ref{subsec:strace}.
|
|
|
|
|
|
\subsection{\texttt{strace} Command}\label{subsec:strace}
|
|
|
|
The \texttt{strace} (``system call/signal trace'') command may be used to run a specified command and to thereby intercept and record the system calls which are made.
|
|
Each system call is recorded as a line and either written to the standard error output or a specified file.
|
|
\cite{strace.1}
|
|
|
|
Listings \ref{lst:main.c} and \ref{lst:strace} give a simple example of what this output looks like.
|
|
It is clearly visible that only (``pure'') system calls are recorded, and calls to library functions (like \texttt{malloc} or \texttt{free}) do not appear.
|
|
Also note that, arguments to the calls are displayed in a ``pretty'' way.
|
|
For example, strings arguments would be simple pointers, but \texttt{strace} displays them as C-like strings.
|
|
|
|
\begin{listing}[htbp]
|
|
\begin{minted}[linenos]{c}
|
|
#include <stdlib.h>
|
|
#include <stdio.h>
|
|
#include <string.h>
|
|
|
|
int main(const int argc, char *const argv[]) {
|
|
char *str = malloc(10);
|
|
strcpy(str, "Abc123");
|
|
printf("Hello World!\nString: %s\n", str);
|
|
free(str);
|
|
}
|
|
\end{minted}
|
|
\caption{Contents of \texttt{main.c}.}
|
|
\label{lst:main.c}
|
|
\end{listing}
|
|
|
|
\begin{listing}[htbp]
|
|
\begin{minted}{text}
|
|
execve("./main", ["./main"], 0x7ffd63b32bb0 /* 71 vars */) = 0
|
|
[-- 32 lines omitted --]
|
|
write(1, "Hello World!\n", 13) = 13
|
|
write(1, "String: Abc123\n", 15) = 15
|
|
exit_group(0) = ?
|
|
+++ exited with 0 +++
|
|
\end{minted}
|
|
\caption{Output of \texttt{strace ./main}.}
|
|
\label{lst:strace}
|
|
\end{listing}
|
|
|
|
This approach works great for debugging and other use-cases,
|
|
but only intercepting system calls does not statisfy the requirements for this work.
|
|
|
|
|
|
\subsection{\texttt{ltrace} Command}\label{subsec:ltrace}
|
|
|
|
The \texttt{ltrace} (``library call trace'') command may be used to trace dynamic library calls instead of system calls.
|
|
It works similarly to \texttt{strace} (see \ref{subsec:strace}).
|
|
\cite{ltrace.1}
|
|
|
|
Listings \ref{lst:main.c} and \ref{lst:ltrace} illustrate what the output of \texttt{ltrace} looks like.
|
|
In contrast to the output of \texttt{strace} now only ``real'' calls to library functions are included in the output.
|
|
Therefore, a lot less ``noise'' is generated (see omitted lines in listing \ref{lst:strace}).
|
|
Again, the function arguments are displayed in a ``pretty'' way.
|
|
This command uses so-called prototype functions~\cite{ltrace.conf.5} to format function arguments.
|
|
|
|
\begin{listing}[htbp]
|
|
\begin{minted}{text}
|
|
malloc(10) = 0x55624164b2a0
|
|
printf("Hello World!\nString: %s\n", "Abc123") = 28
|
|
free(0x55624164b2a0) = <void>
|
|
+++ exited (status 0) +++
|
|
\end{minted}
|
|
\caption{Output of \texttt{ltrace ./main}.}
|
|
\label{lst:ltrace}
|
|
\end{listing}
|
|
|
|
This method fits the requirements for this work a lot better than \texttt{strace} (see~\ref{subsec:strace}),
|
|
but it is not very flexible and offers no means to modify the intercepted function calls.
|
|
|
|
|
|
\subsection{Wrapper Functions in gcc}\label{subsec:wrapper-functions}
|
|
|
|
A different approach to intercepting function calls is to tell the compiler directly, which functions should be intercepted.
|
|
The compiler, and the linker respectively, then directly link calls to the specified functions to wrapper functions.
|
|
(See \ref{subsec:preloading} for more details.)
|
|
|
|
The default linker \texttt{ld} includes such a feature.
|
|
See the OPTIONS section in the ld(1) Linux manual page~\cite{ld.1}:
|
|
|
|
\begin{quote}
|
|
\begin{description}
|
|
\item[\texttt{-{}-wrap=\textit{symbol}}]
|
|
Use a wrapper function for \texttt{\textit{symbol}}.
|
|
Any undefined reference to \texttt{\textit{symbol}} will be resolved to \texttt{\_\_wrap\_\textit{symbol}}.
|
|
Any undefined reference to \texttt{\_\_real\_\textit{symbol}} will be resolved to \texttt{\textit{symbol}}.
|
|
|
|
This can be used to provide a wrapper for a system function.
|
|
The wrapper function should be called \texttt{\_\_wrap\_\textit{symbol}}.
|
|
If it wishes to call the system function, it should call \texttt{\_\_real\_\textit{symbol}}.
|
|
\lbrack\dots\rbrack
|
|
\end{description}
|
|
\end{quote}
|
|
|
|
The gcc compiler also supports this, by allowing to pass options to the linker.
|
|
See the OPTIONS section in the gcc(1) Linux manual page~\cite{gcc.1}:
|
|
|
|
\begin{quote}
|
|
\begin{description}
|
|
\item[\texttt{-Wl,\textit{option}}]
|
|
Pass \texttt{\textit{option}} as an option to the linker.
|
|
If \texttt{\textit{option}} contains commas, it is split into multiple options at the commas.
|
|
You can use this syntax to pass an argument to the option.
|
|
For example, \texttt{-Wl,-Map,output.map} passes \texttt{-Map output.map} to the linker.
|
|
When using the GNU linker, you can also get the same effect with \texttt{-Wl,-Map=output.map}.
|
|
\lbrack\dots\rbrack
|
|
\end{description}
|
|
\end{quote}
|
|
|
|
This means, by specifying \texttt{-Wl,-{}-wrap=\textit{symbol}} when compiling using gcc,
|
|
all calls from the currently compiled program to \texttt{\textit{symbol}} are redirected to \texttt{\_\_wrap\_\textit{symbol}}.
|
|
To call the real function inside the wrapper, \texttt{\_\_real\_\textit{symbol}} may be used.
|
|
Listings \ref{lst:wrap.c} and \ref{lst:wrap} try to illustrate this by overriding the \texttt{malloc} function of the C standard library.
|
|
|
|
\begin{listing}[htbp]
|
|
\begin{minted}[linenos]{c}
|
|
#include <stddef.h>
|
|
|
|
extern void *__real_malloc(size_t size);
|
|
|
|
void *__wrap_malloc(size_t size) {
|
|
// before call to malloc
|
|
void *ret = __real_malloc(size);
|
|
// after call to malloc
|
|
return ret;
|
|
}
|
|
\end{minted}
|
|
\caption{Contents of \texttt{wrap.c}.}
|
|
\label{lst:wrap.c}
|
|
\end{listing}
|
|
|
|
\begin{listing}[htbp]
|
|
\begin{minted}{shell}
|
|
gcc -o main_wrapped main.c wrap.c -Wl,--wrap=malloc
|
|
./main_wrapped
|
|
\end{minted}
|
|
\caption{Compile \texttt{main.c} and \texttt{wrap.c} and run the resulting program.}
|
|
\label{lst:wrap}
|
|
\end{listing}
|
|
|
|
This approach allows wrapping any function in a relatively clean way.
|
|
But it is not possible to override functions in any given binary program.
|
|
It is required to re-compile (or to re-link) a given program to use this feature of ld.
|
|
Therefore, the source code (or the corresponding \texttt{*.out} files) needs to be available.
|
|
Note, only calls from the targeted source code will be redirected, calls from other libraries won't.
|
|
|
|
Theoretically, it should be possible to re-link a given binary without having access to its source code.
|
|
But due to other more straight-forward methods (see \ref{subsec:preloading}), this has not been further investigated.
|
|
|
|
|
|
\subsection{Preloading using \texttt{LD\_PRELOAD}}\label{subsec:preloading}
|
|
|
|
To execute binary files on Linux systems, a dynamic linker is needed at runtime.
|
|
(Unless the binaries were statically linked at compile-time.)
|
|
Usually, \texttt{ld.so} and \texttt{ld-linux.so} are used as dynamic linkers.
|
|
They find and load the shared objects (shared libraries) needed by a program, prepare the program and finally run it.
|
|
\cite{ld.so.8}
|
|
|
|
As the overwhelming majority of programs are dynamically linked,
|
|
most function calls to other libraries (like to the C standard library) reference a shared object, which has to be loaded by the linker at runtime.
|
|
Therefore, it would be possible to ``hijack'' (or intercept) these function calls,
|
|
when the linker would allow loading other functions instead of the proper ones.
|
|
|
|
Luckily, \texttt{ld.so} allows this so-called ``preloading''.
|
|
See the ENVIRONMENT section in the ld.so(8) Linux manual page~\cite{ld.so.8}:
|
|
|
|
\begin{quote}
|
|
\begin{description}
|
|
\item[\texttt{LD\_PRELOAD}]
|
|
A list of additional, user-specified, ELF shared objects to be loaded before all others.
|
|
This feature can be used to selectively override functions in other shared objects.
|
|
\lbrack\dots\rbrack
|
|
\end{description}
|
|
\end{quote}
|
|
|
|
This means, by setting the environment variable \texttt{LD\_PRELOAD}, it is possible to override specific functions.
|
|
Listings \ref{lst:preload.c} and \ref{lst:preload} try to illustrate this by overriding the \texttt{malloc} function of the C standard library.
|
|
|
|
\begin{listing}[htbp]
|
|
\begin{minted}[linenos]{c}
|
|
#include <stdlib.h>
|
|
#include <dlfcn.h>
|
|
#include <errno.h>
|
|
|
|
void *malloc(size_t size) {
|
|
// before call to malloc
|
|
void *(*_malloc)(size_t);
|
|
if ((_malloc = dlsym(RTLD_NEXT, "malloc")) == NULL) {
|
|
errno = ENOSYS;
|
|
return NULL;
|
|
}
|
|
void *ret = _malloc(size);
|
|
// after call to malloc
|
|
return ret;
|
|
}
|
|
\end{minted}
|
|
\caption{Contents of \texttt{preload.c}.}
|
|
\label{lst:preload.c}
|
|
\end{listing}
|
|
|
|
\begin{listing}[htbp]
|
|
\begin{minted}{shell}
|
|
# ./main is already compiled and ready
|
|
gcc -shared -fPIC -o preload.so preload.c
|
|
LD_PRELOAD="$(pwd)/preload.so" ./main
|
|
\end{minted}
|
|
\caption{Compile \texttt{preload.c} and run a program with \texttt{LD\_PRELOAD}.}
|
|
\label{lst:preload}
|
|
\end{listing}
|
|
|
|
The function \texttt{dlsym} is used to retrieve the original address of the \texttt{malloc} function.
|
|
\texttt{RTLD\_NEXT} indicates to find the next occurrence of \texttt{malloc} in the search order after the current object.
|
|
\cite{dlsym.3}
|
|
|
|
By using this method, it is possible to override, and therefore wrap, any function as long as the targeted binary was not statically linked.
|
|
Although, one has to be aware that not only function calls inside the targeted binary, but also calls inside other libraries (e.g., to \texttt{malloc}) are redirected to the overriding function.
|
|
|
|
|
|
\subsection{Conclusion}\label{subsec:conclusion}
|
|
|
|
Lorem Ipsum.
|
|
|
|
\section{Combining Preloading and Wrapper Functions}\label{sec:combining-preloading-and-wrapper-functions}
|
|
|
|
Lorem Ipsum.
|
|
|
|
\section{Retrieving Function Argument Values}\label{sec:Retrieving-function-argument-values}
|
|
|
|
Lorem Ipsum.
|
|
|
|
\section{Determining Function Call Location}\label{sec:determining-function-call-location}
|
|
|
|
Lorem Ipsum.
|
|
|
|
\section{Example}\label{sec:intercepting-example}
|
|
|
|
Lorem Ipsum.
|
|
|
|
\section{Analyzing Intercepted Function Calls}\label{sec:analyzing-intercepted-function-calls}
|
|
|
|
Lorem Ipsum.
|
|
|
|
\section{Parsing Intercepted Function Calls in Python}\label{sec:parsing-intercepted-function-calls}
|
|
|
|
Lorem Ipsum.
|
|
|
|
\section{Automated Testing on Intercepted Function Calls}\label{sec:automated-testing-on-intercepted-function-calls}
|
|
|
|
Lorem Ipsum.
|