1
0

thesis: Unify \ref

This commit is contained in:
2025-07-21 17:12:35 +02:00
parent f70369ec98
commit 7055f74d8e

View File

@@ -2,10 +2,10 @@
\chapter{Intercepting Function Calls}\label{ch:intercepting-function-calls} \chapter{Intercepting Function Calls}\label{ch:intercepting-function-calls}
In this chapter all steps on how to intercept function calls in this work are discussed. In this chapter all steps on how to intercept function calls in this work are discussed.
An example of what the resulting interception looks like may be found in section \ref{sec:intercepting-example}. An example of what the resulting interception looks like may be found in Section~\ref{sec:intercepting-example}.
Furthermore, an overview on how to test given programs is presented in section \ref{sec:automated-testing-on-intercepted-function-calls}. Furthermore, an overview on how to test given programs is presented in Section~\ref{sec:automated-testing-on-intercepted-function-calls}.
This chapter does not discuss how these function calls may be manipulated in any way. This chapter does not discuss how these function calls may be manipulated in any way.
For that see chapter \ref{ch:manipulating-function-calls}. For that see Chapter~\ref{ch:manipulating-function-calls}.
\section{Identified Methods for Intercepting Function and System Calls}\label{sec:methods-for-intercepting} \section{Identified Methods for Intercepting Function and System Calls}\label{sec:methods-for-intercepting}
@@ -24,7 +24,7 @@ The control is handed from the traced process to the tracing process each time a
\cite{ptrace.2} \cite{ptrace.2}
To make use of this system call, a corresponding command already exists. To make use of this system call, a corresponding command already exists.
See~\ref{subsec:strace}. See Subsection~\ref{subsec:strace}.
\subsection{\texttt{strace} Command}\label{subsec:strace} \subsection{\texttt{strace} Command}\label{subsec:strace}
@@ -33,7 +33,7 @@ The \texttt{strace} (``system call/signal trace'') command may be used to run a
Each system call is recorded as a line and either written to the standard error output or a specified file. Each system call is recorded as a line and either written to the standard error output or a specified file.
\cite{strace.1} \cite{strace.1}
Listings \ref{lst:main.c} and \ref{lst:strace} give a simple example of what this output looks like. Listings~\ref{lst:main.c} and~\ref{lst:strace} give a simple example of what this output looks like.
It is clearly visible that only (``pure'') system calls are recorded, and calls to library functions (like \texttt{malloc} or \texttt{free}) do not appear. It is clearly visible that only (``pure'') system calls are recorded, and calls to library functions (like \texttt{malloc} or \texttt{free}) do not appear.
Also note that arguments to the calls are displayed in a ``pretty'' way. Also note that arguments to the calls are displayed in a ``pretty'' way.
For example, string arguments would be simple pointers, but \texttt{strace} displays them as C-like strings. For example, string arguments would be simple pointers, but \texttt{strace} displays them as C-like strings.
@@ -67,9 +67,9 @@ The \texttt{ltrace} (``library call trace'') command may be used to trace dynami
It works similarly to \texttt{strace} (see \ref{subsec:strace}). It works similarly to \texttt{strace} (see \ref{subsec:strace}).
\cite{ltrace.1} \cite{ltrace.1}
Listings \ref{lst:main.c} and \ref{lst:ltrace} illustrate what the output of \texttt{ltrace} looks like. Listings~\ref{lst:main.c} and~\ref{lst:ltrace} illustrate what the output of \texttt{ltrace} looks like.
In contrast to the output of \texttt{strace} now only ``real'' calls to library functions are included in the output. In contrast to the output of \texttt{strace} now only ``real'' calls to library functions are included in the output.
Therefore, a lot less ``noise'' is generated (see omitted lines in listing \ref{lst:strace}). Therefore, a lot less ``noise'' is generated (see omitted lines in Listing~\ref{lst:strace}).
Again, the function arguments are displayed in a ``pretty'' way. Again, the function arguments are displayed in a ``pretty'' way.
This command uses so-called prototype functions~\cite{ltrace.conf.5} to format function arguments. This command uses so-called prototype functions~\cite{ltrace.conf.5} to format function arguments.
@@ -84,21 +84,21 @@ free(0x55624164b2a0) = <void>
\label{lst:ltrace} \label{lst:ltrace}
\end{listing} \end{listing}
This method fits the requirements for this work a lot better than \texttt{strace} (see~\ref{subsec:strace}), This method fits the requirements for this work a lot better than \texttt{strace} (see Subsection~\ref{subsec:strace}),
but it is not very flexible and offers no means to modify the intercepted function calls. but it is not very flexible and offers no means to modify the intercepted function calls.
\subsection{Kernel Module}\label{subsec:kernel-module} \subsection{Kernel Module}\label{subsec:kernel-module}
Another possibility to intercept system calls is to intercept them directly in the kernel via a kernel module. Another possibility to intercept system calls is to intercept them directly in the kernel via a kernel module.
However, this work did not explore this approach further due to time constraints and other, better-fitting alternatives. However, this work did not explore this approach further due to time constraints and other, better-fitting alternatives.
See \cite[Section~7.2]{netsectools2005} for more details on how to intercept system calls using kernel modules. See~\cite[Section~7.2]{netsectools2005} for more details on how to intercept system calls using kernel modules.
\subsection{Wrapper Functions in gcc}\label{subsec:wrapper-functions} \subsection{Wrapper Functions in gcc}\label{subsec:wrapper-functions}
A different approach to intercepting function calls is to tell the compiler directly which functions should be intercepted. A different approach to intercepting function calls is to tell the compiler directly which functions should be intercepted.
The compiler, and the linker respectively, then directly link calls to the specified functions to wrapper functions. The compiler, and the linker respectively, then directly link calls to the specified functions to wrapper functions.
(See \ref{subsec:preloading} for more details.) (See Subsection~\ref{subsec:preloading} for more details.)
The default linker \texttt{ld} includes such a feature. The default linker \texttt{ld} includes such a feature.
See the OPTIONS section in the ld(1) Linux manual page~\cite{ld.1}: See the OPTIONS section in the ld(1) Linux manual page~\cite{ld.1}:
@@ -135,7 +135,7 @@ See the OPTIONS section in the gcc(1) Linux manual page~\cite{gcc.1}:
This means, by specifying \texttt{-Wl,-{}-wrap=\textit{symbol}} when compiling using gcc, This means, by specifying \texttt{-Wl,-{}-wrap=\textit{symbol}} when compiling using gcc,
all calls from the currently compiled program to \texttt{\textit{symbol}} are redirected to \texttt{\_\_wrap\_\textit{symbol}}. all calls from the currently compiled program to \texttt{\textit{symbol}} are redirected to \texttt{\_\_wrap\_\textit{symbol}}.
To call the real function inside the wrapper, \texttt{\_\_real\_\textit{symbol}} may be used. To call the real function inside the wrapper, \texttt{\_\_real\_\textit{symbol}} may be used.
Listings \ref{lst:wrap.c} and \ref{lst:wrap} try to illustrate this by overriding the \texttt{malloc} function of the C standard library. Listings~\ref{lst:wrap.c} and~\ref{lst:wrap} try to illustrate this by overriding the \texttt{malloc} function of the C standard library.
\begin{listing}[htbp] \begin{listing}[htbp]
\inputminted[linenos]{c}{listings/wrap.c} \inputminted[linenos]{c}{listings/wrap.c}
@@ -159,7 +159,7 @@ Therefore, the source code (or the corresponding \texttt{*.out} files) needs to
Note, only calls from the targeted source code will be redirected, calls from other libraries won't. Note, only calls from the targeted source code will be redirected, calls from other libraries won't.
Theoretically, it should be possible to re-link a given binary without having access to its source code. Theoretically, it should be possible to re-link a given binary without having access to its source code.
But due to other more straight-forward methods (see \ref{subsec:preloading}), this has not been further investigated. But due to other more straight-forward methods (see Subsection~\ref{subsec:preloading}), this has not been further investigated.
\subsection{Preloading using \texttt{LD\_PRELOAD}}\label{subsec:preloading} \subsection{Preloading using \texttt{LD\_PRELOAD}}\label{subsec:preloading}
@@ -188,7 +188,7 @@ See the ENVIRONMENT section in the ld.so(8) Linux manual page~\cite{ld.so.8}:
\end{quote} \end{quote}
This means, by setting the environment variable \texttt{LD\_PRELOAD}, it is possible to override specific functions. This means, by setting the environment variable \texttt{LD\_PRELOAD}, it is possible to override specific functions.
Listings \ref{lst:preload.c} and \ref{lst:preload} try to illustrate this by overriding the \texttt{malloc} function of the C standard library. Listings~\ref{lst:preload.c} and~\ref{lst:preload} try to illustrate this by overriding the \texttt{malloc} function of the C standard library.
\begin{listing}[htbp] \begin{listing}[htbp]
\inputminted[linenos]{c}{listings/preload.c} \inputminted[linenos]{c}{listings/preload.c}
@@ -221,7 +221,7 @@ it has been found that the most reliable way to achieve the goals of this work (
This is because (as long as the programs to test are dynamically linked), intercepting function calls allows one to intercept many more calls and in a more flexible way. This is because (as long as the programs to test are dynamically linked), intercepting function calls allows one to intercept many more calls and in a more flexible way.
Therefore, from now on this work only considers function calls and no system calls directly. Therefore, from now on this work only considers function calls and no system calls directly.
In this work preloading (see \ref{subsec:preloading}) was chosen to be used In this work preloading (see Subsection~\ref{subsec:preloading}) was chosen to be used
because it is simple to use (``clean'' source code, easy to compile and run programs with it) and offers the means to arbitrarily execute code when the intercepted function call is redirected. because it is simple to use (``clean'' source code, easy to compile and run programs with it) and offers the means to arbitrarily execute code when the intercepted function call is redirected.
The following sections concern the next steps in what else is needed to create a powerful ``interceptor''. The following sections concern the next steps in what else is needed to create a powerful ``interceptor''.
@@ -231,7 +231,7 @@ The following sections concern the next steps in what else is needed to create a
After deciding to use the preloading method to intercept function calls, a more detailed plan is needed to continue developing. After deciding to use the preloading method to intercept function calls, a more detailed plan is needed to continue developing.
It was decided to have one single \texttt{intercept.so} file as a resulting artifact which then may be loaded via the \texttt{LD\_PRELOAD} environment variable. It was decided to have one single \texttt{intercept.so} file as a resulting artifact which then may be loaded via the \texttt{LD\_PRELOAD} environment variable.
The easiest and most straightforward way to structure the source code was to put all code in one single C file. The easiest and most straightforward way to structure the source code was to put all code in one single C file.
Listing \ref{lst:intercept-preload.c} gives an overview over the grounding code structure. Listing~\ref{lst:intercept-preload.c} gives an overview over the grounding code structure.
For each function that should be intercepted, this function simply has to be declared and defined the same way \texttt{malloc} was. For each function that should be intercepted, this function simply has to be declared and defined the same way \texttt{malloc} was.
\begin{listing}[htbp] \begin{listing}[htbp]
@@ -251,7 +251,7 @@ As already mentioned, \texttt{ltrace} uses prototype functions to format its fun
This allows \texttt{ltrace} to ``dynamically'' display function arguments for any new or unknown functions without the need for recompilation. This allows \texttt{ltrace} to ``dynamically'' display function arguments for any new or unknown functions without the need for recompilation.
\cite{ltrace.conf.5} \cite{ltrace.conf.5}
However, due to implementation complexity reasons and the need for ``complex''\todo{} return types (see~\ref{sec:retrieving-function-return-values}) a statically compiled approach has been used for this work. However, due to implementation complexity reasons and the need for ``complex''\todo{} return types (see Section~\ref{sec:retrieving-function-return-values}) a statically compiled approach has been used for this work.
This means that each function formats its arguments and return values itself without any configuration option. This means that each function formats its arguments and return values itself without any configuration option.
The reason for retrieving as much information as possible from each function call is that at a later point in time it is possible to completely reconstruct the exact function calls an their sequence. The reason for retrieving as much information as possible from each function call is that at a later point in time it is possible to completely reconstruct the exact function calls an their sequence.
@@ -425,7 +425,7 @@ See the OPTIONS section in the readelf(1) Linux manual page~\cite{readelf.1}:
\section{\texttt{intercept.so} Library}\label{sec:intercept.so-library} \section{\texttt{intercept.so} Library}\label{sec:intercept.so-library}
The time has come for putting it all together. The time has come for putting it all together.
As mentioned in \ref{sec:fundameltal-project-structure}, almost the whole project exists in one source file, \texttt{intercept.c}. As mentioned in Section~\ref{sec:fundameltal-project-structure}, almost the whole project exists in one source file, \texttt{intercept.c}.
This file is compiled to \texttt{intercept.so}, which may be preloaded using \texttt{LD\_PRELOAD} and controlled with other environment variables. This file is compiled to \texttt{intercept.so}, which may be preloaded using \texttt{LD\_PRELOAD} and controlled with other environment variables.
These other environment variables are described in the following: These other environment variables are described in the following:
@@ -468,7 +468,7 @@ The shared object currently supports intercepting the following functions:
To make the usage of the aforementioned shared object more easy, a simple python script has been put together. To make the usage of the aforementioned shared object more easy, a simple python script has been put together.
This script may be used as a command line tool. This script may be used as a command line tool.
See listing \ref{lst:intercept}. See Listing~\ref{lst:intercept}.
\begin{listing}[htbp] \begin{listing}[htbp]
\inputminted[linenos]{python}{../proj/intercept/intercept} \inputminted[linenos]{python}{../proj/intercept/intercept}
@@ -485,7 +485,7 @@ intercept [-h] [-F FUNCTIONS] [-s] [-o | -L LIBRARIES] \
\begin{description} \begin{description}
\item[\texttt{-F}, \texttt{-{}-functions}] \item[\texttt{-F}, \texttt{-{}-functions}]
A list of functions to intercept. A list of functions to intercept.
See \ref{sec:intercept.so-library} for more details. See Section~\ref{sec:intercept.so-library} for more details.
Default value is \texttt{*}. Default value is \texttt{*}.
\item[\texttt{-s}, \texttt{-{}-sparse}] \item[\texttt{-s}, \texttt{-{}-sparse}]
Indicates that strings and structures should be printed empty to save bandwidth. Indicates that strings and structures should be printed empty to save bandwidth.
@@ -494,7 +494,7 @@ intercept [-h] [-F FUNCTIONS] [-s] [-o | -L LIBRARIES] \
This has the effect, that only function calls from the executed binary itself are recorded. This has the effect, that only function calls from the executed binary itself are recorded.
\item[\texttt{-L}, \texttt{-{}-libraries}] \item[\texttt{-L}, \texttt{-{}-libraries}]
A list of library paths to intercept function calls from. A list of library paths to intercept function calls from.
See \ref{sec:intercept.so-library} for more details. See Section~\ref{sec:intercept.so-library} for more details.
Default value is \texttt{*} (except when \texttt{-o} is present). Default value is \texttt{*} (except when \texttt{-o} is present).
\item[\texttt{-l}, \texttt{-{}-log}] \item[\texttt{-l}, \texttt{-{}-log}]
Used to specify in which file the recorded function calls should be logged. Used to specify in which file the recorded function calls should be logged.
@@ -502,13 +502,13 @@ intercept [-h] [-F FUNCTIONS] [-s] [-o | -L LIBRARIES] \
\item[\texttt{-i}, \texttt{-{}-intercept}] \item[\texttt{-i}, \texttt{-{}-intercept}]
Decides where to output/print/write/send the recorded function calls. Decides where to output/print/write/send the recorded function calls.
Values may be \texttt{stdout}, \texttt{stderr}, \texttt{file:\textit{<path>}}, \texttt{unix:\textit{<path>}}. Values may be \texttt{stdout}, \texttt{stderr}, \texttt{file:\textit{<path>}}, \texttt{unix:\textit{<path>}}.
See \ref{sec:intercept.so-library} for more details. See Section~\ref{sec:intercept.so-library} for more details.
\end{description} \end{description}
\section{Example}\label{sec:intercepting-example} \section{Example}\label{sec:intercepting-example}
To make it easier for the reader listing \ref{lst:intercept-client} provides some recorded function calls. To make it easier for the reader, Listing~\ref{lst:intercept-client} provides some recorded function calls.
Most lines had to be broken up into multiple lines for better readability. Most lines had to be broken up into multiple lines for better readability.
The recorded calls stem from a program written by myself as a solution for an assignment in the Operating Systems course at university. The recorded calls stem from a program written by myself as a solution for an assignment in the Operating Systems course at university.
It is a simple HTTP client. It is a simple HTTP client.