diff --git a/doc/README.md b/doc/README.md index fc503f4..c00b1af 100644 --- a/doc/README.md +++ b/doc/README.md @@ -16,7 +16,7 @@ From the [ENVIRONMENT section in the Linux manual page ld.so(8)](https://www.man * No need to re-link * Works for *all* functions * Works only on dynamically linked executables -* Intercepts all calls (including stack allocations etc.) +* Intercepts all calls (including calls inside libraries etc.) Example (`preload.c`): ```c diff --git a/thesis/.gitignore b/thesis/.gitignore index fcbbe84..c06a2ea 100644 --- a/thesis/.gitignore +++ b/thesis/.gitignore @@ -18,3 +18,4 @@ *.toc *.xmpdata *.xmpi +*.bbl diff --git a/thesis/Makefile b/thesis/Makefile index 8d3c2b1..a4f0038 100644 --- a/thesis/Makefile +++ b/thesis/Makefile @@ -3,11 +3,10 @@ all: thesis.pdf clean-out %.pdf: %.tex $(wildcard src/*) - pdflatex $< - pdflatex $< + latexmk -pdf $< -clean: clean-out - rm -rf *.pdf +clean: + latexmk -C clean-out: - rm -rf *.acn *.aux *.glo *.glsdefs *.idx *.ist *.loa *.lof *.log *.lot *.mw *.out *.toc *.xmpdata *.xmpi + latexmk -c diff --git a/thesis/src/01.introduction.tex b/thesis/src/01.introduction.tex index efce03a..ccf8acf 100644 --- a/thesis/src/01.introduction.tex +++ b/thesis/src/01.introduction.tex @@ -3,6 +3,18 @@ Lorem Ipsum. -\section{Something} +\section{TODO: Why intercept?} + +Lorem Ipsum. + +\section{TODO: Why are current solutions not enough?} + +Lorem Ipsum. + +\section{TODO: Linux/C/ELF call structure} + +Lorem Ipsum. + +\section{TODO: System Calls vs. Function Calls}\label{sec:system-calls-vs-function-calls} Lorem Ipsum. diff --git a/thesis/src/02.intercept.tex b/thesis/src/02.intercept.tex index ce14143..de7b069 100644 --- a/thesis/src/02.intercept.tex +++ b/thesis/src/02.intercept.tex @@ -2,3 +2,150 @@ \chapter{Intercepting Function Calls}\label{ch:intercepting-function-calls} Lorem Ipsum. + +\section{Identified Methods for Intercepting Function and System Calls}\label{sec:methods-for-intercepting} + +Lorem Ipsum. + +\subsection{Preloading using \texttt{LD\_PRELOAD}}\label{subsec:preloading} + +To execute binary files on Linux systems, a dynamic linker is needed at runtime. +(Unless the binaries were statically linked at compile-time.) +Usually, \texttt{ld.so} and \texttt{ld-linux.so} are used as dynamic linkers. +They find and load the shared objects (shared libraries) needed by a program, prepare the program and finally run it. +\cite{ld.so.8} + +As the overwhelming majority of programs are dynamically linked, +most function calls to other libraries (like to the C standard library) reference a shared object, which has to be loaded by the linker at runtime. +Therefore, it would be possible to ``hijack'' (or intercept) these function calls, +when the linker would allow loading other functions instead of the proper ones. + +Luckily, \texttt{ld.so} allows this so-called ``preloading''. +See the ENVIRONMENT section in the ld.so(8) Linux manual page~\cite{ld.so.8}: + +\begin{quote} + \begin{description} + \item[\texttt{LD\_PRELOAD}] + A list of additional, user-specified, ELF shared objects to be loaded before all others. + This feature can be used to selectively override functions in other shared objects. + \lbrack\dots\rbrack + \end{description} +\end{quote} + +This means, by setting the environment variable \texttt{LD\_PRELOAD}, it is possible to override specific functions. +The listings \ref{lst:preload.c} and \ref{lst:preload} try to illustrate this. + +\begin{listing}[htbp] + \begin{minted}[linenos]{c} +#include +#include +#include + +void *malloc(size_t size) { + // before call to malloc + void *(*_malloc)(size_t); + if ((_malloc = dlsym(RTLD_NEXT, "malloc")) == NULL) { + errno = ENOSYS; + return NULL; + } + void *ret = _malloc(size); + // after call to malloc + return ret; +} + \end{minted} + \caption{Contents of \texttt{preload.c}.} + \label{lst:preload.c} +\end{listing} + +\begin{listing}[htbp] + \begin{minted}{shell} +# ./main is already compiled and ready +gcc -shared -fPIC -o preload.so preload.c +LD_PRELOAD="$(pwd)/preload.so" ./main + \end{minted} + \caption{Compile \texttt{preload.so} and run a program with \texttt{LD\_PRELOAD}.} + \label{lst:preload} +\end{listing} + +The function \texttt{dlsym} is used to retrieve the original address of the \texttt{malloc} function. +\texttt{RTLD\_NEXT} indicates to find the next occurrence of \texttt{malloc} in the search order after the current object. +\cite{dlsym.3} + +Using this method, it is possible to override, and therefore wrap, any function as long as the targeted binary was not statically linked. +Although, one has to be aware that not only function calls inside the targeted binary, but also calls inside other libraries (e.g., to \texttt{malloc}) are redirected to the overriding function. + +\subsection{Wrapper Functions in \texttt{gcc}}\label{subsec:wrapper-functions} + +From the OPTIONS section in the ld(1) Linux manual page~\cite{ld.1}: + +\begin{quote} + \begin{description} + \item[\texttt{--wrap=\textit{symbol}}] + Use a wrapper function for \texttt{\textit{symbol}}. + Any undefined reference to \texttt{\textit{symbol}} will be resolved to \texttt{\_\_wrap\_\textit{symbol}}. + Any undefined reference to \texttt{\_\_real\_\textit{symbol}} will be resolved to \texttt{\textit{symbol}}. + + This can be used to provide a wrapper for a system function. + The wrapper function should be called \texttt{\_\_wrap\_\textit{symbol}}. + If it wishes to call the system function, it should call \texttt{\_\_real\_\textit{symbol}}. + \lbrack\dots\rbrack + \end{description} +\end{quote} + +From the OPTIONS section in the gcc(1) Linux manual page~\cite{gcc.1}: + +\begin{quote} + \begin{description} + \item[\texttt{-Wl,\textit{option}}] + Pass \texttt{\textit{option}} as an option to the linker. + If \texttt{\textit{option}} contains commas, it is split into multiple options at the commas. + You can use this syntax to pass an argument to the option. + For example, \texttt{-Wl,-Map,output.map} passes \texttt{-Map output.map} to the linker. + When using the GNU linker, you can also get the same effect with \texttt{-Wl,-Map=output.map}. + \lbrack\dots\rbrack + \end{description} +\end{quote} + +\subsection{Kernel Module}\label{subsec:kernel-module} + +Lorem Ipsum. + +\subsection{Emulation}\label{subsec:emulation} + +Lorem Ipsum. + +\subsection{Modifying the Kernel}\label{subsec:modifiying-kernel} + +Lorem Ipsum. + +\subsection{Conclusion}\label{subsec:conclusion} + +Lorem Ipsum. + +\section{Combining Preloading and Wrapper Functions}\label{sec:combining-preloading-and-wrapper-functions} + +Lorem Ipsum. + +\section{Retrieving Function Argument Values}\label{sec:Retrieving-function-argument-values} + +Lorem Ipsum. + +\section{Determining Function Call Location}\label{sec:determining-function-call-location} + +Lorem Ipsum. + +\section{Example}\label{sec:intercepting-example} + +Lorem Ipsum. + +\section{Analyzing Intercepted Function Calls}\label{sec:analyzing-intercepted-function-calls} + +Lorem Ipsum. + +\section{Parsing Intercepted Function Calls in Python}\label{sec:parsing-intercepted-function-calls} + +Lorem Ipsum. + +\section{Automated Testing on Intercepted Function Calls}\label{sec:automated-testing-on-intercepted-function-calls} + +Lorem Ipsum. diff --git a/thesis/src/03.manipulate.tex b/thesis/src/03.manipulate.tex index 125925c..7732877 100644 --- a/thesis/src/03.manipulate.tex +++ b/thesis/src/03.manipulate.tex @@ -2,3 +2,21 @@ \chapter{Manipulating Function Calls}\label{ch:manipulating-function-calls} Lorem Ipsum. + +Unix-Sockets, TCP-Sockets, \dots + +\section{Defining a Protocol}\label{sec:defining-a-protocol} + +Lorem Ipsum. + +\section{Parsing Responses}\label{sec:parsing-responses} + +Lorem Ipsum. + +\section{Creating a Socket Server in Python}\label{sec:creating-a-socket-server-in-python} + +Lorem Ipsum. + +\section{Automated Testing using Function Call Manipulation}\label{sec:automated-testing-using-function-call-manipulation} + +Lorem Ipsum. diff --git a/thesis/src/04.related-work.tex b/thesis/src/04.related-work.tex index cbab321..b2710f3 100644 --- a/thesis/src/04.related-work.tex +++ b/thesis/src/04.related-work.tex @@ -2,3 +2,7 @@ \chapter{Related Work}\label{ch:related-work} Lorem Ipsum. + +What other solutions are available? +What are the differences? +What are the characteristics? diff --git a/thesis/src/05.conclusion.tex b/thesis/src/05.conclusion.tex index db32d4b..b654ec0 100644 --- a/thesis/src/05.conclusion.tex +++ b/thesis/src/05.conclusion.tex @@ -2,3 +2,5 @@ \chapter{Conclusion}\label{ch:conclusion} Lorem Ipsum. + +Perhaps do some study/``research'' on performance (CPU/memory/\dots). diff --git a/thesis/src/99.intercept.bib b/thesis/src/99.intercept.bib index e69de29..1607897 100644 --- a/thesis/src/99.intercept.bib +++ b/thesis/src/99.intercept.bib @@ -0,0 +1,12 @@ +@manual{ld.so.8, + title = {ld.so(8) -- System Manager's Manual -- Linux manual pages}, +} +@manual{dlsym.3, + title = {dlsym(3) -- Library Functions Manual -- Linux manual pages}, +} +@manual{ld.1, + title = {ld(1) -- GNU Development Tools -- Linux manual pages}, +} +@manual{gcc.1, + title = {GCC(1) -- GNU -- Linux manual pages}, +} diff --git a/thesis/thesis.tex b/thesis/thesis.tex index 8077bb9..e4f8e05 100644 --- a/thesis/thesis.tex +++ b/thesis/thesis.tex @@ -37,6 +37,9 @@ \usepackage{morewrites} % Increases the number of external files that can be used. \usepackage[a-2b,mathxmp]{pdfx} % Enables PDF/A compliance. Loads the package hyperref and has to be included second to last. \usepackage[acronym,toc]{glossaries} % Enables the generation of glossaries and lists of acronyms. This package has to be included last. +\usepackage{minted} +\usepackage{chngcntr} +\counterwithin{listing}{chapter} % Set PDF document properties \hypersetup{ @@ -167,7 +170,7 @@ \printglossaries % Add a bibliography. -\bibliographystyle{alpha} +\bibliographystyle{plain} \bibliography{src/99.intercept} \end{document}