thesis: Complete 2.3
This commit is contained in:
@@ -232,7 +232,7 @@ After deciding to use the preloading method to intercept function calls, a more
|
|||||||
It was decided to have one single \texttt{intercept.so} file as a resulting artifact which then may be loaded via the \texttt{LD\_PRELOAD} environment variable.
|
It was decided to have one single \texttt{intercept.so} file as a resulting artifact which then may be loaded via the \texttt{LD\_PRELOAD} environment variable.
|
||||||
The easiest and most straightforward way to structure the source code was to put all code in one single C file.
|
The easiest and most straightforward way to structure the source code was to put all code in one single C file.
|
||||||
Listing \ref{lst:intecept-preload.c} gives an overview over the grounding code structure.
|
Listing \ref{lst:intecept-preload.c} gives an overview over the grounding code structure.
|
||||||
For each function that should be intercepted, this function simply has to be declared and defined as \texttt{malloc} was.
|
For each function that should be intercepted, this function simply has to be declared and defined the same way \texttt{malloc} was.
|
||||||
|
|
||||||
\begin{listing}[htbp]
|
\begin{listing}[htbp]
|
||||||
\inputminted[linenos]{c}{listings/intercept-preload.c}
|
\inputminted[linenos]{c}{listings/intercept-preload.c}
|
||||||
@@ -241,26 +241,118 @@ For each function that should be intercepted, this function simply has to be dec
|
|||||||
\end{listing}
|
\end{listing}
|
||||||
|
|
||||||
|
|
||||||
\section{Retrieving Function Argument Values}\label{sec:Retrieving-function-argument-values}
|
\section{Retrieving Function Argument Values}\label{sec:retrieving-function-argument-values}
|
||||||
|
|
||||||
|
Now that the first steps have been done, one needs to think about what exactly to record when intercepting.
|
||||||
|
A simple notification that a given function was called would be too less.
|
||||||
|
Within the following subsections it is tried to get as much information as possible from each function call.
|
||||||
|
|
||||||
|
As already mentioned, \texttt{ltrace} uses prototype functions to format its function arguments.
|
||||||
|
This allows \texttt{ltrace} to ``dynamically'' display function arguments for any new or unknown functions without the need for recompilation.
|
||||||
|
\cite{ltrace.conf.5}
|
||||||
|
|
||||||
|
However, due to implementation complexity reasons and the need for ``complex''\todo{} return types (see~\ref{sec:retrieving-function-return-values}) a statically compiled approach has been used for this work.
|
||||||
|
This means that each function formats its arguments and return values itself without any configuration option.
|
||||||
|
|
||||||
|
The reason for retrieving as much information as possible from each function call is that at a later point in time it is possible to completely reconstruct the exact function calls an their sequence.
|
||||||
|
This allows analysis on these records to be performed independently of the corresponding execution of the program.
|
||||||
|
It should always be possible for any parser to fully parse the recorded calls without any specific knowledge of specific functions, their argument types, or return value type.
|
||||||
|
|
||||||
|
|
||||||
|
\subsection{Numbers}\label{subsec:retrieving-numbers}
|
||||||
|
|
||||||
|
The most simple types of argument are plain numbers, like integers (\texttt{int}, \texttt{long}, \ldots) or floating point numbers (\texttt{float}, \texttt{double}).
|
||||||
|
(In fact, \textit{all} arguments are represented as numbers or integers.
|
||||||
|
See the following subsections for examples.)
|
||||||
|
Plain numbers may be formatted simply as what they are, in base 10 notation, or with a prefix like \texttt{0x} for hexadecimal or \texttt{0} for octal representation.
|
||||||
|
|
||||||
|
Example: \texttt{malloc(123)} (or \texttt{malloc(0x7B)}).
|
||||||
|
|
||||||
|
\subsection{Unspecific Pointers}\label{subsec:retrieving-unspecific-pointers}
|
||||||
|
|
||||||
|
Pointers with no further information known about (like \texttt{void *}) are essentially integers.
|
||||||
|
Therefore, they may be treated as such.
|
||||||
|
|
||||||
|
Example: \texttt{free(0x55624164b2a0)}.
|
||||||
|
|
||||||
|
\subsection{Strings and Buffers}\label{subsec:retrieving-strings-buffers}
|
||||||
|
|
||||||
|
Strings in C are simple pointers to a place in memory which is null-terminated.
|
||||||
|
This means that the strings end with the first occurrence of the null-byte (\texttt{0x00}).
|
||||||
|
To distinguish unspecific pointers from pointers to strings, it was chosen to use a colon (\texttt{:}) after the pointer numerical value.
|
||||||
|
The colon is followed by the contents of the string with beginning and ending quoted (\texttt{"}).
|
||||||
|
Special values inside the string are escaped with a backslash.
|
||||||
|
|
||||||
|
Example: \texttt{sem\_unlink(0x1234:"/test-semaphore")}.
|
||||||
|
|
||||||
|
Another type of ``string'' in C is a buffer with a known length.
|
||||||
|
When buffers are used, usually another argument is passed to the function which indicates the length of the buffer.
|
||||||
|
This fact may be used to print out the contents of the buffer in the same way as normal C strings.
|
||||||
|
|
||||||
|
Example: \texttt{write(3, 0x1234:"Test\textbackslash{}x00ABC", 8)}.
|
||||||
|
|
||||||
|
\subsection{Flags}\label{subsec:retrieving-flags}
|
||||||
|
|
||||||
|
Some functions have one of their arguments dedicated to flags which may be combined by bitwise XOR.
|
||||||
|
These arguments are also of type integer.
|
||||||
|
To distinguish flag arguments from others, a pipe symbol (\texttt{|}) is used after the colon and between the flags.
|
||||||
|
|
||||||
|
Example: \texttt{open(0x1234:"test.txt", 0102:|O\_CREAT|O\_RDWR|, 0644)}.
|
||||||
|
|
||||||
|
\subsection{Constants}\label{subsec:retrieving-constants}
|
||||||
|
|
||||||
|
For some functions constants are used.
|
||||||
|
These constants are typically used C macros in the source code.
|
||||||
|
This makes the source code more readable (and portable).
|
||||||
|
Constants are represented as an integer again followed by a colon, this time without any special characters to disdinguish them from other types.
|
||||||
|
|
||||||
|
Example: \texttt{socket(2:AF\_INET, 1:SOCK\_STREAM, 6)}.
|
||||||
|
|
||||||
|
\subsection{Pointers to Arrays}\label{subsec:retrieving-pointers-to-arrays}
|
||||||
|
|
||||||
|
Sometimes arrays are used as arguments.
|
||||||
|
Arrays in C work similar to strings, they are either null-terminated (by an element being of value 0), or their length is explicitly given.
|
||||||
|
So to represent them, two brackets are used (\texttt{[]}) and a comma (\texttt{,}) to separate the respective elements.
|
||||||
|
Each element may be represented as an ``argument'' on its own (as illustrated by the example).
|
||||||
|
|
||||||
|
Example: \\
|
||||||
|
\texttt{getopt(2, 0x7f0b8:[0x7feb3:"./main", 0x7fee6:"arg"], 0x123:"v")}.
|
||||||
|
|
||||||
|
\subsection{Pointers to Structures}\label{subsec:retrieving-pointers-to-structures}
|
||||||
|
|
||||||
|
In rare cases structures (\texttt{struct}) are used as argument types.
|
||||||
|
Two curly brackets (\texttt{\{\}}) are used to indicate structures.
|
||||||
|
Then the field names are displayed plainly, followed by a colon and then the value of that field.
|
||||||
|
Commas are used to separate the fields respectively.
|
||||||
|
|
||||||
|
Example: \texttt{\tiny connect(2, 0x123:\{sa\_family: 2:AF\_INET, sin\_addr: "1.1.1.1", sin\_port: 80\}, 16)}.
|
||||||
|
|
||||||
|
|
||||||
|
\section{Retrieving Function Return Values}\label{sec:retrieving-function-return-values}
|
||||||
|
|
||||||
Lorem Ipsum.
|
Lorem Ipsum.
|
||||||
|
|
||||||
|
|
||||||
\section{Determining Function Call Location}\label{sec:determining-function-call-location}
|
\section{Determining Function Call Location}\label{sec:determining-function-call-location}
|
||||||
|
|
||||||
Lorem Ipsum.
|
Lorem Ipsum.
|
||||||
|
|
||||||
|
|
||||||
\section{Example}\label{sec:intercepting-example}
|
\section{Example}\label{sec:intercepting-example}
|
||||||
|
|
||||||
Lorem Ipsum.
|
Lorem Ipsum.
|
||||||
|
|
||||||
|
|
||||||
\section{Analyzing Intercepted Function Calls}\label{sec:analyzing-intercepted-function-calls}
|
\section{Analyzing Intercepted Function Calls}\label{sec:analyzing-intercepted-function-calls}
|
||||||
|
|
||||||
Lorem Ipsum.
|
Lorem Ipsum.
|
||||||
|
|
||||||
|
|
||||||
\section{Parsing Intercepted Function Calls in Python}\label{sec:parsing-intercepted-function-calls}
|
\section{Parsing Intercepted Function Calls in Python}\label{sec:parsing-intercepted-function-calls}
|
||||||
|
|
||||||
Lorem Ipsum.
|
Lorem Ipsum.
|
||||||
|
|
||||||
|
|
||||||
\section{Automated Testing on Intercepted Function Calls}\label{sec:automated-testing-on-intercepted-function-calls}
|
\section{Automated Testing on Intercepted Function Calls}\label{sec:automated-testing-on-intercepted-function-calls}
|
||||||
|
|
||||||
Lorem Ipsum.
|
Lorem Ipsum.
|
||||||
|
|||||||
Reference in New Issue
Block a user