\chapter{Introduction}\label{ch:introduction}

Intercepting (also known as Hooking, or Tracing) system or function calls allows one to trace what a given program does.
This information is useful for security analysis or when testing or verifying a program.
This chapter gives a general overview about what the motivation and goal for this work were (Section~\ref{sec:motivation-and-goal}), and what the difference between system calls and function calls is (Section~\ref{sec:definitions}).


\section{Motivation and Goal}\label{sec:motivation-and-goal}

When teaching students about Operating Systems, their interfaces, and standard libraries, C is still a widely used language.
Especially when using Linux.
Therefore, it is obvious, why many university courses still require students to write their assignments and exams in C\@.
The problem when trying to verify whether students have correctly implemented their assignment is that low-level OS constructs (like semaphores, pipes, sockets, memory management) make it hard to run automated tests, because the testing system needs to keep track, set up, and verify the usage of these resources.

The goal of this work was to find a way to easily intercept system or function calls and to verify if students called the right functions with the right arguments at the right time.
This restriction in scope allows focusing on simple binary programs without having to think about complex or I/O heavy programs.
Furthermore, in this setting the source code of the student's programs is obviously available because this is what they need to deliver.
The availability of source code is a key concern when trying to intercept function or system calls, as will be clear in the next chapters.


\section{Definitions}\label{sec:definitions}

First, function calls, system calls, and their differences need to be defined.
The following subsections concern these definitions.


\subsection{Function Calls}\label{subsec:function-calls}

Generally, a function in C (and also most other programming languages) is a piece of code that may be called and therefore executed from elsewhere.
Functions have zero or more arguments and return a single value.
When calling a function, the caller places the return address onto the stack.
This address indicates where the function should continue executing when it is finished.

Functions are used to structure programs, reuse functionality, or expose functionality in libraries.
Other languages than C differentiate between functions, methods, procedures, and so on.
A function written in the source code is almost always compiled to a function in the resulting binary.

Intercepting calls to functions allows one to see the function name, arguments, return value, and return address.


\subsection{System Calls}\label{subsec:system-calls}

In contrast to functions, system calls are calls to the kernel itself.
Many operations on a modern operating system require special privileges, which a simple user-space process does not have.
By invoking a system call, the (user-space) process hands control over to the (privileged) kernel and requests an operation to be performed.
\cite[Chapter~10]{linuxkernel}

How exactly these system calls work depends on the architecture and operating system.
But generally, the process places the system call number and its arguments in defined registers and then executes a special system call opcode.
Then the kernel executes the requested operation and places the return value inside another register, and lastly hands the execution back to the process.
\cite[Chapter~10]{linuxkernel}

Intercepting calls to system calls allows one to see the system call number, arguments, and return value.
One has to keep in mind that many system-related functionalities are not in fact translated to system calls one-to-one.
For example, \texttt{malloc}~\cite{malloc.3} has no dedicated system call, it is managed by the C standard library internally.
Many system calls have corresponding wrapper functions in the C standard library (like \texttt{open}, \texttt{close}, \texttt{sem\_wait}).