thesis: Last adjustments

thesis: Refine Conclusion
thesis: Add text for manipulation performance test
2025-09-04 01:49:17 +02:00 · 2025-09-04 00:50:41 +02:00 · 2025-09-03 23:42:54 +02:00 · 2025-09-03 23:12:13 +02:00 · 2025-09-03 16:21:54 +02:00 · 2025-09-03 16:11:33 +02:00
9 changed files with 197 additions and 40 deletions
--- a/thesis/intercept_v1.1.pdf
+++ b/thesis/intercept_v1.1.pdf
--- a/thesis/intercept_v2.0.pdf
+++ b/thesis/intercept_v2.0.pdf
--- a/thesis/src/02.related-work.tex
+++ b/thesis/src/02.related-work.tex
@@ -6,16 +6,92 @@ See also Section~\ref{sec:methods-for-intercepting}.
 Some relevant methods will be discussed there in more detail.


-\section{Function Call Interception}\label{sec:function-call-interception}
+\section{System and Function Call Hooking in Literature}\label{sec:call-hooking-in-literature}

-All related work regarding function call interception is mentioned in the aforementioned Section.
+The following subsections explore some applications of system and function call hooking.
+There are possibly many other use-cases, but the following were deemed most important.
+
+
+\subsection{Classification of Hooking Techniques}\label{subsec:classification}
+
+Lopez et al. \cite{lopez2017} classify subroutine hooking techniques as follows:
+
+\begin{itemize}
+  \setlength\itemsep{0em}
+  \item Subroutine Type: Function / System call.
+  \item Hook Insertion: Static (before execution) / Dynamic (during execution).
+  \item Instrumentation Type: Active (``manipulation'') / Passive (``interception'').
+  \item Hooking Location: On-device / Off-device (most used for mobile devices).
+  \item Hooking Scope: Inner Functions / Exported Functions (e.g.,\ libraries).
+  \item OS Modification: Required / Not Required.
+  \item Availability of Source Code: Open-source / Closed-source.
+  \item Pricing Model: Free / Paid.
+\end{itemize}
+
+The technique developed in this work would be classified as follows:
+Function, Static, Active+Passive, On-device, Exported Functions, OS Modification Not Required, target program may be Closed-source, Free.
+
+
+\subsection{Linux Systems}\label{subsec:linux-systems}
+
+Yasukata et al. \cite{zpoline} introduced zpoline, a system call hooking technique using binary rewriting.
+See Subsection~\ref{subsec:zpoline} for more details.
+Hong et al. \cite{datahook} developed DataHook, a system call hooking technique based on glibc.
+See Subsection~\ref{subsec:datahook} for more details.
+
+
+\subsection{Windows Systems}\label{subsec:windows-systems}
+
+Hunt and Brubacher \cite{detours} developed Detours, a library for instrumenting arbitrary Win32 functions on x86 machines.
+Detours intercepts Win32 functions by re-writing target function images.
+Based on this method, Sze and Sekar developed Spif \cite{spif} (see next subsection).
+
+
+\subsection{Security Applications}\label{subsec:security-applications}
+
+Fraser et al. \cite{fraser2000} introduced a general mechanism for securing unmodified commercial software by wrapping system calls at the library interface.
+They hook system calls by replacing the standard library entry points with wrapper functions, similar to \texttt{LD\_PRELOAD}.
+The wrapper functions are able to perform security checks or other security measures.
+
+Garfinkel et al. \cite{ostia} developed Ostia, a sandboxing system, which uses system call hooking to secure applications.
+They implemented their own ELF binary loader to load their emulation library into memory before the sandboxed program starts.
+To communicate between this library and their \textit{agent}, they use Unix domain sockets.
+The \textit{agent} then responds, according to its policies.
+This is a similar approach to the one of this work (see Chapter~\ref{ch:manipulating-function-calls}).
+
+Sze and Sekar \cite{spif} introduced Spif, an approach that defends against malware by tracking code and data origin on Windows systems.
+They use Detours \cite{detours} to intercept low-level Windows API calls.
+
+Kern \cite{kern2023} discusses the use of \texttt{LD\_PRELOAD} in cloud environments for HTTP deception.
+This is done to analyze malware or other adversaries in real environments without their knowledge and without any risk of danger.
+Examples are to override the \texttt{send} and \texttt{recv} functions of libc.
+With some modifications, the technique presented in this work may also be used in this context to intercept and manipulate \texttt{send} and \texttt{recv} calls.
+
+
+\subsection{Software Distribution}\label{subsec:software-distribution}
+
+Guo and Engler \cite{guo2011cde} use system call hooking for creating portable software.
+They developed CDE, which logs all files a program accesses during execution, including shared libraries.
+All accessed files and the environment are bundled together and may now be executed on any other system having a compatible kernel without having to install any dependencies.
+This would also be possible with the use of logged function calls like in this work (e.g., see Section~\ref{sec:intercepting-example}).
+
+
+\subsection{Rapid Prototyping}\label{subsec:rapid-prototyping}
+
+Spillane et al. \cite{spillane2007} use \texttt{ptrace} to hook system calls of another process to simulate these calls using a user-space program.
+This is useful for rapid prototyping (e.g., file systems) by developing a user-space program first, and then, using the gained insight, porting it into kernel-space.
+
+
+\section{Function Call Hooking}\label{sec:function-call-hooking}
+
+All underlying techniques for function call interception on Linux systems are mentioned in Section~\ref{sec:methods-for-intercepting}.
 See \texttt{ltrace} (Subsection~\ref{subsec:ltrace}), wrapper functions (Subsection~\ref{subsec:wrapper-functions}), and \texttt{LD\_PRELOAD} (Subsection~\ref{subsec:preloading}).


-\section{System Call Interception}\label{sec:system-call-interception}
+\section{System Call Hooking}\label{sec:system-call-hooking}

-This section discusses further related work regarding system call interception.
-This excludes techniques already discussed in Section~\ref{sec:methods-for-intercepting},
+This section discusses further techniques regarding system call interception.
+This excludes techniques discussed in Section~\ref{sec:methods-for-intercepting},
 like \texttt{ptrace} (Subsection~\ref{subsec:ptrace}), and \texttt{strace} (Subsection~\ref{subsec:strace}).
 Almost all following methods use binary rewriting to replace system calls with other instructions (except SUD, Subsection~\ref{subsec:syscall-user-dispatch}).
 This is one of the reasons why they are not mentioned in Section~\ref{sec:methods-for-intercepting}.
--- a/thesis/src/04.manipulate.tex
+++ b/thesis/src/04.manipulate.tex
@@ -4,12 +4,13 @@
 This chapter discusses how to manipulate function calls and how this may be used to test programs.
 How function calls may be intercepted at all has been discussed in Chapter~\ref{ch:intercepting-function-calls}.
 This chapter builds on the basis of the previous one and expands its functions.
-In this context, ``manipulation'' means changing the arguments of a function, before calling it with the modified arguments, or skipping the execution of the real function completely and simply returning a given value (``mocking'').
+In this context, ``manipulation'' means changing the arguments of a function before calling it with the modified arguments, or skipping the execution of the real function completely and simply returning a given value (``mocking'').
 These techniques allow in-depth testing of programs.

 In contrast to simply recording and logging function calls, which may be controlled via environment variables, manipulation of such function calls requires some other process to indicate how to handle each call.
-This work uses simple sockets to communicate between the process of the program to be tested, and a ``server'', which decides what action to perform for each function call.
-Currently, only communication over Unix sockets is implemented, but communication over TCP sockets is also easily possible.
+This work uses simple Unix domain sockets to communicate between the process of the program to be tested, and a ``server'', which decides what action to perform for each function call.
+Currently, only communication over Unix domain sockets is implemented, but communication over TCP sockets is also easily possible.
+This approach is similar to the one used in \cite{ostia} to communicate with the \textit{agent}.

 Figure~\ref{fig:control-flow} illustrates the control flow for manipulating function calls.

--- a/thesis/src/05.evaluation.tex
+++ b/thesis/src/05.evaluation.tex
@@ -71,7 +71,7 @@ For some functions the modification of function arguments has been implemented.
 \section{Performance}\label{sec:performance}

 Although high performance was not a primary goal of this work, the performance degradation caused by interception and manipulation should not be excessive.
-The following two subsections test and discuss the performance degradation of \texttt{intercept.so} compared to running a program without any intercepting or hooking.
+The following two subsections test and discuss the performance degradation of \texttt{intercept.so} compared to running a program without any interception or hooking.


 \subsection{Performance when Intercepting}\label{subsec:performance-intercepting}
@@ -148,13 +148,23 @@ Most of the delay is caused by logging the recorded function calls.

 \subsection{Performance when Manipulating}\label{subsec:performance-manipulating}

-Measuring performance for function call manipulation makes no sense without knowing the exact socket server to be used.
+Meaningful performance evaluation of function call manipulation requires a specific server implementation, since the response speed of the server dominates overall performance.
 As seen in Subsection~\ref{subsec:performance-intercepting}, most delay comes not from intercepting itself, but from the further processing.
 This also applies to function call manipulation.
 The performance degradation heavily depends on the response speed of the used socket.
-Therefore, an explicit performance test on manipulation was deemed unlikely to yield meaningful results and was not carried out.
-\todo{Update text}
+Despite this, a simple performance test has been conducted.

+The setup was the same as in Subsection~\ref{subsec:performance-intercepting}.
+But this time \texttt{intercept.so} was configured to connect to a Unix domain socket.
+At first, a simple C program was used to respond to the messages on the socket, only using \texttt{getline} and \texttt{fprintf}.
+For the first test run the program always responded with the \texttt{"ok"} command (``Manipulate (simple ok)''), for the second with the \texttt{"return 0"} command (``Manipulate (simple return)'').
+After that, the default Python implementation developed in this work, which parses the incoming messages automatically, was tested.
+Again, at first always responding with the \texttt{"ok"} command (``Manipulate (Python ok)''), and then with the \texttt{"return 0"} command (``Manipulate (Python return)'').
+Figure~\ref{fig:manipulate-performance} illustrates the results and some previous measurements for context.
+
+Results of the simple C program show, that the communication over the socket alone has only minimal overhead compared with ``Logging to stderr''.
+Due to the parsing of messages the Python program has slightly worse performance degradation.
+The ``return'' test is slightly faster compared to the ``ok'' test, because the \texttt{pipe} function normally responds with \texttt{return 0; errno 0; fildes=[7,8]}, but when using \texttt{"return 0"} it responds with \texttt{return 0; errno 0}, which is less data to parse.

 \begin{figure}
  \centering
@@ -218,5 +228,5 @@ Therefore, an explicit performance test on manipulation was deemed unlikely to y
    \end{axis}
  \end{tikzpicture}
  \caption{Execution times of a simple program using \texttt{intercept.so} with different manipulation modes.}
-  \label{fig:manupulate-performance}
+  \label{fig:manipulate-performance}
 \end{figure}
--- a/thesis/src/06.conclusion.tex
+++ b/thesis/src/06.conclusion.tex
@@ -1,9 +1,11 @@

 \chapter{Conclusion}\label{ch:conclusion}

-\todo{Start with Goals in OSVU}
+The primary goal of this work was to support the Operating Systems course by providing a practical and reliable way to automatically test students' C programs.
+Exercises in this course often involve low-level concepts such as semaphores, sockets, and shared memory, which are difficult to test automatically with conventional approaches.
+The motivation was therefore to develop a technique that allows intercepting function calls in order to verify whether students invoked the correct functions with appropriate arguments and in the correct order.

-This work presented \texttt{intercept.so}, a shared object file intended to be preloaded using \texttt{LD\_PRELOAD}, which may be used to intercept function calls on Linux systems.
+To address these challenges, this work presented \texttt{intercept.so}, a shared object file intended to be preloaded using \texttt{LD\_PRELOAD}, which may be used to intercept function calls on Linux systems.
 Furthermore, a supporting Python program, \texttt{intercept}, was presented to make the shared object easier to use.
 By using preloading to hook or intercept function calls, the overhead and performance degradation remain negligible for the purpose of testing student submissions.
 To make use of intercepted function calls, some techniques of automatic testing of simple C programs were discussed.
--- a/thesis/src/99.intercept.bib
+++ b/thesis/src/99.intercept.bib
@@ -96,5 +96,79 @@
    month = jun,
    articleno = {ISSTA005},
    numpages = {21},
-    keywords = {DataHook, Hooking technique, Software analysis, Software debugging, System call}
+    keywords = {DataHook, Hooking technique, Software analysis, Software debugging, System call},
+}
+@article{lopez2017,
+    title={A survey on function and system call hooking approaches},
+    author={Lopez, Juan and Babun, Leonardo and Aksu, Hidayet and Uluagac, A. Selcuk},
+    journal={Journal of Hardware and Systems Security},
+    volume={1},
+    number={2},
+    pages={114--136},
+    year={2017},
+    publisher={Springer},
+}
+@masterthesis{kern2023,
+    author = {Patrick Kern},
+    title = {Injecting Shared Libraries with LD\_PRELOAD for Cyber Deception},
+    school = {TU Wien},
+    year = {2023},
+}
+@inproceedings{guo2011cde,
+    title={CDE: Using system call interposition to automatically create portable software packages},
+    author={Guo, Philip J. and Engler, Dawson},
+    booktitle={2011 USENIX Annual Technical Conference (USENIX ATC 11)},
+    year={2011},
+}
+@inproceedings{detours,
+    title={Detours: Binary interception of Win32 functions},
+    author={Galen Hunt and Doug Brubacher},
+    booktitle={Windows NT 3rd symposium},
+    year={1999},
+}
+@inproceedings{spillane2007,
+    author = {Spillane, Richard P. and Wright, Charles P. and Sivathanu, Gopalan and Zadok, Erez},
+    title = {Rapid file system development using ptrace},
+    year = {2007},
+    isbn = {9781595937513},
+    publisher = {Association for Computing Machinery},
+    address = {New York, NY, USA},
+    url = {https://doi.org/10.1145/1281700.1281722},
+    doi = {10.1145/1281700.1281722},
+    booktitle = {Proceedings of the 2007 Workshop on Experimental Computer Science},
+    pages = {22–es},
+    keywords = {rapid prototyping, monitors},
+    location = {San Diego, California},
+    series = {ExpCS '07},
+}
+@inproceedings{spif,
+    author = {Sze, Wai Kit and Sekar, R.},
+    title = {Provenance-based Integrity Protection for Windows},
+    year = {2015},
+    isbn = {9781450336826},
+    publisher = {Association for Computing Machinery},
+    address = {New York, NY, USA},
+    url = {https://doi.org/10.1145/2818000.2818011},
+    doi = {10.1145/2818000.2818011},
+    booktitle = {Proceedings of the 31st Annual Computer Security Applications Conference},
+    pages = {211–220},
+    numpages = {10},
+    location = {Los Angeles, CA, USA},
+    series = {ACSAC '15},
+}
+@inproceedings{ostia,
+    title={Ostia: A Delegating Architecture for Secure System Call Interposition},
+    author={Garfinkel, Tal and Pfaff, Ben and Rosenblum, Mendel},
+    booktitle={NDSS},
+    year={2004},
+}
+@inproceedings{fraser2000,
+    author={Fraser, T. and Badger, L. and Feldman, M.},
+    booktitle={Proceedings DARPA Information Survivability Conference and Exposition. DISCEX'00},
+    title={Hardening COTS software with generic software wrappers},
+    year={2000},
+    volume={2},
+    number={},
+    pages={323-337 vol.2},
+    doi={10.1109/DISCEX.2000.821530},
 }
--- a/thesis/thesis.tex
+++ b/thesis/thesis.tex
@@ -10,7 +10,7 @@
 \begin{filecontents*}[overwrite]{\jobname.xmpdata}
 \Author{\authorname}                                    % The author's name in the document properties.
 \Title{Intercepting and Manipulating System/Function Calls in Linux Systems} % The document's title in the document properties.
-\Language{de-AT}                                        % The document's language in the document properties. Select 'en-US', 'en-GB', or 'de-AT'.
+\Language{en-US}                                        % The document's language in the document properties. Select 'en-US', 'en-GB', or 'de-AT'.
 \Keywords{system call\sep syscall\sep function call\sep intercept\sep hook} % The document's keywords in the document properties (separated by '\sep ').
 \Publisher{TU Wien}                                     % The document's publisher in the document properties.
 \Subject{Thesis}                                        % The document's subject in the document properties.
@@ -99,7 +99,7 @@

 % Required data.
 \setregnumber{12119052}
-\setdate{25}{08}{2025} % Set date with 3 arguments: {day}{month}{year}.
+\setdate{04}{09}{2025} % Set date with 3 arguments: {day}{month}{year}.
 \settitle{\thesistitle}{Abfangen und Manipulieren von\\System-/Funktionsaufrufen in\\Linux-Systemen} % Sets English and German version of the title (both can be English or German). If your title contains commas, enclose it with additional curvy brackets (i.e., {{your title}}) or define it as a macro as done with \thesistitle.
 %\setsubtitle{Optional Subtitle of the Thesis}{Optionaler Untertitel der Arbeit} % Sets English and German version of the subtitle (both can be English or German).

@@ -167,11 +167,10 @@

 % Declare the use of AI tools as mentioned in the statement of originality.
 % Use either the English aitools or the German kitools.
-\begin{aitools}
-  No generative AI tools were used in and for this work whatsoever.
-  The only exception was the use of ChatGPT for proofreading and refining of the abstract.
-  \todo{Remove}
-\end{aitools}
+%\begin{aitools}
+%  No generative AI tools were used in and for this work whatsoever.
+%  The only exception was the use of ChatGPT for proofreading and refining of the abstract.
+%\end{aitools}

 %\begin{kitools}
 %\todo{Ihr Text hier.}
--- a/thesis/vutinfth.cls
+++ b/thesis/vutinfth.cls
@@ -408,23 +408,18 @@
    oder dem Sinn nach entnommen sind, auf jeden Fall unter Angabe
    der Quelle als Entlehnung kenntlich gemacht habe.}]{Statement}%
 \CreatePolylingual[
-  english={I further declare that I have used generative AI tools only as an aid,
-  and that my own intellectual and creative efforts predominate in this
-  work. In the appendix ``Overview of Generative AI Tools Used'' I have
-  listed all generative AI tools that were used in the creation of this
-  work, and indicated where in the work they were used. If whole passages
-  of text were used without substantial changes, I have indicated the
-  input (prompts) I formulated and the IT application used with its
-  product name and version number/date.},
-  naustrian={Ich erkl\"are weiters, dass ich mich generativer KI-Tools lediglich als
-  Hilfsmittel bedient habe und in der vorliegenden Arbeit mein
-  gestalterischer Einfluss \"uberwiegt. Im Anhang \glqq\"Ubersicht verwendeter
-  Hilfsmittel\grqq\ habe ich alle generativen KI-Tools gelistet, die verwendet
-  wurden, und angegeben, wo und wie sie verwendet wurden.  F\"ur
-  Textpassagen, die ohne substantielle \"Anderungen \"ubernommen wurden, haben
-  ich jeweils die von mir formulierten Eingaben (Prompts) und die
-  verwendete IT-Anwendung mit ihrem Produktnamen und Versionsnummer/Datum
-  angegeben.}]{AIStatement}%
+  english={
+    I further declare that I have used generative AI tools only as an aid, and that my own intellectual and creative efforts predominate in this work.
+    %In the appendix ``Overview of Generative AI Tools Used'' I have listed all generative AI tools that were used in the creation of this work, and indicated where in the work they were used.
+    %If whole passages of text were used without substantial changes, I have indicated the input (prompts) I formulated and the IT application used with its product name and version number/date.
+    The only use of a generative AI tool was ChatGPT, which was used for proof reading, for rephrasing single sentences and paragraphs, and to create the abstract.
+  },
+  naustrian={
+    Ich erkl\"are weiters, dass ich mich generativer KI-Tools lediglich als Hilfsmittel bedient habe und in der vorliegenden Arbeit mein gestalterischer Einfluss \"uberwiegt.
+    %Im Anhang \glqq\"Ubersicht verwendeter Hilfsmittel\grqq\ habe ich alle generativen KI-Tools gelistet, die verwendet wurden, und angegeben, wo und wie sie verwendet wurden.
+    %F\"ur Textpassagen, die ohne substantielle \"Anderungen \"ubernommen wurden, haben ich jeweils die von mir formulierten Eingaben (Prompts) und die verwendete IT-Anwendung mit ihrem Produktnamen und Versionsnummer/Datum angegeben.
+    Das einzige verwendete generative KI-Tool war ChatGPT, welches zum Korrekturlesen, bei der Umformulierung einzelner Sätze und Paragraphen, und zum Erstellen der Kurzfassung verwenet wurde.
+  }]{AIStatement}%
 \CreatePolylingual[
  english=Overview of Gen. AI Tools Used,
  naustrian=Übersicht verwendeter Hilfsmittel]{AIToolsChapter}%
Author	SHA1	Message	Date
Lorenz Stechauner	2d29267454	thesis: Last adjustments	2025-09-04 01:49:17 +02:00
Lorenz Stechauner	7fe22fdc62	thesis: Refine Conclusion	2025-09-04 00:50:41 +02:00
Lorenz Stechauner	ea00123e40	thesis: Add text for manipulation performance test	2025-09-03 23:42:54 +02:00
Lorenz Stechauner	bb5bd436d0	thesis: Update AI statement	2025-09-03 23:12:13 +02:00
Lorenz Stechauner	d732fa8919	thesis: Add v1.1	2025-09-03 16:21:54 +02:00
Lorenz Stechauner	632913638c	thesis: Add more Relaed Work	2025-09-03 16:11:33 +02:00