Linux binary analysis and exploitation

27
© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering Linux Binary Analysis and Exploitation Dharma Ganesan, Mikael Lindvall

Transcript of Linux binary analysis and exploitation

Page 1: Linux binary analysis and exploitation

© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering

Linux Binary Analysis and ExploitationDharma Ganesan, Mikael Lindvall

Page 2: Linux binary analysis and exploitation

© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering2

Context of the slides

Gave a presentation: NASA Coding Summit Held at NASA’s IV&V Center

NASA systems & context are removed in these slides Too sensitive for public release Increases the risk of attacks on those systems

Slides meant to be a teaser on this topic Many low-level nitty-gritty details are left-out Time-restriction (only 30 min. original talk)

Page 3: Linux binary analysis and exploitation

© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering3

Keywords (used in our exploit)

Return-Oriented Programming Address Space Randomization (ASLR) Non-Executable Stack (NX) Attacking a Global Offset Table (GOT) Stealing Remote Libc Stealing Stack Canary

Page 4: Linux binary analysis and exploitation

© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering4

Attack Scenarios and Our Scope

Scenario 1: Open-source software E.g. Linux, Apache Web-server, etc.

Scenario 2: Open-binary but closed source E.g. Most commercial products

Scenario 3: Closed-binary and closed source E.g. Remote services

Scope of this talk: Scenario 2 (remote exploit)

Page 5: Linux binary analysis and exploitation

© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering5

Questions

Many modern operating systems (OS) have built-in security features more on this later

Is it possible to circumvent these security features and take over a remote machine?

Do we still have to do secure coding even though OS has security features?

Let’s investigate these questions for Linux Although highly relevant for other Oses!

Page 6: Linux binary analysis and exploitation

© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering6

Modern OS security features (samples)

Address Space Layout Randomization (ASLR)

Non-Executable Stack (NX)

Stack Canary

Page 7: Linux binary analysis and exploitation

© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering7

ASLR feature for security

Historically, memory addresses of variables and functions did not change between runs

Allows hackers to perform remote code execution easily Address space layout randomization (ASLR)

randomizes many items: Address of variables differ between runs

(e.g. buffer addresses are difficult to predict for hackers)

Address of shared-libraries/dlls differ between runs (e.g. address of library functions difficult for hackers to

predict)

Page 8: Linux binary analysis and exploitation

© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering8

Non-Executable stack (NX) for security

Historically, hackers send exploits using the user input buffer

Modify the control the flow by redirecting the control to the buffer

Non-executable stack (NX) will not allow code execution on stack If a hacker stores his exploit (e.g. virus) on a

stack, OS will not run that code

Page 9: Linux binary analysis and exploitation

© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering9

Stack Canary for security

Historically, when hackers overflow a buffer and modify the control flow, the OS was not aware of this hacking event

Stack canary (a random key) can detect this issue The random key generated by the runtime linker is

inserted into the stack to maintain control flow integrity One cannot override the return addresses, stored

on the stack, without guessing the canary!

Page 10: Linux binary analysis and exploitation

© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering10

Questions

Many modern operating systems (OS) have built-in security features more on this later

Is it possible to circumvent these security features and take over a remote machine?

Do we still have to do secure coding even though OS has security features?

Let’s investigate these questions for Linux Although highly relevant for other Oses!

Page 11: Linux binary analysis and exploitation

© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering11

High-level procedure for analysis of binary

Assumption: Remote service binary is available to the hacker but the environment is not

Step 1: Data gathering about the target binary Step 2: Analyze binary for vulnerable library functions, signatures Step 3: Reachability analysis of vulnerable library functions Step 4: Memory layout analysis of the binary and remote machine Step 5: Stealing the remote’s Libc, the Stack Canary Step 6: Construct evil input that will take over the remote machine

Page 12: Linux binary analysis and exploitation

© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering12

Applying the procedure: An example

Context: This service is part of a capture-the-flag online challenge (ringzero.com)

About the remote service (base 64 decoder service): The remote service listens for input on a particular port It outputs base 64 decoding for the given input The binary of the remote service is available for

download But not the running environment such as libc libraries nor OS

600 assembly instructions (x86-64)

Page 13: Linux binary analysis and exploitation

© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering13

Applying the procedure: An example

Challenge: Break into this remote service Perform remote code execution by exploiting

vulnerabilities in the binary Steal secrets (i.e. flag file) from the server by

reading the file system of the server

Page 14: Linux binary analysis and exploitation

© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering14

Step 1: Data gathering of the remote service

Tools: readelf and grep What is the OS, machine, and processor type of the remote service?

dharma@ubuntu:~$ readelf -hn <binary> Data: 2's complement, little endian OS/ABI: UNIX - System V Machine: Advanced Micro Devices X86-64 OS: Linux, ABI: 2.6.24

Unfortunately, my OS version is different from the remote service But we will overcome this problem (discussed later)

Is the stack executable? dharma@ubuntu:~/Downloads$ readelf -lW <binary>| grep GNU_STACK Output: GNU_STACK ... RW 0x10 RW means the stack is read and write only but not executable

Page 15: Linux binary analysis and exploitation

© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering15

Step 1: Data gathering of the remote service

Is there a stack canary that will kick me out if I overflow any buffers?

Tools used: objdump, grep

Dump of assembler code for function doprocessing:

0x0000000000400eaa <+318>: mov -0x8(%rbp),%rax

0x0000000000400eae <+322>: xor %fs:0x28,%rax

0x0000000000400eb7 <+331>: je 0x400ebe <doprocessing+338>

0x0000000000400eb9 <+333>: callq 0x400930 <__stack_chk_fail@plt>

Stack canary is generated at runtime and stored in the fs register

Unfortunately, there is a built-in stack integrity check

stack_chk_fail will be called if I corrupt the stack

Page 16: Linux binary analysis and exploitation

© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering16

Step 2: Analyze the binary for vulnerable library functions?

Tools used: objdump and grep Which external functions are used?

dharma@ubuntu:~$ objdump –R <binary> Output: List of library functions used by the binary

Hunt for vulnerable functions pointed me to “fork” This function is not used properly (more on this later)

No strcpy or gets usage (unlucky for the hacker)

Page 17: Linux binary analysis and exploitation

© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering17

Step 2: Analyze the binary for vulnerable signatures?

Is there a function in the given binary which takes two buffers as inputs but without the length of each buffer as arguments?

If yes, then the service may have memory safety issues It may be possible to overflow the buffer, modify control flow

Searching for vulnerable signature often requires disassembly of the binary in order to reconstruct signatures for each function Takes a lot of time and effort

Found vulnerable signature: base64_decode(char*, char*); Disassembled function found no bounds checking of buffer

sizes

Page 18: Linux binary analysis and exploitation

© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering18

Step 3: Reachability analysis

How do reach the vulnerable signature? Answering this question requires

reconstructing the call graph from the binary For example, in the remote service

vulnerable function base64_decode is called without bounds checking

Great news for the hacker – stack-based buffer overflow

Page 19: Linux binary analysis and exploitation

© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering19

Step 3: Reachability analysis: Manually reversed C function from

binary (sample)void doprocessing(){ char base64Out[0x200]; char userInput[0x400];

bzero(base64Out, 0x200); bzero(userInput, 0x400); write(1, "Please enter your base 64 string: \n", 0x23); read(0, userInput, 0x400); write(1, "Your message is:\n", 0x11); write(1, base64Out, base64_decode(userInput, base64Out)); /* base64_decode is not checking the decoded buffer size */ write(1, "\nThank you for using ringzer0 base64 decoder!\n", 0x2e);}

• Base64_decode can corrupt the return address of doprocessing• Remote code execution: If the base 64 decoded string exceeds the buffer size

Page 20: Linux binary analysis and exploitation

© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering20

Step 4: Memory layout analysis

Finding the vulnerability is a small part of the puzzle Exploiting the vulnerability is the tricky part We need to understand the memory layout of the remote

service from its binary in order to do remote code execution

Is the address space layout randomization (ASLR) turned on in the remote machine?

Do answer this question: We need to find a way to leak memory addresses from the remote machine to our machine

Page 21: Linux binary analysis and exploitation

© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering21

Step 4: Leaking memory addresses of the remote service

Every Linux binary has a table called Global Offset Table (GOT) GOT contains pointers that will point to runtime addresses of library

functions Goal: Print the GOT entries of the remote service! We can modify the control flow of doProcessing function due to buffer

overflow We will overwrite the return address of doProcessing by the write

function address and pass a GOT entry address to appropriate registers (rsi register) This step is performed using Return-oriented programming (ROP)

Running the remote service two times showed different addresses – ASLR is ON – not easy to hack the remote server

Page 22: Linux binary analysis and exploitation

© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering22

Step 5: Stealing the remote’s Libc

Libc is turning-complete – meaning we can construct any algorithm from the fragments of libc

Since the remote service is vulnerable to memory errors, we are able to read arbitrary memory of the remote service!

This vulnerability allowed us to write a program that secretly transfers the remote service’s libc binary This solved the problem that the remote server has a

different runtime versions of libc and GCC

Page 23: Linux binary analysis and exploitation

© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering23

Step 5: Stealing the stack canary

The stack canary prevents remote code execution! Goal: Steal the stack canary by guessing 1 byte at a time Approach: A stack canary is 8 byte, require 8x256 guesses The binary has a fork-based vulnerability – a design flaw

The parent remote service spawns a child task using the fork syscall

But, all child tasks inherit the same stack canary Thus, we wrote a program that will correctly guess the stack

canary in 8x256 attempts.

Page 24: Linux binary analysis and exploitation

© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering24

Step 6 – Constructing the evil input that spawns a remote shell

In our case, we want to spawn a remote shell using the vulnerable remote service Using return-oriented programming (ROP) – a

hacking technique We wrote a program that constructs ROP gadgets

using the stolen libc We get a backdoor into the remote system! Please talk to me for more details!

only 30 min talk

Page 25: Linux binary analysis and exploitation

© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering25

Conclusion

Memory errors are very dangerous even if a remote machine is running on a custom-built environment! Hackers can steal, reconstruct, exploit our environment

Secure OS features are necessary but not sufficient We were able to defeat ASLR, NX, and Stack Canaries

Secure coding is mandatory; OS cannot always protect us if our coding is not secure

One main security requirement: input validation Extensive off-nominal testing/verification is required!

Page 26: Linux binary analysis and exploitation

© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering26

Future work

Our binary analysis is semi-manual More automation/research is needed for

binary reverse engineering Reachability analysis is effort intensive Generating a remote shell spawning evil input is the most

challenging part of exploit generation We have some ideas for how to do this!

Page 27: Linux binary analysis and exploitation

© 2016 Fraunhofer USA, Inc. Center for Experimental Software Engineering

Linux Binary Analysis and Exploitation

Dharma Ganesan, Mikael Lindvall

Fraunhofer Center for Experimental Software EngineeringCollege Park, Maryland, USA

{dganesan, mlindvall}@fc-md.umd.edu