Joshua Mason, Sam Small Johns Hopkins University Fabian Monrose University of North Carolina Greg...

29
English Shellcode Joshua Mason, Sam Small Johns Hopkins University Fabian Monrose University of North Carolina Greg MacManus iSIGHT Partners 16th ACM CCS

Transcript of Joshua Mason, Sam Small Johns Hopkins University Fabian Monrose University of North Carolina Greg...

Page 1: Joshua Mason, Sam Small Johns Hopkins University Fabian Monrose University of North Carolina Greg MacManus iSIGHT Partners 16th ACM CCS.

English ShellcodeJoshua Mason, Sam

SmallJohns Hopkins University

Fabian MonroseUniversity of North

Carolina

Greg MacManusiSIGHT Partners

16th ACM CCS

Page 2: Joshua Mason, Sam Small Johns Hopkins University Fabian Monrose University of North Carolina Greg MacManus iSIGHT Partners 16th ACM CCS.

Advaced Defense Lab 2

Outline

Introduction On the arms race Related work Our approach Automatic generation Implementation Evaluation

Page 3: Joshua Mason, Sam Small Johns Hopkins University Fabian Monrose University of North Carolina Greg MacManus iSIGHT Partners 16th ACM CCS.

Advaced Defense Lab 3

Introduction

Code-injection attack Source code for script-language Byte-code Machine code

The common component The injected code or … shellcode

Page 4: Joshua Mason, Sam Small Johns Hopkins University Fabian Monrose University of North Carolina Greg MacManus iSIGHT Partners 16th ACM CCS.

Advaced Defense Lab 4

Misconception

Shellcode is delivered in tandem with the exploitation. Store shellcode in memory, then exploit

Shellcode takes the form of directly executable machine code. polymorphism

Page 5: Joshua Mason, Sam Small Johns Hopkins University Fabian Monrose University of North Carolina Greg MacManus iSIGHT Partners 16th ACM CCS.

Advaced Defense Lab 5

Misconception…?

Even polymorphic shellcode is constrained by an essential component: the decoder.

Shellcode is fundamentally different in structure than non-executable payload data. This paper!!!

Decoder

Encoded data

Page 6: Joshua Mason, Sam Small Johns Hopkins University Fabian Monrose University of North Carolina Greg MacManus iSIGHT Partners 16th ACM CCS.

Advaced Defense Lab 6

About This Paper

Automatically producing English Shellcode

Although it is not indistinguishable form authentic English prose. Do you want to analyze?

Page 7: Joshua Mason, Sam Small Johns Hopkins University Fabian Monrose University of North Carolina Greg MacManus iSIGHT Partners 16th ACM CCS.

Advaced Defense Lab 7

On The Arms Race

Shellcode developers are often faced with constraints that limit the range of byte-values aceepted. e.g. printable, alphanumeric, MIME

Encoding Self-modification

Page 8: Joshua Mason, Sam Small Johns Hopkins University Fabian Monrose University of North Carolina Greg MacManus iSIGHT Partners 16th ACM CCS.

Advaced Defense Lab 8

On The Arms Race

Much literature describing code injection attacks assumes a standard attack template. A NOP sled, shellcode, and one or more

pointer

While emulation and static analysis have bean successful in identifying some failings of advanced shellcode. But…overhead

Page 9: Joshua Mason, Sam Small Johns Hopkins University Fabian Monrose University of North Carolina Greg MacManus iSIGHT Partners 16th ACM CCS.

Advaced Defense Lab 9

On The Arms Race

It has been suggested that malicious polymorphic behavior cannot be modeled effectively. On the infeasibility of Modeling

Polymorphic Shellcode. By Y. Song et al.

Page 10: Joshua Mason, Sam Small Johns Hopkins University Fabian Monrose University of North Carolina Greg MacManus iSIGHT Partners 16th ACM CCS.

Advaced Defense Lab 10

Related Work

Limit the spoils of exploitation and to prevent developers from writing vulnerable code

Preventing the execution of injected code

Content-based input-validation Polymorphic▪ To identify self-decrypting shellcode▪ But … non-self-contained polymorphic shellcode

Page 11: Joshua Mason, Sam Small Johns Hopkins University Fabian Monrose University of North Carolina Greg MacManus iSIGHT Partners 16th ACM CCS.

Advaced Defense Lab 11

Our Approach

Shellcode is simply an ordered list of machine instructions. “Shake Shake Shake!” push %ebx; push “ake ”;

push %ebx; push “ake ”;push %ebx; push “ake!”;

But add, mov, call To develop an automated approach

Arbitrary shellcode English representation

Page 12: Joshua Mason, Sam Small Johns Hopkins University Fabian Monrose University of North Carolina Greg MacManus iSIGHT Partners 16th ACM CCS.

Advaced Defense Lab 12

High-level Overview

English shellcode is completely self-contained.

Page 13: Joshua Mason, Sam Small Johns Hopkins University Fabian Monrose University of North Carolina Greg MacManus iSIGHT Partners 16th ACM CCS.

Advaced Defense Lab 13

The Decoder

The decoder must be English-cpmpatible Cannot use many instruction▪ E.g. loop instructions

Our decoder has the form: Initialization Decoder Encoded payload

Page 14: Joshua Mason, Sam Small Johns Hopkins University Fabian Monrose University of North Carolina Greg MacManus iSIGHT Partners 16th ACM CCS.

Advaced Defense Lab 14

The Decoder principle

Only English-compatible instructions

English-compatible instructions that can produce useful instructions

Favor instructions that have less-constrained ASCII equivalents push %eax (“P”) > push %ecx (“Q”)

Page 15: Joshua Mason, Sam Small Johns Hopkins University Fabian Monrose University of North Carolina Greg MacManus iSIGHT Partners 16th ACM CCS.

Advaced Defense Lab 15

Decoder - initialization

Overwriting registers and patching some instructions

Using inc instruction and manipulatiing the alignment of the stack

Page 16: Joshua Mason, Sam Small Johns Hopkins University Fabian Monrose University of North Carolina Greg MacManus iSIGHT Partners 16th ACM CCS.

Advaced Defense Lab 16

Page 17: Joshua Mason, Sam Small Johns Hopkins University Fabian Monrose University of North Carolina Greg MacManus iSIGHT Partners 16th ACM CCS.

Advaced Defense Lab 17

Decoder - Unpacking

“and r/m8, r8”(0x20, ASCII space character) add▪ lods (load string from esi)

Page 18: Joshua Mason, Sam Small Johns Hopkins University Fabian Monrose University of North Carolina Greg MacManus iSIGHT Partners 16th ACM CCS.

Advaced Defense Lab 18

Decoder - Decoding

Two pointer: %esi, %edi

”,” and “ ”

”u” and “decode””G”

Page 19: Joshua Mason, Sam Small Johns Hopkins University Fabian Monrose University of North Carolina Greg MacManus iSIGHT Partners 16th ACM CCS.

Advaced Defense Lab 19

Page 20: Joshua Mason, Sam Small Johns Hopkins University Fabian Monrose University of North Carolina Greg MacManus iSIGHT Partners 16th ACM CCS.

Advaced Defense Lab 20

Decoder – Initialing Registers

Using popa instruction (ASCII character “a”)

Page 21: Joshua Mason, Sam Small Johns Hopkins University Fabian Monrose University of North Carolina Greg MacManus iSIGHT Partners 16th ACM CCS.

Advaced Defense Lab 21

Automatic Generation

Taken as-is, the custom decoder will have common English characters, but will not appearance of English text.

Add some instructions between decoder instructions

Augmenting a statistical language generation algorithm.

Page 22: Joshua Mason, Sam Small Johns Hopkins University Fabian Monrose University of North Carolina Greg MacManus iSIGHT Partners 16th ACM CCS.

Advaced Defense Lab 22

Automatic Generation

n-gram model length is 5

the ith instruction in decoder have a

level i A sentence have score i when it

complete level i

)|()|()|()(

)|()|()|()()(

123121

12121312121

nn

nnn

WWPWWPWWPWP

WWWWPWWWPWWPWPWWWP

Page 23: Joshua Mason, Sam Small Johns Hopkins University Fabian Monrose University of North Carolina Greg MacManus iSIGHT Partners 16th ACM CCS.

Advaced Defense Lab 23

Page 24: Joshua Mason, Sam Small Johns Hopkins University Fabian Monrose University of North Carolina Greg MacManus iSIGHT Partners 16th ACM CCS.

Advaced Defense Lab 24

Using beam search algorithm Keep the best m(=20,000) candidates

during the process For encoded payload, observe how

many target byte are encoded

Page 25: Joshua Mason, Sam Small Johns Hopkins University Fabian Monrose University of North Carolina Greg MacManus iSIGHT Partners 16th ACM CCS.

Advaced Defense Lab 25

Implementation

The training data Over 15,000 Wikipedia articles 27,000 books from the Project

Gutenberg Language engine was constructed in

the Java language using the LingPipe API

Scoring engine using ptrace API Executor Watcher

Taking 12 hours

Page 26: Joshua Mason, Sam Small Johns Hopkins University Fabian Monrose University of North Carolina Greg MacManus iSIGHT Partners 16th ACM CCS.

Advaced Defense Lab 26

Page 27: Joshua Mason, Sam Small Johns Hopkins University Fabian Monrose University of North Carolina Greg MacManus iSIGHT Partners 16th ACM CCS.

Advaced Defense Lab 27

An Optimized Design

Emulation Expand 1 instruction into tens of

instructions Monitored direct execution

Maintain 2 machine state Use 3 separate stacks Pause 2 conditions▪ Encounter a jump▪ Change memory

Roughly in less than 1 hour

Page 28: Joshua Mason, Sam Small Johns Hopkins University Fabian Monrose University of North Carolina Greg MacManus iSIGHT Partners 16th ACM CCS.

Advaced Defense Lab 28

Evaluation

Exit(0) 2054 bytes

Page 29: Joshua Mason, Sam Small Johns Hopkins University Fabian Monrose University of North Carolina Greg MacManus iSIGHT Partners 16th ACM CCS.

Advaced Defense Lab 29

Compare with Spectrum Analysis

Windows Bind DLL Inject