Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin April 12-14, 2010 Binary Rewriting with...

download Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin April 12-14, 2010 Binary Rewriting with Dyninst Madhavi Krishnan and Dan McNulty

of 26

  • date post

    18-Jan-2016
  • Category

    Documents

  • view

    216
  • download

    2

Embed Size (px)

Transcript of Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin April 12-14, 2010 Binary Rewriting with...

  • Talk OutlineBinary Rewriter ReviewImplementation ChallengesNew FeaturesRewriting Statically Linked BinariesConclusion*Binary Rewriting with Dyninst

    A Brief Discussion of Ways and Means

  • Binary Rewriting*Binary Rewriting with DyninstlibcDyninstBinaryRewritera.out.rewrittenlibprofile Rewrite executables Rewrite libraries Add new libraries to binariesa.outlibc.rewritten

    A Brief Discussion of Ways and Means

  • Binary Rewriter CapabilitiesInstrument once, run many Support more systems (BlueGene, FreeBSD, )Operate on unmodified binariesNo debug information requiredNo linker relocations requiredNo symbols requiredRewritten binary need not be compiled or linked*Binary Rewriting with DyninstDynamic instrumentation and binary rewriting use the same abstractions and interfaces

    A Brief Discussion of Ways and Means

  • /* Setup */BPatch_addressSpace *addr_space; if (use_bin_edit)addr_space = BPatch.openFile(a.out);elseaddr_space = BPatch.createProcess(a.out);

    /* Instrumentation */addr_space->loadLibrary(libInstrumentation.so);addr_space->getImage()->findFunction(func, funcs);addr_space->insertSnippet(callExpr, point);

    /* Finalize */if (use_bin_edit) {app_bin->writeFile(a.rewritten.out);} else {app_proc->continueExecution();}Binary Rewriter Example

    A Brief Discussion of Ways and Means

  • *Binary Rewriting with Dyninst

    A Brief Discussion of Ways and Means

  • ChallengesComplex StandardsExecutable and Linkable Format(ELF)System V StandardLinux Standard Base (LSB)Accessing information in the original binary fileRedundant information Inconsistent! E.g., Section size stored in headers and dynamic sectionWriting a new binary fileUpdating sections with new informationNot precisely defined by standards!E.g., Adding new symbol to hash section*Binary Rewriting with Dyninst

    A Brief Discussion of Ways and Means

  • ChallengesImplementation of the standardsLibraries and toolsOS Assigning meaning to undefined behaviorSymbols with no name and no typeStringent requirements by libelfSection alignmentUnexpected restrictions by the OSProgram header must be on first pageLoader assumes relocation sections are adjacent

    *Binary Rewriting with Dyninst

    A Brief Discussion of Ways and Means

  • What is New in the Binary Rewriter? Linux/PowerPC32 port

    Handling run time events with the binary rewriter

    Support for rewriting static binaries

    *Binary Rewriting with Dyninst

    A Brief Discussion of Ways and Means

  • Dealing with Position Independent Code (PIC)What is PIC?Why deal with PIC?PowerPC specific challengesIdentifying PIC idiomDetermining current PC

    Linux/PowerPC32 Port*Binary Rewriting with Dyninst 0x1000 0x2000 0x3000CodeDataShared libraryAddress spacePC relative references

    A Brief Discussion of Ways and Means

  • Initialize and finalize instrumentation

    Handling Run Time Events*Binary Rewriting with Dyninst Dyninst MutatorMutatee Processprocess load

    EventsOneTimeCodeCallback

    A Brief Discussion of Ways and Means

  • Initialize and finalize instrumentation

    Handling Run Time Events*Binary Rewriting with Dyninst Mutatee Binary process load

    Events?Snippet to handle the eventinit/fini sectionA general framework to handle run time events

    A Brief Discussion of Ways and Means

  • Rewriting Static Binaries*Binary Rewriting with Dyninst Dynamic BinaryStatic Binary HeadersDynamic LinkerCodeShared Libraries?Static Library Code Data Headers Code Data

    A Brief Discussion of Ways and Means

  • Adding New Libraries to Static BinariesLink code and data from the new libraries into the binaryCan we use use an existing linker?Dyninst must become a linker

    *Binary Rewriting with DyninstStatic Binary Headers Code Data

    A Brief Discussion of Ways and Means

  • Rewriting a Static Binary*Binary Rewriting with Dyninst Headers Code DataLets start with this simple picture of a binary

    A Brief Discussion of Ways and Means

  • Rewriting a Static Binary*Binary Rewriting with DyninstFirst, load new libraries Headers Code Data

    A Brief Discussion of Ways and Means

  • Rewriting a Static Binary*Binary Rewriting with DyninstSecond, generate instrumentation toreference new libraries References Headers Code Data Instrumentation

    A Brief Discussion of Ways and Means

  • Rewriting a Static Binary*Binary Rewriting with DyninstThird, link code and data from the newlibraries into the binary Headers Code Data InstrumentationlibdyninstRT.a Codelibprofile.a Codelibc.a CodelibdyninstRT.a Datalibprofile.a Datalibc.a DataReferences

    A Brief Discussion of Ways and Means

  • Rewriting a Static Binary*Binary Rewriting with DyninstFinally, update the headersOld Headers Code Data InstrumentationlibdyninstRT.a Codelibprofile.a Codelibc.a CodelibdyninstRT.a Datalibprofile.a Datalibc.a DataNew Headers

    A Brief Discussion of Ways and Means

  • Challenges in Rewriting Static BinariesDyninst must become a linker*Binary Rewriting with Dyninst Object FileObject FileStatic LibraryLinkerNot FinalizedStatic BinaryFinalizedrelinkerDyninst Binary RewriterNew Library

    A Brief Discussion of Ways and Means

  • Challenges in Rewriting Static BinariesRelinking is harder than linkingThread Local Storage (TLS)Constructor and destructor tablesSupporting TLSNeed to link together multiple TLS sections TLS sections must be adjacentMove existing TLS section to the end and append new TLS sectionsUpdate program header

    *Binary Rewriting with Dyninst

    A Brief Discussion of Ways and Means

  • Challenges in Rewriting Static BinariesUnexpected interactions within the tool chain *Binary Rewriting with Dyninst gccldStandard FormatUnpublished conventionsDyninstBinary RewriterNew LibraryLinked Binary

    A Brief Discussion of Ways and Means

  • /* Setup */BPatch_addressSpace *addr_space; if (use_bin_edit)addr_space = BPatch.openFile(a.out);elseaddr_space = BPatch.createProcess(a.out);

    /* Instrumentation */if( addr_space->isStaticExecutable() ) { addr_space->loadLibrary(libprofile.a); addr_space->loadLibrary(libc.a);} else { addr_space->loadLibrary(libprofile.so);}

    /* Finalize */if (use_bin_edit) {app_bin->writeFile(a.rewritten.out);} else {app_proc->continueExecution();}Binary Rewriter Example

  • Binary Rewriter StatusRewriting dynamic binariesLinux/x86Linux/x86_64Linux/PowerPC32Rewriting static binaries Linux/x86Linux/x86_64

    *Binary Rewriting with Dyninst

    A Brief Discussion of Ways and Means

  • Future DirectionsRewriting dynamically linked binaries PowerPC64Rewriting statically linked binariesPowerPC FamilyPorts to new platforms and object formatsFreeBSD (ELF)Windows (PE, PDB)AIX (XCOFF)Update debug information (DWARF) in rewritten binaries

    *Binary Rewriting with Dyninst

    A Brief Discussion of Ways and Means

  • Demo on Tuesday: Scalasca, TAU, Paraver

    Questions?*Binary Rewriting with Dyninst

    A Brief Discussion of Ways and Means

    **Using the rewriter, you can instrument and rewrite executables and libraries. You can also add new libraries to the binaries in this process. For example, add libprofile to do profiling in your binary.*Once we have the file format support in place, its easier to port to new systems. Because we dont have to deal with process control and hence we have fewer system issues to deal with. This allows us to support more systems like BlueGene and FreeBSD.

    The good news is our dynamic instrumatation and binary rewriting shares the interface, for the most part. And what that means for users like you, is lesser things to learn to use the rewriter. You have make very few modifications to your existing mutator for binary rewriting. To be more precise, the only difference between dynamic instrumentation and binary rewriting while writing a dyninst mutator is during setup and finalize stage of the mutator the instrumentation code remains the same.

    *Lets look at a simple code example. For the dynamic instrumentation case, you create or attach to process in memory for instrumentation. In the binary rewriter case, you open a file in disk for instrumentation. So in the mutator code, during setup, you either open a binary file or create a process. After this, the instrumentation code is exactly the same for both case. Once you are done with instrumentation, you either write to a new binary file or continue execution of the loaded process. If you already know how to write a mutator code for dynamic instrumentation, congratulations you just learnt how to do binary rewriting. *That brings to the end of review. I presented you the interface of our rewriter. It was nice and simple. It worked really well with our existing instrumentation interface. Makes you wonder, what was hard with the rewriter? What is you saw is just the tip of the iceberg. What is hard is the details that lies underneath that brings us to the implementation and challenges of the rewriter? In the past we have presented the implementation of our rewriter. So in this talk, I will focus on the challenges that comes specific to rewriting binaries that maybe of interest to this community. *As I said we share the instrumentation infrastructure with the dynamic instrumentation. So our code generator for the most remains unchanged. However, Unlike, the dyn instrumentor, which modifies a process thats already loaded in memory, we need to modify binary that is on disk that almost sounds simpler. Hmmm not really. Why? In this case, we are just reading the binary file but actually writing one out to disk. And these new binary have to conform to various standards more than the dynamic