Dmk shmoo2007

Click here to load reader

download Dmk shmoo2007

of 92

  • date post

  • Category


  • view

  • download


Embed Size (px)



Transcript of Dmk shmoo2007

  • 1. Weaponizing Noam Chomsky: Symbols and GrammarsAre Fun Dan Kaminsky Director Of Penetration Testing

2. Introduction

  • Many physicists would agree that, had it not been for congestion control, the evaluation of web browsers might never have occurred. In fact, few hackers worldwide would disagree with the essential unification of voice-over-IP and public private key pair. In order to solve this riddle, we confirm that SMPs can be made stochastic, cacheable, and interposable.
    • Rooter: A Methodology for the Typical Unification of Access Points and Redundancy

3. That was BS.

  • That also got accepted into a con.
    • Automatically generated from a context free grammar
    • Ive been working too hard all these years
    • Be quiet, or I will replace you with a very small shell script
  • This talk is a bit of a remix
    • Patterns and symbols are interesting me as of late
      • Automatic determination of both is difficult, interesting, and unsolved
    • Integration into human symbolic systems promises particularly interesting results
    • So were going to explore a bit.

4. Language Is Cool

  • Language:A protocol for the transmission of concepts and intentions between humans
    • Documentation is not available
    • Documentation does not really work
    • Learned through exposure and use
      • Significant amount of internal structure, redundancy, and consistency
  • Who makes language?
    • Kids.
      • Adults coin words here and there, but when theyre forced to invent a common language to get things done, its called a Pidgin, and its terrible
      • The kids hear it, and invent a Creole a merged language of significantly greater accuracy and depth
  • Children make languages
  • Adults make working languages
  • Programmers make barely working languages

5. Programmers Talk Funny

  • Fundamentally two languages that programmers must use
    • Code to Human:User Interface Design
    • Code to Code:File and Network Protocol
  • UI is a protocol.
    • This is obvious in retrospect.
  • There are two things this talk hopes to do
    • Correct some of the Code->Human protocols that are out there
    • Use human strategies to analyze Code to Code communications
      • Learning a protocol is learning a language.Humans do not learn languages quickly, and thus were resource bound on fuzzer development
      • Its 2007 most parsers remain unfuzzed (and thus just waiting to be exploited)

6. Weaponizing Noam?

  • An early inference procedure was described by Chomsky and Miller (1957a), as reported in Solomonoff (1959). Chomsky proposed a method for detecting loops in finite state languages. The approach requires a set of valid sentences, and an oracle that determines whether a sentence is in the language. The algorithm proceeds by deleting part of a valid sentence and asking the oracle whether the sentence is still valid. If it is, the deleted part is reinserted into the sequence and repeated, so that it appears twice. If the sentence is still in the language, a cycle has been detected.
    • Inferring Sequential Structure, Craig Neville Manning, 1996
    • This couldnt POSSIBLY be useful for building a structure for a dumb fuzzer to operate against.
      • Instead of seeing if the parsercrashes , just see if it considers the inputvalid

7. Topics Of Discussion

  • Further Explorations in Cryptomnemonics
    • Using Names and Syllables for password representation
  • Sequitur-XML:Merging automated structure discovery with the standard architecture for structure representation
    • which turned out to be quite nice for controlled structure destruction
  • Exploring Dotplots
    • Building a GUI
    • Exploring other domains

8. Intro To Symbol Sets

  • Machine Symbols
    • Data (AA, BB, CC)
    • Code (a(), b(), c())
    • Formats (All, Bad, Code)
  • Human Symbols
    • Letters (A, B, C)
    • Glyphs ( )
    • Syllables (Ah, Bee, See)
    • Words (Amazing, Bear, Clear)
    • Native Names (Alice, Bob, Charlie)
    • Things (Axe, Bone, Chimpanzee)
    • Actions (Ask, Buy, Compute)
    • Colors (Aquamarine, Blue, Chartreuse)
  • Machines can use formats, but their native format is raw bits
  • Humans have no concept of raw bits everythingmust be contextual
    • Long history in mnemonics of mapping arbitrary data to a context

9. Different Domains Have Different Strengths See Visual Processing 10. Cryptomnemonics

  • Definition:The study of human memory, as it applies to cryptographic systems
  • Developing in response to this:
    • $ ssh dan@blah The authenticity of host 'blah (' can't be established. RSA key fingerprint is 09:a9:b1:99:84:17:7d:ba:c6:55:46:5a:17:f8:83:01. Are you sure you want to continue connecting (yes/no)?
  • The machine is acting like its integrating with another machine.Its not, and that matters.
  • Humans can handle hexadecimal characters but not that many.

11. Hex Confusion

  • After somewhere between 2 and 5 characters, most of you will fail to see a difference
    • Positional Bias:Expect to see certain things at the beginning or end
    • Value Confusion:Letter vs. Number is remembered before the actual value of letter or number
      • Glyph confusion
    • Despair Effect
      • Nobody could possibly detect a change, so its not rational to even try

12. Classes of Memory

  • There are three classes of memory, at least to the degree as is useful in cryptography
    • Rejection:Ive never seen that before
    • Recognition:Its that one, not that other one
    • Recollection:Let me describe it to you.
  • SSH just requires rejection
    • Hex is not rejectable
    • Can we try another domain?

13. Exploring The Nymic Domain

  • $ ssh dan@blah Key Data: julio and epifania dezzutti luther and rolande doornbos manual and twyla imbesi dirk and cuc kolopajlo omar and jeana hymel The authenticity of host 'blah (' can't be established. Are you sure you want to continue connecting (yes/no)?
    • Alternate mapping for09:a9:b1:99:84:17:7d:ba:c6:55:46:5a:17:f8:83:01.
    • Proposed last year as a potential solution
  • There is nothing more contextual than a story, and there is nothing more stable in a story than the names of its participants
    • Stories retold are stories remembered we need to be exposed to the above group time and time again to be able to reject any deviation from it

14. How To Derive Names?

  • Original Model
    • Take US Census Data
    • Remove any names that may be easily confused with one another:
      • Easy:Bob v. Bobby
      • Hard:Bob v. Robert
  • Celebrity Naming
    • Marge Godwin
  • Archaic Naming
    • Use constructs from various ancient languages
  • Mechanistic Constructs
    • Bubble Babble: 64 bits = xegoz-tosys-vusik-masar
    • Koremutake: 64 bits = darujifahe stygrifrejy

15. How Many Names?

  • Unclear what the crossover point is between hard from more names, and benefit from more entropy per name
    • Present system is 512 male name, 512 female name, 1024 last names from US Census
    • 256/256/256 would provide 24 bits per couple instead of 40,