Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data...

142
Software Watermarking c April 28, 2011 Christian Collberg

Transcript of Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data...

Page 1: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Software Watermarking

c© April 28, 2011 Christian Collberg

Page 2: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Watermarking

Embed a unique identifier into the executable of a program.

2 / 68

Page 3: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Watermarking

Embed a unique identifier into the executable of a program.

A watermark is much like a copyright notice.

2 / 68

Page 4: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Watermarking

Embed a unique identifier into the executable of a program.

A watermark is much like a copyright notice.

Won’t prevent an attacker from reverse engineering or piratingit the program.

2 / 68

Page 5: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Watermarking

Embed a unique identifier into the executable of a program.

A watermark is much like a copyright notice.

Won’t prevent an attacker from reverse engineering or piratingit the program.

Allows us to show that the program the attacker claims to behis, is actually ours.

2 / 68

Page 6: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Watermarking

Embed a unique identifier into the executable of a program.

A watermark is much like a copyright notice.

Won’t prevent an attacker from reverse engineering or piratingit the program.

Allows us to show that the program the attacker claims to behis, is actually ours.

Software fingerprinting: every copy you sell will have adifferent unique mark in it

2 / 68

Page 7: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Watermarking

Embed a unique identifier into the executable of a program.

A watermark is much like a copyright notice.

Won’t prevent an attacker from reverse engineering or piratingit the program.

Allows us to show that the program the attacker claims to behis, is actually ours.

Software fingerprinting: every copy you sell will have adifferent unique mark in it

Trace the copy back to the original owner, and take legalaction.

2 / 68

Page 8: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

History and

Applicationsp. 468

p0 p1 p2

Page 9: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

History and Applications

Page 10: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Customer #27182818

Filtering mark Secret mark(invisible,fragile)

Fingerprint mark(invisible,robust)

Meta−data mark(visible,fragile)

Authorship mark((in)visible,robust)

Licensing mark(invisible,robust)

(visible,robust)

Validation mark

MD5(kitten.jpg)

(visible,fragile) <right> copy−once </right>

</license>

<license object="kitten.jpg">

</grant>

<grant to="Alice">

"Cute kitten in window in Venice""Attack mice at dawn"

0xc6ba8f25d2dfc44cf518d7f327c8e83f

PG−13

Customer #31415926c© 2006 Collberg

Page 11: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Visible vs. invisible marks

A visible mark acts as a deterrent against misuse.

6 / 68

Page 12: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Visible vs. invisible marks

A visible mark acts as a deterrent against misuse.

An invisible mark, can only be extracted using a secret notavailable to the end user.

6 / 68

Page 13: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Robust vs. fragile marks

A robust mark is difficult to modify (accidentally ordeliberately).

7 / 68

Page 14: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Robust vs. fragile marks

A robust mark is difficult to modify (accidentally ordeliberately).

A fragile mark could (and sometimes should) be easilydestroyed by transformations to the cover object.

7 / 68

Page 15: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Robust vs. fragile marks

A robust mark is difficult to modify (accidentally ordeliberately).

A fragile mark could (and sometimes should) be easilydestroyed by transformations to the cover object.

Marks should survive lossy compression schemes, shrinking,cropping, xeroxing, PAL-to-NTSC,...

7 / 68

Page 16: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Authorship marks

Embed an identification of the copyright owner in the coverobject.

8 / 68

Page 17: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Authorship marks

Embed an identification of the copyright owner in the coverobject.

Visible marks act as a deterrent and invisible ones allow aweb-spider to search for images on the web.

8 / 68

Page 18: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Authorship marks

Embed an identification of the copyright owner in the coverobject.

Visible marks act as a deterrent and invisible ones allow aweb-spider to search for images on the web.

Example: Playboy’s use of Digimarc.

8 / 68

Page 19: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Fingerprint marks

Serialize the cover object, i.e. embed a different mark in everydistributed copy.

9 / 68

Page 20: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Fingerprint marks

Serialize the cover object, i.e. embed a different mark in everydistributed copy.

Example: actor Carmine Caridi gave away copies of AcademyAward screening tapes,

9 / 68

Page 21: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Fingerprint marks

Serialize the cover object, i.e. embed a different mark in everydistributed copy.

Example: actor Carmine Caridi gave away copies of AcademyAward screening tapes,

Example: Beta copies of software.

9 / 68

Page 22: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Licensing marks

A licensing mark encodes, invisibly and robustly, the way thecover object can be used by the end user.

10 / 68

Page 23: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Licensing marks

A licensing mark encodes, invisibly and robustly, the way thecover object can be used by the end user.

Integral part of any DRM system.

10 / 68

Page 24: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Licensing marks

A licensing mark encodes, invisibly and robustly, the way thecover object can be used by the end user.

Integral part of any DRM system.

Usage rules could be stored in file headers, but usingwatermarking ensures that the data remains even aftertransformations.

10 / 68

Page 25: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Meta-data mark

Meta-data marks are visible and (possibly) fragile marks thatembed useful data.

11 / 68

Page 26: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Meta-data mark

Meta-data marks are visible and (possibly) fragile marks thatembed useful data.

Example: captions.

11 / 68

Page 27: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Validation marks

Used by the end user to verify that the marked object isauthentic and hasn’t been altered.

12 / 68

Page 28: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Validation marks

Used by the end user to verify that the marked object isauthentic and hasn’t been altered.

Example: compute an MD5 sum of an object and embed it asa watermark.

12 / 68

Page 29: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Validation marks

Used by the end user to verify that the marked object isauthentic and hasn’t been altered.

Example: compute an MD5 sum of an object and embed it asa watermark.

Example: validate that a crime scene photograph hasn’t beenchanged (by moving, say, a gun from one person’s hand toanother).

12 / 68

Page 30: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Validation marks

Used by the end user to verify that the marked object isauthentic and hasn’t been altered.

Example: compute an MD5 sum of an object and embed it asa watermark.

Example: validate that a crime scene photograph hasn’t beenchanged (by moving, say, a gun from one person’s hand toanother).

Validation marks need to be fragile,

12 / 68

Page 31: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Filtering marks

A filtering or classification mark carries classification codes toallow media players to filter out any inappropriate material.

13 / 68

Page 32: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Filtering marks

A filtering or classification mark carries classification codes toallow media players to filter out any inappropriate material.

The mark needs to be robust and visible.

13 / 68

Page 33: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Secret marks

A secret mark is used for covert communication.

14 / 68

Page 34: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Secret marks

A secret mark is used for covert communication.

steganography.

14 / 68

Page 35: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Secret marks

A secret mark is used for covert communication.

steganography.

Robustness matters not at all.

14 / 68

Page 36: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Secret marks

A secret mark is used for covert communication.

steganography.

Robustness matters not at all.

Invisibility is vitally important.

14 / 68

Page 37: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Secret marks

A secret mark is used for covert communication.

steganography.

Robustness matters not at all.

Invisibility is vitally important.

Example:

Hidden in the X-rated pictures on several

pornographic Web sites and the posted comments

on sports chat rooms may lie the encrypted

blueprints of the next terrorist attack against the

United States or its allies.

14 / 68

Page 38: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Audio marking: Echo hiding

Embed echoes that are short enough to be imperceptible tothe human ear:

p1

δ0δ1

p0

15 / 68

Page 39: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Audio: Least Significant Bit

LSB of an audio sample is the one that contributes least toyour perception,

p0 p1 p2

16 / 68

Page 40: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Audio: Least Significant Bit

LSB of an audio sample is the one that contributes least toyour perception,

Alter without adversely affecting quality!

p0 p1 p2

16 / 68

Page 41: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Audio: Least Significant Bit

LSB of an audio sample is the one that contributes least toyour perception,

Alter without adversely affecting quality!

Attack: randomly replace the least significant bit of everysample!

p0 p1 p2

16 / 68

Page 42: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Image: Patchwork

Embed a single bit by manipulating the brightness of pixels.

17 / 68

Page 43: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Image: Patchwork

Embed a single bit by manipulating the brightness of pixels.

Use a pseudo-random number sequence to trace out pairs(A, B) of pixels

17 / 68

Page 44: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Image: Patchwork

Embed a single bit by manipulating the brightness of pixels.

Use a pseudo-random number sequence to trace out pairs(A, B) of pixels

During embedding adjust the brightness of A up by a smallamount, and B down by the same small amount:

17 / 68

Page 45: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Patchwork: Embedding algorithm

Embed(P, key):

1 Init RND(key); δ ← 5

2 i ← RND(); j ← RND()

3 Adjust the brightness of pixels ai and

bi : ai ← ai + δ; bj ← bj − δ

4 repeat from 2 ≈ 10000 times

18 / 68

Page 46: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Patchwork: Recognition algorithm

Recognize(P, key):

1 Init RND(key); S ← 0

2 i ← RND(); j ← RND()

3 S ← S + (ai − bj)

4 repeat from 2 ≈ 10000 times

5 if S ≫ 0 ⇒ 0 output "marked!"

19 / 68

Page 47: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Blind vs. Informed

Watermarking recognizers are either blind or informed.

20 / 68

Page 48: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Blind vs. Informed

Watermarking recognizers are either blind or informed.

To extract a blind mark you need the marked object and thesecret key.

20 / 68

Page 49: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Blind vs. Informed

Watermarking recognizers are either blind or informed.

To extract a blind mark you need the marked object and thesecret key.

To extract an informed mark you need extra information, suchas original, unwatermarked, object.

20 / 68

Page 50: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Watermarking text

Cover object types:

the text itself with formatting (ASCII text); orfree-flowing text;an image of the text (PostScript or PDF).

21 / 68

Page 51: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Watermarking Text: PDF

Similar to marking images.

of my generation,I saw the best minds

starving hysterical naked

of my generation,

starving hysterical naked

I saw the best minds

{{12pt

14pt

{{12pt

12pt

22 / 68

Page 52: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Watermarking Text: PDF

Similar to marking images.

Example: encode 0-bit or a 1-bit by hanging word/linespacing.

of my generation,I saw the best minds

starving hysterical naked

of my generation,

starving hysterical naked

I saw the best minds

{{12pt

14pt

{{12pt

12pt

22 / 68

Page 53: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Watermarking Text: formatted ASCII

Encode the mark in white-space: 1 space = 0-bit, 2 spaces =1-bit:

¨ ¥

IÃsawÃtheÃbestÃminds

ofÃmyÃgeneration ,

starvingÃhystericalÃnaked§ ¦

¨ ¥

IÃÃÃsawÃÃtheÃbestÃÃÃminds

ofÃÃÃÃÃÃÃmyÃÃÃgeneration ,

starvingÃhystericalÃnaked§ ¦

23 / 68

Page 54: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Watermarking Text: Synonym replacement

Replace words with synonyms.

Insert spelling or punctuation errors.

¨ ¥

I saw the best minds

of my generation ,

starving hysterical naked§ ¦

¨ ¥

I observed the choice intellects

of my generation ,

famished hysterical nude§ ¦

24 / 68

Page 55: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Watermarking Text: Syntax

Encode a mark in the syntactic structure of an English text:1 Devise an extract function which computes a bit from a

sentence,2 Modify the sentence until it embeds the right bit.

¨ ¥

I saw the best minds

of my generation ,

starving hysterical naked§ ¦

¨ ¥

It was the best minds

of my generation that I saw ,

starving hysterical naked§ ¦

25 / 68

Page 56: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Watermarking Text: Atallah et al.

1 Chunk up the watermark, embed one piece per sentence.2 A function computes one bit per syntax tree node.3 Modify sentence until these bits embed a watermark chunk.4 A marker sentence precedes every watermark-bearing sentence.

¨ ¥

I saw the best minds

of my generation ,

starving hysterical naked§ ¦

¨ ¥

I saw the best minds of my

generation. They were starving

hysterical naked. None , baby ,

none were smarter than them. Nor

more lacking in supply of essential

nutrients or in more need of

adequate clothing. Baby.§ ¦ 26 / 68

Page 57: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Watermarking

Softwarep. 478

Page 58: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Static watermarks

key

EmbedStatic Static

Extract

key

wP

w

P′

You care about

Encoding bitrate

Stealth

Resilience to attack

28 / 68

Page 59: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Ideas for Software Watermark Algorithms

Encode the watermark

in a permutation of a language structure

29 / 68

Page 60: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Ideas for Software Watermark Algorithms

Encode the watermark

in a permutation of a language structure

in an embedded media object

29 / 68

Page 61: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Ideas for Software Watermark Algorithms

Encode the watermark

in a permutation of a language structure

in an embedded media object

in a statistical property of the program

29 / 68

Page 62: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Ideas for Software Watermark Algorithms

Encode the watermark

in a permutation of a language structure

in an embedded media object

in a statistical property of the program

as a solution to a static analysis problem

29 / 68

Page 63: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Ideas for Software Watermark Algorithms

Encode the watermark

in a permutation of a language structure

in an embedded media object

in a statistical property of the program

as a solution to a static analysis problem

in the topology of a CFG

29 / 68

Page 64: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Dynamic watermarks

Dynamic Dynamic ExtractEmbedw

PP

′ w

I1, · · · , IkI1, · · · , Ik

Encode the watermark in the runtime state of the program

30 / 68

Page 65: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Dynamic watermarks

Dynamic Dynamic ExtractEmbedw

PP

′ w

I1, · · · , IkI1, · · · , Ik

Encode the watermark in the runtime state of the program

Dynamic marks appear more robust, but are morecumbersome to use

30 / 68

Page 66: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Attacks against software watermarks

key

EmbedStatic Static

Extract

key

wP

w

P′

The adversary knows the algorithm

31 / 68

Page 67: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Attacks against software watermarks

key

EmbedStatic Static

Extract

key

wP

w

P′

The adversary knows the algorithm

The adversary has complete access to the program

31 / 68

Page 68: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Attacks against software watermarks

key

EmbedStatic Static

Extract

key

wP

w

P′

The adversary knows the algorithm

The adversary has complete access to the program

The adversary doesn’t know the key

31 / 68

Page 69: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Attacks against software watermarks

key

EmbedStatic Static

Extract

key

wP

w

P′

The adversary knows the algorithm

The adversary has complete access to the program

The adversary doesn’t know the key

The adversary doesn’t know the embedding location (it’s keydependent)

31 / 68

Page 70: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Attacks — Rewrite attack

Alice has to assume that Bob will try to destroy her marksbefore trying to resell the program!

One attack will always succeed. . .

32 / 68

Page 71: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Attacks — Rewrite attack

Alice has to assume that Bob will try to destroy her marksbefore trying to resell the program!

One attack will always succeed. . .

42 Extract ?AttackRewrite P’’P’

Ideally, this is the only effective attack.

32 / 68

Page 72: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Attacks — Additive attack

Bob can also add his own watermarks to the program:

11

42P’

AttackAdditive P’’

422319 Extract ?

An additive attack can help Bob to cast doubt in court as towhose watermark is the original one.

33 / 68

Page 73: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Attacks — Distortive attack

A distortive attack applies semantics-preservingtransformations to try to disturb Alice’s recognizer:

transformations

P’’ ?Distortive

preservingSemantics−

P’42 Extract42Attack

34 / 68

Page 74: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Attacks — Distortive attack

A distortive attack applies semantics-preservingtransformations to try to disturb Alice’s recognizer:

transformations

P’’ ?Distortive

preservingSemantics−

P’42 Extract42Attack

Transformations: code optimizations, obfuscations,. . .

34 / 68

Page 75: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Attacks — Collusive attack

Bob buys two differently marked copies and comparing themto discover the location of the fingerprint:

AttackP2

17

P1

42 P’’ExtractCollusive ?

35 / 68

Page 76: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Attacks — Collusive attack

Bob buys two differently marked copies and comparing themto discover the location of the fingerprint:

AttackP2

17

P1

42 P’’ExtractCollusive ?

Alice should apply a different set of obfuscations to eachdistributed copy, so that comparing two copies of the sameprogram will yield little information.

35 / 68

Page 77: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Watermarking Algorithms

Page 78: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Watermarking by

Permutationp. 486

goto B3if (e) goto B2

· · · · · ·

B2

· · · · · ·

B1

B0

· · · · · ·

goto B2

B5

· · · · · ·

B6

B7

· · · · · ·

if (e) goto B2

· · · · · ·

· · · · · ·

· · · · · ·

· · · · · ·

· · · · · ·

if (e) goto B6

if (e) goto B3

· · · · · ·

· · · · · ·

if (e) goto B2

· · · · · ·

if (e) goto B3

· · · · · ·

· · · · · ·

goto B2

· · · · · ·

if (e) goto B6

B0

B1

B2

B3

B4

B5

B6

B7 B7

B4

B3

B6

B1

B2

B5

B0goto B1

goto B2

goto B7

goto B5

· · · · · ·

if (e) goto B6

B4

if (e) goto B3

· · · · · ·

B3

Page 79: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

AlgorithmwmDM

p. 488

Reordering Basic Blocks

goto B3if (e) goto B2· · · · · ·

B2

· · · · · ·

B1

B0

· · · · · ·

goto B2

B5

· · · · · ·

B6

B7

· · · · · ·

if (e) goto B2· · · · · ·

· · · · · ·

· · · · · ·

· · · · · ·

· · · · · ·

if (e) goto B6

if (e) goto B3

· · · · · ·

· · · · · ·

if (e) goto B2· · · · · ·

if (e) goto B3· · · · · ·

· · · · · ·

goto B2

· · · · · ·

if (e) goto B6

B0B1

B2

B3

B4

B5

B6

B7 B7

B4

B3

B6

B1

B2

B5

B0goto B1

goto B2

goto B7

goto B5

· · · · · ·

if (e) goto B6

B4

if (e) goto B3· · · · · ·

B3

Page 80: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Algorithm wmDM: Reordering Basic Blocks

goto B3if (e) goto B2

· · · · · ·

B2

· · · · · ·

B1

B0

· · · · · ·

goto B2

B5

· · · · · ·

B6

B7

· · · · · ·

if (e) goto B2

· · · · · ·

· · · · · ·

· · · · · ·

· · · · · ·

· · · · · ·

if (e) goto B6

if (e) goto B3

· · · · · ·

· · · · · ·

if (e) goto B2

· · · · · ·

if (e) goto B3

· · · · · ·

· · · · · ·

goto B2

· · · · · ·

if (e) goto B6

B0

B1

B2

B3

B4

B5

B6

B7B7

B4

B3

B6

B1

B2

B5

B0goto B1

goto B2

goto B7

goto B5

· · · · · ·

if (e) goto B6

B4

if (e) goto B3

· · · · · ·

B3

39 / 68

Page 81: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Algorithm wmDM: Reordering Basic Blocks

Performance overhead of 0-11% for three standardhigh-performance computing benchmarks.

40 / 68

Page 82: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Algorithm wmDM: Reordering Basic Blocks

Performance overhead of 0-11% for three standardhigh-performance computing benchmarks.

Negligible slowdown for a set of Java benchmarks.

40 / 68

Page 83: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Algorithm wmDM: Reordering Basic Blocks

Performance overhead of 0-11% for three standardhigh-performance computing benchmarks.

Negligible slowdown for a set of Java benchmarks.

If you have m items to reorder you can encode

log2(m!) ≈ log2(√

2πm(m/e)m) = O(m log m)

watermarking bits.

40 / 68

Page 84: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Algorithm wmDM: Reordering Basic Blocks

Performance overhead of 0-11% for three standardhigh-performance computing benchmarks.

Negligible slowdown for a set of Java benchmarks.

If you have m items to reorder you can encode

log2(m!) ≈ log2(√

2πm(m/e)m) = O(m log m)

watermarking bits.

What about stealth?

40 / 68

Page 85: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

AlgorithmwmVVS

p. 506

Watermarks in CFGs

Page 86: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Algorithm wmVVS: Watermarks in CFGs

Basic idea:1 Embed the watermark in the CFG of a function.

42 / 68

Page 87: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Algorithm wmVVS: Watermarks in CFGs

Basic idea:1 Embed the watermark in the CFG of a function.2 Tie the CFG tightly to the rest of the program.

42 / 68

Page 88: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Algorithm wmVVS: Watermarks in CFGs

Basic idea:1 Embed the watermark in the CFG of a function.2 Tie the CFG tightly to the rest of the program.

Issues:1 How do you encode a number in a CFG?2 How do you find the watermark CFG?3 How do you attach the watermark CFG to the rest of the

program?

42 / 68

Page 89: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Algorithm wmVVS: Embedding

Generate a stealthy watermark CFG:1 basic blocks have out-degree of one or two

43 / 68

Page 90: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Algorithm wmVVS: Embedding

Generate a stealthy watermark CFG:1 basic blocks have out-degree of one or two2 it is reducible

43 / 68

Page 91: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Algorithm wmVVS: Embedding

Generate a stealthy watermark CFG:1 basic blocks have out-degree of one or two2 it is reducible3 it is shallow (real code isn’t deeply nested)

43 / 68

Page 92: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Algorithm wmVVS: Embedding

Generate a stealthy watermark CFG:1 basic blocks have out-degree of one or two2 it is reducible3 it is shallow (real code isn’t deeply nested)4 it is small (real functions aren’t big)

43 / 68

Page 93: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Algorithm wmVVS: Embedding

Generate a stealthy watermark CFG:1 basic blocks have out-degree of one or two2 it is reducible3 it is shallow (real code isn’t deeply nested)4 it is small (real functions aren’t big)5 it is resilient to edge-flips :

if a>=b goto Bj

· · ·

if a<b goto Bk

· · ·

BjBkBkBj

43 / 68

Page 94: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Algorithm wmVVS: Embedding

Generate a stealthy watermark CFG:1 basic blocks have out-degree of one or two2 it is reducible3 it is shallow (real code isn’t deeply nested)4 it is small (real functions aren’t big)5 it is resilient to edge-flips :

if a>=b goto Bj

· · ·

if a<b goto Bk

· · ·

BjBkBkBj

Reducible Permutation Graphs (RPGs)

43 / 68

Page 95: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

public static int bogus;

public static int m4(int i) {

i = i & 0x7BFF;

bogus +=2; i-=i>>2;

do {

i = i >> 3;

label: {

if (++bogus <= 0) {

i = i | 0x1000;

if ((bogus += 6) == 0)

break label;

}

++bogus;

i = i * 88 >>> 1;

}

i = i | 0x4;

} while((bogus += 6)<0);

bogus +=2; return i;

}

Page 96: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

public void P(boolean S) {

if (S)

System.out.println("YES");

else

System.out.println("NO");

}

public void main (String args []) {

for (int i=1; i<args.length; i++) {

if (args [0]. equals(args[i])) {

P(true);

if (m4(3)<0)

P(false) ;

return;

}

}

m3(-1) ;

P(false);

}

public int bogus;

public int m4(int i) {

i = i & 0x7BFF;

bogus += 2;

i -= i >> 2;

do {

if (i<-6)

P(bogus<i);

i = i >> 3;

label: {

if (++ bogus <= 0) {

i = i | 0x1000;

m3(0);

if (( bogus +=6)==0)

break label;

}

++ bogus;

i = i * 88 >>> 1;

}

i = i | 0x4;

} while ((( bogus += 6)<0)

&& (m3(9)>=0) )

bogus += 2;

return i;

}

public int m3(int i) {

i = i ^ i >> 0x1F;

i = i / 4 * 3;

do {

i -= i >> 3;

if(( bogus += 11) <= 0)

break;

Page 97: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Algorithm wmVVS: Recognition

So, how do you find the watermark CFG among all the “real”CFGs?

46 / 68

Page 98: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Algorithm wmVVS: Recognition

So, how do you find the watermark CFG among all the “real”CFGs?

Idea:

Mark the basic blocks,A 0 for every cover program block, a 1 for every watermarkblock.

46 / 68

Page 99: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Algorithm wmVVS: Recognition

So, how do you find the watermark CFG among all the “real”CFGs?

Idea:

Mark the basic blocks,A 0 for every cover program block, a 1 for every watermarkblock.

Recognition procedure:1 compute the mark value for each basic block in the program

46 / 68

Page 100: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Algorithm wmVVS: Recognition

So, how do you find the watermark CFG among all the “real”CFGs?

Idea:

Mark the basic blocks,A 0 for every cover program block, a 1 for every watermarkblock.

Recognition procedure:1 compute the mark value for each basic block in the program2 assume that any function with more than t% blocks marked is

a watermark function

46 / 68

Page 101: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Algorithm wmVVS: Recognition

So, how do you find the watermark CFG among all the “real”CFGs?

Idea:

Mark the basic blocks,A 0 for every cover program block, a 1 for every watermarkblock.

Recognition procedure:1 compute the mark value for each basic block in the program2 assume that any function with more than t% blocks marked is

a watermark function3 construct CFGs for the watermark functions

46 / 68

Page 102: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Algorithm wmVVS: Recognition

So, how do you find the watermark CFG among all the “real”CFGs?

Idea:

Mark the basic blocks,A 0 for every cover program block, a 1 for every watermarkblock.

Recognition procedure:1 compute the mark value for each basic block in the program2 assume that any function with more than t% blocks marked is

a watermark function3 construct CFGs for the watermark functions4 decode each one into an integer watermark

46 / 68

Page 103: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Algorithm wmVVS: Recognition

So, how do you find the watermark CFG among all the “real”CFGs?

Idea:

Mark the basic blocks,A 0 for every cover program block, a 1 for every watermarkblock.

Recognition procedure:1 compute the mark value for each basic block in the program2 assume that any function with more than t% blocks marked is

a watermark function3 construct CFGs for the watermark functions4 decode each one into an integer watermark

The embedder can split the watermarking into pieces, forhigher bitrate.

46 / 68

Page 104: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Steganographic

Embeddingsp. 522

ESCAPEATDAWN!

Wendy

BobAlice

Page 105: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Steganographic Embeddings

Customer #27182818

Filtering mark Secret mark(invisible,fragile)

Fingerprint mark(invisible,robust)

Meta−data mark(visible,fragile)

Authorship mark((in)visible,robust)

Licensing mark(invisible,robust)

(visible,robust)

Validation mark

MD5(kitten.jpg)

(visible,fragile) <right> copy−once </right>

</license>

<license object="kitten.jpg">

</grant>

<grant to="Alice">

"Cute kitten in window in Venice""Attack mice at dawn"

0xc6ba8f25d2dfc44cf518d7f327c8e83f

PG−13

Customer #31415926c© 2006 Collberg

48 / 68

Page 106: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Watermark Embeddings

Watermarks are

short identifiersdifficult to locatehard to destroy

49 / 68

Page 107: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Watermark Embeddings

Watermarks are

short identifiersdifficult to locatehard to destroy

The adversary

knows that the object is markedknows the algorithm useddoesn’t know the keyis active

49 / 68

Page 108: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Watermark Embeddings

Watermarks are

short identifiersdifficult to locatehard to destroy

The adversary

knows that the object is markedknows the algorithm useddoesn’t know the keyis active

You care about

data-ratestealthresilience

49 / 68

Page 109: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Steganographic Embeddings

Stegomarks are

long identifiersdifficult to locate

50 / 68

Page 110: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Steganographic Embeddings

Stegomarks are

long identifiersdifficult to locate

The adversary

wants to know if the object is markedknows the algorithm useddoesn’t know the keyis passive

50 / 68

Page 111: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Steganographic Embeddings

Stegomarks are

long identifiersdifficult to locate

The adversary

wants to know if the object is markedknows the algorithm useddoesn’t know the keyis passive

You care about

data-ratestealth

50 / 68

Page 112: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Steganography — Prisoners’ Problem

Alice Bob

51 / 68

Page 113: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Steganography — Prisoners’ Problem

Alice Bob

51 / 68

Page 114: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Steganography — Prisoners’ Problem

Alice Bob

51 / 68

Page 115: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Steganography — Prisoners’ Problem

Wendy

Alice Bob

51 / 68

Page 116: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Steganography — Prisoners’ Problem

Wendy

Alice Bob

51 / 68

Page 117: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Steganography — Prisoners’ Problem

DAWN!

ESCAPEAT

Wendy

Alice Bob

51 / 68

Page 118: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Steganography — Prisoners’ Problem

ESCAPEATDAWN!

Wendy

BobAlice

51 / 68

Page 119: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Steganography — Null cipher

Easter is soon, dear! So many flowers! Can you

smell them? Are you cold at night? Prison food

stinks! Eat well, still! Are you lonely? The

prison cat is cute! Don’t worry! All is well!

Wendy is nice! Need you! ):

52 / 68

Page 120: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

AlgorithmwmASB

p. 523

Hidden Messages in x86 Binaries

wm

sym

y.oy()

sym

x.ox()

y()wm −E wm −D

x()

sym

y()

cc

ccld

x.cx()

y.c

y()

strip

wm

_1()_2()

x()

sym

wm

a1.outa0.out

a2.out

Page 121: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

wmASB: Hidden Messages in x86 Binaries

Basic idea: Play compiler!

whenever the compiler has a choice in which code to

generate, or the order in which to generate it, pick

the choice that embeds the next bits from the

message W .

54 / 68

Page 122: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

wmASB: Hidden Messages in x86 Binaries

Basic idea: Play compiler!

whenever the compiler has a choice in which code to

generate, or the order in which to generate it, pick

the choice that embeds the next bits from the

message W .

Four sources of ambiguity:1 code layout (ordering of chains of basic blocks)

54 / 68

Page 123: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

wmASB: Hidden Messages in x86 Binaries

Basic idea: Play compiler!

whenever the compiler has a choice in which code to

generate, or the order in which to generate it, pick

the choice that embeds the next bits from the

message W .

Four sources of ambiguity:1 code layout (ordering of chains of basic blocks)2 instruction scheduling (instruction order within basic blocks)

54 / 68

Page 124: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

wmASB: Hidden Messages in x86 Binaries

Basic idea: Play compiler!

whenever the compiler has a choice in which code to

generate, or the order in which to generate it, pick

the choice that embeds the next bits from the

message W .

Four sources of ambiguity:1 code layout (ordering of chains of basic blocks)2 instruction scheduling (instruction order within basic blocks)3 register allocation

54 / 68

Page 125: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

wmASB: Hidden Messages in x86 Binaries

Basic idea: Play compiler!

whenever the compiler has a choice in which code to

generate, or the order in which to generate it, pick

the choice that embeds the next bits from the

message W .

Four sources of ambiguity:1 code layout (ordering of chains of basic blocks)2 instruction scheduling (instruction order within basic blocks)3 register allocation4 instruction selection

54 / 68

Page 126: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

wmASB: Embedding

1 Construct:

1 codebook B of equivalent instruction sequences

mul ri,x,5

shl ri,x,2

add ri,ri,x

add ri,x,x

add ri,ri,riadd ri,ri,x

2 statistical model M of real code

55 / 68

Page 127: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

wmASB: Embedding

1 Construct:

1 codebook B of equivalent instruction sequences

mul ri,x,5

shl ri,x,2

add ri,ri,x

add ri,x,x

add ri,ri,riadd ri,ri,x

2 statistical model M of real code

2 Encrypt W with key .

55 / 68

Page 128: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

wmASB: Embedding

1 Construct:

1 codebook B of equivalent instruction sequences

mul ri,x,5

shl ri,x,2

add ri,ri,x

add ri,x,x

add ri,ri,riadd ri,ri,x

2 statistical model M of real code

2 Encrypt W with key .

3 Canonicalize P:

1 Sort block chains, procedures, modules2 Order instructions in each block in standard order3 Replace each instruction with the first alternative from B.

55 / 68

Page 129: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

wmASB: Embedding

4 Code layout : Embed bits from W by reordering codesegments within the executable.

56 / 68

Page 130: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

wmASB: Embedding

4 Code layout : Embed bits from W by reordering codesegments within the executable.

5 Instruction scheduling :

1 Build dependency graph2 Generate all valid instruction schedules3 Embed bits from W by picking a schedule

Use M to avoid picking unusual schedules.

56 / 68

Page 131: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

wmASB: Embedding

4 Code layout : Embed bits from W by reordering codesegments within the executable.

5 Instruction scheduling :

1 Build dependency graph2 Generate all valid instruction schedules3 Embed bits from W by picking a schedule

Use M to avoid picking unusual schedules.

6 Instruction selection: Use B to embed bits from W byreplacing instructions. Use M to avoid unusual instructionsequences.

56 / 68

Page 132: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

wmASB: Stealth

Instruction selection :

There are 3078 different encodings of three instructions forEAX=(EAX/2)!Most don’t occur in real code. . .

57 / 68

Page 133: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

wmASB: Stealth

Instruction selection :

There are 3078 different encodings of three instructions forEAX=(EAX/2)!Most don’t occur in real code. . .

Instruction scheduling :

Avoid bad schedules: no compiler would generate it!Avoid generating different schedules for two blocks with thesame dependency graph!

57 / 68

Page 134: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

wmASB: Stealth

Instruction selection :

There are 3078 different encodings of three instructions forEAX=(EAX/2)!Most don’t occur in real code. . .

Instruction scheduling :

Avoid bad schedules: no compiler would generate it!Avoid generating different schedules for two blocks with thesame dependency graph!

Code layout :

Compilers lay out code for locality: don’t deviate too muchfrom that!

57 / 68

Page 135: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

wmASB: Stealth

Encoding rate

Unstealthy code: 127

Stealthy: 189 .

58 / 68

Page 136: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

wmASB: Stealth

Encoding rate

Unstealthy code: 127

Stealthy: 189 .

Encoding space:

58% from code layout25% from instruction scheduling17% from instruction selection

58 / 68

Page 137: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

wmASB: Stealth

Encoding rate

Unstealthy code: 127

Stealthy: 189 .

Encoding space:

58% from code layout25% from instruction scheduling17% from instruction selection

Real code doesn’t use unusual instruction sequences.

Real code contains many schedules for the same dependencygraph

58 / 68

Page 138: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Wanna design a watermarking algorithm?

Find a language structure into which to encode the mark(CFGs, threads, dynamic control flow. . . )

〈language structure, encoder/decoder , tracer/locator ,embedder/extractor , attacker/protector〉

59 / 68

Page 139: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Wanna design a watermarking algorithm?

Find a language structure into which to encode the mark(CFGs, threads, dynamic control flow. . . )

Construct an encoder/decoder (number↔CFG,. . . )

〈language structure, encoder/decoder , tracer/locator ,embedder/extractor , attacker/protector〉

59 / 68

Page 140: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Wanna design a watermarking algorithm?

Find a language structure into which to encode the mark(CFGs, threads, dynamic control flow. . . )

Construct an encoder/decoder (number↔CFG,. . . )

Construct a tracer/locater to find locations for the mark(using key, every function, . . . )

〈language structure, encoder/decoder , tracer/locator ,embedder/extractor , attacker/protector〉

59 / 68

Page 141: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Wanna design a watermarking algorithm?

Find a language structure into which to encode the mark(CFGs, threads, dynamic control flow. . . )

Construct an encoder/decoder (number↔CFG,. . . )

Construct a tracer/locater to find locations for the mark(using key, every function, . . . )

Construct a embedder/extractor to tie the mark tosurrounding code

〈language structure, encoder/decoder , tracer/locator ,embedder/extractor , attacker/protector〉

59 / 68

Page 142: Software Watermarking - University of Arizona · 2011-05-11 · watermarking ensures that the data remains even after transformations. 10/68. Meta-data mark ... Audio marking: Echo

Wanna design a watermarking algorithm?

Find a language structure into which to encode the mark(CFGs, threads, dynamic control flow. . . )

Construct an encoder/decoder (number↔CFG,. . . )

Construct a tracer/locater to find locations for the mark(using key, every function, . . . )

Construct a embedder/extractor to tie the mark tosurrounding code

Decide on an attack model .

〈language structure, encoder/decoder , tracer/locator ,embedder/extractor , attacker/protector〉

59 / 68