CSM25 Secure Information Hiding Dr Hans Georg … · CSM25 Secure Information Hiding Dr Hans Georg...
-
Upload
truongcong -
Category
Documents
-
view
228 -
download
0
Transcript of CSM25 Secure Information Hiding Dr Hans Georg … · CSM25 Secure Information Hiding Dr Hans Georg...
Statistics and SteganalysisCSM25 Secure Information Hiding
Dr Hans Georg Schaathun
University of Surrey
Spring 2008
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 1 / 42
Learning Outcomes
After this session, everyone shouldhow statistical methods apply to steganographyunderstand how a statistical hypothesis can be usedbe able to implement the basic χ2 test of steganalysis
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 2 / 42
Suggested Reading
Core Reading
Cox et al. Chapter 13.
Suggested Reading
«Higher-order statistical steganalysis of palette images»by Jessica Fridrich, Miroslav Goljan, David Soukal in Proc. SPIEElectronic Imaging, Jan 2003, pp. 178-190
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 3 / 42
General Introduction Statistical models
Outline
1 General IntroductionStatistical modelsHistogramme
2 The χ2 testPairs of ValuesI visual approachHypothesis testingThe error types
3 PostlogueGeneralised χ2 testSummary
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 4 / 42
General Introduction Statistical models
The fundamental question
Wendy the Warden intercepts an image.
Depends on a model for natural imagesStatistical models and probability distributions
With a perfect model,cipher with ciphertexts distributed as natural images
If Wendy has a better model than Alice and Bob,then she can do effective steganalysis
In reality, we do not know what a natural image looks like
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 5 / 42
General Introduction Statistical models
The fundamental question
Wendy the Warden intercepts an image.
Is the image a stegogramme?
Depends on a model for natural imagesStatistical models and probability distributions
With a perfect model,cipher with ciphertexts distributed as natural images
If Wendy has a better model than Alice and Bob,then she can do effective steganalysis
In reality, we do not know what a natural image looks like
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 5 / 42
General Introduction Statistical models
The fundamental question
Wendy the Warden intercepts an image.
Is it a probable, natural image?
Is it a probable stegogramme?
Depends on a model for natural imagesStatistical models and probability distributions
With a perfect model,cipher with ciphertexts distributed as natural images
If Wendy has a better model than Alice and Bob,then she can do effective steganalysis
In reality, we do not know what a natural image looks like
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 5 / 42
General Introduction Statistical models
The fundamental question
Wendy the Warden intercepts an image.
Is it a probable, natural image?
Is it a probable stegogramme?
Depends on a model for natural imagesStatistical models and probability distributions
With a perfect model,cipher with ciphertexts distributed as natural images
If Wendy has a better model than Alice and Bob,then she can do effective steganalysis
In reality, we do not know what a natural image looks like
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 5 / 42
General Introduction Statistical models
The fundamental question
Wendy the Warden intercepts an image.
Is it a probable, natural image?
Is it a probable stegogramme?
Depends on a model for natural imagesStatistical models and probability distributions
With a perfect model,cipher with ciphertexts distributed as natural images
If Wendy has a better model than Alice and Bob,then she can do effective steganalysis
In reality, we do not know what a natural image looks like
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 5 / 42
General Introduction Statistical models
The fundamental question
Wendy the Warden intercepts an image.
Is it a probable, natural image?
Is it a probable stegogramme?
Depends on a model for natural imagesStatistical models and probability distributions
With a perfect model,cipher with ciphertexts distributed as natural images
If Wendy has a better model than Alice and Bob,then she can do effective steganalysis
In reality, we do not know what a natural image looks like
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 5 / 42
General Introduction Statistical models
The fundamental question
Wendy the Warden intercepts an image.
Is it a probable, natural image?
Is it a probable stegogramme?
Depends on a model for natural imagesStatistical models and probability distributions
With a perfect model,cipher with ciphertexts distributed as natural images
If Wendy has a better model than Alice and Bob,then she can do effective steganalysis
In reality, we do not know what a natural image looks like
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 5 / 42
General Introduction Statistical models
A visual example
Two different patterns in LSB... sharp borderWhy?
Corresponding border in full image?No explanation in full message⇒ probably stego...
... but not certain
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 6 / 42
General Introduction Statistical models
A visual example
Two different patterns in LSB... sharp borderWhy?
Corresponding border in full image?No explanation in full message⇒ probably stego...
... but not certain
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 6 / 42
General Introduction Statistical models
A visual example
Two different patterns in LSB... sharp borderWhy?
Corresponding border in full image?No explanation in full message⇒ probably stego...
... but not certain
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 6 / 42
General Introduction Statistical models
A visual example
Two different patterns in LSB... sharp borderWhy?
Corresponding border in full image?No explanation in full message⇒ probably stego...
... but not certain
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 6 / 42
General Introduction Statistical models
The remit of statistics
Statistics can estimate ‘normal’ behaviourand compare behaviours
AdvantagesAutomated decisionsExtract detailExact, quantifiable featuresAggregate measures
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 7 / 42
General Introduction Statistical models
The remit of statistics
Statistics can estimate ‘normal’ behaviourand compare behaviours
AdvantagesAutomated decisionsExtract detailExact, quantifiable featuresAggregate measures
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 7 / 42
General Introduction Histogramme
Outline
1 General IntroductionStatistical modelsHistogramme
2 The χ2 testPairs of ValuesI visual approachHypothesis testingThe error types
3 PostlogueGeneralised χ2 testSummary
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 8 / 42
General Introduction Histogramme
A typical image
Image histogram made by imhist in MatlabGives number of pixels per colour-value
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 9 / 42
General Introduction Histogramme
And a stego-image
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 10 / 42
General Introduction Histogramme
And a stego-image
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 10 / 42
General Introduction Histogramme
What happened?
Histogram of stego-image: More raggedEvery other bar sticks out.Why?50.8% 1-s in the binary message.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 11 / 42
General Introduction Histogramme
What happened?
Histogram of stego-image: More raggedEvery other bar sticks out.Why?50.8% 1-s in the binary message.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 11 / 42
General Introduction Histogramme
What happened?
Histogram of stego-image: More raggedEvery other bar sticks out.Why?50.8% 1-s in the binary message.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 11 / 42
General Introduction Histogramme
What happened?
Histogram of stego-image: More raggedEvery other bar sticks out.Why?50.8% 1-s in the binary message.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 11 / 42
General Introduction Histogramme
What happened?
Histogram of stego-image: More raggedEvery other bar sticks out.Why?50.8% 1-s in the binary message.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 11 / 42
General Introduction Histogramme
What is characteristic?Pairs of values
Consider colour 2i (i = 0, 1, . . . , 127)What happens under LSB embedding?2i → 2i , 2i + 1Never 2i → 2i − 1.
Likewise 2i + 1 → 2i , 2i + 1(2i , 2i + 1) is a Pair of ValuesA pixel in (2i , 2i + 1) before embedding
... is a pixel in (2i , 2i + 1) after embedding
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 12 / 42
General Introduction Histogramme
What is characteristic?Pairs of values
Consider colour 2i (i = 0, 1, . . . , 127)What happens under LSB embedding?2i → 2i , 2i + 1Never 2i → 2i − 1.
Likewise 2i + 1 → 2i , 2i + 1(2i , 2i + 1) is a Pair of ValuesA pixel in (2i , 2i + 1) before embedding
... is a pixel in (2i , 2i + 1) after embedding
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 12 / 42
General Introduction Histogramme
What is characteristic?Pairs of values
Consider colour 2i (i = 0, 1, . . . , 127)What happens under LSB embedding?2i → 2i , 2i + 1Never 2i → 2i − 1.
Likewise 2i + 1 → 2i , 2i + 1(2i , 2i + 1) is a Pair of ValuesA pixel in (2i , 2i + 1) before embedding
... is a pixel in (2i , 2i + 1) after embedding
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 12 / 42
The χ2 test Pairs of Values
Outline
1 General IntroductionStatistical modelsHistogramme
2 The χ2 testPairs of ValuesI visual approachHypothesis testingThe error types
3 PostlogueGeneralised χ2 testSummary
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 13 / 42
The χ2 test Pairs of Values
Pairs of ValuesThe statistic
Image X . Random variable Yk = #(x , y)|Xxy = kThe Yk -s is the Histogramme.
Recall that (2l , 2l + 1) is a pair of values.First 7 pixel bits determined by image colour.
i.e. which pairLast bit (LSB) determined by message
i.e. which half of the pair
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 14 / 42
The χ2 test Pairs of Values
Pairs of ValuesExpected behaviour
Sum Y2l + Y2l+1 unaffected by embedding.For a random message
Expect 50-50 2l and 2l + 1i.e. E(Y2l) = 1
2 (Y2l + Y2l+1)
Can we make a statistic out of this?
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 15 / 42
The χ2 test Pairs of Values
Pairs of ValuesExpected behaviour
Sum Y2l + Y2l+1 unaffected by embedding.For a random message
Expect 50-50 2l and 2l + 1i.e. E(Y2l) = 1
2 (Y2l + Y2l+1)
Can we make a statistic out of this?
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 15 / 42
The χ2 test Pairs of Values
The χ2 statistic
S =∑o∈Ω
(Fo − E(Fo))2
E(Fo), (general χ2 statistic),
S =127∑l∈0
(Y2l − 12(Y2l + Y2l+1))
2
12(Y2l + Y2l+1)
. (pairs of values)
Definition
SPoV =127∑l∈0
12(Y2l − Y2l+1)
2
Y2l + Y2l+1.
#Ω− 1 degrees of freedom
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 16 / 42
The χ2 test Pairs of Values
The χ2 PDF
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 17 / 42
The χ2 test Pairs of Values
The Pairs-of-Values χ2 Distribution
χ2 PDF127 degrees offreedomRed: 2% prob.+Green: 5%+Blue: 10%CumulativeDensityFunction (CDF)
Area underthe curve
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 18 / 42
The χ2 test Pairs of Values
The Pairs-of-Values χ2 Distribution
χ2 PDF127 degrees offreedomRed: 2% prob.+Green: 5%+Blue: 10%CumulativeDensityFunction (CDF)
Area underthe curve
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 18 / 42
The χ2 test Pairs of Values
The Pairs-of-Values χ2 Distribution
χ2 PDF127 degrees offreedomRed: 2% prob.+Green: 5%+Blue: 10%CumulativeDensityFunction (CDF)
Area underthe curve
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 18 / 42
The χ2 test Pairs of Values
χ2 in Matlab
Defined in the Statistics toolboxSimplified functions available on website:
chi2cdfchi2pdfchi2inv
You may have to exclude pixel values which do not occurthis may give fewer degrees of freedom
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 19 / 42
The χ2 test I visual approach
Outline
1 General IntroductionStatistical modelsHistogramme
2 The χ2 testPairs of ValuesI visual approachHypothesis testingThe error types
3 PostlogueGeneralised χ2 testSummary
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 20 / 42
The χ2 test I visual approach
The p-value
Let S be a stochastic χ2 distributed variableLet s be the observed χ2 statisticDefine p-value:p = P(S < s)
I.e. low p-value ⇒ s is unusually smallImprobable if the image is a stegogramme.Conclusion: probably natural image
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 21 / 42
The χ2 test I visual approach
PlotsNo message
χ2 statistic p-value
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 22 / 42
The χ2 test I visual approach
Plots30% of capacity
χ2 statistic p-value
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 23 / 42
The χ2 test I visual approach
Plots60% of capacity
χ2 statistic p-value
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 24 / 42
The χ2 test I visual approach
Plots100% of capacity
χ2 statistic p-value
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 25 / 42
The χ2 test Hypothesis testing
Outline
1 General IntroductionStatistical modelsHistogramme
2 The χ2 testPairs of ValuesI visual approachHypothesis testingThe error types
3 PostlogueGeneralised χ2 testSummary
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 26 / 42
The χ2 test Hypothesis testing
The null hypothesis
null hypothesis
H0 : The image X is a stegogramme.
Statistic with known distribution under H0S is χ2 distributed with 127 degrees of freedom.
We decide on a threshold T such thatPr(S > T |H0) is small
If the observed x > t we reject H0.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 27 / 42
The χ2 test Hypothesis testing
The null hypothesis
null hypothesis
H0 : The image X is a stegogramme.
Statistic with known distribution under H0S is χ2 distributed with 127 degrees of freedom.
We decide on a threshold T such thatPr(S > T |H0) is small
If the observed x > t we reject H0.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 27 / 42
The χ2 test Hypothesis testing
The null hypothesis
null hypothesis
H0 : The image X is a stegogramme.
Statistic with known distribution under H0S is χ2 distributed with 127 degrees of freedom.
We decide on a threshold T such thatPr(S > T |H0) is small
If the observed x > t we reject H0.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 27 / 42
The χ2 test Hypothesis testing
The null hypothesis
null hypothesis
H0 : The image X is a stegogramme.
Statistic with known distribution under H0S is χ2 distributed with 127 degrees of freedom.
We decide on a threshold T such thatPr(S > T |H0) is small
If the observed x > t we reject H0.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 27 / 42
The χ2 test Hypothesis testing
The null hypothesis
null hypothesis
H0 : The image X is a stegogramme.
Statistic with known distribution under H0S is χ2 distributed with 127 degrees of freedom.
We decide on a threshold T such thatPr(S > T |H0) is small
If the observed x > t we reject H0.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 27 / 42
The χ2 test Hypothesis testing
The null hypothesis
null hypothesis
H0 : The image X is a stegogramme.
Statistic with known distribution under H0S is χ2 distributed with 127 degrees of freedom.
We decide on a threshold T such thatPr(S > T |H0) is small
If the observed x > t we reject H0.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 27 / 42
The χ2 test Hypothesis testing
Level of Significance
Before testing, choose desired level of significance α
Threshold T is taken such that Pr(X > T |H0) < α.If we observe X > T , we reject H0 at significance level αIf we observe X < T , we could not reject H0 at a significance level α
Equivalently, compare the p-value against α
p < α ⇒ Reject
RemarkIf H0 is true, the probability that the hypothesis test gives the wrongconclusion is α.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 28 / 42
The χ2 test Hypothesis testing
Level of Significance
Before testing, choose desired level of significance α
Threshold T is taken such that Pr(X > T |H0) < α.If we observe X > T , we reject H0 at significance level αIf we observe X < T , we could not reject H0 at a significance level α
Equivalently, compare the p-value against α
p < α ⇒ Reject
RemarkIf H0 is true, the probability that the hypothesis test gives the wrongconclusion is α.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 28 / 42
The χ2 test Hypothesis testing
Level of Significance
Before testing, choose desired level of significance α
Threshold T is taken such that Pr(X > T |H0) < α.If we observe X > T , we reject H0 at significance level αIf we observe X < T , we could not reject H0 at a significance level α
Equivalently, compare the p-value against α
p < α ⇒ Reject
RemarkIf H0 is true, the probability that the hypothesis test gives the wrongconclusion is α.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 28 / 42
The χ2 test Hypothesis testing
Level of Significance
Before testing, choose desired level of significance α
Threshold T is taken such that Pr(X > T |H0) < α.If we observe X > T , we reject H0 at significance level αIf we observe X < T , we could not reject H0 at a significance level α
Equivalently, compare the p-value against α
p < α ⇒ Reject
RemarkIf H0 is true, the probability that the hypothesis test gives the wrongconclusion is α.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 28 / 42
The χ2 test Hypothesis testing
Level of Significance
Before testing, choose desired level of significance α
Threshold T is taken such that Pr(X > T |H0) < α.If we observe X > T , we reject H0 at significance level αIf we observe X < T , we could not reject H0 at a significance level α
Equivalently, compare the p-value against α
p < α ⇒ Reject
RemarkIf H0 is true, the probability that the hypothesis test gives the wrongconclusion is α.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 28 / 42
The χ2 test Hypothesis testing
Level of Significance
Before testing, choose desired level of significance α
Threshold T is taken such that Pr(X > T |H0) < α.If we observe X > T , we reject H0 at significance level αIf we observe X < T , we could not reject H0 at a significance level α
Equivalently, compare the p-value against α
p < α ⇒ Reject
RemarkIf H0 is true, the probability that the hypothesis test gives the wrongconclusion is α.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 28 / 42
The χ2 test Hypothesis testing
Level of Significance
Before testing, choose desired level of significance α
Threshold T is taken such that Pr(X > T |H0) < α.If we observe X > T , we reject H0 at significance level αIf we observe X < T , we could not reject H0 at a significance level α
Equivalently, compare the p-value against α
p < α ⇒ Reject
RemarkIf H0 is true, the probability that the hypothesis test gives the wrongconclusion is α.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 28 / 42
The χ2 test Hypothesis testing
Level of Significance
Before testing, choose desired level of significance α
Threshold T is taken such that Pr(X > T |H0) < α.If we observe X > T , we reject H0 at significance level αIf we observe X < T , we could not reject H0 at a significance level α
Equivalently, compare the p-value against α
p < α ⇒ Reject
RemarkIf H0 is true, the probability that the hypothesis test gives the wrongconclusion is α.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 28 / 42
The χ2 test Hypothesis testing
Choosing the level of significance
Say you gather the data first, and then choose level ofsignificance.
How does this influence the test?Error probability?
Tuning α to observations means you always reject the nullhypothesis(a priori) error probability under H0 is 100%
or bounded by the maximum α you would have accepted.
Level of significance is only meaningful if chosen in advance.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 29 / 42
The χ2 test Hypothesis testing
Choosing the level of significance
Say you gather the data first, and then choose level ofsignificance.
How does this influence the test?Error probability?
Tuning α to observations means you always reject the nullhypothesis(a priori) error probability under H0 is 100%
or bounded by the maximum α you would have accepted.
Level of significance is only meaningful if chosen in advance.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 29 / 42
The χ2 test Hypothesis testing
Choosing the level of significance
Say you gather the data first, and then choose level ofsignificance.
How does this influence the test?Error probability?
Tuning α to observations means you always reject the nullhypothesis(a priori) error probability under H0 is 100%
or bounded by the maximum α you would have accepted.
Level of significance is only meaningful if chosen in advance.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 29 / 42
The χ2 test Hypothesis testing
Choosing the level of significance
Say you gather the data first, and then choose level ofsignificance.
How does this influence the test?Error probability?
Tuning α to observations means you always reject the nullhypothesis(a priori) error probability under H0 is 100%
or bounded by the maximum α you would have accepted.
Level of significance is only meaningful if chosen in advance.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 29 / 42
The χ2 test Hypothesis testing
Choosing the level of significance
Say you gather the data first, and then choose level ofsignificance.
How does this influence the test?Error probability?
Tuning α to observations means you always reject the nullhypothesis(a priori) error probability under H0 is 100%
or bounded by the maximum α you would have accepted.
Level of significance is only meaningful if chosen in advance.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 29 / 42
The χ2 test Hypothesis testing
Common misconceptions
After the test, when we have or have not rejected H0The probability that H0 is correct is not α.The probability that H0 is false is not α either.
RemarkNo simple relation between level of significance and the probability ofany hypothesis being right or wrong.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 30 / 42
The χ2 test Hypothesis testing
In Matlab
Consider the relation Threshold — Level of Significance
Pr(X > T |H0) < α
α = 1− chi2cdf(T , 127)T = chi2inv(1− α, 127)
To plot the PDFX = [0:1:300]plot ( X, chi2pdf(X,127) )
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 31 / 42
The χ2 test Hypothesis testing
In Matlab
Consider the relation Threshold — Level of Significance
Pr(X > T |H0) < α
α = 1− chi2cdf(T , 127)T = chi2inv(1− α, 127)
To plot the PDFX = [0:1:300]plot ( X, chi2pdf(X,127) )
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 31 / 42
The χ2 test The error types
Outline
1 General IntroductionStatistical modelsHistogramme
2 The χ2 testPairs of ValuesI visual approachHypothesis testingThe error types
3 PostlogueGeneralised χ2 testSummary
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 32 / 42
The χ2 test The error types
Hypothesis tests
Hypothesis testing is a recurring theme in statistics.Typical hypotheses
Treatment A makes patients recover more quickly than notreatment.The climate in South-East Britain is as warm today as it was a 100years ago.The image sent by Alice is a stegogramme.
When the hypothesis has been phrased,experiments can tell us whether it is plausible or not.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 33 / 42
The χ2 test The error types
Hypothesis tests
Hypothesis testing is a recurring theme in statistics.Typical hypotheses
Treatment A makes patients recover more quickly than notreatment.The climate in South-East Britain is as warm today as it was a 100years ago.The image sent by Alice is a stegogramme.
When the hypothesis has been phrased,experiments can tell us whether it is plausible or not.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 33 / 42
The χ2 test The error types
Hypothesis tests
Hypothesis testing is a recurring theme in statistics.Typical hypotheses
Treatment A makes patients recover more quickly than notreatment.The climate in South-East Britain is as warm today as it was a 100years ago.The image sent by Alice is a stegogramme.
When the hypothesis has been phrased,experiments can tell us whether it is plausible or not.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 33 / 42
The χ2 test The error types
Hypothesis tests
Hypothesis testing is a recurring theme in statistics.Typical hypotheses
Treatment A makes patients recover more quickly than notreatment.The climate in South-East Britain is as warm today as it was a 100years ago.The image sent by Alice is a stegogramme.
When the hypothesis has been phrased,experiments can tell us whether it is plausible or not.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 33 / 42
The χ2 test The error types
Hypothesis tests
Hypothesis testing is a recurring theme in statistics.Typical hypotheses
Treatment A makes patients recover more quickly than notreatment.The climate in South-East Britain is as warm today as it was a 100years ago.The image sent by Alice is a stegogramme.
When the hypothesis has been phrased,experiments can tell us whether it is plausible or not.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 33 / 42
The χ2 test The error types
Hypothesis tests
Hypothesis testing is a recurring theme in statistics.Typical hypotheses
Treatment A makes patients recover more quickly than notreatment.The climate in South-East Britain is as warm today as it was a 100years ago.The image sent by Alice is a stegogramme.
When the hypothesis has been phrased,experiments can tell us whether it is plausible or not.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 33 / 42
The χ2 test The error types
Hypothesis tests
Hypothesis testing is a recurring theme in statistics.Typical hypotheses
Treatment A makes patients recover more quickly than notreatment.The climate in South-East Britain is as warm today as it was a 100years ago.The image sent by Alice is a stegogramme.
When the hypothesis has been phrased,experiments can tell us whether it is plausible or not.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 33 / 42
The χ2 test The error types
Asymmetry of hypothesis testing
Treatment A makes patients recover more quickly than notreatment.
One error is more serious than another.Type I: Accepting the hypothesis when it is wrong
Patients get ineffective (or unhealthy) medicine.Type II: Rejecting the hypothesis when it is right
More research will be made to optimise the treatment.
H0 retained H0 rejectedH0 true No error Error Type IH0 false Error Type II No error
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 34 / 42
The χ2 test The error types
Asymmetry of hypothesis testing
Treatment A makes patients recover more quickly than notreatment.
One error is more serious than another.Type I: Accepting the hypothesis when it is wrong
Patients get ineffective (or unhealthy) medicine.Type II: Rejecting the hypothesis when it is right
More research will be made to optimise the treatment.
H0 retained H0 rejectedH0 true No error Error Type IH0 false Error Type II No error
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 34 / 42
The χ2 test The error types
Asymmetry of hypothesis testing
Treatment A makes patients recover more quickly than notreatment.
One error is more serious than another.Type I: Accepting the hypothesis when it is wrong
Patients get ineffective (or unhealthy) medicine.Type II: Rejecting the hypothesis when it is right
More research will be made to optimise the treatment.
H0 retained H0 rejectedH0 true No error Error Type IH0 false Error Type II No error
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 34 / 42
The χ2 test The error types
Asymmetry of hypothesis testing
Treatment A makes patients recover more quickly than notreatment.
One error is more serious than another.Type I: Accepting the hypothesis when it is wrong
Patients get ineffective (or unhealthy) medicine.Type II: Rejecting the hypothesis when it is right
More research will be made to optimise the treatment.
H0 retained H0 rejectedH0 true No error Error Type IH0 false Error Type II No error
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 34 / 42
The χ2 test The error types
Asymmetry of hypothesis testing
Treatment A makes patients recover more quickly than notreatment.
One error is more serious than another.Type I: Accepting the hypothesis when it is wrong
Patients get ineffective (or unhealthy) medicine.Type II: Rejecting the hypothesis when it is right
More research will be made to optimise the treatment.
H0 retained H0 rejectedH0 true No error Error Type IH0 false Error Type II No error
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 34 / 42
The χ2 test The error types
The weirdness of the steganalysis
H0: The message is a stegogramme.
We consider it (implicitely) serious to declare the messageinnocent when it is a stegogramme.Why?
Makes strong surveillance regime.Might be appropriate for prison scenario.
Real reasonProbability distribution known only for stegogrammes.We require known distribution under H0.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 35 / 42
The χ2 test The error types
Calculating probability of Type I Errors
DefinitionA Type I Error is the event that
H0 is true; andH0 is rejected.
What is the error rate?We want to calculate the conditional probability
Pr(Reject H0|H0) = Pr(X > t |H0).
Because of H0, distribution of X is known.Hence the error probability can be looked up.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 36 / 42
The χ2 test The error types
Calculating probability of Type I Errors
DefinitionA Type I Error is the event that
H0 is true; andH0 is rejected.
What is the error rate?We want to calculate the conditional probability
Pr(Reject H0|H0) = Pr(X > t |H0).
Because of H0, distribution of X is known.Hence the error probability can be looked up.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 36 / 42
The χ2 test The error types
Calculating probability of Type I Errors
DefinitionA Type I Error is the event that
H0 is true; andH0 is rejected.
What is the error rate?We want to calculate the conditional probability
Pr(Reject H0|H0) = Pr(X > t |H0).
Because of H0, distribution of X is known.Hence the error probability can be looked up.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 36 / 42
The χ2 test The error types
Calculating probability of Type I Errors
DefinitionA Type I Error is the event that
H0 is true; andH0 is rejected.
What is the error rate?We want to calculate the conditional probability
Pr(Reject H0|H0) = Pr(X > t |H0).
Because of H0, distribution of X is known.Hence the error probability can be looked up.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 36 / 42
The χ2 test The error types
Calculating probability of Type I Errors
DefinitionA Type I Error is the event that
H0 is true; andH0 is rejected.
What is the error rate?We want to calculate the conditional probability
Pr(Reject H0|H0) = Pr(X > t |H0).
Because of H0, distribution of X is known.Hence the error probability can be looked up.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 36 / 42
The χ2 test The error types
Calculating probability of Type I Errors
DefinitionA Type I Error is the event that
H0 is true; andH0 is rejected.
What is the error rate?We want to calculate the conditional probability
Pr(Reject H0|H0) = Pr(X > t |H0).
Because of H0, distribution of X is known.Hence the error probability can be looked up.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 36 / 42
The χ2 test The error types
Calculating probability of Type I Errors
DefinitionA Type I Error is the event that
H0 is true; andH0 is rejected.
What is the error rate?We want to calculate the conditional probability
Pr(Reject H0|H0) = Pr(X > t |H0).
Because of H0, distribution of X is known.Hence the error probability can be looked up.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 36 / 42
The χ2 test The error types
Type II Errors
In theory: Similar to Type I Errors.In practice: What is the distribution of X when H0 is false?
Do we know this distribution at all?
RemarkVery often, we will not know the error probability.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 37 / 42
The χ2 test The error types
Type II Errors
In theory: Similar to Type I Errors.In practice: What is the distribution of X when H0 is false?
Do we know this distribution at all?
RemarkVery often, we will not know the error probability.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 37 / 42
The χ2 test The error types
Type II Errors
In theory: Similar to Type I Errors.In practice: What is the distribution of X when H0 is false?
Do we know this distribution at all?
RemarkVery often, we will not know the error probability.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 37 / 42
The χ2 test The error types
Type II Errors
In theory: Similar to Type I Errors.In practice: What is the distribution of X when H0 is false?
Do we know this distribution at all?
RemarkVery often, we will not know the error probability.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 37 / 42
The χ2 test The error types
A problem of the χ2 test
Accusing Alice of sending a stegogramme when she is not, iscalled false positive.Suppose false positives is a serious matter.How can we limit the risk of false positives?False positives are Type II Errors.Distribution when H0 is false is unknown
RemarkWe cannot (theoretically) bound the probability of false positives in theχ2 test.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 38 / 42
Postlogue Generalised χ2 test
Outline
1 General IntroductionStatistical modelsHistogramme
2 The χ2 testPairs of ValuesI visual approachHypothesis testingThe error types
3 PostlogueGeneralised χ2 testSummary
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 39 / 42
Postlogue Generalised χ2 test
Randomised location
PoV assumes embedding in consecutive bitsGeneralised χ2 proposes a fixFridrich et al (2003) suggests an implementationNo rigid hypothesis test or statistical theory
works experimentally
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 40 / 42
Postlogue Summary
Outline
1 General IntroductionStatistical modelsHistogramme
2 The χ2 testPairs of ValuesI visual approachHypothesis testingThe error types
3 PostlogueGeneralised χ2 testSummary
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 41 / 42
Postlogue Summary
Summary
Steganalysis can be cast as a problem of statisticsstandard statistical theory applies
The Pairs-of-Values χ2 test is a simple exampleThe weekly exercise is to implement and test this steganalysistechnique.
See website for detailed assignment.
Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2008 42 / 42