Topics: Statistics & Experimental Design The Human Visual System Color Science Light Sources:...
date post
12-Jan-2016Category
Documents
view
213download
0
Embed Size (px)
Transcript of Topics: Statistics & Experimental Design The Human Visual System Color Science Light Sources:...
Topics:
Statistics & Experimental DesignThe Human Visual SystemColor ScienceLight Sources: Radiometry/PhotometryGeometric OpticsTone-transfer FunctionImage SensorsImage ProcessingDisplays & OutputColorimetry & Color MeasurementImage EvaluationPsychophysics
Design of experimentsWhy is it important?
We wish to draw meaningful conclusions from data collected
Statistical methodology is the only objective approach to analysis
Design of experiments Recognize the problem Select factor to be varied, levels and ranges over which factors will be varied Select the response variable Choose experimental design: Sample size? Blocking? Randomization? Perform the experiment Statistical analysis Conclusions and recommendations
Lets start easy We would like to compare the output of two systems. Design a testing protocol and run it several times
RunSystemASystemB1y1Ay1B2y2Ay2B3y3Ay3B
Visualize dataFor small data sets: Scatter plot
Chart2
16.8517.5
16.417.63
17.2118.25
16.3518
16.5217.86
17.0417.75
16.9618.22
17.1517.9
16.5917.96
16.5718.15
system_A
system_B
Run #
Output
Sheet1
Runsystem_Asystem_B
116.8517.5
216.417.63
317.2118.25
416.3518
516.5217.86
617.0417.75
716.9618.22
817.1517.9
916.5917.96
1016.5718.15
Sheet1
system_A
system_B
Run #
Output
Sheet2
Sheet3
Visualize dataFor larger data sets: Histogram Divide horizontal axis into intervals (bins) Construct rectangle over interval with area proportional to number (frequency) of observationsfrequency, ni
Statistical inferenceDraw conclusions about a population using a sample from that population. Imagine hypothetical population containing a large number N of observations. Denote measure of location of population as m
Statistical inference Denote spread of population as variances
Statistical inferenceA small group of observations is known as a sample. A statistic like the average is calculated from a set of data considered to be a sample from a population
RunSystemASystemB1y1Ay1B2y2Ay2B3y3Ay3B
Statistical inference Sample variance supplies a measure of the spread of the sample
Probability distribution functionsP(axb)
Probability distribution functionsP(xi)xiP(x = xi) = p(xi)
Mean, variance of pdf Mean is a measure of central tendency or location
Variance measures the spread or dispersion
Normal distributions = standard deviation = s2 m = mean
Normal distribution, From previous examples we can see that mean = m and variance = s2 completely characterize the distribution.
Knowing the pdf of the population from which sample is draw determine pdf of particular statistic.
Normal distribution Probability that a positive deviation from the mean exceeds one standard deviation s is 0.1587 1/6 = percentage of the total area under the curve. (Same as negative deviation)
Probability that a deviation in either direction will exceed one standard deviation s is 2 x 0.1587 = 0.3174
Chance that a positive deviation from the mean will exceed two s = 0.02275 1/40
Normal distribution Sample runs differ as a result of experimental error
Often can be described by normal distribution
Standard Normal distribution, N(0,1) Values for N(0,1) are found in tables.
Standard Normal distribution, N(0,1)
Standard Normal distribution, N(0,1) Example:
Suppose the outcome of a given experiment is approximately normally distributed with a m = 4.0 and s = 0.3. What is the probability that the outcome may be 4.4?
Look in table in previous page, to find that the probability is 9%.
c2 distribution Another sampling distribution that can be defined in terms of normal random variables.
Suppose z1, z2, , zk are normally and independently distributed random variables with mean m = 0 and variance s2 = 1 (NID(0,1)), then lets define
Where c follows the chi-square distribution with k degrees of freedom.
c2 distribution k = 1k = 5k = 10k = 15
Students t Distribution In practice we dont know the theoretical parameter s
This means we cant really use and refer to the result of
the table of standard normal distribution
Assume that experimental standard deviation s can be used as an estimate of s
Students t Distribution Define a new variable
It turns out that t has a known distribution.
It was deduced by Gosset in 1908
Students t Distribution k=1k=10k=100Probability points are given in tables.
The form depends on the degree of uncertainty in s2, measured by the number of degrees of freedom, k.
Inferences about differences in means Statistical hypothesis: Statement about the parameters of a probability distribution.
Lets go back to the example we started with, i.e., comparison of two imaging systems.
We may think that the performance measurement of the two systems are equal.
Hypothesis testingFirst statement is the Null hypothesis, second statement is the Alternative hypothesis. In this case it is a two-sided alternative hypothesis.
How to test hypothesis? Take a random sample, compute an appropriate test statistic and reject, or fail to reject the null hypothesis H0.
We need to specify a set of values for the test statistic that leads to rejection of H0. This is the critical region.
Hypothesis testingTwo errors can be made:
Type I error: Reject null hypothesis when it is true Type II error: Null hypothesis is not rejected when it is not true
In terms of probabilities:
Hypothesis testing We need to specify a value of the probability of type I error a. This is known as significance level of the test.
The test statistic for comparing the two systems is:
Where
Hypothesis testing To determine whether to reject H0, we would compare t0 to the t distribution with kA+kB-2 degrees of freedom.
If we reject H0 and conclude that means are different.
We have:
System ASystem B
Hypothesis testing We have kA + kB 2 = 18
Choose a = 0.05
We would reject H0 if
Hypothesis testing
Hypothesis testingSince t0 = -9.13 < -t0.025,18 = -2.101 then we reject H0 and conclude that the means are different.
Hypothesis testing doesnt always tell the whole story. Its better to provide an interval within which the value of the parameter is expected to lie. Confidence interval.
In other words, its better to find a confidence interval on the difference mA - mB
Confidence intervalUsing data from previous example:
So the 95 percent confidence interval estimate on the difference in means extends from -1.43 to -0.89.
Note that since mA mB = 0 is not included in this interval, the data do not support the hypothesis that mA = mB at the 5% level of significance.
Fundamental tools used throughout course - especially in lab experiences.Mu is referred to as parameter At some point need to address Gauge R&R