M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

82
U NIVERSITEIT VAN A MSTERDAM MARIO F. TRIOLA 3rd EDITION ssentials of STATISTICS

Transcript of M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

Page 1: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

MARIO F. TRIOLAMARIO F. TRIOLA3rd3rd EDITIONEDITION

Essentials of STATISTICS

Page 2: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Programma vandaag

Organisatie en opzet van de cursus Waarom Statistiek? Vooruitblik op de stof hoofdstukken 1,2 en 3

Page 3: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Website cursus:

Page 4: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Website cursus:

Page 5: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Boek

Literatuur:Mario Triola:

Essentials of Statistics, 3rd edition

Addison-Wesley Higher Education, 2008

Page 6: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Rooster

Page 7: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Organisatie en opzet (1)

Kijk zelf op website naar:– Introductie– Beoordeling en deadlines– Ziekteregeling– Rooster– Etc.

Page 8: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Organisatie en opzet (2)

Uitdelen en inleveren:– Week 1: Opdrachten hoofdstuk 1, 2 en 3– Week 2: uitwerkingen hoofdstuk 1,2 en 3 en maak

een kopie voor de zelfbeoordeling bij de bespreking

Page 9: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Organisatie en opzet (3)

Werkcolleges verplicht? Succes garantie?

Page 10: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Samenhang?

relatie pr.cf en tnt.cf stat 0708

5,0

5,5

6,0

6,5

7,0

7,5

8,0

8,5

9,0

9,5

10,0

0,0 2,0 4,0 6,0 8,0 10,0

tentamencijfer

pra

ctic

um

cijf

er

Reeks1

Page 11: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Waarom Statistiek?

Lezen en schrijven artikelen vakgebied IK– Voorbeeld artikel MIS Quarterly

Lezen en schrijven in het dagelijks leven– Voorbeeld: tabel actiecommitee in de buurt

Baisvoorwaarde: logisch denken en redeneren– Voorbeeld: het Monty Hall-probleem

Page 12: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Tabel (1) artikel MIS Quarterly

Page 13: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Tabel (2) artikel MIS Quarterly

Page 14: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Tabel buurtcomité

Page 15: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Intuïtie is onbetrouwbaar

Monty Hall probleem– Quiz: er zijn 3 gesloten deuren, – Achter één deur staat een auto, achter beide

andere deuren is niets,– Jij mag een deur kiezen..– Welke kans op de hoofdprijs?

Page 16: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Maar dan …

De quizmaster opent NA UW KEUZE een van de twee overgebleven deuren en laat zien dat daar niets achter zit.

Probleem: U mag nu nog van deur wisselen. Doet U dit?

Page 17: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Analyse

Stel de hoofdprijs zit achter deur 1:

1. U koos deur 1 (auto). De quizmaster opent een andere deur waarachter niets staat. Ruilen levert verlies op…

2. U koos deur 2 (leeg). De quizmaster opent deur 3 waarachter niets staat. Ruilen levert hoofdprijs!

3. U koos deur 3 (leeg). De quizmaster opent deur 2 waarachter niets staat. Ruilen levert hoofdprijs!

1 2 3

Page 18: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Aanpak hoorcolleges

Geen uitgebreide bespreking Wel vooruitblik op de stof en bespreking van

mogelijke knelpunten Nu: hoofdstuk 1, 2 en 3

Page 19: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Sekties hfst 1, 2 en 3

1.1 Overzicht 1.2 Datatypen 1.3 Kritisch denken 1.4 Ontwerp experimenten

2.1 Overzicht 2.2 Frequentieverdeling 2.3 Histogrammen 2.4 Grafische weergave

3.1 Overzicht 3.2 Centrummaten 3.3 Variatiematen 3.4 Relatieve afwijking 3.5 Exploratieve data-

analyse

Page 20: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Triola, hoofdstuk 1

Belangrijke definities voor gebruik bij de statistiek

Page 21: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Sektie 1.1 Belangrijke definities

Data Statistiek Populatie Census Steekproef

Page 22: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Definitie Statistiek

a collection of methods for- planning studies and experiments,- obtaining data, - and then organizing, summarizing, presenting, analyzing, interpreting, - and drawing conclusions based on the data

Page 23: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Chapter Key Concepts

Sample data must be collected in an appropriate way, such as through a process of random selection.

• If sample data are not collected in an appropriate way, the data may be so completely useless that no amount of statistical torturing can salvage them.

Page 24: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Sektie 1.2 Data typen

Definities:– Populatie parameter versus steekproef statistic– Kwantitatieve versus kwalitatieve data– Discrete versus continue data– Meetnivo’s: nominaal, ordinaal, interval, ratio

Page 25: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Levels of Measurement

1. Nominal - categories only

2. Ordinal - categories with some order

3. Interval - differences but no natural starting point

4. Ratio - differences and a natural starting point

Page 26: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Sektie 1.3 Kritisch denken

Misbruik, ondeskundig gebruik, verkeerd gebruik van de statistiek

Page 27: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Misuse # 1- Bad Samples

Voluntary response sample (or self-selected sample)

- one in which the respondents themselves decide whether to be included. In this case, valid conclusions can be made only about the specific group of people who agree to participate.

Page 28: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

To correctly interpret a graph, you must analyze the numerical information given in the graph, so as not to be misled by the graph’s shape.

Misuse # 3- Graphs

Page 29: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Loaded Questions Order of Questions Refusals Correlation & Causality Self Interest Study Precise Numbers Partial Pictures Deliberate Distortions

Other Misuses of Statistics

Page 30: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Sektie 1.4 Ontwerp van het onderzoek

Soorten studies– Observationeel– Experimenteel– Retrospectief– Prospectief (longitudinaal, cohort)

Page 31: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Confounding occurs in an experiment when the experimenter is not able to distinguish between the effects of different factors

Definition

Page 32: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Voorbeeld: confounding effects

Page 33: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Controlling Effects of Variables

Blinding– subject does not know he or she is receiving a

treatment or placebo Rigorously Controlled Design

– subjects are very carefully chosen Blocks

– groups of subjects with similar characteristics Completely Randomized Exp. Design

– subjects are put into blocks through a process of random selection

Page 34: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Steekproeven

Page 35: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Definitions

Random Sample– members of the population are selected in

such a way that each individual member has an equal chance of being selected

Simple Random Sample (of size n)– subjects selected in such a way that every

possible sample of the same size n has the same chance of being chosen

Page 36: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Random

Systematic

Convenience

Stratified

Cluster

Methods of Sampling

Page 37: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Triola, hoofdstuk 2

Statistiek voor het samenvatten en weergeven van data

Page 38: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

1. Center: A representative or average value that indicates where the middle of the data set is located.2. Variation: A measure of the amount that the values vary among themselves. 3. Distribution: The nature or shape of the distribution of data (such as bell-shaped, uniform, or skewed).4. Outliers: Sample values that lie very far away from the vast majority of other sample values.5. Time: Changing characteristics of the data over time.

Sektie 2.1 OverviewImportant Characteristics of Data

CVDOT

Page 39: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Sektie 2.2 Frequentieverdelingen

Gewone (rechte) telling van waarden in een tabel

Samenvoegen van waarden in categorieën (classes)

Page 40: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Frequency Distribution Ages of

Best Actresses

Frequency DistributionOriginal Data

Page 41: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Samenhangende definities

Lower class limits Upper class limits Class boundaries Class midpoints Class width Relatieve frequenties Cumulatieve frequenties (cumulatieve percentages)

Page 42: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Frequency Tables

Page 43: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Sektie 2.3 Histogrammen

Grafische weergave van verdelingen

Page 44: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

HistogramA bar graph in which the horizontal scale represents the classes of data values and the vertical scale represents the frequencies

Page 45: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Relative Frequency Histogram Has the same shape and horizontal scale as a histogram, but the vertical scale is marked with relative frequencies instead of actual frequencies

Page 46: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

One key characteristic of a normal distribution is that it has a “bell” shape. The histogram below illustrates this.

Critical ThinkingInterpreting Histograms

Page 47: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Sektie 2.4 Statistical graphics

Andere vormen van visuele weergave– Polygon– Ogive– Dot plot– Stemplot– Pareto chart– Pie chart– Scatter plot– Time series

Page 48: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Ogive

A line graph that depicts cumulative frequencies

Insert figure 2-6 from page 58

Page 49: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Dot PlotConsists of a graph in which each data value is plotted as a point (or dot) along a scale of values

Page 50: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Other Graphs

Page 51: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Triola, hoofdstuk 3

Statistiek voor het beschrijven, verkennen en vergelijken van data

Page 52: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Sektie 3.1 Overzicht

Descriptive Statistics– summarize or describe the important

characteristics of a known set of data

Inferential Statistics– use sample data to make inferences (or

generalizations) about a population

Page 53: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Sektie 3.2 Centrummaten

Gemiddelde (mean) – Van steekproef (x-streep) en van populatie (mu)

Mediaan (x-tilde) Modus Midrange Gewogen gemiddelde

Page 54: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Notation

µ is pronounced ‘mu’ and denotes the mean of all values in a population

x =n

x is pronounced ‘x-bar’ and denotes the mean of a set of sample values

Nµ =

x

Page 55: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Carry one more decimal place than is present in the original set of values.

Round-off Rule for Measures of Center

Page 56: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

use class midpoint of classes for variable x

Mean from a Frequency Distribution

Page 57: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Best Measure of Center

Page 58: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Skewness

Page 59: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Sektie 3.3 Variatiematen

Range Standaard deviatie

– steekproef (s) en populatie (sigma) Variantie (s-kwadraat)

Variatiecoëfficiënt (CV)

Page 60: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Key Concept

Because this section introduces the concept of variation, which is something so important in statistics, this is one of the most important sections in the entire book.

Place a high priority on how to interpret values of standard deviation.

Page 61: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Definition

The standard deviation of a set of sample values is a measure of variation of values about the mean.

Page 62: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Sample Standard Deviation Formula

(x - x)2

n - 1s =

Page 63: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Population Standard Deviation

2 (x - µ)

N =

This formula is similar to the previous formula, but instead, the population mean and population size are used.

Page 64: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Standard Deviation - Important Properties

The standard deviation is a measure of variation of all values from the mean.

The value of the standard deviation s can increase dramatically with the inclusion of one or more outliers (data values far away from all others).

The units of the standard deviation s are the same as the units of the original data values.

Page 65: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Variance - Notationstandard deviation squared

s

2

2

}Notation

Sample variance

Population variance

Page 66: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Estimation of Standard DeviationRange Rule of Thumb

For estimating a value of the standard deviation s,

Use

Where range = (maximum value) – (minimum value)

Range

4s

Page 67: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Estimation of Standard DeviationRange Rule of Thumb

For interpreting a known value of the standard deviation s, find rough estimates of the minimum and maximum “usual” sample values by using:

Minimum “usual” value (mean) – 2 X (standard deviation) =

Maximum “usual” value (mean) + 2 X (standard deviation) =

Page 68: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

The Empirical Rule

Page 69: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Definition

The coefficient of variation (or CV) for a set of sample or population data, expressed as a percent, describes the standard deviation relative to the mean.

Sample Population

sxCV = 100%

CV =

100%

Page 70: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Sektie 3.4 Maten van relatieve afwijking

Z-scores Quartielen Percentielen

Page 71: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Key Concept

This section introduces measures that can be used to compare values from different data sets, or to compare values within the same data set. The most important of these is the concept of the z score.

Page 72: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

z Score (or standardized value)

the number of standard deviations that a given value x is above or below the mean

Definition

Page 73: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Sample Population

x - µz =

Round z to 2 decimal places

Measures of Position z score

z = x - xs

Page 74: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Interpreting Z Scores

Whenever a value is less than the mean, its corresponding z score is negative

Ordinary values: z score between –2 and 2 Unusual Values: z score < -2 or z score > 2

Page 75: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Q1, Q2, Q3 divide ranked scores into four equal parts

Quartiles

25% 25% 25% 25%

Q3Q2Q1(minimum) (maximum)

(median)

Page 76: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Percentiles

Just as there are three quartiles separating data into four parts, there are 99 percentiles denoted P1, P2, . . . P99, which partition the data into 100 groups.

Page 77: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Sektie 3.5 EDA

Uitbijters (outliers) Boxplot

Page 78: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Important Principles

An outlier can have a dramatic effect on the mean.

An outlier can have a dramatic effect on the standard deviation.

An outlier can have a dramatic effect on the scale of the histogram so that the true

nature of the distribution is totally obscured.

Page 79: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Definitions

For a set of data, the 5-number summary consists of the minimum value; the first quartile Q1; the median (or second quartile Q2); the third quartile, Q3; and the maximum value

A boxplot ( or box-and-whisker-diagram) is a graph of a data set that consists of a line extending from the minimum value to the maximum value, and a box with lines drawn at the first quartile, Q1; the median; and the third quartile, Q3

Page 80: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Boxplots

Page 81: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Boxplots - cont

Page 82: M ARIO F. T RIOLA 3rd E DITION Essentials of S TATISTICS.

UNIVERSITEIT VAN AMSTERDAM

Einde vooruitblik 1, 2 en 3

Volgende week: – Vragenuur hoofdstukken 1, 2 en 3– Vooruitblik hoofdstukken 4 en 5