Post on 04-Jan-2016
DATA(Not Just a Lot of Numbers)
James Stewart Lothar Redlin Saleem Watson
College AlgebraA Course in Crisis?
Introductory collegiate mathematics is in the midst of a revolution…
-Nancy Baxter Hastings,
Dickinson College
Traditional College Algebra is a boring, archaic, torturous course that does not help students solve problems or become better citizens. It turns off students and discourages them from seeking more mathematics learning.
- Chris Arney,
Dean of Science and Mathematics, St. Rose College
College AlgebraA Course in Crisis?
NSF conference on “Rethinking the Courses Below Calculus” in Washington D.C in 2001. Some of the major themes to emerge from this conference:
• Spend less time on algebraic manipulation and more time on exploring concepts
• Reduce the number of topics but study those topics covered in greater depth
• Give greater priority to data analysis as a foundation for mathematical modeling
• Emphasize the verbal, numerical, graphical and symbolic representations of mathematical concepts
WHY DATA?
Over the past two decades computers have transformed public discourse by generating piles of data and myriad analyses of these data. Ordinary citizens must deal with numbers and data every day.
-Bernard Madison, University of Arkansas
Virtually any educated individual will need the ability: 1. to examine a set of data and recognize a behavioral pattern in it,2. to assess how well a given functional model matches the data,3. to recognize the limitations in the model,4. to use the model to draw appropriate conclusions,5. to answer approriate questions about the phenomenon being studied.
-Sheldon Gordon, Farmingdale State University of New York
WHY DATA?
• Data relate the real world and algebraic equations.
• Students can collect data and make models from data.
• Students can see how the model gives us information about the thing being modeled.
WHY DATA?
Greater Depth:
Connecting
the
Concepts
WHY DATA?
Greater Depth:
Connecting
the
Concepts
WHY DATA?
Greater Depth:
Connecting
the
Concepts
COLLECTING DATA
• From classmates (measurements)
Age, height, hand span, shoe size, hat size
COLLECTING DATA• From classmates (Surveys)
Survey
1. What is the value (in cents) of the coins in your pocket or purse? __________
2. How far is your daily commute to school (in miles)? _________
3. How many siblings do you have (including yourself)? __________
4. How many hours a week do you spend on the Internet? __________
5. How many hours a week do you spend on homework? ________
6. Rate your happiness. not happy happy very happy
7. Rate your satisfaction with your school work. not satisfied satisfied very satisfied
COLLECTING DATA
• From simple experiments– Bridge science
COLLECTING DATA
• From simple experiments– How quickly can you name your favorite things
– How many words can you recall from a memorized list (after a day, a week, a month).
Listing vegetables Memorizing a list
COLLECTING DATA
• From simple experiments– How quickly does water leak from a tank? Toricelli’s
Law
Toricelli’s Law The experiment Students performing the experiment
COLLECTING DATA
• From simple experiments– Radioactive decay—modeled with pennies
Radioactive DecayCoin Experiment
COLLECTING DATA
• From the Internet– How many farms in the US?
Farming in the 19th century Farming in the 20th century
COLLECTING DATA
• From the Internet– Population
Las Vegas 1900 Las Vegas 2000
• From Journal Articles– Algebra and Alcohol
Time (hr)
15 ml 30 ml 45 ml 60 ml
0. 0. 0. 0. 0.
0.067 0.032 0.071 — —
0.133 0.096 0.019 — —
0.167 — — 0.28 0.30
0.2 0.13 0.25 — —
0.267 0.17 0.30 — —
0.333 0.16 0.31 0.42 0.46
0.417 0.17 — — —
0.5 0.16 0.41 0.51 0.59
0.667 — — 0.61 0.66
0.667 — — 0.61 0.66
0.667 — — 0.61 0.66
0.667 — — 0.61 0.66
Concentration (mg/ml) after
95% ethanol oral dose
A
( ) tS t at b
COLLECTING DATA
DATA
15
15
64
11
15
22
30
25
15
15
83
77
346
15
76
32
711
131
22
4564
15
15
15
154515
74
8
175
0 321 812 573 224 355 22
10 546
11 457
12 673
13 752
14 375
15 972
10 436 5611 147 7612 583 34
10 478 56 74
11 547 76 80
12 103 34 24
GOAL: Get Information from Data
Model
Equation Scatter Plot
Regression
Matrix
Sample
Test of Hypothsis
Confidence Interval
Frequency HistogramMean
Standard Deviation
Proportion
Getting Information from Data
• Descriptive InformationTells us something about the data itself
• Inferential InformationTells us how to extend the information obtained from the data beyond the domain of the data.
The FORM of the Data
How does the data obtained from this survey differ for different questions?
• What is your age?• What is your height?• What is your hair color?• From which source do you mostly obtain the news?• Do you believe that the Universe began in a huge
explosion?
The Form of the Data
The form of the data tells us the kind of information we can obtain.
• One-Variable Data
• Two-Variable Data
• Categorical Data
• Sample Data
One-Variable Data
Age (yr) 2 2 2 3 3 3 4 4 4 4 4 5
Income ( thousands of dollars) 280 56 59 62 51
Selling Price (X 1000) 159 193 167 172 169 216 169 172
Descriptive Information
• Summary statistics:Central tendency: Mean, median
Dispersion: variance, standard deviation
One-Variable Data
Example: height of students Mean: 60”, S.D.: 10”
Given this information, which picture is more likely?
One-Variable Data
Descriptive Information
• Frequency histogram
Graphical, gives more complete information—tells how the data is distributed.
One-Variable Data
Descriptive Information
• Frequency histogram
One-Variable Data
Two-Variable Data
Age (yr) 2 2 2 3 3 3 4 4 4 4 4 5
Height (in) 32 31 36 38 35 41 47 43 42 38 39 45
Hourssince 6:00 am
Temperature(°F)
0 59
2 62
4 68
6 65
8 58
10 60
12 62
Depth (ft)
Pressure (lb/in²)
0 14.7
10 19.2
20 23.7
30 28.2
40 32.7
50 37.2
60 41.7
Descriptive Information
• Scatter plotGives description of the relationship between the variables.
• Regression Line (or other curve)Gives the curve that best fits (or best describes) the data
Two-Variable Data
Descriptive information
• Depth/Pressure Data
Two-Variable Data
Depth(ft)
Pressure(lb/ft2)
0 14.7
10 19.2
20 23.7
30 28.2
40 32.7
50 37.2
Descriptive information
• TV Hours/BMI
Two-Variable Data
Hours TV BMI
0 15
0 17
.5 15
.5 18
.75 16
1 16
1 15
1 17
1.25 18
1.5 19
: :
Two-variable data(Goal: Find a relationship between the variables)
Descriptive Information
• Regression Line (or other curve)Gives the curve that best fits (or best describes) the data
Two-Variable Data
Depth / Pressure Hours TV / BMI
Two-variable data(Goal: Find a relationship between the variables)
Inferential Information
• Regression Line (or other curve)Use the curve to get information not in the data (extrapolation or interpolation using the regression curve).
Two-Variable Data
Two-variable data(Goal: Find a relationship between the variables)
Inferential Information
• Regression Line (or other curve)
Two-Variable Data
interpolateextrapolate
interpolate
extrapolate
a
Descriptive information
• Tire Inflation-Tire Life Relation– Quadratic functions
Two-Variable Data
Tire Pressure/Tire Life
Pressure(lb/in2)
Tire life(mi X1000)
26 50
28 66
31 78
35 81
38 74
42 70
45 5920
20.24324 17.627 239.47y x x
QuadReg y=ax2+bx+c a=-.24324 b= 17.627 c= 239.47
a
Descriptive information
• Species-Area Relation– Power functions
Two-Variable Data
Species-area data
CaveArea(m2)
Numberof species
La Escondida 18 1
El Escorpion 19 1
El Tigre 58 1
Mision Imposible 60 2
San Martin 128 5
El Arenal 187 4
La Ciudad 344 6
Virgen 511 7
550
550
PwrRegy = a * x ^ b
a = 0.140019
b = 0.640512
0.640.14y x
a
Descriptive information
• Length-at-Age Relation– Polynomial functions
Two-Variable Data
(a)
Length-at-age data
Age(years)
Length(inches)
Age
(years)
Length(inches)
1 4.8 9 18.2
2 8.8 9 17.1
2 8.0 10 18.8
3 7.9 10 19.5
4 11.9 11 18.9
5 14.4 12 21.7
6 14.1 12 21.9
6 15.8 13 23.8
7 15.6 14 26.9
8 17.8 14 25.1
90 year old rock fish3 20.0155 0.372 3.95 1.21y x x x
Descriptive information
• Algebra and alcohol– Surge Functions
Two-Variable Data
Time (hr)
15 ml 30 ml 45 ml 60 ml
0. 0. 0. 0. 0.
0.067 0.032 0.071 — —
0.133 0.096 0.019 — —
0.167 — — 0.28 0.30
0.2 0.13 0.25 — —
0.267 0.17 0.30 — —
0.333 0.16 0.31 0.42 0.46
0.417 0.17 — — —
0.5 0.16 0.41 0.51 0.59
0.667 — — 0.61 0.66
0.667 — — 0.61 0.66
0.667 — — 0.61 0.66
0.667 — — 0.61 0.66
Concentration (mg/ml) after
95% ethanol oral dose
A
( ) tS t at b
Categorical Data
Student Number Hair color Eye Color
1 Dark Brown
2 Blond Brown
3 Dark Blue
Results of survey:
These data need organizing!
Categorical data(Goal: Organize the data/Get information)
Descriptive information
• Organize Data in a Matrix
Categorical Data
Hair ColorBlondeDark
Blue
Green
Brown
Red
9 3 0
0 6 3
3 1 1
Categorical data(Goal: Organize the data/Get information)
Descriptive information
• Organize Data in a Proportionality Matrix
Categorical Data
Hair ColorBlondeDark
Blue
Green
Brown
Red
.75 .30 .00
.00 .60 .75
.25 .10 .25
Categorical data(Goal: Organize the data/Get information)
Get information from the data
• Using matrix multiplication
Categorical Data
.75 .30 .00 500 615
.00 .60 .75 800 930
.25 .10 .25 600 355
Hair ColorBlondeDark
Blue
Green
Brown
Red
Categorical data(Goal: Organize the data/Get information)
Get information from the data
• Using matrix multiplication
Categorical Data
500 X .75 + 800 X .30 + 600X .00 = 615
Dark hair
Proportionblue eyes
Blond hair
Proportionblue eyes
Red hair
Proportionblue eyes
Sample Data
We sample the wine. (We don’t drink the whole bottle and then decide that the wine is no good.)
Get information about a population from a sample.
GOAL: Get Information from Data
• No information if the sample is not randomExample: Samples of height
Take sample from the basketball team
Example: Proportion of male to female students
Take sample from the girls dormitory
• No information if the sample size is too smallExample: A sample of one
Sample Data
Examples• Hypothesis: Equal number of male and female
students
A random sample of 30 students are all femaleConclusion: Reject hypothesis
• Hypothesis: A coin is fair
The coin is tossed 20 times and results in 20 headsConclusion: Reject hypothesis
Intuitive basis for inference from a sample
Alternate examples• Hypothesis: Equal number of male and female
students
A random sample of 30 students, 21 are femaleConclusion: Reject hypothesis?
• Hypothesis: A coin is fair
The coin is tossed 20 times, 16 headsConclusion: Reject hypothesis?
Intuitive basis for inference from a sample
• Tossing 30 coins is a binomial distribution
Statistical basis for inference from a sample
Number of Heads
The Mean orexpecten numberof heads
Our result
• The average height of male students in samples of 500 students is a normal distribution.
The Mean orexpected averageheight
Our result
How do we make these intuitive decisions? Because we know that some events are less likely than others. We intuitively “know” the probability distribution of certain events.
• For more refined estimates we need to know the probability distribution more accurately.
Statistical basis for inference from a sample
Decision Rule: If the probability of getting the sample we actually got (or a more extreme sample) is very small (say .05 or less), we reject our hypothesis.
We use the calculator for graphing, for regression, for matrix operations, etc…
So let’s use the calculator to find probabilities.
Statistical basis for inference from a sample
Hypothesis: Proportion of females in population is 0.6.
Statistical basis for inference from a sample
1-Proportion Z Test
p0 : .6
Sucesses,x: 70
n : 100
Alternate Hyp : Prop = p 0/
Random sample of 100 has 70 females
Random sample of 50has 35 females
1-Proportion Z Test
p0 : = .6
Z: = 2.04124
P-value : = .04124
P-hat : = .7
n : = .7
P-Value .04 <.05Reject Hypothesis
1-Proportion Z Test
p0 : .6
Sucesses,x: 35
n : 50
Alternate Hyp : Prop = p 0/
P-Value .14 >.05Fail to reject Hypothesis
1-Proportion Z Test
p0 : = .6
Z: = 1.4438
P-value : = .148915
P-hat : = .7
n : = .7
Hypothesis: Mean height of male students is 70”.
Statistical basis for inference from a sample
Random sample of 20 has mean height 70.13
Random sample of 6has mean height 72.6
P-Value .69 >.05Fail to reject Hypothesis
P-Value .01 <.05Reject Hypothesis
T Test
70
List: list1
Freq : 1
Alternate Hyp : /
T Test
0 = 70t = .400P-value = .694
df = 19
x = 70.13
/
_
Sx = 1.45n = 20
T Test
70
List: list2
Freq : 1
Alternate Hyp : /
T Test
0 = 70t = 4.026P-value = .010
df = 5
x = 72.63
/
_
Sx = 1.60n = 6
Research articles report results in terms of p-values
Statistical basis for inference from a sample