CHAPTER 1 Exploring Data€¦ · CHAPTER 1 Exploring Data Introduction Data Analysis: Making Sense...
Transcript of CHAPTER 1 Exploring Data€¦ · CHAPTER 1 Exploring Data Introduction Data Analysis: Making Sense...
The Practice of Statistics, 5th Edition
Starnes, Tabor, Yates, Moore
Bedford Freeman Worth Publishers
CHAPTER 1Exploring Data
Introduction
Data Analysis:
Making Sense of Data
The Practice of Statistics, 5th Edition 2
Hiring Discrimination Activity
An airline has just finished training 25 pilots – 15 male and 10 female – to
become captains. Unfortunately, only eight captain positions are available right
now. Airline managers announce that they will use a lottery to determine which
pilots will fill the available positions. The names of all 25 pilots will be written on
identical slips of paper. The slips will be placed in a hat, mixed thoroughly, and
drawn out one at a time until all eight captains have been identified.
A day later, managers announce the results of the lottery. Of the 8 captains
chosen, 5 are female and 3 are male. Some of the male pilots who weren’t
selected suspect that the lottery was not carried out fairly. One of these pilots
asks your statistics class for advice about whether to file a grievance with the
pilots’ union.
The key questions in this possible discrimination seems to be: Is it plausible
(believable) that these results happened just by chance? To find out, you and
your classmates will simulate the lottery process that airline managers said
they used.
The Practice of Statistics, 5th Edition 3
Hiring Discrimination Activity
1. Simulate the number of female pilots randomly selected.
a. Mix the slips of paper thoroughly. Without looking, remove 8 slips from the
cup. Count the number of female pilots selected.
b. Shuffle the cards. Without looking, remove 8 cards. Count the number of
female pilots selected.
c. Numbers 1 -15 to represent male pilots. Numbers 16 – 25 represent female
pilots. Generate random numbers with the following command.
Math, left arrow, 5, “1,25,20”, ), enter
Use the first non-repeated numbers to count the number of female
pilots selected.
2. Please perform simulation “b” three times for each person in the group
each. Record the number of female pilots in each of the trials on the
dotplot.
The Practice of Statistics, 5th Edition 4
Hiring Discrimination Activity
3. Does it seem believable that airline managers carried out a fair lottery?
What advice would you give the male pilot who contacted you?
4. Would your advice change if the lottery had chosen 6 female (and 2
male) pilots? What about 7 female pilots? Explain.
0 1 2 3 4 5 6 7 8
Number of Female Pilots Selected
The Practice of Statistics, 5th Edition 5
Hiring Discrimination Activity
3. Does it seem believable that airline managers carried out a fair lottery?
What advice would you give the male pilot who contacted you?
4. Would your advice change if the lottery had chosen 6 female (and 2
male) pilots? What about 7 female pilots? Explain.
0 1 2 3 4 5 6 7 8
Number of Female Pilots Selected
Learning Objectives
After this section, you should be able to:
The Practice of Statistics, 5th Edition 6
✓ IDENTIFY the individuals and variables in a set of data
✓ CLASSIFY variables as categorical or quantitative
Data Analysis: Making Sense of Data
The Practice of Statistics, 5th Edition 7
Data Analysis
Statistics is the science of data.
Data Analysis is the process of organizing, displaying, summarizing,
and asking questions about data.
Individuals
✓ objects described by a set of data
Variable
✓ any characteristic of an individual
Categorical Variable
✓ places an individual into
one of several groups or
categories.
Quantitative Variable
✓ takes numerical values for
which it makes sense to find
an average.
The Practice of Statistics, 5th Edition 8
A variable generally takes on many different values.
• We are interested in how often a variable takes on each value.
Distribution
✓ tells us what values a variable takes and how often
it takes those values.
Variable of Interest:
MPG
Dotplot of MPG
Distribution
Data Analysis
The Practice of Statistics, 5th Edition 9
Examine each variable
by itself.
Then study relationships
among the variables.
Start with a graph
or graphs
How to Explore Data
Add numerical
summaries
The Practice of Statistics, 5th Edition 10
US Census Data
Here is information about 10 randomly selected U.S. residents from a
recent census, imported using Fathom software.
StateNumber of Family
Members Age Gender Marital status Total
Income ($)Travel time to
work (min)
Kentucky 2 61 Female Married 21,000 20
Florida 6 27 Female Married 21,300 20
Wisconsin 2 27 Male Married 30,000 5
California 4 33 Female Married 26,000 10
Michigan 3 49 Female Married 15,100 25
Virginia 3 26 Female Married 25,000 15
Pennsylvania 4 44 Male Married 43,000 10
Virginia 4 22 Male Single 3,000 0
California 1 30 Male Single 40,000 15
New York 4 34 Female Separated 30,000 40
a. Who are the individuals in this data
set?
b. What variables are measured? Identify
each as categorical or quantitative.
c. Describe the individual in the first row.
a. The individuals are the ten randomly selected U.S.
residents from the recent Census
b. Cat.: State, gender, marital status. Quant.: number of
family members, age, income, travel time
c. This individual is a 61 year old married female from
Kentucky. She has two members in her household,
has a total income of $21,000, and drives 20 minutes
to work.
The Practice of Statistics, 5th Edition 11
Population
Sample
Collect data from a
representative
Sample...
Perform Data
Analysis, keeping
probability in
mind…
Make an Inference
about the
Population.
From Data Analysis to Inference
The Practice of Statistics, 5th Edition 12
Population
Sample
Collect data from a
representative
Sample...
Perform Data
Analysis, keeping
probability in
mind…
Make an Inference
about the
Population.
From Data Analysis to Inference
Section Summary
In this section, we learned that…
The Practice of Statistics, 5th Edition 13
✓ A dataset contains information on individuals.
✓ For each individual, data give values for one or more variables.
✓ Variables can be categorical or quantitative.
✓ The distribution of a variable describes what values it takes and
how often it takes them.
✓ Inference is the process of making a conclusion about a
population based on a sample set of data.
Data Analysis: Making Sense of Data
The Practice of Statistics, 5th Edition 14
The Practice of Statistics, 5th Edition 15
The Practice of Statistics, 5th Edition 16
HOMEWORK
Page 6: 2, 4
Read Section 1.1