Agresti/Franklin Statistics, 1 of 33 Enrollment Fall 2005 (all students) ClassificationMenWomenTotal...
-
Upload
sharon-carter -
Category
Documents
-
view
216 -
download
1
Transcript of Agresti/Franklin Statistics, 1 of 33 Enrollment Fall 2005 (all students) ClassificationMenWomenTotal...
Agresti/Franklin Statistics, 1 of 33
Enrollment Fall 2005 (all students)
Classification Men Women Total
Undergraduate1,533(52%)
1,416(48%)
2,949
Professional* 17 22 39
Graduate 1,285 698 1,983
Master 505 276 781
Doctoral 780 422 1,202
Total2 2,835 2,136 4,971
Agresti/Franklin Statistics, 2 of 33
Geographic Origin3 (Fall 2005)
Undergraduates* Graduates Total
Master Doctoral
Texas1,532
(51.3%)474 482 2,488
Other U.S.1,320
(44.2%)157 178 1,655
International96
(3.2%)123 521 740
Not Designated40
(1.3%)27 21 88
Total 2,988 781 1,202 4,971
Agresti/Franklin Statistics, 3 of 33
Student Demographics (Fall 2005)
Undergrad Grad
# % Master % Doctoral %
Architecture 126 4% 74 9% 1 1%
Engineering 751 25% 36 5% 464 39%
Humanities 559 19% 16 2% 175 14%
Management -- 0% 471 60% -- 0%
Music 128 4% 123 16% 39 3%
Natural Sciences 704 23% 29 4% 346 29%
Social Sciences 693 23% -- 0% 135 11%
Interdisciplinary 21 1% -- 0% 42 3%
Continuing Studies -- 0% 32 4% -- 0%
Unclassified 6 1% -- 0% -- 0%
Total 2,988 781 1,202 100%
Agresti/Franklin Statistics, 4 of 33
Chapter 1Statistics: The Art and Science of
Learning from Data
Learn ….
What Statistics Is
Why Statistics Is Important
Agresti/Franklin Statistics, 5 of 33
Chapter 1
Learn…
How Data is Collected
How Data is Used to Make
Predictions
Agresti/Franklin Statistics, 7 of 33
Health Study
Does a low-carbohydrate diet result in significant weight loss?
Agresti/Franklin Statistics, 8 of 33
Market Analysis
Are people more likely to stop at a Starbucks if they’ve seen a recent TV advertisement for their coffee?
Agresti/Franklin Statistics, 9 of 33
Heart Health
Does regular aspirin intake reduce deaths from heart attacks?
Agresti/Franklin Statistics, 10 of 33
Cancer Research
Are smokers more likely than non-smokers to develop lung cancer?
Agresti/Franklin Statistics, 11 of 33
To search for answers to these questions, we…
Design experiments
Conduct surveys
Gather data
Agresti/Franklin Statistics, 12 of 33
Statistics is the art and science of:
Designing studies Analyzing data Translating data into knowledge and
understanding of the world
Agresti/Franklin Statistics, 13 of 33
Example from the National Opinion Center at the University of Chicago:
General Social Survey (GSS) provides data about the American public
Survey of about 2000 adult Americans
Agresti/Franklin Statistics, 16 of 33
Design
How to conduct the experiment
How to select the people for the survey
Agresti/Franklin Statistics, 17 of 33
Description
Summarize the raw data
Present the data in a useful format
Agresti/Franklin Statistics, 19 of 33
Example: Harvard Medical School study of Aspirin and Heart attacks
Study participants were divided into two groups• Group 1: assigned to take aspirin
• Group 2: assigned to take a placebo
Agresti/Franklin Statistics, 20 of 33
Example: Harvard Medical School study of Aspirin and Heart attacks
Results: the percentage of each group that had heart attacks during the study:
0.9% for those taking aspirin 1.7% for those taking placebo
Agresti/Franklin Statistics, 21 of 33
Example: Harvard Medical School study of Aspirin and Heart attacks
Can you conclude that it is beneficial for people to take aspiring regularly?
Example: Harvard Medical School study of Aspirin and Heart attacks
Agresti/Franklin Statistics, 23 of 33
Subjects
The entities that we measure in a study
Subjects could be individuals, schools, countries, days,…
Agresti/Franklin Statistics, 24 of 33
Population and Sample
Population: All subjects of interest
Sample: Subset of the population for whom we have data
Agresti/Franklin Statistics, 25 of 33
Geographic Origin (Fall 2005)
Undergraduates* Graduates Total
Master Doctoral
Texas1,532
(51.3%)474 482 2,488
Other U.S.1,320
(44.2%)157 178 1,655
International96
(3.2%)123 521 740
Not Designated40
(1.3%)27 21 88
Total 2,988 781 1,202 4,971
Agresti/Franklin Statistics, 26 of 33
Enrollment Fall 2005
Classification Men Women Total
Undergraduate1,533(52%)
1,416(48%)
2,949
Professional* 17 22 39
Graduate 1,285 698 1,983
Master 505 276 781
Doctoral 780 422 1,202
Total2 2,835 2,136 4,971
Agresti/Franklin Statistics, 27 of 33
Majors (Fall 2005)Undergrad Grad
# % Master % Doctoral %
Architecture 126 4% 74 9% 1 1%
Engineering 751 25% 36 5% 464 39%
Humanities 559 19% 16 2% 175 14%
Management -- 0% 471 60% -- 0%
Music 128 4% 123 16% 39 3%
Natural Sciences 704 23% 29 4% 346 29%
Social Sciences 693 23% -- 0% 135 11%
Interdisciplinary 21 1% -- 0% 42 3%
Continuing Studies
-- 0% 32 4% -- 0%
Unclassified 6 1% -- 0% -- 0%
Total 2,988 781 1,202 100%
Agresti/Franklin Statistics, 28 of 33
Example Format
• Picture the Scenario
• Question to Explore
• Think it Through
• Insight
• Practice the concept
Agresti/Franklin Statistics, 29 of 33
Example: The Sample and the Population for an Exit Poll
In California in 2003, a special election was held to consider whether Governor Gray Davis should be recalled from office.
An exit poll sampled 3160 of the 8 million people who voted.
Agresti/Franklin Statistics, 30 of 33
What’s the sample and the
population for this exit poll?
The population was the 8 million people who voted in the election.
The sample was the 3160 voters who were interviewed in the exit poll.
Example: The Sample and the Population for an Exit PollExample: The Sample and the Population for an Exit Poll
Agresti/Franklin Statistics, 31 of 33
Descriptive Statistics
Methods for summarizing data
Summaries usually consist of graphs and numerical summaries of the data
Agresti/Franklin Statistics, 33 of 33
Inference
Methods of making decisions or predictions about a populations based on sample information.
Agresti/Franklin Statistics, 34 of 33
Parameter and Statistic
A parameter is a numerical summary of the population
A statistic is a numerical summary of a sample taken from the population
Agresti/Franklin Statistics, 35 of 33
Randomness
Simple Random Sampling: each subject in the population has the same chance of being included in that sample
Randomness is crucial to experimentation
Agresti/Franklin Statistics, 36 of 33
Variability
Measurements vary from person to person
Measurements vary from sample to sample
Agresti/Franklin Statistics, 37 of 33
a. To describe whether a sample has more females or males.
b. To reduce a data file to easily understood summaries.
c. To make predictions about populations using sample data.
d. To predict the sample data we will get when we know the population.
Inferential Statistics are used:
Agresti/Franklin Statistics, 38 of 33
Chapter 2Exploring Data with Graphs and
Numerical Summaries
Learn ….The Different Types of Data
The Use of Graphs to Describe Data
The Numerical Methods of Summarizing Data
Agresti/Franklin Statistics, 40 of 33
In Every Statistical Study:
Questions are posed
Characteristics are observed
Agresti/Franklin Statistics, 41 of 33
Characteristics are Variables
A Variable is any characteristic that is recorded for subjects in the study
Agresti/Franklin Statistics, 42 of 33
Variation in Data
The terminology variable highlights the fact that data values vary.
Agresti/Franklin Statistics, 43 of 33
Example: Students in a Statistics Class
Variables:• Age
• GPA
• Major
• Smoking Status
• …
Agresti/Franklin Statistics, 44 of 33
Data values are called observations
Each observation can be:
• Quantitative
• Categorical
Agresti/Franklin Statistics, 45 of 33
Categorical Variable
Each observation belongs to one of a set of categories
Examples:• Gender (Male or Female)
• Religious Affiliation (Catholic, Jewish, …)
• Place of residence (Apt, Condo, …)
• Belief in Life After Death (Yes or No)
Agresti/Franklin Statistics, 46 of 33
Quantitative Variable
Observations take numerical values
Examples:• Age
• Number of siblings
• Annual Income
• Number of years of education completed
Agresti/Franklin Statistics, 47 of 33
Graphs and Numerical Summaries
Describe the main features of a variable
For Quantitative variables: key features are center and spread
For Categorical variables: key feature is the percentage in each of the categories
Agresti/Franklin Statistics, 48 of 33
Quantitative Variables
Discrete Quantitative Variables
and
Continuous Quantitative Variables
Agresti/Franklin Statistics, 49 of 33
Discrete
A quantitative variable is discrete if its possible values form a set of separate numbers such as 0, 1, 2, 3, …
Agresti/Franklin Statistics, 50 of 33
Examples of discrete variables
Number of pets in a household Number of children in a family Number of foreign languages spoken
Agresti/Franklin Statistics, 51 of 33
Continuous
A quantitative variable is continuous if its possible values form an interval
Agresti/Franklin Statistics, 52 of 33
Examples of Continuous Variables
Height Weight Age Amount of time it takes to complete
an assignment
Agresti/Franklin Statistics, 53 of 33
Frequency Table
A method of organizing data
Lists all possible values for a variable along with the number of observations for each value
Agresti/Franklin Statistics, 55 of 33
Example: Shark Attacks
What is the variable?
Is it categorical or quantitative?
How is the proportion for Florida calculated?
How is the % for Florida calculated?
Example: Shark Attacks
Agresti/Franklin Statistics, 56 of 33
Insights – what the data tells us about shark attacks
Example: Shark Attacks
Agresti/Franklin Statistics, 57 of 33
Identify the following variable as categorical or quantitative:
Choice of diet (vegetarian or non-vegetarian):
a. Categorical
b. Quantitative
Agresti/Franklin Statistics, 58 of 33
Number of people you have known who have been elected to political office:
a. Categorical
b. Quantitative
Identify the following variable as categorical or quantitative:
Agresti/Franklin Statistics, 59 of 33
Identify the following variable as discrete or continuous:
The number of people in line at a box office to purchase theater tickets:
a. Continuous
b. Discrete
Agresti/Franklin Statistics, 60 of 33
The weight of a dog:
a. Continuous
b. Discrete
Identify the following variable as discrete or continuous:
Agresti/Franklin Statistics, 61 of 33
Section 2.2
How Can We Describe Data Using Graphical Summaries?
Agresti/Franklin Statistics, 62 of 33
Graphs for Categorical Data
Pie Chart: A circle having a “slice of pie” for each category
Bar Graph: A graph that displays a vertical bar for each category
Agresti/Franklin Statistics, 67 of 33
Graphs for Quantitative Data
Dot Plot: shows a dot for each observation
Stem-and-Leaf Plot: portrays the individual observations
Histogram: uses bars to portray the data
Agresti/Franklin Statistics, 69 of 33
Dotplot for Sodium in Cereals
Sodium Data:
0 210 260 125 220 290 210 140 220 200 125 170 250 150 170 70 230 200 290 180
Agresti/Franklin Statistics, 70 of 33
Stem-and-Leaf Plot for Sodium in Cereal
Sodium Data:
0 210
260 125
220 290
210 140
220 200
125 170
250 150
170 70
230 200
290 180
Agresti/Franklin Statistics, 71 of 33
Frequency Table
Sodium Data: 0 210
260 125220 290210 140220 200125 170250 150170 70230 200290 180
Agresti/Franklin Statistics, 73 of 33
Which Graph?
Dot-plot and stem-and-leaf plot:• More useful for small data sets
• Data values are retained
Histogram• More useful for large data sets
• Most compact display
• More flexibility in defining intervals
Agresti/Franklin Statistics, 74 of 33
Shape of a Distribution
Overall pattern• Clusters?
• Outliers?
• Symmetric?
• Skewed?
• Unimodal?
• Bimodal?
Agresti/Franklin Statistics, 77 of 33
Identify the minimum and maximum sugar values:
a. 2 and 14 b. 1 and 3
c. 1 and 15 d. 0 and 16
Agresti/Franklin Statistics, 78 of 33
Consider a data set containing IQ scores for the general public:
What shape would you expect a histogram of this data set to have?
a. Symmetric
b. Skewed to the left
c. Skewed to the right
d. Bimodal