Section 2.1 – What Are the Types of...
Transcript of Section 2.1 – What Are the Types of...
Variables � A variable is any characteristic that is recorded
for subjects in a study. The word “variable” highlights that data values vary. ◦ Note that we studied several characteristics when we
completed the class study. Some examples are: � Gender � Class year � GPA
◦ In general, you can think of a variable as being a survey question.
Categorical vs. Quantitative � Categorical variables are such that each
observation belongs to one of a set of categories. ◦ Gender? ◦ Others?
� Quantitative variables are variables such that each observation can take on numerical values that represent different magnitudes of the variable. ◦ What is your height? ◦ Others?
Descriptive Statistics � How do we numerically summarize survey
data? First ask yourself if the data comes from a categorical or quantitative variable.
◦ If the data comes from a categorical variable, we’ll
want to describe the relative number of observations in each category (e.g. percentages.) ◦ If the data comes from a quantitative variable, key
features to describe are the center (e.g. mean/median) and spread (quartiles/standard deviation).
Quantitative Variables are Discrete or Continuous � A quantitative variable is discrete if its
possible values form a set of separate numbers, such as 0, 1, 2, etc. ◦ How many siblings do you have? ◦ Others?
� A quantitative variable is continuous if its possible values form an interval. ◦ What is your height? ◦ Others?
Frequency Tables � A frequency table is a listing of possible values for each
variable, together with the number of observations for each value. ◦ Count how often each variable occurs. ◦ Use percentages or proportions (relative frequencies)
to summarize the data.
� Make frequency tables for some categorical variables.
Section 2.2 - How Can We Describe Data Using Graphical Summaries? The type of graph that you should choose depends on whether the variable is quantitative or categorical.
9
Graphs for Categorical Data � Pie Chart: A circle having a “slice of pie”
for each category. ◦ The size of the slices is determined by the
proportional size of each category. ◦ Labeling wedges with percents helps to
make the information clearer.
10
Graphs for Categorical Data � Bar Graph: A graph that displays a vertical bar for each category ◦ The heights of the bars are determined by
the proportional size of each category. ◦ The bars do not touch. ◦ If you order the bars from highest to
lowest frequency it is known as a Pareto chart.
12
Pie Chart vs. Bar Chart � Pie chart gives a quick picture of parts
of the whole. � Difficult compare small differences in
a pie chart. � Bar graphs are good for comparing
two groups and a particular variable.
15
Graphs for Quantitative Data
� Dot Plot: Shows a dot for each observation. To construct: ◦ Label a horizontal line for the variable that you are measuring and mark it with regular values of the variable on it. ◦ For each observation, draw a dot above the line corresponding to the value of the observation.
16
Pros and Cons of a Dotplot
� Gives a quick picture of the data. � Difficult to graph precisely – better
with discrete than continuous data.
18
Graphs for Quantitative Data � Stem-and-Leaf Plot: Portrays the
individual observations on branches ◦ Each observation is represented by a stem
and a leaf. Typically all of the digits except the last are the stem and the last one is the leaf.
◦ Example: GPA � What is the range of answers? � What values should the stems be? � How about the leaves?
19
GPA Stem-and-Leaf Plot
20
Variable: GPA Decimal point is at the colon. Leaf unit = 0.1 2 : 00001444 2 : 556677788899 3 : 0111222344444 3 : 6888 4 : 00
Pros and Cons of a Stem-and-Leaf
� The leaves are ordered and therefore it is difficult to sketch for large data sets.
� You see a value for every single observation. � You decide how to distribute the digits and
possibly split the stems. � Nice to do side-by-side stem-and-leaf plots
to compare two groups.
21
Graphs for Quantitative Data � Histogram: Uses bars to portray the
frequencies or relative frequencies of the data. ◦ Divide the range of values into equally
sized sections. ◦ For each observation place a tally mark in
the corresponding section. ◦ Use the tally marks to set the heights of
the bars in your histogram. � Use either counts or proportions.
22
Pros and Cons of a Histogram
� Changing the width of the bars can give you dramatically different graphs.
� Obscures individual data values. � Helps to clarify the shape of the data.
24
Which Graph Should You Choose to Graph Quantitative Data? � Dot-plot and stem-and-leaf plot: ◦ More useful for small data sets ◦ Data values are retained
� Histogram ◦ More useful for large data sets ◦ Most compact display ◦ More flexibility in defining intervals
25
Recap of Types of Graphs � Categorical ◦ Pie Chart ◦ Bar Graph
� Quantitative ◦ Dotplot ◦ Stem-and-Leaf Plot ◦ Histogram
26
Shape of a Distribution of Quantitative Variables
� Overall Pattern of the Graph: ◦ Clusters? ◦ Outliers? ◦ Symmetric? ◦ Skewed? ◦ Unimodal? ◦ Bimodal?
27