Keyboarding Skills Objective 1.01: Implement proper keyboarding techniques.
Study Guides for Exams...42. Use the following data on ranked scores on a keyboarding skills test to...
Transcript of Study Guides for Exams...42. Use the following data on ranked scores on a keyboarding skills test to...
1
Study Guide for Exams
Concepts:
A. Course objectives
B. The process and philosophy of science (observations, questions, hypotheses, theories, prediction, if-
then, correlational and experimental tests, data, facts, scientific “proof”, relationship of ideas and data -
ID, limitations of science)
C. Statistical basics (variables [measured, derived, dependent, independent, response, predictor], data,
case, observation)
data collection (population, sample, error, random, independence, sample size)
measurement scales and kinds of variables (nominal/categorical, ranked/ordinal, interval/ratio,
continuous, discrete)
practical: identifying variables and measurement scales
D. Frequency distributions (histogram)
E. Description of data
central tendency (mode, median, mean, weighted mean)
dispersion (maximum, minimum, range, interquartile range, sum-of-squares, standard deviation,
variance, coefficient of variation)
parameters and statistics
reporting sample means (necessity of measure of dispersion, error bars)
calculating descriptive statistics with a calculator and with SYSTAT (raw data file, frequency data
file)
F. Goodness-of-fit
G. Probability distributions
Discrete (binomial, Poisson)
a. binomial (mutually exclusive [either/or] categories; defined by p, n
b. Poisson (rare and random events; defined by mean)
c. calculating terms of binomial and Poisson distributions
d. comparison of observed and expected distributions (influence of sample size)
H. SYSTAT
File Menu (New, Open, Save, Save As, Print, Exit)
Edit Menu (Undo, Cut, Copy, Paste, Copy Graph, Delete, Options
Data Menu (Variable Properties, Transform, By Groups, Select Cases, Case Weighting By
Frequency)
Utilities Menu (Probability Calculator)
Graph Menu (Bar, Dot, Histogram, Box, Scatterplot).
Analyze Menu (One-Way Frequency Tables, Basic Statistics, Tables)
_______________________________________
Test-Taking Strategies For Multiple-Choice Exams: When taking multiple-choice exams, one often hears
the claim, “It’s better to stay with your first choice; don’t go back and change your answers.” Is there any
evidence for this claim? Surprisingly, psychological research has shown that, overall, about 50% of changes
go from wrong to right; 25% go from right to wrong; and 25% go from wrong to wrong. Women change their
answers more often than men, and women are more likely than men to go from right to wrong. Thus, there is
no evidence to support the claim but no study can actually prove the claim wrong either because we don’t
know how those who stay with their first answers fare.
The following questions have been taken from old exams in Biostats classes over the
last 10-15 years. Because Biostats changes to some degree each semester, some
questions may address items not covered during the latest semester. In addition, new
material covered in the latest Biostats class may not have representative questions
.below.
2
Exam 1 Questions 1. The mode, median, and mean should be very nearly equivalent in this type of frequency distribution.
(a) binomial; (b) skewed; (c) normal; (d) Poisson; (e) probability
2. A frequency distribution is also known as a _____. (a) histogram; (b) central tendency; (c) table; (d)
plot; (e) parametric distribution
3. The shape of a Poisson distribution is determined by the _____. (a) mean, standard deviation; (b) p, n;
(c) geometric mean, SD, n; (d) mode, coefficient of variation; (e) mean
4. Which is the quickest (i.e., fewest steps) SYSTAT method of determining the sample size of several
categories? (a) scatterplot; (b) transform; (c) select cases; (d) Tables; (e) K-S test
5. This is the sum-of-squares divided by the sample size. (a) mode; (b) mean; (c) range; (d) median; (e)
variance
6. Which is not a “statistic” (term used in the narrow, technical sense)? (a)x; (b) s; (c) SD; (d) ; (e)
sample variance
7. This is the probability of obtaining a “4” with one role of a single die. (a) 1 in 6; (b) 1 in 4; (c) 2 in 3;
(d) 1 in 8; (e) 1 in 2
8. This is an important measure of data dispersion. (a) mean; (b) variance; (c) mode; (d) median; (e)
Goodness-of-Fit
9. On a SYSTAT dot graph, these graphics portray variation around the mean. (a) parameters; (b)
interquartile plots; (c) z-scores; (d) descriptive statistics; (e) error bars
10. The position of a team in major league baseball standings is measured on this scale. (a) categorical; (b)
ordinal; (c) ratio; (d) interval; (e) continuous
11. This is the square root of the sum-of-squares divided by the sample size. (a) standard deviation; (b)
mean; (c) range; (d) median; (e) variance
12. This standardized expression permits one to directly compare the relative amount of variation
associated with two or more means of one variable. (a) average deviation; (b) variance; (c) median; (d)
coefficient of variation; (e) z-score
13. This is a measure of dispersion. (a) mean; (b) variance; (c) mode; (d) median; (e) regression plot
14. This is a measure of central tendency. (a) variance; (b) mode; (c) standard deviation; (d) range; (e) z-
score
15. In an interval scale of measurement, values are neither quantitative nor ranked, and there is no
mathematical or value relationship among them. (a) true; (b) false
16. The only time scientists will use a theory is if they know for sure that the theory is correct. (a) true; (b)
false
17. The basic reason scientific knowledge has advanced so remarkably through the years is because many
dedicated scientists have proved thousands of hypotheses and theories (a) true; (b) false
18. A “statistic” is _____. (a) a numerical property of a sample; (b) a numerical property of a population;
(c) a normal distribution; (d) a single case; (e) a single observation
19. Which is not an “essential descriptive statistic” as used in class? (a) weighted mode; (b) sample size;
(c) average; (d) mean; (e) standard deviation
20. The temperature of a human body in Celsius should be measured on a ratio scale. (a) true; (b) false
21. The various species contained within a particular genus of birds should be measured on a ranked scale.
(a) true; (b) false
22. Assuming one had the proper instrument, which could be measured as a continuous variable? (a) size
ranking; (b) number of RBCs; (c) frequency of predation events; (d) species; (e) color
23. This is the probability of obtaining two heads with one flip of two coins. (a) 0.50; (b) 0.25; (c) 0.10;
(d) 1.0; (e) 0.75
24. The term “random” refers to a condition where the value of one case does not affect the value of other
cases. (a) true; (b) false
25. These are the measured values of variables for individual cases. (a) tables; (b) observations; (c) data;
(d) error; (e) expected values
26. This is a very important measure of central tendency especially for continuous and many discrete
variables. (a) variance; (b) standard deviation; (c) range; (d) mean; (e) sum-of-squares
3
27. Scientists will only use a theory if they know for sure that the theory is absolutely true. (a) true; (b)
false
28. This term represents all of the individuals of a specified part of the statistical universe. (a) sample; (b)
range; (c) population; (d) mode; (e) frequency distribution
29. This is the difference between the maximum and minimum in a data set. (a) median; (b) variance; (c)
range; (d) geometric mean; (e) coefficient of variation
30. An observed frequency distribution of a given type will more closely conform to a theoretical
frequency distribution of the same type under this condition. (a) decreased N; (b) increased N; (c)
decreased range; (d) increased mean; (e) decreased mean
31. In this type of frequency distribution, the value of the mean is generally very low. (a) Poisson; (b)
discrete; (c) bimodal; (d) normal; (e) skewed
32. Statistical “error” often refers to the level of confidence that one has regarding how well the statistics
of _____ estimate the statistics of _____. (a) samples, populations; (b) populations, samples; (c)
parametrics, nonparametrics; (d) precision, accuracy; (e) accuracy, precision.
33. On a SYSTAT data sheet, individual variables are usually found in ______ whereas individual cases
are found in _____. (a) rows, columns; (b) rows, rows; (c) columns, columns; (d) columns, rows
34. An error bar usually illustrates _____. (a) a measure of central tendency; (b) a measure of dispersion;
(c) sample size; (d) a measure of ; (e) the coefficient of variation
35. In this scale of measurement, values are neither quantitative nor ranked, and there is no mathematical
or value relationship among them. (a) ordinal; (b) interval; (c) continuous; (d) categorical; (e) ratio
36. The volume of blood (ml) is measured on this scale. (a) categorical; (b) ordinal; (c) ratio; (d) interval;
(e) continuous
37. The species of tree is measured on this scale. (a) categorical; (b) ordinal; (c) ratio; (d) interval; (e)
continuous
38. The age of a viral particle is measured on this scale. (a) categorical; (b) ordinal; (c) ratio; (d) interval;
(e) continuous
Identify the variables and respective scales of measurement
39. It is thought that endotherms from northern areas have shorter appendages than endotherms from
southern areas. Test this hypothesis using the following data on wing lengths (mm) of house sparrows.
northern: 120, 113, 125, 118 variables_________________________________
southern: 116, 117, 121, 114 respective scales___________________________
40. Use the following data on number of ladybird beetles collected from sunflowers in different seasons to
test the hypothesis that sex ratio of beetles is unrelated to season.
Spring Sum Fall
male 163 135 71 variables___________________________________
female 86 77 40 respective scales_____________________________
41. A mammalogist was interested in possible relationships of prey size and predator size. Use the data
below on otters and their prey to determine if there a relationship between predator and prey masses
(g).
otter wgt. - 1500, 500, 750, 1000; variables____________________________
prey wgt. - 128, 190, 75, 125; respective scales______________________
42. Use the following data on ranked scores on a keyboarding skills test to test the hypothesis that high
school training improves keyboarding skills of college students.
with HS training: 44, 48, 36, 32, 51; variables____________________________
without training: 32, 40, 44, 44, 34; respective scales______________________
43. In a study of snake hibernation, fifteen pythons of similar size and age were randomly assigned to three
groups. One group was treated with drug A, one group with drug B, and the third group was not
treated. Their systolic blood pressure (mmHg) was measured 24 hours after administration of the
treatments. Does either drug affect blood pressure?
control: 130, 135, 132, 128, 130
drug A: 118, 120, 125, 119, 121 variables_______________________________
drug B: 105, 110, 98, 106, 105 respective scales_________________________
44. Use the following data on mean adult body weight (mg) and larval density (no./mm3) of fruit flies to
determine if there is a functional relationship between adult body mass and the density at which it was
reared.
4
Density 1.000 3.000 5.000 6.000 variables _____________________________
Weight 1.356 1.356 1.284 1.252 respective scales________________________
45. The following data are frequency of individuals with different hair colors according to sex. According
to these data, is human hair color dependent on sex? (Protocol link)
sex black brown blond red
male 32 43 16 9 variables_______________________________
female 55 65 64 16 respective scales_________________________
46. Use the following data on human blood-clotting times (min.) of individuals given one of two different
drugs to test the hypothesis that drug B induces clotting at a faster rate.
Drug B: 8.8, 8.4, 7.9, 8.7; variables__________________________________
Drug G: 9.9, 9.0, 11.1, 9.6; respective scales____________________________
47. Calculate the essential descriptive statistics of systolic blood pressures (mgHg) in the following sample
of HU students: 144, 136, 163, 117, 133, 141, 152, 140, 140, 138, 127, 120, 161, 124, 137.
mean = _______ standard deviation = _______ sample size = _______
48. AmIV is a disease caused by a certain pathogenic amoeba that invades human red blood cells. The rate
and pattern of infection is known for some populations but unknown for most. A medical
parasitologist had some knowledge that infection in the Mexican population was uncommon, but he
did not know whether infection was random with respect to which individuals were infected. IF the
infection rate was in fact random, THEN the observed distribution of infection rates should closely
correspond to a theoretical distribution that describes the occurrence of a rare and random event.
Complete the following table showing the observed number and the expected number (to 3 decimal
places) of humans infected with amoebas in a sample collected in Mexico City. Based on your
analysis of these data, answer the questions and support your answers.
No. amoebas Observed Expected
per 1 x 106 no. infected no. infected
cells humans humans
0 317 ______
1 41 ______
2 1 ______
3 12 ______
4 7 ______
5 0
Total _____ Total ______
Based on your analysis of the data, was the infection rate rare? Why?
Based on your analysis of the data, was the infection rate random? Why?
49. Calculate the probability of having a total of four boys and one girl in a family of five children.
50. The following sample contains data for the 5th digit claw length (mm) for two species of Asiatic bats.
Road-winged bat - 0.45, 0.39, 0.39, 0.42, 0.23
Flap-winged bat - 0.69, 0.99, 0.85, 0.98, 0.87, 0.95, 0.92, 0.81
Calculate the mean, standard deviation, and coefficient of variation for claw length in each bat
species.
Species Mean SD CV___
R-W bat _____ _______ ______
F-W bat _____ _______ ______
In which species is claw length relatively more variable? _______ Support your answer ___-
_______
51. A particularly severe strain of brucelosus was detected in low frequencies in the early 21st century
Rwandan cattle herds. A government veterinarian needed to know whether the disease was being
spread by a correctable human ranching practice or if the disease was just occurring randomly among
herds. Data collected on frequency of infection in herds from throughout the country are shown below.
Determine if there is evidence that the disease was being spread by anything other than random
chance.
5
Choose the appropriate probability distribution formula and enter the value(s) of the variables
necessary for its solution. Show formula here ________________________________
Complete the following table showing the observed number and the expected number of cattle
infected with brucelosus in the following sample.
No. infected Observed Expected
cattle per no. no.
herd herds herds__
0 523 ______
1 71 ______
2 22 ______
3 11 ______
4 5 ______
Total _____ Total ______
Based on your analysis of the data, do you think there is evidence that the disease was being
spread by poor ranching practices? Support your answer.
52. Calculate the essential descriptive statistics of systolic blood pressures (mgHg) in the following sample
of Japanese fisherman: 134, 146, 143, 117, 123, 124, 142, 147, 130, 138, 123, 120, 131, 134
mean = _______ SD = _______ n = _______
53. Calculate the essential descriptive statistics of the number of flourescent worms per petri dish in the
following sample:
no. of no. of no. of no. of mean = _______
worms dishes worms dishes
0 3 40 92 SD = _______
10 41 50 114
20 53 60 292 n = _______
30 52 70 7
54. Calculate the probability of having a total of four boys and four girls in a litter of eight gerbils. Choose
the appropriate probability distribution formula and enter the value(s) of the variables necessary for
solution.
Show formula here ____________________________________
Answer = ________________
55. The following data are maximum sprint speeds (m/sec) of gravid females measured from two species
of captive lizards.
Coal skink - 0.45, 0.39, 0.59, 0.55, 0.23, 0.51, 0.28, 0.47, 0.34, 0.50, 0.65
Broad headed skink - 0.69, 0.99, 0.85, 0.98, 0.87, 0.95, 0.92, 0.81, 0.79, 1.09, 0.65,
a. Calculate the mean, SD, and CV for sprint speed in each laboratory sample.
Laboratory Standard Coefficient
sample Mean deviation of variation
Coal skink _____ _______ ______
B-H skink _____ _______ ______
b. In which species is sprint speed relatively more variable? Support your answer.
Use the file GINMOVE.SYD which contains data on a population of spiny softshell turtles (Apalone
spinifera) inhabiting Gin Creek in Searcy, AR. Data were collected in 1995. Variables are: no = turtle
number, sex$ = sex of turtle (M, F), date = date (year [yy], month [mm], day [dd]), time = 24 hr time,
tb = body temperature (°C), tamb = ambient temperature (°C), tair = air temperature (°C), twat = water
temperature (°C), wlev = water level, clar$ = water clarity (clear, turbid, muddy), sky = sky condition,
wind = wind speed, hab$ = habitat (P, pools; R, riffles; Q, backwater), beh = behavior (buried,
basking, moving ), loc = location along the stream (m), loc2 = location 24 hr later (m), dist = distance
moved the previous 24 hours (m). N = 973
56. Calculate the descriptive statistics of body temperature for all female cases in May.
6
mean = _______ SD = _______ n = _______
57. In how many cases were turtles found in pool habitat? _______
Prepare graphs for the following (hognose.syz)
58. Daily movements associated with courtship range between 50-100 m. Construct a single graph
that illustrates the descriptive statistics of distance moved the previous 24 hr separately for males
and females when they are courting.
59. Construct a graph that illustrates the frequency of male residents located in May over all years by
date.
60. Graph the cases of Tb against cases of Tair when Tair exceeds 10 C
61. Construct a single graph that illustrates separately for active and inactive snakes the descriptive
statistics of body temperature when air temperature is below 30 C.
7
Exam 2 Updated 9 February 2016
In addition to being comprehensive, the
following new concepts are covered:
importance of normal distribution in
statistics
properties of normal distribution (defined by
mean and standard deviation,)
areas of normal curve
standard normal distribution (z-scores)
testing for normality (Probability plot and
Kolmogorov-Smirnoff test); skewness
test statistic
parametric and non-parametric tests
data transformation (logarithmic, square
root, arcsine)
statistical inference
major categories of statistical inference
sampling distribution
central limit theorem
Student’s t-distribution
standard error of mean
95% confidence limits
reporting sample means
graphical error bars
hypothesis testing
research hypothesis
null hypothesis
test statistic
critical value
alpha level
one and two-tailed tests
type I & II errors
relationship of type I & II errors
power of a test
significance level
statistical significance
parametric and nonparametric tests
assumptions of a test
Bartlett’s test
Levene’s test
robust test
testing for differences
independent samples t-test
paired samples t-test
repeated measures tests
Mann-Whitney test
Wilcoxon test
graphical analysis of differences between
means
Exam 2 Questions
1. In a standard normal distribution, a z-score of _____ on each side of the mean encloses 95% of the
cases. (a) 0.68; (b) 1.96; (c) 1.0; (d) 0.05; (e) 0.0
2. This is the most common data transformation used in biology. (a) square root; (b) Lilliefors; (c)
arcsine; (d) logarithm; (e) interquartile
3. In SYSTAT, this is the preferred quantitative method for students to determine if data are
normally distributed. (a) histogram; (b) Tables; (c) dot graph; (d) probability plot; (e)
Komolgorov-Smirnov test
Test-Taking Strategies For Multiple-Choice Exams: When taking multiple-choice exams, one often hears
the claim, “It’s better to stay with your first choice; don’t go back and change your answers.” Is there any
evidence for this claim? Surprisingly, psychological research has shown that, overall, about 50% of changes
go from wrong to right; 25% go from right to wrong; and 25% go from wrong to wrong. Women change their
answers more often than men, and women are more likely than men to go from right to wrong. Thus, there is
no evidence to support the claim but no study can actually prove the claim wrong either because we don’t
know how those who stay with their first answers fare.
The following questions have been taken from old exams in Biostats classes over the
last 10-15 years. Because Biostats changes to some degree each year, some questions
may address items not covered during the latest semester. In addition, new material
covered in the latest Biostats class may not have representative questions below.
8
4. Statistical tests of this type are very powerful but have relatively rigid assumptions that must be
met. (a) parametric; (b) nonparametric; (c) normality; (d) distribution-free; (e) interval plot
5. The specific shape of a normal distribution is determined by these. (a) mean, sample size; (b)
mean, median; (c) mean, standard deviation; (d) mean; (e) variance, sample size
6. Data that are influenced by many small and unrelated random effects are frequently normally
distributed. As a consequence, normally distributed data are widespread and common in nature.
(a) true; (b false)
Use the attached SND Table (1-tailed, showing proportion included) to determine the percent of the area of
the standard normal distribution that is either included or excluded by each of the following z-scores.
Each question asks for the percent in either one or two tails of the distribution.
7. 1.00 (one-tail, included) (a) 99.0; (b) 95.0; (c) 68.0; (d) 47.5; (e) 34.0; (ab) 5.0; (ac) 2.5; (ad) 1.0
8. 1.96 (two-tails, included) (a) 99.0; (b) 95.0; (c) 68.0; (d) 47.5; (e) 34.0; (ab) 5.0; (ac) 2.5; (ad) 1.0
9. 1.96 (one-tail, excluded) (a) 99.0; (b) 95.0; (c) 68.0; (d) 47.5; (e) 34.0; (ab) 5.0; (ac) 2.5; (ad) 1.0
10. 2.58 (two-tails, excluded) (a) 99.0; (b) 95.0; (c) 68.0; (d) 47.5; (e) 34.0; (ab) 5.0; (ac) 2.5; (ad) 1.0
11. Statistical tests of this type are not very powerful but have relatively few assumptions. (a)
parametric; (b) nonparametric; (c) tests based on the normal distribution; (d) cross-tabulation; (e)
sum-of-squares
12. When using SYSTAT to quantitatively test whether data in a frequency distribution table are
normally distributed, this function from the data menu must be enabled. (a) select cases; (b)
transform; (c) case weighting by frequency; (d) by groups; (e) variable properties
13. Use the attached Table (1-tailed, showing proportion included) to determine the percent (to the
nearest 0.1) of the area of the standard normal distribution which is enclosed by each of the
following z-scores. Each question asks for the percent in either one or two tails of the distribution.
a. 0.34 (one-tail) = __________% b. 1.96 (two-tails) = __________%
14. Use the attached Table (1-tailed, showing proportion included) to determine the percent (to the
nearest 0.1) of the area of the standard normal distribution which is excluded by each of the
following z-scores. Each question asks for the percent in either one or two tails of the distribution.
a. 1.96 (one-tail) = __________% b. 2.58 (two-tails) = __________%
Use HOGNOSE.SYD - data on radiotracked hognose snakes (Heterodon platirhinos) recorded near
Riverside Park north of Searcy, AR. Variables are: yr = year, date (month, day), no = snake ID, sex$ = sex,
rc = recapture status, xloc = x-axis location, yloc = y-axis location, hab1$ = habitat1, hab2$ = habitat2,
grdcov$ = groundcover, act$ = activity (inactive or active), beh$ = behavior, dist = distance moved in the
last 24 hr (m), tb = body temperature, tair = air temperature, status$ = residency status (resident or
nonresident). N = 783
15. For resident snakes, quantitatively test variable DIST to determine if it is normally distributed.
Is the distribution normal? _________ sample size = ______
Support your answer ________________________________________________
16. Which is false regarding data that are suitable for parametric tests? (a) sampled data are
independent of each other; (b) data are randomly sampled; (c) sampled data are normally
distributed; (d) sampled data are measured on an ordinal scale; (e) sampled data are measurements
of continuous
17. This is the mathematical relationship between the standard deviation and the standard error of the
mean.
18. This frequency distribution could be described as a normal distribution whose shape varies with
sample size.
19. H0: σa2 = σb
2 is the proper null hypothesis for this statistical test.
20. This is the result when a true null hypothesis is rejected.
21. This term describes a general property of statistical tests in which the probability of rejecting a
false null hypothesis is relatively high.
22. This process is an example of statistical inference.
23. This is the value of the alpha level that most scientists use when testing a null hypothesis.
9
24. This is an example of a population parameter.
25. This is a general hypothesis of no difference or no relationship.
26. The goal of this statistical test is to detect differences among variances of normally distributed
data sets.
27. The goal of this statistical test is to detect differences between the means of two separate groups.
Data are not normally distributed nor are the group variances equal.
28. This term describes a general property of statistical tests that are relatively insensitive to
deviations from their assumptions.
29. Statistical tests of this type are very powerful but have relatively rigid assumptions.
30. H0: μa ≤ μb is a proper null hypothesis for this non-parametric statistical test.
31. How many asterisks indicate a significance level of P<0.01?
32. This calculated value is used in conjunction with a statistical table to determine the probability of a
null hypothesis being true.
33. If the research hypothesis is A>B, the null hypothesis is ____.
34. The shape of this theoretical probability distribution is determined by the mean and standard
deviation.
35. The cases of this distribution consist of individual sample means taken from a population.
36. How is the chance of making a type 2 error affected when alpha is decreased?
37. The ability to consistently apply this attribute of a good scientific research hypothesis contributes
to differentiating everyday scientists from great scientists.
38. The goal of this statistical test is to determine if the means of two separate groups are different.
Data are not normally distributed but the group variances are equal.
39. If the probability of rejecting a false null hypothesis is relatively high for a given test, we would
say the test is ____.
40. This is a general hypothesis of no difference between groups.
41. In the early 1900s, this biologist contributed greatly to the areas of population genetics and
statistical applications in biological research.
42. How many asterisks indicate a significance level of P<0.001?
43. This general term describes the conclusion about any null hypothesis that has been statistically
rejected.
44. This is a quantitative test for determining if sample data are normally distributed.
45. These two values represent the approximate 95% confidence intervals for this mean:x = 7.4, SD
= 0.40, N =
46. H0: σa2 = σb
2 is the proper null hypothesis for this statistical test.
47. This is the alpha level that most biologists use when testing a null hypothesis.
48. This is the result when a true null hypothesis is rejected.
49. The risk of making a Type 2 error can be reduced by ________.
50. If the null hypothesis is A=B, the research hypothesis is ____.
51. The goal of this statistical test is to detect differences between the means of repeated
measurements on individuals. Data are skewed and the variances are unequal.
52. This is the test statistic for a Mann-Whitney test.
53. Nonparametric tests address either questions of differences or questions of _____.
54. This parametric test is considered to be robust.
55. This is the name of the tabled value of a test statistic at the specified alpha level.
56. This is the result when a false null hypothesis is not rejected.
57. The goal of this statistical test is to detect differences between two dependent means when the data
meet parametric test assumptions.
58. Statistical tests of this general type are not very powerful but they are easy to use because they
have relatively few assumptions.
59. This is an example of a statistic.
60. T-tests assume that variances between groups are homogeneous. How would you test this
assumption?
61. H0: μa ≤ μb is a suitable null hypothesis for this nonparametric test.
62. Data that are suitable for this category of statistical tests must be normally distributed and
continuous
10
63. This is the quantitative relationship between the standard error of the mean and the standard
deviation.
64. This mathematical theorem predicts that sample means from a non-normally distributed
population will have a normal distribution if the sample size is large enough.
65. This is the symbol for a parametric variance.
66. The goal of this statistical test is to detect differences among variances of skewed data sets.
67. This process is an example of statistical inference.
68. The goal of this statistical test is to detect differences between two dependent means when the data
meet parametric test assumptions.
69. Statistical tests of this general type are not very powerful but they are easy to use because they
have relatively few assumptions.
70. The goal of this statistical test is to detect differences between the means of repeated
measurements on individuals. Data are not normally distributed; the variances are equal.
71. This principle states that means of samples from a non-normally distributed population will have a
normal distribution if the sample size is large enough.
72. This term describes the statistical conclusion regarding a null hypothesis that has been rejected at a
probability level of 0.05.
73. This is a nonparametric test for determining whether data are normally distributed.
74. Statistical tests address either questions of differences or questions of _____.
75. What are the primary attributes of a good scientific research hypothesis?
_______________________________
Problems
76. Use the following data on salivary gimetz concentration (mg/100 ml) in male and female college
students to test the research hypothesis that sex affects gimetz levels.
males: 220.1, 218.6, 229.6, 228.8, 222.0, 224.1, 226.5
females: 23.4, 221.5, 230.2, 224.3, 223.8, 230.8
77. Use the data in HOGNOSE.SYS to test the hypothesis that active (act$=act) male resident snakes
(status$=res) move greater distances each day than do active female residents.
78. Density of voles is hypothesized to vary differently from year to year in grassland habitats that
have either been unburned, burned annually, or burned every 4-5 years. To test this idea, an
ecologist measured the population density of voles each year for 10 years in each of three different
habitats. The data are below. Does variation in population density of voles differ among the
habitats?
Density (number per hectare)
burned periodically 348, 244, 198, 321, 276, 239, 287, 311, 302, 271
burned 4-5 years 147, 172, 133, 111, 109, 113, 096, 115, 110, 107
unburned 167, 231, 098, 177, 216, 179, 195, 154, 163, 134
79. Use the following data on ranked scores on a keyboarding skills test to test the hypothesis that
high school training improves keyboarding skills of college students.
with training: 44, 48, 36, 32, 51, 45, 54, 56
without training: 32, 40, 44, 44, 34, 30, 26
Common errors when working exam problems (listed in approximate
decreasing order of potential point loss)
Incorrect or incomplete reading of the problem
Not knowing what the variables and respective measurement scales are
Not knowing the assumptions of chosen test
Incorrect correspondence between stated variables and null hypothesis
Rejecting H0 when P>0.05 or not rejecting H0 when P≤0.05
Not knowing how to work with logarithms
Incorrect data entry
11
80. Use the data in GINMOVE.SYD to test the hypothesis that male and female turtles move different
distances each day.
81. Using the following data on volume (cubic microns) of avian erythrocytes taken from normal
(diploid) and intersex (triploid) individuals, test the hypothesis that ploidy affects erythrocyte
volume.
diploid: 248, 236, 269, 254, 249, 251, 260, 245, 239, 255
triploid: 380, 391, 377, 392, 398, 374
82. Use the following data on clutch size to test the hypothesis that variability in clutch size differs
between zoo-bred and wild-bred snow geese.
zoo: 10, 11, 12, 11, 10, 11, 11
wild: 9, 8, 11, 12, 10, 13, 11, 10, 10
83. Five sophomores volunteered for an exercise physiology class project. Maximum oxygen
consumption (ml O2/min) was measured twice in each of the five sophomores over a period of one
month. One measurement was taken two days before a vigorous cardio exercise program began
and the second two days after the program ended. Use the following data to test the hypothesis
that training produced a greater ability to consume oxygen.
Individual no. 1 2 3 4 5_
Before treatment 1920 2020 2060 1960 1960
After treatment 2250 2410 2260 2200 2360
84. Crop yields were measured in each of nine experimental plots over two successive years, one
using “old” fertilizer and one using “new” fertilizer. Use the following data to test the research
hypothesis that the new fertilizer produced greater yields.
plot 1 2 3 4 5 6 7 8 9
old fertilizer 1920 2020 2060 1960 1960 2140 1980 1940 1790
new fertilizer 2250 2410 2260 2200 2360 2320 2240 2300 2090
12
Exam 3
Correlation and Regression Contenttesting for relationships
correlation
positive correlation
negative correlation
causation
correlation coefficient
strength of relationship
coefficient of determination (r2)
Pearson’s correlation
Bonferroni probabilities
Spearman’s correlation
regression
dependent variable
independent variable
residual
least squares fit
intercept
slope
regression coefficient
predicting Y from X
inverse prediction
extrapolation
semilog and log-log regressions
exponential form of log-log equation
Questions from old exams
1. This statistical procedure is used when one desires to predict the value of a dependent variable
from knowledge of the value of an independent variable. (a) regression analysis; (b) correlation
analysis; (c) goodness-of-fit; (d) data transformation; (e) analysis of variance 2. This term refers to the prediction of “Y” from a known value of “X” that is beyond the range of
the actual data. (a) extrapolation; (b) guessing; (c) transformation; (d) goodness-of-fit; (e) type II
error
3. R-squared (r2) is also known as the _____. (a) coefficient of variation; (b) coefficient of
determination; (c) parametric measure; (d) critical value; (e) measure of statistical power
4. The strength of the relationship in a correlation analysis is shown by this value. (a) intercept; (b)
correlation coefficient; (c) slope; (d) probability; (e) regression coefficient
5. In a regression analysis, “Y” is the independent variable and “X” is the dependent variable. (a)
true; (b) false 6. In a regression analysis, the regression line is fitted to the data points by this method. (a)
Kolmogorov-Smirnov; (b) extrapolation; (c) ANOVA; (d) data transformation; (e) least squares
7. How heart rate relates to oxygen consumption varies from person to person. Age, weight, sex,
body composition, fitness level, and other factors all play a role. Drawing from population models
and their own research, the companies that manufacture heart rate monitors have developed
formulas that couple heart rate with those different variables and massage it all into an estimate of
calorie usage. The onboard calculators found on treadmills, elliptical trainers and other devices use
basically the same approach. Depending on the machines, however, they typically don’t allow you
to enter as much information about yourself as a heart monitor. The machine might ask for your
weight and age, for example, but not your sex or an estimate of your fitness level. Fewer variables
mean a rougher guess. In statistical terms, what is the meaning of the last sentence, “Fewer
variables mean a rougher guess?” (a) lower CV; (b) higher CV; (c) lower r2; (d) higher r2; (e) lower
probability
Problems
Common errors when working exam problems (listed in approximate
decreasing order of potential point loss)
Incorrect or incomplete reading of the problem
Not knowing what the variables and respective measurement scales are
Not knowing the assumptions of chosen test
Incorrect correspondence between stated variables and null hypothesis
Rejecting H0 when P>0.05 or not rejecting H0 when P≤0.05
Not knowing how to work with logarithms
Incorrect data entry
13
Example problems from lectures:
1. Use the following data on wing length (cm) and tail length (cm) in cowbirds to determine if there is a
relationship between the two variables. (Protocol link)
wing 10.4 10.8 11.1 10.2 10.3 10.2 10.7 10.45 10.8 11.2
10.6
tail 7.4 7.6 7.9 7.2 7.4 7.1 7.4 7.2 7.8 7.7
7.8
2. Use the following data taken from crabs to determine if there is a relationship between weight of gills
(g) and weight of body (g) and between weight of thoracic shield (g) and weight of body. (Protocol
link)
body 159 179 100 45 384 230 100 320 80 220
320
gill 14.4 15.2 11.3 2.5 22.7 14.9 11.4 15.81 4.19 15.39
17.25
thorax 80.5 85.2 49.9 21.1 195.3 111.5 56.6 156.1 39.0 108.9
160.1
3. The following data are ranked scores for ten students who took both a math and a biology aptitude
examination. Is there a relationship between math and biology aptitude scores for these students?
(Protocol link)
math 53 45 72 78 53 63 86 98 59 71
biology 83 37 41 84 56 85 77 87 70 59
4. Test the following data to determine if there is a relationship between the total length of aphid stem
mothers and the mean thorax length of their parthenogenetic offspring. (Protocol link)
mother 8.7 8.5 9.4 10.0 6.3 7.8 11.9 6.5 6.6 10.6
offspring 5.95 5.65 6.00 5.70 4.40 5.53 6.00 4.18 6.15 5.93
5. The following data are rate of oxygen consumption (ml/g/hr) in crows at different temperatures (C).
Does temperature affect oxygen consumption in crows? Determine the equation for predicting oxygen
consumption from temperature. (Protocol link)
temp -18 -15 -10 -5 0 5 10 19
oxygen 5.2 4.7 4.5 3.6 3.4 3.1 2.7 1.8
6. Use the following data on mean adult body weight (mg) and larval density (no./mm3) of fruit flies to
determine if there is a functional relationship between adult body mass and the density at which it was
reared. Determine the equation for predicting body weight from larval density. (Protocol link)
density 1 3 5 6 10 20 40
weight 1.356 1.356 1.284 1.252 0.989 0.664 0.475
Practice problems
Nos. 5, 7, 10, 12, 14, 25, 26, 28, 29, 31, 38, 39, 46, 52, 55, 58, 62, 65, 66, 72
ANOVA - Content In addition to being comprehensive, the following new concepts are covered:
analysis of variance
F-ratio
F-distribution
Between-group variance
Within-group (error) variance
Post-hoc pairwise tests
one-way ANOVA
two-way ANOVA
Tukey test
factor
14
interaction
synergism
antagonism
residuals
Kruskal-Wallis test
DSCF test
ANCOVA
interaction plot
covariate
least squares means
the problem of multiple comparisons
Circular statistics
Principal components
MANOVA
Repeated measures ANOVA
Logistic regression
Non-linear regression
Multiple regression
Questions from old exams
1. In an ANCOVA, the covariate is a _____ variable. (a) dependent; (b) multivariate; (c) categorical; (d)
continuous; (e) derived
2. Which test is least powerful? (a) ANOVA; (b) Pearson’s correlation; (c) independent-samples t-test; (d) paired
t-test; (e) Mann-Whitney test
3. To determine the effect of two independent variables on a dependent variable, what is the advantage of doing a
single two-way ANOVA as opposed to two separate one-way ANOVAs? (a) a two-way ANOVA is more
robust; (b) a two-way ANOVA calculates the effect of a covariate; (c) a two-way ANOVA is easier to use on a
calculator; (d) a two-way ANOVA assesses possible interaction between the independent variables; (e) a two-
way ANOVA provides a test statistic
4. In an ANOVA, this is the normal variation expected in individuals that is not a result of being part of a “group.”
It results from such things as individual genetic makeup and environmental history. (a) standard deviation; (b)
SE; (c) between group variance; (d) error variance; (e) coefficient of variation
5. The goal of this test is to detect differences between >2 independent means. Data are not normally distributed
nor are the variances among groups equal.
Problems
Example problems from lectures 1. Random samples of a certain species of zooplankton were collected from five lakes and their selenium content
(ppm) was determined. Was there a difference among lakes with respect to selenium content? (Protocol link)
lake A: 23, 30, 28, 32, 35, 27, 30, 32
lake B: 34, 42, 39, 40, 38, 41, 40, 39
lake C: 15, 18, 12, 10, 8, 16, 20, 19
lake D: 18, 15, 9, 12, 10, 17, 10, 12
lake E: 25, 20, 22, 18, 30, 22, 20, 19
2. The following data are amount of food (kg) consumed per day by adult deer at different times of the year. Test
the null hypothesis that food consumption was the same for all the months tested. (Protocol link)
February May August November
4.7 4.6 4.8 4.9
4.9 4.4 4.7 5.2
5.0 4.3 4.6 5.4
4.8 4.4 4.4 5.1
4.7 4.1 4.7 5.6
4.2 4.8
3. In a study of snake hibernation, fifteen pythons of similar size and age were randomly assigned to three groups.
One group was treated with drug A, one group with drug B, and the third group was not treated. Their systolic
blood pressure (mmHg) was measured 24 hours after administration of the treatments. Do the drugs affect
blood pressure? If so, do they have similar effects? (Protocol link)
control: 130, 135, 132, 128, 130
15
drug A: 118, 120, 125, 119, 121
drug B: 105, 110, 98, 106, 105
4. Fourteen hucksters were assigned at random to one of three experimental groups and fed a different diet for six
months. Use the following data on huckster mass (kg) at the end of the experiment to determine if diet affected
body size. Which diet produced the heaviest hucksters? (Protocol link)
diet 1 diet 2 diet 3
60.8 68.7 102.6
57.0 67.7 102.1
65.0 74.0 100.2
58.6 66.3 96.5
61.7 69.8
5. Twenty-four freshwater clams were randomly assigned to four groups of six each. One group was placed in
deionized water, one group was placed in a solution of 0.5 mM sodium sulfate, and one group was placed in a
solution of 0.74 mM sodium chloride. At the end of a specified time period, blood potassium levels (M K+)
were determined. Did treatment affect blood potassium levels? (Protocol link)
pond water: 0.518, 0.523, 0.499, 0.502, 0.520, 0.507
deionized water: 0.308, 0.385, 0.301, 0.390, 0.307, 0.371
sodium sulfate: 0.393, 0.415, 0.351, 0.390, 0.385, 0.397
sodium chloride: 0.383, 0.405, 0.398, 0.352, 0.381, 0.407
6. An entomologist interested in the vertical distribution of a fly species collected the following data on numbers
of flies (no. flies/m3) from each of tree different vegetation layers. Use these data to test the hypothesis that fly
abundance was the same in all three vegetation layers. (Protocol link)
herbs shrubs trees
14.0 8.4 6.9
12.1 5.1 7.3
5.6 5.5 5.8
6.2 6.6 4.1
12.2 6.3 5.4
7. Use USOPHEO.SYD to determine if body size is affected by sex and/or location. Read the description of the
data file before proceeding. (Protocol link)
8. Qualime epithelial cancer is hypothesized to result from either genotype or several environmental factors that
vary by season. To address this hypothesis, use the data below on QSA level (g/g; the diagnostic test indicator
of qualime cancer) that were collected on 20 individuals in different seasons. (Protocol link)
QSA Genotype Season QSA Genotype Season QSA Genotype Season QSA Genotype Season
478 ZZ Winter 425 ZW Summer 428 ZZ Summer 466 ZW Winter
538 ZZ Winter 467 ZW Summer 478 ZZ Summer 522 ZW Winter
502 ZZ Winter 444 ZW Summer 455 ZZ Summer 489 ZW Winter
496 ZZ Winter 438 ZW Summer 446 ZZ Summer 475 ZW Winter
483 ZZ Winter 431 ZW Summer 432 ZZ Summer 501 ZW Winter
Practice problems
Nos. 3, 21, 22, 27, 54, 56, 57, 69, 70
16
Final Exam
Questions
1. A “powerful” statistical test is a test in which _____. (a) the probability of rejecting a false null hypothesis is
high; (b) the probability of rejecting a true null hypothesis is high; (c) the probability of accepting a false
null hypothesis is high; (d) the probability of accepting a true null hypothesis is high
2. This statistical procedure is used when one desires to predict the value of a dependent variable from
knowledge of the value of an independent variable. (a) regression analysis; (b) correlation analysis; (c)
goodness-of-fit; (d) data transformation; (e) analysis of variance
3. One can get a general idea of whether two means are significantly different if, on a graph, the values of these
do not overlap. (a) mean1SE; (b) mean2SD; (c) 95% confidence limits; (d) ranges; (e) means1SD
4. This term refers to the prediction of “Y” from a known value of “X” that is beyond the range of the actual
data. (a) extrapolation; (b) guessing; (c) transformation; (d) goodness-of-fit; (e) type II error
5. Which is false regarding data that are suitable for parametric tests? (a) sampled data are independent of each
other; (b) data are randomly sampled; (c) sampled data are normally distributed; (d) sampled data are
measured on an ordinal scale; (e) sampled data are measurements of continuous variables
6. This is the sum-of-squares divided by the sample size. (a) mode; (b) mean; (c) range; (d) median; (e)
weighted mean
7. R-squared (r2) is also known as the _____. (a) coefficient of variation; (b) coefficient of determination; (c)
parametric measure; (d) critical value; (e) measure of statistical power
8. The central limit theorem states that _____. (a) the means of samples from a normally distributed population
have a normal distribution; (b) the means of samples from a normally distributed population are always
skewed; (c) the means of samples from a normally distributed population are not normally distributed; (d)
the means of samples from a normally distributed population have no variance; (e) the means of samples
from a normally distributed population are significantly different from one another
9. The outcomes of statistical tests are usually found in this section of a primary literature paper. (a)
introduction; (b) materials and methods; (c) results; (d) discussion; (e) literature cited
10. A t-distribution with infinite degrees of freedom is identical to this distribution. (a) Poisson; (b) binomial; (c)
F; (d) Chi-square; (e) normal
11. An observed frequency distribution of a given type will more closely conform to a theoretical frequency
distribution of the same type under this condition. (a) decreased N; (b) increased N; (c) decreased range; (d)
increased mean; (e) decreased mean
12. Statistical “error” often refers to the level of confidence that one has regarding how well the statistics of
_____ estimate the statistics of _____. (a) samples, populations; (b) populations, samples; (c) parametrics,
nonparametrics; (d) precision, accuracy; (e) accuracy, precision.
13. The volume of blood (ml) is measured on this scale. (a) categorical; (b) ordinal; (c) ratio; (d) interval; (e)
continuous
14. Which of the following is a type I error? (a) rejection of a false null hypothesis; (b) rejection of a true null
hypothesis; (c) acceptance of a true null hypothesis; (d) acceptance of a false null hypothesis
15. The percentage results of political polls as reported on television usually have a “margin of error”
accompanying the percentages. What is a “margin of error?” (a) 95% confidence interval; (b) error
variance; (c) z-score; (d) normality test; (e) r2
16. In an ANCOVA, the covariate is a _____ variable. (a) dependent; (b) multivariate; (c) categorical; (d)
continuous; (e) derived
17. This calculated value is used in conjunction with a statistical table to determine the probability of a null
hypothesis being true.
18. The shape of this theoretical probability distribution is determined by the mean and standard deviation.
19. The cases of this distribution consist of individual sample means taken from a population.
The following questions have been taken from old exams in Biostats classes over the last 10-15 years.
Because Biostats changes to some degree each year, some questions may address items not covered
during the latest semester. In addition, new material covered in the latest Biostats class may not have
representative questions below.
17
20. The strength of the relationship in a correlation analysis is shown by this value. (a) intercept; (b) correlation
coefficient; (c) slope; (d) probability; (e) regression coefficient
21. The goal of this statistical test is to determine if the means of two separate groups are different. Data are not
normally distributed but the group variances are equal.
22. How many asterisks indicate a significance level of P<0.001?
23. This general term describes the conclusion about any null hypothesis that has been statistically rejected.
24. H0: σa2 = σb
2 is the proper null hypothesis for this statistical test.
25. This is the alpha level that most biologists use when testing a null hypothesis.
26. This principle states that sample means from a normally distributed population will be normally distributed
regardless of sample size.
27. This is the result when a true null hypothesis is rejected.
28. The risk of making a Type 2 error can be reduced by ________.
29. If the null hypothesis is A=B, the research hypothesis is ____.
30. The goal of this statistical test is to detect differences between the means of repeated measurements on
individuals in one group. Data are skewed and the group variances are unequal.
31. This is the test statistic for a Mann-Whitney test.
32. Nonparametric tests address either questions of differences or questions of _____.
33. This parametric test is considered to be robust.
34. This is the name of the tabled value of a test statistic at the specified alpha level.
35. The goal of this statistical test is to detect differences between two dependent means when the data meet
parametric test assumptions.
36. T-tests assume that variances between groups are homogeneous. How would you test this assumption?
37. H0: μa ≤ μb is a suitable null hypothesis for this nonparametric test.
38. This is the numerical relationship between the standard error of the mean and the standard deviation.
39. This mathematical theorem predicts that sample means from a non-normally distributed population will have
a normal distribution if the sample size is large enough.
40. This frequency distribution is basically a normal distribution whose shape varies with sample size.
41. The goal of this statistical test is to detect differences among variances of skewed data sets.
42. Which test is least powerful? (a) ANOVA; (b) Pearson’s correlation; (c) independent-samples t-test; (d)
paired t-test; (e) Mann-Whitney test
43. To determine the effect of two independent variables on a dependent variable, what is the advantage of
doing a single two-way ANOVA as opposed to two separate one-way ANOVAs? (a) a two-way ANOVA is
more robust; (b) a two-way ANOVA calculates the effect of a covariate; (c) a two-way ANOVA is easier to
use on a calculator; (d) a two-way ANOVA assesses possible interaction between the independent variables;
(e) a two-way ANOVA provides a test statistic
44. In a standard normal distribution, a z-score of _____ on each side of the mean encloses 95% of the cases. (a)
0.68; (b) 1.96; (c) 1.0; (d) 0.05; (e) 0.0
45. In a regression analysis, “Y” is the independent variable and “X” is the dependent variable. (a) true; (b) false
46. The shape of a Poisson distribution is determined by the _____. (a) mean, standard deviation; (b) p, n; (c)
geometric mean, SD, n; (d) mode, coefficient of variation; (e) mean
47. This is an important measure of data dispersion. (a) mean; (b) variance; (c) mode; (d) median; (e) Goodness-
of-Fit
48. On a SYSTAT dot graph, these graphics portray variation around the mean. (a) parameters; (b) interquartile
plots; (c) z-scores; (d) descriptive statistics; (e) error bars
49. This is the square root of the sum-of-squares divided by the sample size. (a) standard deviation; (b) mean;
(c) range; (d) median; (e) variance
50. In an interval scale of measurement, values are neither quantitative nor ranked, and there is no mathematical
or value relationship among them. (a) true; (b) false
51. This is an important measure of data central tendency. (a) scatterplot; (b) variance; (c) standard deviation;
(d) median; (e) sum-of-squares
52. The basic reason scientific knowledge has advanced so remarkably through the years is because many
dedicated scientists have proved thousands of hypotheses and theories (a) true; (b) false
53. The temperature of a human body in Celsius should be measured on a ratio scale. (a) true; (b) false
54. The various species contained within a particular genus of birds should be measured on a ranked scale. (a)
true; (b) false
55. The most common data transformation used in biology is the logarithmic transformation. (a) true; (b) false
18
56. This is the probability of obtaining two heads with one flip of two coins. (a) 0.50; (b) 0.25; (c) 0.10; (d) 1.0;
(e) 0.75
57. Data that are influenced by many small and unrelated random effects are frequently normally distributed.
As a consequence, normally distributed data are widespread and common in nature. (a) true; (b false)
58. The discipline of statistics concerns _____. (a) using quantitative properties of samples to answer questions
about populations; (b) tallying sports information; (c) how to confuse and frustrate students; (d) how to
support one’s predetermined ideas with numbers; (e) how to maximize business profit
59. This standardized expression permits one to directly compare the relative amount of variation associated
with two or more means of one variable. (a) average deviation; (b) variance; (c) median; (d) coefficient of
variation; (e) z-score
60. In SYSTAT, this is the preferred quantitative method for students to determine if data are normally
distributed. (a) histogram; (b) Tables; (c) dot graph; (d) probability plot; (e) Komolgorov-Smirnov test
61. In an ANOVA, this is the normal variation expected in individuals that is not a result of being part of a
“group.” It results from such things as individual genetic makeup and environmental history. (a) standard
deviation; (b) SE; (c) between group variance; (d) error variance; (e) coefficient of variation
62. These provide a graphical portrayal of variation around the mean. (a) error bars; (b) sampling distributions;
(c) z-scores; (d) essential descriptive statistics; (e) parameters
63. In this scale of measurement, values are neither quantitative nor ranked, and there is no mathematical or
value relationship among them. (a) ordinal; (b) interval; (c) continuous; (d) categorical; (e) ratio
64. The age of a viral particle is measured on this scale. (a) categorical; (b) ordinal; (c) ratio; (d) interval; (e)
continuous
65. In a regression analysis, the regression line is fitted to the data points by this method. (a) Kolmogorov-
Smirnov; (b) extrapolation; (c) ANOVA; (d) data transformation; (e) least squares
66. This is a measure of dispersion. (a) mean; (b) variance; (c) mode; (d) median; (e) regression plot
67. Who made this statement, “Isn’t that what science is all about...eliminating possibilities?” (a) Student; (b)
Sean Connery; (c) Dr. Who; (d) William Gossett; (e) Ronald Fisher
68. A robust statistical test is a test which _____. (a) is sensitive to deviations from the assumptions; (b) is
insensitive to deviations from the assumptions; (c) has no assumptions; (d) has a high probability of
accepting a true null hypothesis; (e) has a low probability of rejecting a true null hypothesis
69. Which is not an example of statistical inference? (a) calculating a sample mean; (b) estimating a population
mean; (c) estimating a population variance; (d) testing a statistical hypothesis; (e) estimating a population
median
70. “How heart rate relates to oxygen consumption varies from person to person. Age, weight, sex, body
composition, fitness level, and other factors all play a role. Drawing from population models and their own
research, the companies that manufacture heart rate monitors have developed formulas that couple heart rate
with those different variables and massage it all into an estimate of calorie usage. The onboard calculators
found on treadmills, elliptical trainers and other devices use basically the same approach. Depending on the
machines, however, they typically don’t allow you to enter as much information about yourself as a heart
monitor. The machine might ask for your weight and age, for example, but not your sex or an estimate of
your fitness level. Fewer variables mean a rougher guess.” In statistical terms, what is the meaning of the
last sentence, “Fewer variables mean a rougher guess?” (a) lower CV; (b) higher CV; (c) lower r2; (d) higher
r2; (e) lower probability
71. The computer output below contains the results of a log-log regression analysis of hemoglobin concentration
on blood volume. What is the proper exponential regression equation for these results? (a) Hemo =
1.45Vol2.56; (b) Hemo = 2.56Vol1.45; (c) Hemo = 28.2Vol 2.56; (d) Hemo = 0.33Vol1.45; (e) Hemo = 505Vol0.14
Predictor Coef SE Coef T P
Constant 1.45 0.33 1.091 0.317
Volume 2.56 0.14 22.469 0.000
Analysis of Variance
Source DF SS MS F P
Regression 1 156.661 156.661 504.866 0.000
Residual Error 6 1.862 0.310
Total 234 582.46
19
Choose the most appropriate statistical test
a. independent t-test ab. Kruskal-Wallis bcd. Pearson correlation ad. 2-way ANOVA
b. paired t-test bc. Bartlett’s cde. Spearman correlation ae. Willcoxon
c. Mann-Whitney cd. goodness-of-fit abcd. Tukey test
d. Levene’s de. test of independence bcde. Regression
e. 1-Way ANOVA abc. Kolmogorov-Smirnov ac. ANCOVA
72. The concentration of unicellular algae (measured as chlorophyll concentration per liter) was measured in two
independent samples each at two different depths in each of four lakes. We wish to know if there is a
difference in algae concentration between the two depths. The data are normally distributed and the group
variances are equal.
lake surface 1 m lake surface 1 m
1 425 130 3 100 30
1 433 147 3 113 29
2 500 215 4 312 103
2 488 221 4 325 100
73. White-throated sparrows occur in 2 distinct color morphs, referred to as brown and white. It was suspected
that females select mates of the opposite morph (i.e., white females select brown males and vice versa).
This phenomenon is known as negative assortive mating. In 30 mated pairs, the color combinations were as
follows. Do these data support the assumption that negative assortive mating occurs in this species?
males
white brown
females white 27 43
brown 44 25
74. Four tomato plants were treated with chlorogenic acid to determine if this would influence the activity (% of
maximum) of the enzyme 0-diphenol oxidase in the leaves. A control group of four plants were not treated.
We do not know if the variable (activity) is normally distributed, nor is it possible to determine this. Does
the treatment affect activity of the enzyme?
treated untreated treated untreated
35 10 38 11
45 18 29 8
75. We suspect that a certain strain of laboratory rates has a genetic tendency to make left turns in a “T” maze.
Of 12 rats that were tested in such a maze, 8 chose to go into the left arm and 4 chose the right arm. Do
these results support our suspicion about a left-turning tendency?
76. Determine if the following data on sprint speed (m/sec) in five-lined skinks are normally distributed: 1.7,
0.8, 1.1, 0.9, 1.2, 1.6, 0.9, 0.8, 1.0, 1.4, 0.7, 1.1, 0.7.
77. Random samples of a certain species of zooplankton were collected from 5 randomly selected lakes and
their selenium content was determined. We wish to know if there is a difference among lakes with respect
to selenium content in this species (i.e., is there a significant “lake effect”?). The data are normally
distributed and the group error variances are equal.
lake A lake B lake C lake D lake E
23 34 15 18 25
30 42 18 15 20
35 38 8 10 30
27 41 16 17 22
30 40 20 10 20
32 39 19 12 19
78. The goal is to detect differences among variances of normally-distributed data.
79. The goal is to detect differences among >2 means. Data are ranked.
80. The goal is to determine if each of 2 data sets are normally distributed.
81. The goal is to detect whether frequency data differ from a theoretical distribution.
20
82. The goal is to detect differences between >2 independent means. Data are not normally distributed nor are
the variances among groups equal..
83. The goal is to detect relationships between 2 or more ordinal variables. Data are not normally distributed.
84. The goal is to detect differences between the means of 2 separate groups. Data are not normally distributed
nor are the group variances equal.
Computer output interpretation
_____________________________________________________
Using the output below, answer the following questions:
TABLE OF SEX (ROWS) BY LOCATION (COLUMNS)
FREQUENCIES
AR KS NFL SFL STX VA TOTAL
-------------------------------------------------------------
F |90 21 28 70 67 42 | 318
| |
M |90 20 47 47 79 48 | 331
-------------------------------------------------------------
TOTAL 180 41 75 117 146 90 649
TEST STATISTIC VALUE DF PROB
PEARSON CHI-SQUARE 10.489 5 0.063
85. The value of the test statistic is _____. (a) 649; (b) <0.10; (c) 0.063; (d) 5; (e) 10.489
86. The null hypothesis should _____. (a) be rejected; (b) not be rejected
87. The conclusion is _____. (a) mean SEX is different from mean LOC; (b) SEX is unrelated to LOC; (c)
observed LOC = expected LOC; (d) AR = KS = NFL = SFL = STX = VA; (e) males are larger than female
_________________________________________________
Using the output below, answer the following questions:
H2OOUT versus H2OIN Predictor Coef SE Coef T P
Constant 0.433 0.472 1.091 0.317
HS0IN 0.317 0.912 22.469 0.000
S = 0.988 R-Sq = 98.8% R-Sq(adj) = 98.6%
Analysis of Variance
Source DF SS MS F P
Regression 1 156.661 156.661 504.866 0.000
Residual Error 6 1.862 0.310
Total 234 582.46
88. The regression equation is _____. (a) Y=0.433+0.317X; (b) Y=0.472+0.912X; (c) Y=0.317-0.433X; (d)
Y=156.661+0.317X; (e) Y=504.866+0.433X
89. The value of Y (H2OOUT) when X (H2OIN) = 9.4 is _____. (a) 9.04; (b) -2.55; (c) 507.78; (d) 159.58; (e)
3.400
90. The dependent variable is _____. (a) RESIDUAL; (b) H20IN; (c) H20OUT; (d) X; (e) CONSTANT
21
Terms and concepts
process and philosophy of science (observations,
questions, hypotheses, theories, prediction, if-
then, correlational and experimental tests, data,
facts, scientific “proof”, relationship of ideas and
data - ID, limitations of science)
variables (measured, derived, dependent,
independent, response, predictor), data, case,
observation
data collection (population, sample, error,
random, independence, sample size)
measurement scales and kinds of variables
(nominal/categorical, ranked/ordinal,
interval/ratio, continuous, discrete)
practical: identifying variables and measurement
scales
Frequency distributions (histogram)
central tendency (mode, median, mean, weighted
mean)
dispersion (maximum, minimum, range,
interquartile range, sum-of-squares, standard
deviation, variance, coefficient of variation)
parameters and statistics
reporting sample means (necessity of measure of
dispersion, error bars)
calculating descriptive statistics with a calculator
and with SYSTAT (raw data file, frequency data
file)
Goodness-of-fit
Probability distributions
binomial (mutually exclusive [either/or]
categories; defined by p, n
Poisson (rare and random events; defined by
mean)
calculating terms of binomial and Poisson
distributions
comparison of observed and expected
distributions (influence of sample size)
importance of normal distribution in statistics
properties of normal distribution (defined by
mean and standard deviation)
areas of normal curve
standard normal distribution (z-scores)
testing for normality (Probability plot and
Kolmogorov-Smirnoff test)
skewness
test statistic
parametric and non-parametric tests
data transformation (logarithmic, square root,
arcsine)
statistical inference
major categories of statistical inference
sampling distribution
central limit theorem
Student’s t-distribution
standard error of mean
95% confidence limits
reporting sample means
graphical error bars
hypothesis testing
research hypothesis
null hypothesis
test statistic
critical value
alpha level
one and two-tailed tests
type I & II errors
relationship of type I & II errors
power of a test
significance level
statistical significance
parametric and nonparametric tests
assumptions of a test
Bartlett’s test
Levene’s test
robust test
testing for differences
independent samples t-test
paired samples t-test
repeated measures tests
Mann-Whitney test
Wilcoxon test
graphical analysis of differences between
means
correlation, causation
regression (assumptions; null hypothesis,
intercept, slope)
residuals
regression equation (linear, semi-log, log-
log; exponential)
prediction
extrapolation
model building
use of exponential regression in biology
analysis of variance
F-ratio
F-distribution
Between-group variance
Within-group (error) variance
Post-hoc pairwise tests
one-way ANOVA
two-way ANOVA
Tukey test
factor
22
interaction
synergism
antagonism
residuals
Kruskal-Wallis test
DSCF test
ANCOVA
interaction plot
covariate
least squares means
the problem of multiple comparisons
circular statistics
Principal components
MANOVA
Repeated measures ANOVA
Logistic regression
Polynomial regression
Multiple regression
analysis of covariance (ANCOVA)