1. SummarizingSummarizing DescriptiveDescriptive
RelationshipsRelationships Lets ReviewLets Review Dr. Pedro L.
Martinez Introduction to Educational Research
2. What we have Learned!What we have Learned! 1. Inferential
statistics helps us to determine if what we have observed in a
sample, represents a similar phenomenon in the population. 2. The
assumption is that our sample is quite similar to the population
being studied and we operate under the premise that we have a
obtained a normal distribution in our sample when looking at scores
of any type.
3. MoreMore 3. Normal distributions have standard deviations
and are symmetrical. 4. There is a probability that we may not have
a normal distribution from our sample (error), therefore we cannot
make inferences if thats the case.
4. What do we do?What do we do? A. We try to make our
distributions as normal as possible so that it represents the one
in the population. B. When this does not happen, we rectify the
problem by utilizing other statistics that are associated with
central tendency measures. Solutions: z- z scores t- family of t
scores
5. What are z and t scores?What are z and t scores? A z score
is used to determine where one particular score stands with the
rest of the scores in a distribution. Central Tendency measures
gives us parameters but not the distance of each score from the
mean. However using the mean and the standard deviation allows to
calculate the z score. z cores also help us to compare two
individual scores when we compare two variables. (e.g. a Math test
score and a Spelling score)
6. The Answer is : StandardizationThe Answer is :
Standardization z scores help us to standardize scores in order to
compare individual scores with different variables. Scores in
different tests use different scales and does not permit to compare
the scores from different distributions unless we standardize them.
Standardization is a process of converting each individual score
into a distribution to a z score, thus telling you how far from the
means a given score is.
7. ExampleExample Student X has taken two exams, one in Biology
and the other in Statistics. Here are the scores from the total
number of answers in each exam. Biology 65 out of 100 items
Statistics 42 out of 200 items Question?- In which test did student
X do better? What do you mean by better?
8. Lets look at each of theseLets look at each of these
distributions:distributions: Score Mean SD Biology 65 60 10
Statistics 42 37 5 So in what test did student X perform
better?
9. What did you mean by better?What did you mean by better? 1.
If I am asking the percentage of correct answers then my obvious
answer is __________. 2. But wait a minute that is not fair, the
Statistics exam was more difficult! 3. What is the dilemma?
10. Your answer should be:Your answer should be: A) How did
Student X do in comparison to other students? B) We could answer
this by looking at the mean and standard deviation in each of the
two exams. They are different distributions.
11. ConclusionsConclusions 1. Can we compare two scores if the
scales are different? A) Depending on your answer what is the next
step? B) Think before you answer!
12. StandardizationStandardization You cannot compare two
different scores when the scales are different. We need the same
scale(standardization) When we take raw scores from a test we can
convert them into standard deviation units through the use of z
scores. Formula: z= raw score-mean standard deviation
13. Lets pretend thatLets pretend that Student X took a
spelling test and received a 1.0 in her z score. What can you tell
from this score? 1) __________________________ 2)
__________________________ 3) __________________________
14. Now that you know how to find a z score letsNow that you
know how to find a z score lets consider the following:consider the
following: When a distribution of scores is standardized the
average (mean) for the distribution is 0 and the standard deviation
is 1.0 What does z score tell us if a z score= -1.5 What does a z
score tell us if a z score=.29 It can also tell us if: A) An
individual does better or worse than the average person. B) How
much a score is above or below the average C) If the score is
better or worse to the rest of other scores
15. The z scores formula depends on whether youThe z scores
formula depends on whether you are observing a population or
sampleare observing a population or sample A normal distribution
that is standardized (so that it has a mean of 0 and a SD of 1) is
called the standard normal distribution, or the normal distribution
of z-scores. If we know the mean m ("mu"), and standard deviation s
("sigma") of a set of scores which are normally distributed, we can
standardize each "raw" score, x, by converting it into a z score by
using the following formula on each individual score: Where x-bar
and s are used as estimators for the population's true mean and
standard deviation. Both formulas essentially calculate the same
thing:
16. What is it from this scoreWhat is it from this score that I
do not know?that I do not know? 1. I dont know of the student did
better or worse than the average score. 2. I dont know how much the
score is below or above the mean. 3) I dont know how relatively
better or worse this score in comparison to the rest of those
scores that are associated with the distribution of scores from
that given Spelling Test..
17. Suppose I told you the following:Suppose I told you the
following: The average score in that Spelling test was 12 and the
total items in this Spelling test was 50. The test taker is 7 year
old! Dont despair, statisticians have already figured out and can
predict the percentage of scores that will fall between the mean
and a z score!
18. z scores can provide you with:z scores can provide you
with: 1) determine percentile scores. 2) The mean in a z score is
equal to 0. 3) From this point we can determine that 50% of scores
fall on either side of the means. Can you explain why?
19. Figuring Out Percentiles with z scoresFiguring Out
Percentiles with z scores Step 1Step 1 The average SAT score for a
white male is 517. Suppose I want to know what score marks the 90th
percentile? Step 1-Use the z score table. And find the z score that
marks closet to 90th percentile. The closest is 8997. The z score
is 1.28 (intersection) So a z score of 1.28 corresponds to the 90th
percentile. What would be a z score that represents the 75th
percentile?
20. Step 2 Convert z score to a raw scoreStep 2 Convert z score
to a raw score We know what score represents the 90th percentile
and we know the means is 517. But we do not know the real score
that marks the 90th percentile. X values can be changed into
z-scores just as z-scores can be changed into X values Step 2-
Convert the z score into the original unit of measurement. We use
this formula:
21. z-scores (cont.)z-scores (cont.) The formula for changing X
values into z-scores is X = + (z) () X=517 + (1.28) (100) X=517
+128 X=645 X is a standard deviation score of a z score.
22. Answer using this formula isAnswer using this formula is
X=517 + (1.28) (100) X=517 +128 X=645 The score that marks the 90th
percentile for white males that took the SAT score in 2008 is
647.
23. We can also use a z score toWe can also use a z score to
convert an know raw score into aconvert an know raw score into a
percentile scorepercentile score If student X in my SAT
distribution has a score of 425 on the SAT Math test. And if I want
to know how many students scored above or below this score? Then:
Step 1-Convert the raw score back into a z score
24. Step 1 Covert Raw Score toStep 1 Covert Raw Score to a z
scorea z score z=425-517 100 Z= -92 100 Z= -.92
25. Step 2 Use Appendix AStep 2 Use Appendix A Find the z score
that is equivalent to .92 on the left column moving vertically. Z
scores are not reported as negative because the scores in a normal
distribution are symmetrical. So the proportion falling above or
below is the same. So what does the z score tell me? It tells me by
using the table that 82.12 % of scores scored below a z score of
.92. It also tells me that 17.88% of the distribution will fall
beyond a z value of .92.
26. Step 3Step 3 A z score of -.92 corresponds to a raw score
of 425 on the Sat Math exam. A score of 425 of this test marks the
17.88th percentile among the distribution of white males taking the
exam in 2008.
27. z-scores (cont.)z-scores (cont.) We are able to transform
every raw score in our distribution into a distribution of z-scores
This new distribution of z-scores will have 3 main properties 1. It
will have the same shape as the distribution of X values (if the X
distribution was normal, the z distribution will be normal) 2. It
will always have a mean of zero 3. It will always have a standard
deviation of one (example on problem This z-score distribution is
called a standardized distribution (being standardized now enables
us to compare distributions that we werent able to compare
before)
28. z scores can also determine thez scores can also determine
the proportion of scores that fallproportion of scores that fall
between two scoresbetween two scores Suppose that John received a
score of 417 on the Sat Exam. His cousin Mark received a score of
567. The Joes family are always quarreling as to whose son is the
brightest. Mark gets smart and says to John, I blew you away, there
must 50 % of the students that took this test between you and me.
John is upset and want to show his cousin he is wrong! What must he
do?
29. Comparing Raw ScoresComparing Raw Scores using z
scoresusing z scores The formula for changing X values into
z-scores is z = X Step 1-Convert both raw scores to z scores
417-517 100 z= -100 100 z=-1.00
30. Marks scoreMarks score z=567-517 100 z= 50 z=.50 100
31. Step 2 Using Appendix AStep 2 Using Appendix A Find the z
scores that correspond to -1.00 and .50. Appendix A tells us that
.8413 of the distribution falls below a z value of 1.00. (Remember
that the means splits both distributions by 50/50. So 50% of scores
will fall below the mean. .8413-.50, this tells us that 34.13 % of
the normal distribution will fall below between the mean and a z
score of 1.00
32. Step 2 ContinuedStep 2 Continued Using the same process we
know that a z score of .50 that 69.15 of the distribution falls
below a z score of .50. Thus 19.15% of the scores fall between the
mean and a z score of .50. Recall that one z score is positive and
the other is negative. So if we add the both area of scores we find
the total area of these we find the total distribution of scores
between these two scores and the answer is .34.13 + .1915+ 53.28%.
John must accept defeat!
33. You are on your own.You are on your own. Mark has another
cousin, Martin who scored 617 in the Math Test score. 1. Determine
the proportion of the population that scored between 617 and
517?
34. Sweet Revenge!Sweet Revenge! The answer is 14.98% In your
face Mark!
35. Take a BreakTake a Break Stop here!
36. Scatter PlotsScatter Plots We can prepare a scatter
plotscatter plot by placing one point for each pair of two
variables that represent an observation in the data set. The
scatter plot provides a picture of the data including the
following: 1. Range of each variable; 2. Pattern of values over the
range; 3. A suggestion as to a possible relationship between the
two variables; 4. Indication of outliers (extreme points).
37. Here are the types ofHere are the types of scatter plots
you are likelyscatter plots you are likely to see:to see: This
could show how the distance travelled in a vehicle increases as
time increases, if the vehicle maintains a constant speed. This
could show the increase in a student's height as their grade
38. Scattergrams or ScatterScattergrams or Scatter PlotsPlots
Scatter plots are used by researchers to look for correlations. A
correlation is a relationship between the data, which can suggest
that one event may affect another event. For example, you might
want to discover whether more hours of studying will affect your
Math mark in school. Perhaps a scientist wants to find out if the
distance people live from a major city affects their health.
39. X and Y AxisX and Y Axis In order to use scatter plots in
this way, you must have two sets of numerical data. One set is
plotted on the x-axis of a graph, and the other set is plotted on
the y-axis. The resulting scatter plot will often show at a glance
whether a relationship exists between the two sets of data.
40. ExampleExample Relationship between hours studying and test
score Here's an example. Suppose you want to find out whether more
hours spent studying will have an affect on a person's mark. You
set up an experiment with some people, recording how many hours
they spent studying and then recording what happened to their mark.
A correlation is a relationship between the data, which can suggest
that one event may affect another event.
41. Seeing Patterns toSeeing Patterns to determine
Relationshipdetermine Relationship You can see the data in the
table at the right. It's difficult to see any pattern in the table,
although it's clear that different things happened to different
people. One person studied for 1 hour and had their mark go up 2%,
while another person who also studied for 1 hour saw a drop of
1%!
42. Line of Best FitLine of Best Fit Here is the graph again.
We've shown a line that seems to describe the direction the points
are heading in. This is called the line of best fit.
43. Representation of PositiveRepresentation of Positive
CorrelationsCorrelations
44. Positive correlationsPositive correlations
45. NegativeNegative
46. No CorrelationNo Correlation
47. CovarianceCovariance The covariancecovariance is a measure
of the linear relationship between two variables. A positive value
indicates a direct or increasing linear relationship and a negative
value indicates a decreasing linear relationship. The covariance
calculation is defined by the equation where xi and yi are the
observed values, X and Y are the sample means, and n is the sample
size. 1 ))(( ),( 1 == = n YyXx syxCov n i ii xy
48. CovarianceCovariance Scatter Plots of IdealizedScatter
Plots of Idealized Positive and Negative CovariancePositive and
Negative Covariance X Y x y * * * * * * * * * * * * * Positive
Covariance (Figure 3.5a) X Y x y * * * * * * * * * * * * Negative
Covariance (Figure 3.5b)
49. Correlation CoefficientCorrelation Coefficient The sample
correlation coefficient, rsample correlation coefficient, rxyxy,,
is computed by the equation yx xy ss yxCov r ),,( =
50. Correlation CoefficientCorrelation Coefficient 1. The
correlation ranges from 1 to +1 with, rxy = +1 indicates a perfect
positive linear relationship the X and Y points would plot an
increasing straight line. rxy = 0 indicates no linear relationship
between X and Y. rxy = -1 indicates a perfect negative linear
relationship the X and Y points would plot a decreasing straight
line. 1.1. Positive correlationsPositive correlations indicate
positive or increasing linear relationships with values closer to
+1 indicating data points closer to a straight line and closer to 0
indicating greater deviations from a straight line. 2.2. Negative
correlationsNegative correlations indicate decreasing linear
relationships with values closer to 1 indicating points closer to a
straight line and closer to 0 indicating greater deviations from a
straight line.
51. Scatter Plots andScatter Plots and CorrelationCorrelation
(Figure 3.6)(Figure 3.6) X Y (a) r = .8(a) r = .8
52. X Y (b)r = -.8(b)r = -.8 Scatter Plots andScatter Plots and
CorrelationCorrelation (Figure 3.6)(Figure 3.6)
53. Scatter Plots andScatter Plots and CorrelationCorrelation
(Figure 3.6)(Figure 3.6) X Y (c) r = 0(c) r = 0
54. Linear RelationshipsLinear Relationships Linear
relationshipsLinear relationships can be represented by the basic
equation where Y is the dependent or endogenous variable that is a
function of X the independent or exogenous variable. The model
contains two parameters, 0 and 1 that are defined as model
coefficients. The coefficient 0, is the intercept on the Y-axis and
the coefficient 1 is the change in Y for every unit change in X. XY
10 +=
55. Linear RelationshipsLinear Relationships
(continued)(continued) The nominal assumption made in linear
applications is that different values of X can be set and there
will be a corresponding mean value of Y that results because of the
underlying linear process being studied. The linear equation model
computes the mean of Y for every value of X. This idea is the basis
for Pearsons Product Moment Coefficient in obtaining and
partitioning the relationship between variables and how much effect
they have with one another. In educational issues, there is no
56. Least Squares RegressionLeast Squares Regression Least
Squares RegressionLeast Squares Regression is a technique used to
obtain estimates (i.e. numerical values) for the linear
coefficients 0 and 1. These estimates are usually defined as b0 and
b1 respectively.
57. Cross TablesCross Tables Cross TablesCross Tables present
the number of observations that are defined by the joint occurrence
of specific intervals for two variables. The combination of all
possible intervals for the two variables defines the cells in a
table.
58. Key WordsKey Words Least Squares Estimation Procedure Least
Squares Regression Sample Correlation Coefficient Sample Covariance
Scatter Plot
59. References for Additional HelpReferences for Additional
Help http://www.worsleyschool.net/science/files/scat
http://www.oswego.edu/~srp/stats/z.htm