Download - Zscores

1. SummarizingSummarizing DescriptiveDescriptive RelationshipsRelationships Lets ReviewLets Review Dr. Pedro L. Martinez Introduction to Educational Research

2. What we have Learned!What we have Learned! 1. Inferential statistics helps us to determine if what we have observed in a sample, represents a similar phenomenon in the population. 2. The assumption is that our sample is quite similar to the population being studied and we operate under the premise that we have a obtained a normal distribution in our sample when looking at scores of any type.

3. MoreMore 3. Normal distributions have standard deviations and are symmetrical. 4. There is a probability that we may not have a normal distribution from our sample (error), therefore we cannot make inferences if thats the case.

4. What do we do?What do we do? A. We try to make our distributions as normal as possible so that it represents the one in the population. B. When this does not happen, we rectify the problem by utilizing other statistics that are associated with central tendency measures. Solutions: z- z scores t- family of t scores

5. What are z and t scores?What are z and t scores? A z score is used to determine where one particular score stands with the rest of the scores in a distribution. Central Tendency measures gives us parameters but not the distance of each score from the mean. However using the mean and the standard deviation allows to calculate the z score. z cores also help us to compare two individual scores when we compare two variables. (e.g. a Math test score and a Spelling score)

6. The Answer is : StandardizationThe Answer is : Standardization z scores help us to standardize scores in order to compare individual scores with different variables. Scores in different tests use different scales and does not permit to compare the scores from different distributions unless we standardize them. Standardization is a process of converting each individual score into a distribution to a z score, thus telling you how far from the means a given score is.

7. ExampleExample Student X has taken two exams, one in Biology and the other in Statistics. Here are the scores from the total number of answers in each exam. Biology 65 out of 100 items Statistics 42 out of 200 items Question?- In which test did student X do better? What do you mean by better?

8. Lets look at each of theseLets look at each of these distributions:distributions: Score Mean SD Biology 65 60 10 Statistics 42 37 5 So in what test did student X perform better?

9. What did you mean by better?What did you mean by better? 1. If I am asking the percentage of correct answers then my obvious answer is __________. 2. But wait a minute that is not fair, the Statistics exam was more difficult! 3. What is the dilemma?

10. Your answer should be:Your answer should be: A) How did Student X do in comparison to other students? B) We could answer this by looking at the mean and standard deviation in each of the two exams. They are different distributions.

11. ConclusionsConclusions 1. Can we compare two scores if the scales are different? A) Depending on your answer what is the next step? B) Think before you answer!

12. StandardizationStandardization You cannot compare two different scores when the scales are different. We need the same scale(standardization) When we take raw scores from a test we can convert them into standard deviation units through the use of z scores. Formula: z= raw score-mean standard deviation

13. Lets pretend thatLets pretend that Student X took a spelling test and received a 1.0 in her z score. What can you tell from this score? 1) __________________________ 2) __________________________ 3) __________________________

14. Now that you know how to find a z score letsNow that you know how to find a z score lets consider the following:consider the following: When a distribution of scores is standardized the average (mean) for the distribution is 0 and the standard deviation is 1.0 What does z score tell us if a z score= -1.5 What does a z score tell us if a z score=.29 It can also tell us if: A) An individual does better or worse than the average person. B) How much a score is above or below the average C) If the score is better or worse to the rest of other scores

15. The z scores formula depends on whether youThe z scores formula depends on whether you are observing a population or sampleare observing a population or sample A normal distribution that is standardized (so that it has a mean of 0 and a SD of 1) is called the standard normal distribution, or the normal distribution of z-scores. If we know the mean m ("mu"), and standard deviation s ("sigma") of a set of scores which are normally distributed, we can standardize each "raw" score, x, by converting it into a z score by using the following formula on each individual score: Where x-bar and s are used as estimators for the population's true mean and standard deviation. Both formulas essentially calculate the same thing:

16. What is it from this scoreWhat is it from this score that I do not know?that I do not know? 1. I dont know of the student did better or worse than the average score. 2. I dont know how much the score is below or above the mean. 3) I dont know how relatively better or worse this score in comparison to the rest of those scores that are associated with the distribution of scores from that given Spelling Test..

17. Suppose I told you the following:Suppose I told you the following: The average score in that Spelling test was 12 and the total items in this Spelling test was 50. The test taker is 7 year old! Dont despair, statisticians have already figured out and can predict the percentage of scores that will fall between the mean and a z score!

18. z scores can provide you with:z scores can provide you with: 1) determine percentile scores. 2) The mean in a z score is equal to 0. 3) From this point we can determine that 50% of scores fall on either side of the means. Can you explain why?

19. Figuring Out Percentiles with z scoresFiguring Out Percentiles with z scores Step 1Step 1 The average SAT score for a white male is 517. Suppose I want to know what score marks the 90th percentile? Step 1-Use the z score table. And find the z score that marks closet to 90th percentile. The closest is 8997. The z score is 1.28 (intersection) So a z score of 1.28 corresponds to the 90th percentile. What would be a z score that represents the 75th percentile?

20. Step 2 Convert z score to a raw scoreStep 2 Convert z score to a raw score We know what score represents the 90th percentile and we know the means is 517. But we do not know the real score that marks the 90th percentile. X values can be changed into z-scores just as z-scores can be changed into X values Step 2- Convert the z score into the original unit of measurement. We use this formula:

21. z-scores (cont.)z-scores (cont.) The formula for changing X values into z-scores is X = + (z) () X=517 + (1.28) (100) X=517 +128 X=645 X is a standard deviation score of a z score.

22. Answer using this formula isAnswer using this formula is X=517 + (1.28) (100) X=517 +128 X=645 The score that marks the 90th percentile for white males that took the SAT score in 2008 is 647.

23. We can also use a z score toWe can also use a z score to convert an know raw score into aconvert an know raw score into a percentile scorepercentile score If student X in my SAT distribution has a score of 425 on the SAT Math test. And if I want to know how many students scored above or below this score? Then: Step 1-Convert the raw score back into a z score

24. Step 1 Covert Raw Score toStep 1 Covert Raw Score to a z scorea z score z=425-517 100 Z= -92 100 Z= -.92

25. Step 2 Use Appendix AStep 2 Use Appendix A Find the z score that is equivalent to .92 on the left column moving vertically. Z scores are not reported as negative because the scores in a normal distribution are symmetrical. So the proportion falling above or below is the same. So what does the z score tell me? It tells me by using the table that 82.12 % of scores scored below a z score of .92. It also tells me that 17.88% of the distribution will fall beyond a z value of .92.

26. Step 3Step 3 A z score of -.92 corresponds to a raw score of 425 on the Sat Math exam. A score of 425 of this test marks the 17.88th percentile among the distribution of white males taking the exam in 2008.

27. z-scores (cont.)z-scores (cont.) We are able to transform every raw score in our distribution into a distribution of z-scores This new distribution of z-scores will have 3 main properties 1. It will have the same shape as the distribution of X values (if the X distribution was normal, the z distribution will be normal) 2. It will always have a mean of zero 3. It will always have a standard deviation of one (example on problem This z-score distribution is called a standardized distribution (being standardized now enables us to compare distributions that we werent able to compare before)

28. z scores can also determine thez scores can also determine the proportion of scores that fallproportion of scores that fall between two scoresbetween two scores Suppose that John received a score of 417 on the Sat Exam. His cousin Mark received a score of 567. The Joes family are always quarreling as to whose son is the brightest. Mark gets smart and says to John, I blew you away, there must 50 % of the students that took this test between you and me. John is upset and want to show his cousin he is wrong! What must he do?

29. Comparing Raw ScoresComparing Raw Scores using z scoresusing z scores The formula for changing X values into z-scores is z = X Step 1-Convert both raw scores to z scores 417-517 100 z= -100 100 z=-1.00

30. Marks scoreMarks score z=567-517 100 z= 50 z=.50 100

31. Step 2 Using Appendix AStep 2 Using Appendix A Find the z scores that correspond to -1.00 and .50. Appendix A tells us that .8413 of the distribution falls below a z value of 1.00. (Remember that the means splits both distributions by 50/50. So 50% of scores will fall below the mean. .8413-.50, this tells us that 34.13 % of the normal distribution will fall below between the mean and a z score of 1.00

32. Step 2 ContinuedStep 2 Continued Using the same process we know that a z score of .50 that 69.15 of the distribution falls below a z score of .50. Thus 19.15% of the scores fall between the mean and a z score of .50. Recall that one z score is positive and the other is negative. So if we add the both area of scores we find the total area of these we find the total distribution of scores between these two scores and the answer is .34.13 + .1915+ 53.28%. John must accept defeat!

33. You are on your own.You are on your own. Mark has another cousin, Martin who scored 617 in the Math Test score. 1. Determine the proportion of the population that scored between 617 and 517?

34. Sweet Revenge!Sweet Revenge! The answer is 14.98% In your face Mark!

35. Take a BreakTake a Break Stop here!

36. Scatter PlotsScatter Plots We can prepare a scatter plotscatter plot by placing one point for each pair of two variables that represent an observation in the data set. The scatter plot provides a picture of the data including the following: 1. Range of each variable; 2. Pattern of values over the range; 3. A suggestion as to a possible relationship between the two variables; 4. Indication of outliers (extreme points).

37. Here are the types ofHere are the types of scatter plots you are likelyscatter plots you are likely to see:to see: This could show how the distance travelled in a vehicle increases as time increases, if the vehicle maintains a constant speed. This could show the increase in a student's height as their grade

38. Scattergrams or ScatterScattergrams or Scatter PlotsPlots Scatter plots are used by researchers to look for correlations. A correlation is a relationship between the data, which can suggest that one event may affect another event. For example, you might want to discover whether more hours of studying will affect your Math mark in school. Perhaps a scientist wants to find out if the distance people live from a major city affects their health.

39. X and Y AxisX and Y Axis In order to use scatter plots in this way, you must have two sets of numerical data. One set is plotted on the x-axis of a graph, and the other set is plotted on the y-axis. The resulting scatter plot will often show at a glance whether a relationship exists between the two sets of data.

40. ExampleExample Relationship between hours studying and test score Here's an example. Suppose you want to find out whether more hours spent studying will have an affect on a person's mark. You set up an experiment with some people, recording how many hours they spent studying and then recording what happened to their mark. A correlation is a relationship between the data, which can suggest that one event may affect another event.

41. Seeing Patterns toSeeing Patterns to determine Relationshipdetermine Relationship You can see the data in the table at the right. It's difficult to see any pattern in the table, although it's clear that different things happened to different people. One person studied for 1 hour and had their mark go up 2%, while another person who also studied for 1 hour saw a drop of 1%!

42. Line of Best FitLine of Best Fit Here is the graph again. We've shown a line that seems to describe the direction the points are heading in. This is called the line of best fit.

43. Representation of PositiveRepresentation of Positive CorrelationsCorrelations

44. Positive correlationsPositive correlations

45. NegativeNegative

46. No CorrelationNo Correlation

47. CovarianceCovariance The covariancecovariance is a measure of the linear relationship between two variables. A positive value indicates a direct or increasing linear relationship and a negative value indicates a decreasing linear relationship. The covariance calculation is defined by the equation where xi and yi are the observed values, X and Y are the sample means, and n is the sample size. 1 ))(( ),( 1 == = n YyXx syxCov n i ii xy

48. CovarianceCovariance Scatter Plots of IdealizedScatter Plots of Idealized Positive and Negative CovariancePositive and Negative Covariance X Y x y * * * * * * * * * * * * * Positive Covariance (Figure 3.5a) X Y x y * * * * * * * * * * * * Negative Covariance (Figure 3.5b)

49. Correlation CoefficientCorrelation Coefficient The sample correlation coefficient, rsample correlation coefficient, rxyxy,, is computed by the equation yx xy ss yxCov r ),,( =

50. Correlation CoefficientCorrelation Coefficient 1. The correlation ranges from 1 to +1 with, rxy = +1 indicates a perfect positive linear relationship the X and Y points would plot an increasing straight line. rxy = 0 indicates no linear relationship between X and Y. rxy = -1 indicates a perfect negative linear relationship the X and Y points would plot a decreasing straight line. 1.1. Positive correlationsPositive correlations indicate positive or increasing linear relationships with values closer to +1 indicating data points closer to a straight line and closer to 0 indicating greater deviations from a straight line. 2.2. Negative correlationsNegative correlations indicate decreasing linear relationships with values closer to 1 indicating points closer to a straight line and closer to 0 indicating greater deviations from a straight line.

51. Scatter Plots andScatter Plots and CorrelationCorrelation (Figure 3.6)(Figure 3.6) X Y (a) r = .8(a) r = .8

52. X Y (b)r = -.8(b)r = -.8 Scatter Plots andScatter Plots and CorrelationCorrelation (Figure 3.6)(Figure 3.6)

53. Scatter Plots andScatter Plots and CorrelationCorrelation (Figure 3.6)(Figure 3.6) X Y (c) r = 0(c) r = 0

54. Linear RelationshipsLinear Relationships Linear relationshipsLinear relationships can be represented by the basic equation where Y is the dependent or endogenous variable that is a function of X the independent or exogenous variable. The model contains two parameters, 0 and 1 that are defined as model coefficients. The coefficient 0, is the intercept on the Y-axis and the coefficient 1 is the change in Y for every unit change in X. XY 10 +=

55. Linear RelationshipsLinear Relationships (continued)(continued) The nominal assumption made in linear applications is that different values of X can be set and there will be a corresponding mean value of Y that results because of the underlying linear process being studied. The linear equation model computes the mean of Y for every value of X. This idea is the basis for Pearsons Product Moment Coefficient in obtaining and partitioning the relationship between variables and how much effect they have with one another. In educational issues, there is no

56. Least Squares RegressionLeast Squares Regression Least Squares RegressionLeast Squares Regression is a technique used to obtain estimates (i.e. numerical values) for the linear coefficients 0 and 1. These estimates are usually defined as b0 and b1 respectively.

57. Cross TablesCross Tables Cross TablesCross Tables present the number of observations that are defined by the joint occurrence of specific intervals for two variables. The combination of all possible intervals for the two variables defines the cells in a table.

58. Key WordsKey Words Least Squares Estimation Procedure Least Squares Regression Sample Correlation Coefficient Sample Covariance Scatter Plot

59. References for Additional HelpReferences for Additional Help http://www.worsleyschool.net/science/files/scat http://www.oswego.edu/~srp/stats/z.htm