SW388R6 Data Analysis and Computers I Slide 1 Percentiles and Standard Scores Sample Percentile...

download SW388R6 Data Analysis and Computers I Slide 1 Percentiles and Standard Scores Sample Percentile Homework Problem Solving the Percentile Problem with SPSS.

If you can't read please download the document

Transcript of SW388R6 Data Analysis and Computers I Slide 1 Percentiles and Standard Scores Sample Percentile...

  • Slide 1

SW388R6 Data Analysis and Computers I Slide 1 Percentiles and Standard Scores Sample Percentile Homework Problem Solving the Percentile Problem with SPSS Sample Standard Score Homework Problem Solving the Standard Score Problem with SPSS Logic for Percentile Problems Logic for Standard Score Problems Slide 2 SW388R6 Data Analysis and Computers I Slide 2 Based on percentiles for the variable "occupational prestige score" [prestg80] in the dataset GSS2000R.Sav, is the following statement true, false, or an incorrect application of a statistic? A value of 66 for the variable "occupational prestige score" would position a survey respondent in the top 10% of cases in the data set. o True o True with caution o False o Incorrect application of a statistic Homework problems: Percentiles - 1 This is the general framework for the problems in the homework assignment on percentiles and zscores. You will be asked whether or not a particular value or score can accurately be characterized as placing a subject in the top 5% or 10% of the cases in the dataset. Slide 3 SW388R6 Data Analysis and Computers I Slide 3 Based on percentiles for the variable "occupational prestige score" [prestg80] in the dataset GSS2000R.Sav, is the following statement true, false, or an incorrect application of a statistic? A value of 66 for the variable "occupational prestige score" would position a survey respondent in the top 10% of cases in the data set. o True o True with caution o False o Incorrect application of a statistic Homework problems: Percentiles - 2 The first paragraph identifies: The data set to use, e.g. GSS2000R.Sav The statistic to use, e.g. percentile or zscore The variable used in the analysis Slide 4 SW388R6 Data Analysis and Computers I Slide 4 Based on percentiles for the variable "occupational prestige score" [prestg80] in the dataset GSS2000R.Sav, is the following statement true, false, or an incorrect application of a statistic? A value of 66 for the variable "occupational prestige score" would position a survey respondent in the top 10% of cases in the data set. o True o True with caution o False o Incorrect application of a statistic Homework problems: Percentiles - 3 The second paragraph identifies: The value of the variable to test The percentage expected for the value tested Slide 5 SW388R6 Data Analysis and Computers I Slide 5 Based on percentiles for the variable "occupational prestige score" [prestg80] in the dataset GSS2000R.Sav, is the following statement true, false, or an incorrect application of a statistic? A value of 66 for the variable "occupational prestige score" would position a survey respondent in the top 10% of cases in the data set. o True o True with caution o False o Incorrect application of a statistic Homework problems: Percentiles - 4 The answer to a problem will be True if the computed percentile for the tested value supports the finding in the problem statement. The answer to a problem will Incorrect application of a statistic if the computed statistic violates the level of measurement requirement, i.e. the variable is not ordinal or interval level. The answer to a problem will be False if the computed percentile for the tested value does not support the finding in the problem statement. True with caution is not needed for percentile problems because percentiles are legitimate for ordinal level variables as well as interval level variables. Slide 6 SW388R6 Data Analysis and Computers I Slide 6 Solving the problem with SPSS: Level of measurement The calculation of percentiles requires that the variable be ordinal or interval level. "Occupational prestige score" [prestg80] is interval, satisfying the requirement. Slide 7 SW388R6 Data Analysis and Computers I Slide 7 Solving the problem with SPSS: Computing percentiles - 1 To add the percentile value to each case in SPSS data set, select Rank Cases from the Transform menu. Our first task in SPSS is to compute the percentiles for each case. Slide 8 SW388R6 Data Analysis and Computers I Slide 8 Solving the problem with SPSS: Computing percentiles - 2 First, select and move the variable prestig80 to the Variable(s) list box. Second, click on the Rank Types button to choose the method for rank ordering cases. Slide 9 SW388R6 Data Analysis and Computers I Slide 9 Solving the problem with SPSS: Computing percentiles - 3 Mark the check box for Fractional rank as %. This will compute the percentile for each of the values of the variable prestg80. Clear the check box for Rank, since this is information we do not need to solve the problem. Click on the Continue button to close the dialog box. Slide 10 SW388R6 Data Analysis and Computers I Slide 10 Solving the problem with SPSS: Computing percentiles - 4 Back in the Rank Cases dialog, click on the Ties button to specify the way rank is assigned to scores that have the same numeric values. Slide 11 SW388R6 Data Analysis and Computers I Slide 11 Solving the problem with SPSS: Computing percentiles - 5 First, we mark the High option button. Since SPSS computes percentages at each rank by dividing the rank value by the total number of cases, this will give us the same percentages that we would get from a cumulative frequency distribution. Second, we click on the Continue button to close the dialog box. Slide 12 SW388R6 Data Analysis and Computers I Slide 12 Solving the problem with SPSS: Computing percentiles - 6 First, click on the OK button to obtain the output. Slide 13 SW388R6 Data Analysis and Computers I Slide 13 Solving the problem with SPSS: Computing percentiles - 7 The output contains a summary of the command options we selected. The percentile values are added to the data set.. Slide 14 SW388R6 Data Analysis and Computers I Slide 14 Solving the problem with SPSS: Sorting the percentiles - 1 We can identify the score it will take to be in the top 10% if we sort the data. Right click on the column header Pprestg8 and select Sort Descending from the popup menu. Scroll the data set to the right to see the percentile variable, Pprestg8. Slide 15 SW388R6 Data Analysis and Computers I Slide 15 Solving the problem with SPSS: Answering the question - 1 Scroll down the data set until you locate the value that above the percentile that drops below 90. In this example, 87.84 is below the 90 th percentile, so our answer is the value corresponding to the 92.16 percentile. Being in the top 10% means that a case is in the 90 th percentile or higher. Slide 16 SW388R6 Data Analysis and Computers I Slide 16 Solving the problem with SPSS: Answering the question - 2 Highlight the row corresponding to percentile 92.16 and scroll the data set to the left to locate the prestg80 variable. The value in the prestg80 column on the highlighted row is 65. A score of 65 (or higher) would position a survey respondent in the top 10% of the cases. The answer to the problem is True. Slide 17 SW388R6 Data Analysis and Computers I Slide 17 Removing the percentile variable We do not need the variable that SPSS created for percentiles, so we will remove it from the data set. First, click on the column header, Pprestg8, to select the variable to delete. Second, select the Clear command from the Edit menu, or press the Delete key on your keyboard. Slide 18 SW388R6 Data Analysis and Computers I Slide 18 Based on standard scores for the variable "income" [rincom98] in the dataset GSS2000R.Sav, is the following statement true, false, or an incorrect application of a statistic? A value of 23, or $110,000 or over, for the variable "income" would position a survey respondent in the top 5% of cases in the data set. o True o True with caution o False o Incorrect application of a statistic Homework problems: Standard Scores - 1 This is the general framework for the problems in the homework assignment on percentiles and standard scores. You will be asked whether or not a particular value or score can accurately be characterized as placing a subject in the top 5% or 10% of the cases in the dataset. Slide 19 SW388R6 Data Analysis and Computers I Slide 19 Based on standard scores for the variable "income" [rincom98] in the dataset GSS2000R.Sav, is the following statement true, false, or an incorrect application of a statistic? A value of 23, or $110,000 or over, for the variable "income" would position a survey respondent in the top 5% of cases in the data set. o True o True with caution o False o Incorrect application of a statistic Homework problems: Standard Scores - 2 The first paragraph identifies: The data set to use, e.g. GSS2000R.Sav The statistic to use, e.g. percentile or zscore The variable used in the analysis Slide 20 SW388R6 Data Analysis and Computers I Slide 20 Based on standard scores for the variable "income" [rincom98] in the dataset GSS2000R.Sav, is the following statement true, false, or an incorrect application of a statistic? A value of 23, or $110,000 or over, for the variable "income" would position a survey respondent in the top 5% of cases in the data set. o True o True with caution o False o Incorrect application of a statistic Homework problems: Standard Scores - 3 The second paragraph identifies: The value of the variable to test The percentage expected for the value tested Slide 21 SW388R6 Data Analysis and Computers I Slide 21 Based on standard scores for the variable "income" [rincom98] in the dataset GSS2000R.Sav, is the following statement true, false, or an incorrect application of a statistic? A value of 23, or $110,000 or over, for the variable "income" would position a survey respondent in the top 5% of cases in the data set. o True o True with caution o False o Incorrect application of a statistic Homework problems: Standard Scores - 4 The answer to a problem will be True if the computed percentile for the tested value supports the finding in the problem statement. The answer to a problem will Incorrect application of a statistic if the computed statistic violates the level of measurement requirement, i.e. the variable is not ordinal or interval level or the variable is not normally distributed. The answer to a problem will be False if the computed percentile for the tested value does not support the finding in the problem statement. The answer to a problem will be True with caution if the computed percentile for the tested value supports the finding in the problem statement, but the variable used is ordinal level. Slide 22 SW388R6 Data Analysis and Computers I Slide 22 Solving the problem with SPSS: Level of measurement The calculation of standard scores requires that the variable be ordinal or interval level. "Income" is Ordinal, satisfying the requirement. Since not all data analysts agree with the convention of computing z-scores for ordinal variables, a caution will be added to any true findings. Slide 23 SW388R6 Data Analysis and Computers I Slide 23 Solving the problem with SPSS: Evaluating normality - 1 Select the Descriptive Statistics > Descriptives command from the Analysis menu. Using standard scores to determine the location of a value in the distribution assumes that the distribution of the variable is normal. If the distribution is not normal, we should use percentiles rather than standard scores. We will generate descriptive statistics to evaluate normality at the same time we add zscores to the data set. Slide 24 SW388R6 Data Analysis and Computers I Slide 24 Solving the problem with SPSS: Evaluating normality - 2 Second, click on the Options button to select the statistics we want. First, move the variable we will use in the analysis, rincom98, to the Variable(s) list box. Slide 25 SW388R6 Data Analysis and Computers I Slide 25 Solving the problem with SPSS: Evaluating normality - 3 Second, click on the Continue button to close the dialog box. First, in addition to the statistics, SPSS has checked by default, mark the Kurtosis and Skewness check boxes on the Distribution panel. Slide 26 SW388R6 Data Analysis and Computers I Slide 26 Solving the problem with SPSS: Evaluating normality - 4 Click on the OK button to obtain the output. To add the standard scores, or zscores, for rincom98 to the data set, mark the checkbox, Save standardized values as variables. Slide 27 SW388R6 Data Analysis and Computers I Slide 27 Solving the problem with SPSS: Evaluating normality - 5 Obtaining accurate probabilities for standard scores, or zscores, requires that the distribution of the variable satisfy the criteria for a normal distribution. "Income" satisfied the criteria for a normal distribution. The skewness of the distribution (-.686) was between -1.0 and +1.0 and the kurtosis of the distribution (-.253) was between -1.0 and +1.0. Slide 28 SW388R6 Data Analysis and Computers I Slide 28 Solving the problem with SPSS: Zscores in the data editor Scroll the data editor window to the right to see the variable SPSS created for the zscores. SPSSs convention for naming the variable is to prepend the variable name with a Z, e.g. Zincom98. We need to identify the zscore associated with a probability of 0.05, or higher (the top 5%). While we could use a table of normal probabilities from a textbook, we will use SPSS to compute the probabilities. Slide 29 SW388R6 Data Analysis and Computers I Slide 29 Solving the problem with SPSS: Computing probabilities for zscores - 1 To add the normal distribution probability for each zscore, select Compute from the Transform menu. Slide 30 SW388R6 Data Analysis and Computers I Slide 30 Solving the problem with SPSS: Computing probabilities for zscores - 2 In the Compute Variable dialog, first, type the variable name you want to assign to the zscore probabilities in Target Variable text box. I will use prob followed by the name of the zscore variable, e.g. probZrincom98. Third, move the Cdfnorm to the Numeric Expression: text box using the triangle button. The Cdfnorm function stands for cumulative density function and returns the normal distribution probability for zscores. Second, select CDF & Noncentral CDF from the Function group list box. Third, select Cdfnorm in the Functions list box. Slide 31 SW388R6 Data Analysis and Computers I Slide 31 Solving the problem with SPSS: Computing probabilities for zscores - 3 When you move the CDFNORM( ) function to the Numeric Expression: text box, SPSS will put a ? Mark in parentheses to indicate that it needs more information, e.g. the name of the variable that it will compute probabilities for. First, scroll the list of variables to the bottom and click on the Zrincom98 variable. Second, click on the right arrow button to replace the ? Mark with the variable name. Slide 32 SW388R6 Data Analysis and Computers I Slide 32 Solving the problem with SPSS: Computing probabilities for zscores - 4 CDFNORM(Zrincom98) will calculate the probability from the left tail of the normal distribution up to the z-score value. Since we want the probability above the z-score value, we subtract CDFNORM(Zrincom98) from 1. The 1, or 100%, represents the total probability under the normal curve. The formula for the probabilities is complete. Click on the OK button to close the dialog box. Slide 33 SW388R6 Data Analysis and Computers I Slide 33 Solving the problem with SPSS: Sorting the probabilities for zscores - 1 The variable is added to the data set. Our task of evaluating the probability associated with the value stated in the problem will be easier if we sort the data. Right click on the column header for probZrincom98 and select Sort Ascending from the pop-up menu. Slide 34 SW388R6 Data Analysis and Computers I Slide 34 Solving the problem with SPSS: Sorting the probabilities for zscores - 2 First, scroll down the values of probZrincom98 to locate the probabilities that are less than or equal to 0.05 (the top 5%). In this example, the zscore of 1.77545 has a probability (.037912) less than.05. Second, click on the row number to highlight the values on the row. This will enable us to see what value for Rincom98 corresponds to this zscore and probability. Slide 35 SW388R6 Data Analysis and Computers I Slide 35 Solving the problem with SPSS: Answering the question Scroll the data set to the right until the Rincom98 column is visible. The standard score, or zscore, for the value of 23 on the variable "income" is 1.78. The probability of a zscore of 1.78 is 0.04 (after rounding to 2 decimal places). This probability would position a survey respondent in the top 5% of cases in the data set. The answer to the question is True with caution, since the variable is ordinal. Slide 36 SW388R6 Data Analysis and Computers I Slide 36 Removing the standard score variables We do not need the variable that were created to solve the standard score problem, so we will delete them from the data set. First, click on the column headers, Zrincom98 and probZrincom98, to select the variables to delete. Second, select the Clear command from the Edit menu, or press the Delete key on your keyboard. Slide 37 SW388R6 Data Analysis and Computers I Slide 37 Logic for percentile problems: Level of measurement and percentile value Measurement level of variable? Nominal (dichotomous) Inappropriate application of a statistic Value is in top 5% or 10%? FalseTrue YesNo Interval/ordinal Slide 38 SW388R6 Data Analysis and Computers I Slide 38 Logic for standard score problems: Level of measurement and assumption of normality Measurement level of variable? Nominal (dichotomous) Inappropriate application of a statistic Skewness and Kurtosis between -1.0 and +1.0? No Interval/ordinal When the variable is not normally distributed, use percentiles instead of zscores. Inappropriate application of a statistic Yes Slide 39 SW388R6 Data Analysis and Computers I Slide 39 Logic for standard score problems: Decision about location of value Value is in top 5% or 10%? False True Add caution for ordinal variable. YesNo