Research Methods Hand Book

download Research Methods Hand Book

of 82

Transcript of Research Methods Hand Book

  • 8/6/2019 Research Methods Hand Book

    1/82

    Table of Contents

    Part I IntroductionPrefaceResearch Design Elements of the Scientific Model Levels of Data Association External and Internal ValidityReporting Results Paper Format Table Format Critical Review Checklist

    Part II Describing DataMeasures of Central TendencyMeasures of VariationStandardized Z-ScoresProportions

    Part III Hypothesis TestingSteps to Hypothesis Testing

    Comparing MeansInterval Estimation For MeansComparing a Population Mean to a Sample MeanComparing Two Independent Sample MeansComputing F-ratioTwo Independent Sample Means (Cochran and Cox)One-Way Analysis of Variance

    ProportionsInterval Estimation for ProportionsInterval Estimation for the Difference Between Two ProportionsComparing a Population Proportion to a Sample ProportionComparing Proportions From Two Independent SamplesChi-SquareChi-square Goodness of Fit TestChi-square Test of IndependenceCoefficients for Measuring AssociationCorrelationPearson's Product Moment Correlation CoefficientHypothesis Testing for Pearson r

    Spearman Rho CoefficientHypothesis Testing for Spearman RhoSimple Linear RegressionSample Size Estimation

    TablesT Distribution Critical ValuesZ Distribution Critical Values

    http://c/Program%20Files/AcaStat-5/Help/Intro.htmlhttp://c/Program%20Files/AcaStat-5/Help/1.htmlhttp://c/Program%20Files/AcaStat-5/Help/2.htmlhttp://c/Program%20Files/AcaStat-5/Help/2.html#Modelhttp://c/Program%20Files/AcaStat-5/Help/2.html#levelshttp://c/Program%20Files/AcaStat-5/Help/2.html#Assoc1http://c/Program%20Files/AcaStat-5/Help/2.html#ExValidhttp://c/Program%20Files/AcaStat-5/Help/3.htmlhttp://c/Program%20Files/AcaStat-5/Help/3.html#Paperhttp://c/Program%20Files/AcaStat-5/Help/3.html#Tablehttp://c/Program%20Files/AcaStat-5/Help/3.html#criticalhttp://c/Program%20Files/AcaStat-5/Help/Data.htmlhttp://c/Program%20Files/AcaStat-5/Help/5.htmlhttp://c/Program%20Files/AcaStat-5/Help/6.htmlhttp://c/Program%20Files/AcaStat-5/Help/6.html#Stdscoreshttp://c/Program%20Files/AcaStat-5/Help/7.htmlhttp://c/Program%20Files/AcaStat-5/Help/Hyp.htmlhttp://c/Program%20Files/AcaStat-5/Help/9.htmlhttp://c/Program%20Files/AcaStat-5/Help/10.htmlhttp://c/Program%20Files/AcaStat-5/Help/11.htmlhttp://c/Program%20Files/AcaStat-5/Help/12.htmlhttp://c/Program%20Files/AcaStat-5/Help/13.htmlhttp://c/Program%20Files/AcaStat-5/Help/14.htmlhttp://c/Program%20Files/AcaStat-5/Help/15.htmlhttp://c/Program%20Files/AcaStat-5/Help/16.htmlhttp://c/Program%20Files/AcaStat-5/Help/18.htmlhttp://c/Program%20Files/AcaStat-5/Help/19.htmlhttp://c/Program%20Files/AcaStat-5/Help/20.htmlhttp://c/Program%20Files/AcaStat-5/Help/21.htmlhttp://c/Program%20Files/AcaStat-5/Help/22.htmlhttp://c/Program%20Files/AcaStat-5/Help/24.htmlhttp://c/Program%20Files/AcaStat-5/Help/25.htmlhttp://c/Program%20Files/AcaStat-5/Help/26.htmlhttp://c/Program%20Files/AcaStat-5/Help/27.htmlhttp://c/Program%20Files/AcaStat-5/Help/29.htmlhttp://c/Program%20Files/AcaStat-5/Help/30.htmlhttp://c/Program%20Files/AcaStat-5/Help/31.htmlhttp://c/Program%20Files/AcaStat-5/Help/32.htmlhttp://c/Program%20Files/AcaStat-5/Help/33.htmlhttp://c/Program%20Files/AcaStat-5/Help/34.htmlhttp://c/Program%20Files/AcaStat-5/Help/100.htmlhttp://c/Program%20Files/AcaStat-5/Help/Tables.htmlhttp://c/Program%20Files/AcaStat-5/Help/36.htmlhttp://c/Program%20Files/AcaStat-5/Help/37.htmlhttp://c/Program%20Files/AcaStat-5/Help/1.htmlhttp://c/Program%20Files/AcaStat-5/Help/2.htmlhttp://c/Program%20Files/AcaStat-5/Help/2.html#Modelhttp://c/Program%20Files/AcaStat-5/Help/2.html#levelshttp://c/Program%20Files/AcaStat-5/Help/2.html#Assoc1http://c/Program%20Files/AcaStat-5/Help/2.html#ExValidhttp://c/Program%20Files/AcaStat-5/Help/3.htmlhttp://c/Program%20Files/AcaStat-5/Help/3.html#Paperhttp://c/Program%20Files/AcaStat-5/Help/3.html#Tablehttp://c/Program%20Files/AcaStat-5/Help/3.html#criticalhttp://c/Program%20Files/AcaStat-5/Help/Data.htmlhttp://c/Program%20Files/AcaStat-5/Help/5.htmlhttp://c/Program%20Files/AcaStat-5/Help/6.htmlhttp://c/Program%20Files/AcaStat-5/Help/6.html#Stdscoreshttp://c/Program%20Files/AcaStat-5/Help/7.htmlhttp://c/Program%20Files/AcaStat-5/Help/Hyp.htmlhttp://c/Program%20Files/AcaStat-5/Help/9.htmlhttp://c/Program%20Files/AcaStat-5/Help/10.htmlhttp://c/Program%20Files/AcaStat-5/Help/11.htmlhttp://c/Program%20Files/AcaStat-5/Help/12.htmlhttp://c/Program%20Files/AcaStat-5/Help/13.htmlhttp://c/Program%20Files/AcaStat-5/Help/14.htmlhttp://c/Program%20Files/AcaStat-5/Help/15.htmlhttp://c/Program%20Files/AcaStat-5/Help/16.htmlhttp://c/Program%20Files/AcaStat-5/Help/18.htmlhttp://c/Program%20Files/AcaStat-5/Help/19.htmlhttp://c/Program%20Files/AcaStat-5/Help/20.htmlhttp://c/Program%20Files/AcaStat-5/Help/21.htmlhttp://c/Program%20Files/AcaStat-5/Help/22.htmlhttp://c/Program%20Files/AcaStat-5/Help/24.htmlhttp://c/Program%20Files/AcaStat-5/Help/25.htmlhttp://c/Program%20Files/AcaStat-5/Help/26.htmlhttp://c/Program%20Files/AcaStat-5/Help/27.htmlhttp://c/Program%20Files/AcaStat-5/Help/29.htmlhttp://c/Program%20Files/AcaStat-5/Help/30.htmlhttp://c/Program%20Files/AcaStat-5/Help/31.htmlhttp://c/Program%20Files/AcaStat-5/Help/32.htmlhttp://c/Program%20Files/AcaStat-5/Help/33.htmlhttp://c/Program%20Files/AcaStat-5/Help/34.htmlhttp://c/Program%20Files/AcaStat-5/Help/100.htmlhttp://c/Program%20Files/AcaStat-5/Help/Tables.htmlhttp://c/Program%20Files/AcaStat-5/Help/36.htmlhttp://c/Program%20Files/AcaStat-5/Help/37.htmlhttp://c/Program%20Files/AcaStat-5/Help/Intro.html
  • 8/6/2019 Research Methods Hand Book

    2/82

    Chi-square Distribution Critical ValuesF Distribution Critical Values

    AppendixData File BasicsBasic FormulasGlossary of SymbolsOrder of Mathematical OperationsDefinitionsPractice Data Files

    Definitions

    1-tailed test The probability of Type I error is included inone tail of the sampling distribution. Generallyused when the direction of the difference

    between two populations can be supported bytheory or other knowledge gained prior totesting for statistical significance.

    2-tailed test The probability of Type I error is included inboth tails of the sampling distribution (e.g.,alpha .05 means .025 is in one tail and .025 isin the other tail). Generally used when thedirection of the difference between twopopulations cannot be supported by theory orother knowledge gained prior to testing forstatistical significance.

    Alpha The probability of a Type I error. Represents

    the threshold for claiming statisticalsignificance.

    Association Changes in one variable are accompanied bychanges in another variable.

    Central LimitTheorem

    As sample size increases, the distributionapproximates a normal distribution and isusually close to normal at a sample size of 30.

    Critical Value The point on the x-axis of a samplingdistribution that is equal to alpha. It isinterpreted as standard error. As an example,a critical value of 1.96 is interpreted as 1.96

    standard errors above the mean of thesampling distribution.

    ContinuousVariable

    A variable than can take on any numericalvalue even between one value and another.Grade point average, distance in kilometers,loan interest rates.

    DependentVariable

    A measure not under the control of theresearcher that reflects responses caused by

    http://c/Program%20Files/AcaStat-5/Help/38.htmlhttp://c/Program%20Files/AcaStat-5/Help/39.htmlhttp://c/Program%20Files/AcaStat-5/Help/Appendix.htmlhttp://c/Program%20Files/AcaStat-5/Help/40.htmlhttp://c/Program%20Files/AcaStat-5/Help/42.htmlhttp://c/Program%20Files/AcaStat-5/Help/43.htmlhttp://c/Program%20Files/AcaStat-5/Help/44.htmlhttp://c/Program%20Files/AcaStat-5/Help/45.htmlhttp://c/Program%20Files/AcaStat-5/Help/92.htmlhttp://c/Program%20Files/AcaStat-5/Help/38.htmlhttp://c/Program%20Files/AcaStat-5/Help/39.htmlhttp://c/Program%20Files/AcaStat-5/Help/Appendix.htmlhttp://c/Program%20Files/AcaStat-5/Help/40.htmlhttp://c/Program%20Files/AcaStat-5/Help/42.htmlhttp://c/Program%20Files/AcaStat-5/Help/43.htmlhttp://c/Program%20Files/AcaStat-5/Help/44.htmlhttp://c/Program%20Files/AcaStat-5/Help/45.htmlhttp://c/Program%20Files/AcaStat-5/Help/92.html
  • 8/6/2019 Research Methods Hand Book

    3/82

    variations in another measure (theindependent variable).

    DescriptiveStatistics

    Statistics that classify and summarizenumerical data.

    Discrete Variable A variable that is limited to a finite number of

    values. Such as religion or number of parks ina city (you can't have 1.5 parks)

    Homoscedasticity The variance of the Y scores in a correlationare uniform for the values of the X scores. Inother words, the Y scores are equally spreadabove and below the regression line.

    IndependentVariable

    A measure that can take on different valueswhich are subject to manipulation by theresearcher.

    InferentialStatistics

    Statistics that use characteristics of a randomsample along with measures of sampling error

    to predict the true values in a largerpopulation.

    Interpretation Bias Errors in data collection that occur whenknowledge of the results of one test affect theinterpretation of a second test.

    Interval Data Objects classified by type or characteristic,with logical order and equal differencesbetween levels of data.

    Kurtosis The peakedness of a distribution. Leptokurticis more peaked, Mesokurtic is a normaldistribution, and Platykurtic is a flatterdistribution.

    Mean The arithmetic average of the scores in asample distribution.

    Median The point on a scale of measurement belowwhich fifty percent of the scores fall.

    MeasurementScale

    A reflection of how well a variable and/orconcept can be measured. Generallycategorized in order of precision as nominal,ordinal, interval, and ratio data.

    Mode The most frequently occurring score in adistribution.

    Mu The arithmetic average of the scores in apopulation.

    Nominal Objects classified by type or characteristic.

    NormalDistribution

    A frequency distribution of scores that issymmetric about the mean, median, andmode.

    Ordinal Data Objects classified by type or characteristicwith some logical order.

  • 8/6/2019 Research Methods Hand Book

    4/82

    Parameter The measure of a population characteristic.

    Population Contains all members of a group.

    Power Power is 1-Beta and is defined as theprobability of correctly finding statisticalsignificance. A common value for power is .80

    P-value The probability of a Type I errorRandom Sampling Each and every element in a population has

    an equal opportunity of being selected.

    Ratio Data Objects classified by type or characteristic,with logical order and equal differencesbetween levels, and having a true zerostarting point.

    Reliability The extent to which a measure obtains similarresults over repeat trials.

    Research Question Defines the purpose of the study by clearlyidentifying the relationship(s) the researcher

    intends to investigate.Response Bias Errors in data collection caused by differing

    patterns and completeness of data collectionthat are dominated by a specific subgroupwithin the sample.

    Response Variable The measure not controlled in an experiment.Commonly known as the dependent variable.

    Sample A subset of a population.

    SampleDistribution

    A frequency distribution of sample data.

    Sampling

    Distribution

    A probability distribution representing an

    infinite number of sample distributions for agiven sample size.

    Skewness Skewness provides an indication of the howasymmetric the distribution is for a givensample. When estimated using the thirdmoment, a value of 0 indicates a normalasymmetric distribution. A positive valueindicates a positive skew (the right tail islonger than the left). A negative valueindicates a negative skew (the left tail islonger than the right). Skewness values

    greater than 1 or less than -1 indicate a non-normal distribution.

    SpuriousCorrelation

    The strength and direction of an associationbetween an independent and dependentvariable depends on the value of a thirdvariable.

    Statistic Measure of a sample characteristic.

    Statistical Interpreted as the probability of a Type I error.

  • 8/6/2019 Research Methods Hand Book

    5/82

    Significance Test statistics that meet or exceed a criticalvalue are interpreted as evidence that thedifferences exhibited in the sample statisticsare not due to random sampling error andtherefore are evidence supporting theconclusion there is a real difference in thepopulations from which the sample data wereobtained.

    Type I Error Rejecting a true null hypothesis. Commonlyinterpreted as the probability of being wrongwhen concluding there is statisticalsignificance. Also referred to as Alpha, p-value, or significance.

    Type II Error Retaining a false null hypothesis. Also referredto as Beta.

    Unit of Analysis The object under study. This could be people,schools, cities, etc.

    Validity The extent to which a measure accuratelyrepresents an abstract concept.

    Variable A characteristic that can form different valuesfrom one observation to another.

    Research Design

    The objective of science is to explain reality in such a fashion so that othersmay develop their own conclusions based on the evidence presented. Thegoal of this handbook is to help you learn how to conduct a systematicapproach to understanding the world around us that employs specific rules of

    inquiry; what is known as the scientific model.

    The scientific model helps us create research that is quantifiable (measuredin some fashion), verifiable (others can substantiate our findings), replicable(others can repeat the study), and defensible (provides results that arecredible to others--this does not mean others have to agree with the results).For many the scientific model may seem too complex to follow, but it is oftenused in everyday life and should be evident in any research report, paper, orpublished manuscript. The corollaries of common sense and proper paperformat with the scientific model are given below.

    Corollaries among the Scientific Model, Common Sense, and Paper Format

    Scientific Model Common Sense Paper FormatResearch Question Why IntroDevelop a theory Your answer IntroIdentify variables How MethodIdentify hypotheses Expectations Method Test the hypotheses Collect/analyze

    dataResults

  • 8/6/2019 Research Methods Hand Book

    6/82

    Evaluate the results What it means ConclusionCritical review What it doesnt

    meanConclusion

    Overview of first four elements of the Scientific Model

    The following discussion provides a very brief introduction to the first fourelements of the scientific model. The elements that pertain to hypothesistesting, evaluating results, and critical review are the primary focus in Part IIIof the Handbook.

    1) Research Question

    The research question should be a clear statement about what you intend toinvestigate. It should be specified before research is conducted and openlystated in reporting the results. One conventional approach is to put theresearch question in writing in the introduction of a report starting with the

    phrase " The purpose of this study is . . . ." This approach forces theresearcher to:

    a. identify the research objective (allows others to benchmark how wellthe study design answers the primary goal of the research)

    b. identify key abstract concepts involved in the research

    Abstract concepts: The starting point for measurement. Abstractconcepts are best understood as general ideas in linguistic form thathelp us describe reality. They range from the simple (hot, long, heavy,fast) to the more difficult (responsive, effective, fair). Abstractconcepts should be evident in the research question and/or purpose

    statement. An example of a research question is given below alongwith how it might be reflected in a purpose statement.

    Research Question: Is the quality of public sector and private sectoremployees different?

    Purpose statement: The purpose of this study is to determine if thequality of public and private sector employees is different.

    2) Develop Theory

    A theory is one or more propositions that suggest why an event occurs. It isour view or explanation for how the world works. These propositions providea framework for further analysis that are developed as a non-normativeexplanation for "What is" not "What should be." A theory should have logicalintegrity and includes assumptions that are based on paradigms. Theseparadigms are the larger frame of contemporary understanding shared bythe profession and/or scientific community and are part of the core set ofassumptions from which we may be basing our inquiry.

    http://c/Program%20Files/AcaStat-5/Help/45.html#ResQhttp://c/Program%20Files/AcaStat-5/Help/45.html#ResQ
  • 8/6/2019 Research Methods Hand Book

    7/82

    3) Identify Variables

    Variables are measurable abstract concepts that help us describerelationships. This measuring of abstract concepts is referred to asoperationalization. In the previous research question "Is the quality of publicsector and private sector employees different?" the key abstract concepts areemployee quality and employment sector. To measure "quality" we need toidentify and develop a measurable representation of employee quality.Possible quality variables could be performance on a standardizedintelligence test, attendance, performance evaluations, etc. The variable foremployment sector seems to be fairly self-evident, but a good researchermust be very clear on how they define and measure the concepts of publicand private sector employment.

    Variables represent empirical indicators of an abstract concept. However, wemust always assume there will be incomplete congruence between ourmeasure and the abstract concept. Put simply, our measurement has an errorcomponent. It is unlikely to measure all aspects of an abstract concept andcan best be understood by the following:

    Abstract concept = indicator + errorBecause there is always error in our measurement, multiplemeasures/indicators of one abstract concept are felt to be better(valid/reliable) than one. As shown below, one would expect that as morevalid indicators of an abstract concept are used the effect of the error termwould decline:

    Abstract concept = indicator1 + indicator2 + indicator3 + errorLevels of Data

    There are four levels of variables. These levels are listed below in order oftheir precision. It is essential to be able to identify the levels of data used in aresearch design. They are directly associated with determining whichstatistical methods are most appropriate for testing research hypotheses.

    Nominal: Classifies objects by type or characteristic (sex, race,models of vehicles, political jurisdictions)

    Properties:

    1. categories are mutually exclusive (an object or characteristiccan only be contained in one category of a variable)

    2. no logical order

    Ordinal: classifies objects by type or kind but also has some logicalorder (military rank, letter grades)

    Properties:

    1. categories are mutually exclusive2. logical order exists

  • 8/6/2019 Research Methods Hand Book

    8/82

    3. scaled according to amount of a particular characteristic theypossess

    Interval: classified by type, logical order, but also requires thatdifferences between levels of a category are equal (temperature indegrees Celsius, distance in kilometers, age in years)

    Properties:

    1. categories are mutually exclusive2. logical order exists3. scaled according to amount of a particular characteristic they

    possess4. differences between each level are equal5. no zero starting point

    Ratio: same as interval but has a true zero starting point (income,education, exam score). Identical to an interval-level scale except ratio

    level data begin with the option of total absence of the characteristic.For most purposes, we assume interval/ratio are the same.

    The following table provides examples of variable types:

    Variable LevelCountry NominalLetter Grade OrdinalAge RatioTemperature Interval

    Reliability and Validity

    The accuracy of our measurements are affected by reliability and validity.Reliabilityis the extent to which the repeated use of a measure obtains thesame values when no change has occurred (can be evaluated empirically).Validity is the extent to which the operationalized variable accuratelyrepresents the abstract concept it intends to measure (cannot be confirmedempirically-it will always be in question). Reliability negatively impacts allstudies but is very much a part of any methodology/operationalization ofconcepts. As an example, reliability can depend on who performs themeasurement (i.e., subjective measures) and when, where, and how data arecollected (from whom, written, verbal, time of day, season, current publicevents).

    There are several different conceptualizations of validity. Predictivevalidityrefers to the ability of an indicator to correctly predict (or correlatewith) an outcome (e.g., GRE and performance in graduate school). Contentvalidity is the extent to which the indicator reflects the full domain ofinterest (e.g., past grades only reflect one aspect of student quality).Construct validity (correlational validity) is the degree to which one measurecorrelates with other measures of the same abstract concept (e.g., days lateor absent from work may correlate with performance ratings). Face validity

  • 8/6/2019 Research Methods Hand Book

    9/82

    evaluates whether the indicator appears to measure the abstract concept(e.g., a person's religious preference is unlikely to be a valid indicator ofemployee quality).

    4) Identify measurable hypotheses

    A hypothesis is a formal statement that presents the expected relationshipbetween an independent and dependent variable. A dependent variable isa variable that contains variations for which we seek an explanation. Anindependent variable is a variable that is thought to affect (cause)variations in the dependent variable. This causation is implied when we havestatistically significant associations between an independent and dependentvariable but it can never be empirically proven: Proof is always an exercise inrational inference.

    Association

    Statistical techniques are used to explore connections between independentand dependent variables. This connection between or among variables isoften referred to as association. Association is also known as covariationand can be defined as measurable changes in one variable that occurconcurrently with changes in another variable. A positive association isrepresented by change in the same direction (income rises with educationlevel). Negative association is represented by concurrent change inopposite directions (hours spent exercising and % body fat). Spuriousassociations are associations between two variables that can be betterexplained by a third variable. As an example, if after taking cold medicationfor seven days the symptoms disappear, one might assume the medicationcured the illness. Most of us, however, would probably agree that the change

    experienced in cold symptoms are probably better explained by the passageof time rather than pharmacological effect (i.e., the cold would resolve itselfin seven days irregardless of whether the medication was taken or not).

    Causation

    There is a difference between determining association and causation.Causation, often referred to as a relationship, cannot be proven withstatistics. Statistical techniques provide evidence that a relationship existsthrough the use of significance testing and strength of association metrics.However, this evidence must be bolstered by an intellectual exercise thatincludes the theoretical basis of the research and logical assertion. The

    following presents the elements necessary for claiming causation:

    External and Internal Validity

    There are two types of study designs, experimental and quasi-experimental.

  • 8/6/2019 Research Methods Hand Book

    10/82

    Experimental: The experimental design uses a control group and appliestreatment to a second group. It provides the strongest evidence of causationthrough extensive controls and random assignment to remove otherdifferences between groups. Using the evaluation of a job training programas an example, one could carefully select and randomly assign two groups ofunemployed welfare recipients. One group would be provided job training and

    the other would not. If the two groups are similar in all other relevantcharacteristics, you could assume any differences between the groupsemployment one year later was caused by job training.

    Whenever you use an experimental design, both the internal and externalvalidity can become very important factors.

    Internal validity: The extent to which accurate and unbiasedassociation between the IV and DVs were obtained in the studygroup.

    External validity: The extent to which the association betweenthe IV and DV is accurate and unbiased in populations outsidethe study group.

    Quasi-experimental: The quasi-experimental design does not have thecontrols employed in an experimental design (most social science research).Although internal validity is lower than can be obtained with an experimentaldesign, external validity is generally better and a well designed study shouldallow for the use of statistical controls to compensate for extraneousvariables.

    Types of quasi-experimental design:

    1. Cross-sectional study: obtained at one point in time (mostsurveys)

    2. Case study: in-depth analysis of one entity, object, or event3. Panel study: (cohort study) repeated cross-sectional studies

    over time with the same participants4. Trend study: tracking indicator variables over a period of time

    (unemployment, crime, dropout rates)

    Reporting Results

    The following pages provide an outline for presenting the results of yourresearch. Regardless of the specific format you or others use, the key pointsto consider in reporting the results of research are:

    1) Clearly state the research question up front

    2) Completely explain your assumptions and method of inquiry so thatothers may duplicate the study

  • 8/6/2019 Research Methods Hand Book

    11/82

    3) Objectively and accurately report the results of your analysis

    4) Present data in tables that are

    a) Accurateb) Completec) Titled and documented so that they could stand on their ownwithout a report

    5) Correctly reference sources of information and related research

    6) Openly discuss weaknesses/biases of your research

    7) Develop a defensible conclusion based on your analysis, notpersonal opinion

    Paper Format

    The following outline may be a useful guide in formatting your researchreport. It incorporates elements of the research design and steps forhypothesis testing (in italics). You may wish to refer back to this outline afteryou have developed an understanding of hypothesis testing (Part III).

    I. Introduction: A definition of the central research question (purpose ofthe paper), why it is of interest, and a review of the literature relatedto the subject and how it relates to your hypotheses.

    Elements: Purpose statement

    Theory

    Abstract concepts

    II. Method: Describe the source of the data, sample characteristics,statistical technique(s) applied, level of significance necessary to rejectyour null hypotheses, and how you operationalized abstract concepts.

    Elements: Independent variable(s) and level of measurementDependent variable(s) and level of measurement

    AssumptionsRandom samplingIndependent subgroupsPopulation normally distributed

    Hypotheses

  • 8/6/2019 Research Methods Hand Book

    12/82

    Identify statistical technique(s)State null hypothesis or alternativehypothesis

    Rejection criteriaIndicate alpha (amount of error you arewilling to accept)

    Specify one or two-tailed tests

    Results: Describe the results of your data analysis and the implicationsfor your hypotheses. It should include such elements as univariate, bivariate,and multivariate analyses; significance test statistics, the probability of errorand related status of your hypotheses tests.

    Elements: Describe sample statistics

    Compute test statistics

    Decide results

    IV. Conclusion: Summarize and evaluate your results. Put in plain wordswhat your research found concerning your central research question.Identify alternative variables, implications for further study, and atleast one paragraph on the weaknesses of your research and findings.

    Elements: Interpretation (What do the results mean?)

    Weaknesses

    V. References: Only include literature you cite in your report.VI. Appendix: Additional tables and information not included in the body of

    the report.

    Table Format

    In the professional world, presentation is almost everything. As a result, youshould develop the ability to create a one-page summary of your researchthat contains one or more tables representing your key findings. An exampleis given below:

    Survey of Travel Reimbursement Office Customers

    Customer Characteristics

    Sex Age

  • 8/6/2019 Research Methods Hand Book

    13/82

    Total (1) Female Male 18-29 30-39 40-49 50-59 60+

    Sample Size -> 3686 1466 2081 1024 1197 769 493 98% of Total (1) -> 100% 41% 59% 29% 33% 22% 14% 3%Margin of Error->

    1.6% 2.6% 2.2% 3.1% 2.8% 3.5% 4.4% 9.9%

    StaffProfessional?

    Yes 89.4% 89.6% 89.5% 89.5%89.4%89.5%89.6%90.0%No 10.6% 10.4% 10.5% 10.5%10.6%10.5%10.4%10.0%

    Treated Fairly? Yes 83.1% 82.8% 83.8% 82.6%83.0%83.5%84.3%87.9%No 16.9% 17.2% 16.2% 17.4%17.0%16.5%15.7%12.1%Served Quickly?Yes 71.7% 69.3% 74.1% 73.3%69.3%72.1%74.2%78.0%

    No 28.3% 30.7% 25.9% 26.7%30.7%27.9%25.8%22.0%

    (1) Total number of cases is based on responses to the question concerningbeing served quickly. Non-responses to survey items cause the sample sizes tovary.

    Critical Review Checklist

    Use the following checklist when evaluating research prepared by others. Isthere anything else you would want to add to the list?

    1. Research question(s) clearly identified

    2. Variables clearly identified3. Operationalization of variables is valid and reliable4. Hypotheses evident5. Statistical techniques identified and appropriate6. Significance level (alpha) stated a priori7. Assumptions for statistical tests are met8. Tables are clearly labeled and understandable9. Text accurately describes data from tables10. Statistical significance properly interpreted11. Conclusions fit analytical results12. Inclusion of all relevant variables

    13. Weaknesses addressed14._________________________________________15._________________________________________16._________________________________________17._________________________________________18._________________________________________19._________________________________________20._________________________________________

  • 8/6/2019 Research Methods Hand Book

    14/82

    Measures of Central Tendency

    Mode: The most frequently occurring score. A distribution of scores can beunimodal (one score occurred most frequently), bimodal (two scores tied formost frequently occurring), or multimodal. In the table below the mode is 32.If there were also two scores with the value of 60, we would have a bimodaldistribution (32 and 60).

    Median:The point on a rank ordered list of scores below which 50% of thescores fall. It is especially useful as a measure of central tendency whenthere are very extreme scores in the distribution, such as would be the case ifwe had someone in the age distribution provided below who was 120. If thenumber of scores is odd, the median is the score located in the positionrepresented by (n+1)/2. In the table below the median is located in the 4th

    position (7+1)/2 and would be reported as a median of 42. If the number ofscores are even, the median is the average of the two middle scores. As anexample, if we dropped the last score (65) in the above table, the medianwould be represented by the average of the 3rd (6/2) and 4th score, or 37(32+42)/2. Always remember to order the scores from low to high beforedetermining the median.

    Variable Age Also known as X2432 Mode3242 Median5560

    65

    n= 7 Number of scores (or cases)310 Sum of scores (Xi=each score)

    44.29 Mean

    Mean: The sum of the scores ( ) is divided by the number of scores (n) tocompute an arithmetic average of the scores in the distribution. The mean isthe most often used measure of central tendency. It has two properties: 1)the sum of the deviations of the individual scores (Xi) from the mean is zero,2) the sum of squared deviations from the mean is smaller than what can be

    obtained from any other value created to represent the central tendency ofthe distribution. In the above table the mean age is 44.29 (310/7).

    Weighted Mean: When two or more means are combined to develop anaggregate mean, the influence of each mean must be weighted by thenumber of cases in its subgroup.

  • 8/6/2019 Research Methods Hand Book

    15/82

    Example

    Wrong Method:

    Correct Method:

    Measures of Variation

    Range: The difference between the highest and lowest score (high-low). Itdescribes the span of scores but cannot be compared to distributions with adifferent number of observations. In the table below, the range is 41 (65-24).

    Variance: The average of the squared deviations between the individualscores and the mean. The larger the variance the more variability there isamong the scores. When comparing two samples with the same unit ofmeasurement (age), the variances are comparable even though the samplesizes may be different. Generally, however, smaller samples have greatervariability among the scores than larger samples. The sample variance forthe data in the table below is 251.57. The formula is almost the same forestimating population variance. See formula in Appendix.

    Standard deviation: The square root of variance. It provides a representationof the variation among scores that is directly comparable to the raw scores. The sample standard deviation in the following table is 15.86 years.

    Variable

    Age 44.29

    24 -20.29 411.68

    32 -12.29 151.0432 -12.29 151.04 squared deviations42 -2.29 5.2455 10.71 114.7060 15.71 246.8065 20.71 428.90

    n= 7 1509.43

  • 8/6/2019 Research Methods Hand Book

    16/82

    251.57 sample variance

    15.86 sample standarddeviation

    Standardized Z-Score

    A standardized z-score represents both the relative position of an individualscore in a distribution as compared to the mean and the variation of scores inthe distribution. A negative z-score indicates the score is below thedistribution mean. A positive z-score indicates the score is above thedistribution mean. Z-scores will form a distribution identical to the distributionof raw scores; the mean of z-scores will equal zero and the variance of a z-distribution will always be one, as will the standard deviation.

    To obtain a standardized score you must subtract the mean from the

    individual score and divide by the standard deviation. Standardized scoresprovide you with a score that is directly comparable within and betweendifferent groups of cases.

    VariableAge Z

    Doug 24 -20.29 -20.29/15.86 -1.28Mary 32 -12.29 -12.29/15.86 -0.77Jenny 32 -12.29 -12.29/15.86 -0.77Frank 42 -2.29 -2.29/15.86 -0.14John 55 10.71 10.71/15.86 0.68Beth 60 15.71 15.71/15.86 0.99Ed 65 20.71 20.71/15.86 1.31

    As an example of how to interpret z-scores, Ed is 1.31 standard deviationsabove the mean age for those represented in the sample. Another simpleexample is exam scores from two history classes with the same content but

    difference instructors and different test formats. To adequately comparestudent A's score from class A with Student B's score from class B you needto adjust the scores by the variation (standard deviation) of scores in eachclass and the distance of each student's score from the average (mean) forthe class.

    Standardized Z-Score

  • 8/6/2019 Research Methods Hand Book

    17/82

    A standardized z-score represents both the relative position of an individualscore in a distribution as compared to the mean and the variation of scores inthe distribution. A negative z-score indicates the score is below thedistribution mean. A positive z-score indicates the score is above thedistribution mean. Z-scores will form a distribution identical to the distributionof raw scores; the mean of z-scores will equal zero and the variance of a z-

    distribution will always be one, as will the standard deviation.

    To obtain a standardized score you must subtract the mean from theindividual score and divide by the standard deviation. Standardized scoresprovide you with a score that is directly comparable within and betweendifferent groups of cases.

    VariableAge Z

    Doug 24 -20.29 -20.29/15.86 -1.28Mary 32 -12.29 -12.29/15.86 -0.77Jenny 32 -12.29 -12.29/15.86 -0.77Frank 42 -2.29 -2.29/15.86 -0.14John 55 10.71 10.71/15.86 0.68Beth 60 15.71 15.71/15.86 0.99Ed 65 20.71 20.71/15.86 1.31

    As an example of how to interpret z-scores, Ed is 1.31 standard deviationsabove the mean age for those represented in the sample. Another simpleexample is exam scores from two history classes with the same content butdifference instructors and different test formats. To adequately comparestudent A's score from class A with Student B's score from class B you needto adjust the scores by the variation (standard deviation) of scores in eachclass and the distance of each student's score from the average (mean) forthe class.

    Proportions

    A proportion weights the frequency of occurrence against the total possible. Itis often reported as a percentage.

    Example:

    Frequency: 154 out of 233 adults support a sales tax increase

    Proportion: occurrence/total or 154/233 = .66

  • 8/6/2019 Research Methods Hand Book

    18/82

    Percent: .66 * 100 = 66% of adults support sales tax increase

    Interpreting Contingency Tables

    (count) Female Male Row TotalSupport tax 75 50 125Do not support tax 25 50 75Column Total 100 100 200

    Column %: Of those who are female, 75% (75/100) support the taxOf those who are male, 50% (50/100) support the tax

    Row %: Of those supporting the tax, 60% (75/125) are femaleOf those not supporting the tax, 67% (50/75) are male

    Row Total %: Overall 62.5% (125/200) support the taxOverall 37.5% (75/200) do not support the tax

    Column Total %: Overall 50% (100/200) are female and 50% (100/200) aremale

    Part II Describing Data

    Central Tendency

    VariationProportions

    Part II presents techniques used to describe samples. For interval level data,measures of central tendency and variation are common descriptivestatistics. Measures of central tendency describe a series of data with a singleattribute. Measures of variation describe how widely the data elements vary.Standardized scores combine both central tendency and variation into asingle descriptor that is comparable across different samples with the sameor different units of measurement. For nominal/ordinal data, proportions are acommon method used to describe frequencies as they compare to a total.

    Hypothesis Testing Basics

    The Normal Distribution

    Although there are numerous sampling distributions used in hypothesistesting, the normal distribution is the most common example of how datawould appear if we created a frequency histogram where the x axisrepresents the values of scores in a distribution and the y axis represents the

    http://c/Program%20Files/AcaStat-5/Help/5.htmlhttp://c/Program%20Files/AcaStat-5/Help/6.htmlhttp://c/Program%20Files/AcaStat-5/Help/7.htmlhttp://c/Program%20Files/AcaStat-5/Help/5.htmlhttp://c/Program%20Files/AcaStat-5/Help/6.htmlhttp://c/Program%20Files/AcaStat-5/Help/7.html
  • 8/6/2019 Research Methods Hand Book

    19/82

    frequency of scores for each value. Most scores will be similar and thereforewill group near the center of the distribution. Some scores will have unusualvalues and will be located far from the center or apex of the distribution.These unusual scores are represented below as the shaded areas of thedistribution. In hypothesis testing, we must decide whether the unusualvalues are simply different because of random sampling error or they are in

    the extreme tails of the distribution because they are truly different fromothers. Sampling distributions have been developed that tell us exactly whatthe probability of this sampling error is in a random sample obtained from apopulation that is normally distributed.

    Properties of a normal distribution

    Forms a symmetric bell-shaped curve 50% of the scores lie above and 50% below the midpoint of the

    distribution Curve is asymptotic to the x axis Mean, median, and mode are located at the midpoint of the x axis

    Using theoretical sampling probability distributions

    Sampling distributions allow us to approximate the probability that aparticular value would occur by chance alone. If you collected means from aninfinite number of repeated random samples of the same sample size fromthe same population you would find that most means will be very similar invalue, in other words, they will group around the true population mean. Mostmeans will collect about a central value or midpoint of a samplingdistribution. The frequency of means will decrease as one travels away fromthe center of a normal sampling distribution. In a normal probabilitydistribution, about 95% of the means resulting from an infinite number ofrepeated random samples will fall between 1.96 standard errors above and

    below the midpoint of the distribution which represents the true populationmean and only 5% will fall beyond (2.5% in each tail of the distribution).

    The following are commonly used points on a distribution for decidingstatistical significance:

    90% of scores +/- 1.65 standard errors

    95% of scores +/- 1.96 standard errors

  • 8/6/2019 Research Methods Hand Book

    20/82

    99% of scores +/- 2.58 standard errors

    Standard error: Mathematical adjustment to the standard deviationto account for the effect sample size has on the underlying probabilitydistribution. It represents the standard deviation of the samplingdistribution

    Alpha and the role of the distribution tails

    The percentage of scores beyond a particular point along the x axis of asampling distribution represent the percent of the time during an infinitenumber of repeated samples one would expect to have a score at or beyondthat value on the x axis. This value on the x axis is known as the criticalvalue when used in hypothesis testing. The midpoint represents the actualpopulation value. Most scores will fall near the actual population value butwill exhibit some variation due to sampling error. If a score from a randomsample falls 1.96 standard errors or farther above or below the mean of thesampling distribution, we know from the probability distribution that there isonly a 5% chance of randomly selecting a set of scores that would produce asample mean that far from the true population mean. When conductingsignificance testing, if we have a test statistic that is 1.96 standard errorsabove or below the mean of the sampling distribution, we assume we have astatistically significant difference between our sample mean and theexpected mean for the population. Since we know a value that far from thepopulation mean will only occur randomly 5% of the time, we assume the

    difference is the result of a true difference between the sample and thepopulation mean, and is not the result of random sampling error. The 5% isalso known as alpha and is the probability of being wrong when we concludestatistical significance.

    1-tailed vs. 2-tailed statistical tests

    A 2-tailed test is used when you cannot determine a priori whether adifference between population parameters will be positive or negative. A 1-tailed test is used when you can reasonably expect a difference will bepositive or negative. If you retain the same critical value for a 1-tailed testthat would be used if a 2-tailed test was employed, the alpha is halved (i.e., .

    05 alpha would become .025 alpha).

    Hypothesis Testing

    The chain of reasoning and systematic steps used in hypothesis testing thatare outlined in this section are the backbone of every statistical testregardless of whether one writes out each step in a classroom setting or usesstatistical software to conduct statistical tests on variables stored in adatabase.

    http://c/Program%20Files/AcaStat-5/Help/45.html#CVhttp://c/Program%20Files/AcaStat-5/Help/45.html#CVhttp://c/Program%20Files/AcaStat-5/Help/45.html#2tailhttp://c/Program%20Files/AcaStat-5/Help/45.html#1tailhttp://c/Program%20Files/AcaStat-5/Help/45.html#1tailhttp://c/Program%20Files/AcaStat-5/Help/45.html#CVhttp://c/Program%20Files/AcaStat-5/Help/45.html#CVhttp://c/Program%20Files/AcaStat-5/Help/45.html#2tailhttp://c/Program%20Files/AcaStat-5/Help/45.html#1tailhttp://c/Program%20Files/AcaStat-5/Help/45.html#1tail
  • 8/6/2019 Research Methods Hand Book

    21/82

    Chain of reasoning for inferential statistics

    1. Sample(s) must be randomly selected2. Sample estimate is compared to underlying distribution of the same

    size sampling distribution3.

    Determine the probability that a sample estimate reflects thepopulation parameter

    The four possible outcomes in hypothesis testing

    Actual Population Comparison

    Null Hyp. True Null Hyp. False

    DECISION (there is nodifference)

    (there is adifference)

    Rejected NullHyp

    Type I error

    (alpha)

    Correct Decision

    Did not RejectNull

    Correct Decision Type II Error

    (Alpha = probability of making a TypeI error)

    Regardless of whether statistical tests are conducted by hand or throughstatistical software, there is an implicit understanding that systematic stepsare being followed to determine statistical significance. These general steps

    are described on the following page and include 1) assumptions, 2) statedhypothesis, 3) rejection criteria, 4) computation of statistics, and 5) decisionregarding the null hypothesis. The underlying logic is based on rejecting astatement of no difference or no association, called the null hypothesis. Thenull hypothesis is only rejected when we have evidence beyond a reasonabledoubt that a true difference or association exists in the population(s) fromwhich we drew our random sample(s).

    Reasonable doubt is based on probability sampling distributions and can varyat the researcher's discretion. Alpha .05 is a common benchmark forreasonable doubt. At alpha .05 we know from the sampling distribution that atest statistic will only occur by random chance five times out of 100 (5%

    probability). Since a test statistic that results in an alpha of .05 could onlyoccur by random chance 5% of the time, we assume that the test statisticresulted because there are true differences between the populationparameters, not because we drew an extremely biased random sample.

    When learning statistics we generally conduct statistical tests by hand. Inthese situations, we establish before the test is conducted what test statisticis needed (called the critical value) to claim statistical significance. So, if we

    http://c/Program%20Files/AcaStat-5/Help/45.html#Type1http://c/Program%20Files/AcaStat-5/Help/45.html#Alphahttp://c/Program%20Files/AcaStat-5/Help/45.html#Type2http://c/Program%20Files/AcaStat-5/Help/45.html#CVhttp://c/Program%20Files/AcaStat-5/Help/45.html#Type1http://c/Program%20Files/AcaStat-5/Help/45.html#Alphahttp://c/Program%20Files/AcaStat-5/Help/45.html#Type2http://c/Program%20Files/AcaStat-5/Help/45.html#CV
  • 8/6/2019 Research Methods Hand Book

    22/82

    know for a given sampling distribution that a test statistic of plus or minus1.96 would only occur 5% of the time randomly, any test statistic that is 1.96or greater in absolute value would be statistically significant. In an analysiswhere a test statistic was exactly 1.96, you would have a 5% chance of beingwrong if you claimed statistical significance. If the test statistic was 3.00,statistical significance could also be claimed but the probability of being

    wrong would be much less (about .002 if using a 2-tailed test or two-tenths ofone percent; 0.2%). Both .05 and .002 are known as alpha; the probability ofa Type I error.

    When conducting statistical tests with computer software, the exactprobability of a Type I error is calculated. It is presented in several formatsbut is most commonly reported as "p

  • 8/6/2019 Research Methods Hand Book

    23/82

    Fornominal /ordinal data use

    Difference of proportions, chi square and related measures ofassociation

    State the HypothesisNull Hypothesis (Ho): There is no difference between ___ and ___.

    Alternative Hypothesis (Ha): There is a difference between __ and __.

    Note: The alternative hypothesis will indicate whether a 1-tailed or a 2-tailed test is utilized to reject the null hypothesis.

    Ha for 1-tail tested: The __ of __ is greater (or less) than the __ of __.

    Set the Rejection CriteriaThis determines how different the parameters and/or statistics must bebefore the null hypothesis can be rejected. This "region of rejection" is

    based on alpha ( ) -- the error associated with the confidence level.The point of rejection is known as the critical value.Compute the Test Statistic

    The collected data are converted into standardized scores forcomparison with the critical value.

    Decide Results of Null HypothesisIf the test statistic equals or exceeds the region of rejection bracketedby the critical value(s), the null hypothesis is rejected. In other words,the chance that the difference exhibited between the sample statisticsis due to sampling error is remote--there is an actual difference in thepopulation.

    Comparing Means

    Interval Estimation for One Mean (Margin of Error)Comparing a Population Mean to a Sample Mean (T-test)Comparing Two Independent Sample Means-Equal Variance (T-test)Computing F-ratioComparing Two Independent Samples Without Equal Variance (T-test)Comparing Multiple Means (ANOVA)

    Interval Estimation For Means(Margin of Error)

    Interval estimation involves using sample data to determine a range

    (interval) that, at an established level of confidence, will contain the mean ofthe population.

    Steps

    1. Determine confidence level (df=n-1; alpha .05, 2-tailed)2. Use either z distribution (if n>120) or t distribution (for all sizes of n).3. Use the appropriate table to find the critical value for a 2-tailed test

    http://c/Program%20Files/AcaStat-5/Help/45.html#Nominalhttp://c/Program%20Files/AcaStat-5/Help/45.html#Ordinalhttp://c/Program%20Files/AcaStat-5/Help/11.htmlhttp://c/Program%20Files/AcaStat-5/Help/12.htmlhttp://c/Program%20Files/AcaStat-5/Help/13.htmlhttp://c/Program%20Files/AcaStat-5/Help/14.htmlhttp://c/Program%20Files/AcaStat-5/Help/15.htmlhttp://c/Program%20Files/AcaStat-5/Help/16.htmlhttp://c/Program%20Files/AcaStat-5/Help/45.html#Nominalhttp://c/Program%20Files/AcaStat-5/Help/45.html#Ordinalhttp://c/Program%20Files/AcaStat-5/Help/11.htmlhttp://c/Program%20Files/AcaStat-5/Help/12.htmlhttp://c/Program%20Files/AcaStat-5/Help/13.htmlhttp://c/Program%20Files/AcaStat-5/Help/14.htmlhttp://c/Program%20Files/AcaStat-5/Help/15.htmlhttp://c/Program%20Files/AcaStat-5/Help/16.html
  • 8/6/2019 Research Methods Hand Book

    24/82

    4. Multiple hypotheses can be compared with the estimated interval for thepopulation to determine their significance. In other words, differing values ofpopulation means can be compared with the interval estimation to determineif the hypothesized population means fall within the region of rejection.

    Estimation Formula

    where

    = sample meanCV = critical value (consult distribution table for df=n-1 and chosen alpha--commonly .05)

    (when using the t distribution)

    (when using the z distribution; assumes large sample size)ExampleExample: Interval Estimation for Means

    Problem: A random sample of 30 incoming college freshmen revealed thefollowing statistics: mean age 19.5 years; standard deviation 1.2. Based on a5% chance of error, estimate the range of possible mean ages for allincoming college freshmen.

    Estimation

    Critical value (CV)Df=n-1 or 29

    Consult t-distribution for alpha .05, 2-tailedCV=2.045

    Standard error

    Estimate

    orInterpretation

    We are 95% confident that the actual mean age of the all incoming freshmen

    http://c/Program%20Files/AcaStat-5/Help/11a.htmlhttp://c/Program%20Files/AcaStat-5/Help/11a.html
  • 8/6/2019 Research Methods Hand Book

    25/82

    will be somewhere between 19 years (the lower limit) and 20 years (theupper limit) of age.

    Comparing a Population Mean to a Sample Mean(T-test)

    Assumptions

    Interval/ratio level dataRandom sampling

    Normal distribution in populationState the HypothesisNull Hypothesis (Ho): There is no difference between the mean of thepopulation and the mean of the sample.

    Alternative Hypothesis (Ha): There is a difference between the mean of the

    population and the mean of the sample.

    Ha for 1-tail test: The mean of the sample is greater (or less) than the meanof the population.Set the Rejection Criteria

    Determine the degrees of freedom = n-1Determine level of confidence -- alpha (1 or 2-tailed)

    Use the t distribution table to determine the critical valueCompute the Test StatisticStandard error of the sample mean

    Test statistic

    Decide Results of Null Hypothesis < style="color: rgb(0, 0, 0);">

    There is/is not a significant difference between the mean of one populationand the mean of another population from which the sample was obtained.ExampleExample: Population Mean to a Sample Mean

    http://c/Program%20Files/AcaStat-5/Help/12a.htmlhttp://c/Program%20Files/AcaStat-5/Help/12a.html
  • 8/6/2019 Research Methods Hand Book

    26/82

    Problem: Compare the mean age of incoming students to the known meanage for all previous incoming students. A random sample of 30 incomingcollege freshmen revealed the following statistics: mean age 19.5 years,standard deviation 1 year. The college database shows the mean age forprevious incoming students was 18.

    State the Hypothesis

    Ho: There is no significant difference between the mean age ofpast college students and the mean age of current incomingcollege students.

    Ha: There is a significant difference between the mean age ofpast college students and the mean age of current incomingcollege students.

    Set the Rejection Criteria

    Significance level .05 alpha, 2-tailed test

    Degrees of Freedom = n-1 or 29

    Critical value from t-distribution = 2.045

    Compute the Test StatisticStandard error of the sample mean

    Test statistic

    Decide Results of Null Hypothesis

  • 8/6/2019 Research Methods Hand Book

    27/82

    Given that the test statistic (8.065) exceeds the critical value(2.045), the null hypothesis is rejected in favor of thealternative. There is a statistically significant difference betweenthe mean age of the current class of incoming students and themean age of freshman students from past years. In other words,this year's freshman class is on average older than freshmen

    from prior years.

    If the results had not been significant, the null hypothesis wouldnot have been rejected. This would be interpreted as thefollowing: There is insufficient evidence to conclude there is astatistically significant difference in the ages of current and pastfreshman students.

    Comparing Two Independent Sample Means

    (T-test) With Homogeneity of Variance

    State the Hypothesis

    Null Hypothesis (Ho): There is no difference between the meanof one population and the mean of another population.

    Alternative Hypothesis (Ha): There is a difference between themean of one population and the mean of another population.

    Ha for 1-tailed test: The mean of one population is greater (orless) than the mean of the other population.

    Set the Rejection CriteriaDetermine the "degrees of freedom" = (n1+n2)-2

    Determine level of confidence -- alpha (1 or 2-tailed test)

    Use the t-distribution table to determine the critical value

    Compute the Test StatisticStandard error of the difference between two sample means

  • 8/6/2019 Research Methods Hand Book

    28/82

    Test statistic

    Decide Results of Null HypothesisThere is/is not a significant difference between the mean of onepopulation and the mean of another population from which thetwo samples were obtained.

    ExampleExample: Two Independent Sample Means

    Problem: You have obtained the number of years of education from onerandom sample of 38 police officers from City A and the number of years ofeducation from a second random sample of 30 police officers from City B. Theaverage years of education for the sample from City A is 15 years with astandard deviation of 2 years. The average years of education for the samplefrom City B is 14 years with a standard deviation of 2.5 years. Is there astatistically significant difference between the education levels of policeofficers in City A and City B?

    Assumptions

    Random sampling

    Independent samples

    Interval/ratio level data

    Organize data

    City A (labeled sample 1)

    City B (labeled sample 2)

    State HypothesesHo: There is no statistically significant difference between themean education level of police officers working in City A and themean education level of police officers working in City B.

    For a 2-tailed hypothesis test

    Ha: There is a statistically significant difference between themean education level of police officers working in City A and themean education level of police officers working in City B.

    For a 1-tailed hypothesis test

    http://c/Program%20Files/AcaStat-5/Help/13a.htmlhttp://c/Program%20Files/AcaStat-5/Help/13a.html
  • 8/6/2019 Research Methods Hand Book

    29/82

    Ha: The mean education level of police officers working in City Ais significantly greater than the mean education level of policeofficers working in City B.

    Set the Rejection CriteriaDegrees of freedom = 38+30-2=66

    If using 2-tailed test

    Alpha.05, tcv= 2.000If using 1-tailed test

    Alpha.05, tcv= 1.671Compute Test Statistic

    Standard error

    Test Statistic

    Decide ResultsIf using 2-tailed test

    There is no statistically significant difference between the meanyears of education for police officers in City A and mean years ofeducation for police officers in City B. (note: the test statistic1.835 does not meet or exceed the critical value of 2.000 for a2-tailed test.)

  • 8/6/2019 Research Methods Hand Book

    30/82

    If using 1-tailed test

    Police officers in City A have significantly more years ofeducation than police officers in City B. (note: the test statistic1.835 does exceed the critical value of 1.671 for a 1-tailed test.)

    Computing F-ratio

    The F-ratio is used to determine whether the variances in two independentsamples are equal. If the F-ratio is not statistically significant, you mayassume there is homogeneity of variance and employ the standard t-test forthe difference of means. If the F-ratio is statistically significant, use analternative t-test computation such as the Cochran and Cox method.

    Set the Rejection Criteria

    Determine the "degrees of freedom" for each sample

    df = n1 - 1 (numerator = n for sample with larger variance)

    df = n2 - 1 (denominator = n for sample with smaller variance)

    Determine the level of confidence -- alpha

    Compute the test statistic

    where

    = largest variance

    = smallest variance

    Compare the test statistic with the f critical value (Fcv) listed in the Fdistribution. If the f-ratio equals or exceeds the critical value, the null

    hypothesis (Ho) (there is no difference between the samplevariances) is rejected. If there is a difference in the sample variances, thecomparison of two independent means should involve the use of the Cochranand Cox method.

    Example

    http://c/Program%20Files/AcaStat-5/Help/14a.htmlhttp://c/Program%20Files/AcaStat-5/Help/14a.html
  • 8/6/2019 Research Methods Hand Book

    31/82

    Example: F-Ratio

    Sample A =20 n=10

    Sample B =30 n=30

    Set Rejection Criteria

    df for numerator (Sample B) = 29

    df for denominator (Sample A) = 9

    Consult F-Distribution table for df = (29,9), alpha.05

    Fcv= 2.70

    Compute the Test Statistic

    CompareThe test statistic (1.50) did not meet or exceed the critical value(2.70). Therefore, there is no statistically significant differencebetween the variance exhibited in Sample A and the varianceexhibited in Sample B. Assume homogeneity of variance fortests of the difference between sample means

    Two Independent Sample Means (Cochran and Cox)

    (T-test) Without Homogeneity of Variance

    State the Hypothesis

    Null Hypothesis (Ho): There is no difference between the meanof one population and the mean of another population.

    Alternative Hypothesis (Ha): There is a difference between themean of one population and the mean of another population.

  • 8/6/2019 Research Methods Hand Book

    32/82

    Ha for 1-tailed test: The mean of one population is greater (orless) than the mean of the other population.

    Set the Rejection CriteriaDetermine the "degrees of freedom"

    Determine level of confidence -- alpha (1 or 2-tailed)

    Use the t distribution to determine the critical value

    Compute the Test StatisticStandard error of the difference between two sample means

    Test statistic

    Decide Results of Null HypothesisThere is/is not a significant difference between the mean of onepopulation and the mean of another population from which thetwo samples were obtained.

    Example

    Example: Two Independent Sample Means(No Variance Homogeneity)

    The following uses the same means and sample sizes from the example oftwo independent means with homogeneity of variance but the standarddeviations (and hence variance) are significantly different (hint: verify the f-ratio is 4 with df=37,29).

    http://c/Program%20Files/AcaStat-5/Help/15a.htmlhttp://c/Program%20Files/AcaStat-5/Help/15a.html
  • 8/6/2019 Research Methods Hand Book

    33/82

    Problem: You have obtained the number of years of education from onerandom sample of 38 police officers from City A and the number of years ofeducation from a second random sample of 30 police officers from City B. Theaverage years of education for the sample from City A is 15 years with astandard deviation of 1 year. The average years of education for the samplefrom City B is 14 years with a standard deviation of 2 years. Is there a

    statistically significant difference between the education levels of policeofficers in City A and City B?

    Assumptions

    Random sampling

    Independent samples

    Interval/ratio level data

    Organize data

    City A (labeled sample 1)

    City B (labeled sample 2)

    State HypothesesHo: There is no statistically significant difference between themean education level of police officers working in City A and themean education level of police officers working in City B.

    For a 2-tailed hypothesis test

    Ha: There is a statistically significant difference between themean education level of police officers working in City A and themean education level of police officers working in City B.

    For a 1-tailed hypothesis test

    Ha: The mean education level of police officers working in City Ais significantly greater than the mean education level of policeofficers working in City B.

    Set the Rejection CriteriaDegrees of freedom

  • 8/6/2019 Research Methods Hand Book

    34/82

    If using 2-tailed test

    Alpha.05, tcv= 2.021If using 1-tailed test

    Alpha .05, tcv= 1.684Compute Test Statistic

    Standard error

    Test Statistic

    Decide ResultsIf using 2-tailed test

    There is a statistically significant difference between the meanyears of education for police officers in City A and mean years ofeducation for police officers in City B. (note: the test statistic2.463 exceeds the critical value of 2.021 for a 2-tailed test and1.684 for a 1 tailed-test.)

    If using 1-tailed test

  • 8/6/2019 Research Methods Hand Book

    35/82

    Same results

    One-Way Analysis of Variance(ANOVA)

    Used to evaluate multiple means of one independent variable to avoidconducting multiple t-tests

    Assumptions

    Random sampling

    Interval/ratio level data

    Independent samples

    State the HypothesisNull Hypothesis (Ho): There is no difference between the meansof __________________.

    Alternative Hypothesis (Ha): There is a difference between themeans of __________________.

    Set the Rejection CriteriaDetermine the degrees of freedom for the F Distribution

    df for numerator = K - 1 where K = number of groups/meanstested

    df for denominator = N - K where N = total number of scores

    Determine the level of confidence -- alpha

    Establish the F critical value (Fcv) from the F Distribution table

    Compute the Test Statistic

    = each group mean

    = grand mean for all the groups = ( sum of all scores)/N

  • 8/6/2019 Research Methods Hand Book

    36/82

    nk = number in each group

    Note: F= Mean Squares between groups/ Mean Squares withingroups

    Sum of Squares between groups =

    Sum of Squares within groups =

    Decide Results of Null HypothesisCompare the F statistic to the F critical value. If the F statistic(ratio) equals or exceeds the Fcv, the null hypothesis is rejected.This suggests that the population means of the groups sampledare not equal -- there is a difference between the group means.

    ExampleExample: ANOVA

    Problem:You obtained the number of years of education from one randomsample of 38 police officers from City A, the number of years of educationfrom a second random sample of 30 police officers from City B, and thenumber of years of education from a third random sample of 45 policeofficers from City C. The average years of education for the sample from CityA is 15 years with a standard deviation of 2 years. The average years ofeducation for the sample from City B is 14 years with a standard deviation of2.5 years. The average years of education for the sample from City C is 16years with a standard deviation of 1.2 years. Is there a statistically significantdifference between the education levels of police officers in City A, City B,and City C?

    City A City B City CMean (years) 15 14 16Standard Deviation 2 2.5 1.2S2 (Variance) 4 6.25 1.44N (number of cases) 38 30 45Sum of Squares 152 187.5 64.8Sum of Scores (mean*n) 570 420 720

    State the Hypothesis

    Ho: There is no statistically significant difference among thethree cities in the mean years of education for police officers.

    http://c/Program%20Files/AcaStat-5/Help/16a.htmlhttp://c/Program%20Files/AcaStat-5/Help/16a.html
  • 8/6/2019 Research Methods Hand Book

    37/82

    Ha: There is a statistically significant difference among the threecities in the mean years of education for police officers.

    Set the Rejection CriteriaNumerator Degrees of Freedom

    df=k-1 where k=3 (number of independent samples)

    df=2

    Denominator Degrees of Freedom

    df=n-k where n=113 (sum of n for all independent samples)df=110

    Establish Critical Value

    At alpha.05, df=(2,110)

    Consult f-distribution, Fcv = 3.08

    Compute the Test StatisticEstimate Grand Mean

    Estimate F Statistic

  • 8/6/2019 Research Methods Hand Book

    38/82

    Decide Results of Null HypothesisSince the F-statistic (9.931) exceeds the F critical value (3.08),we reject the null hypothesis and conclude there is a statisticallysignificant difference between the three cities in the mean yearsof education for police officers.

    Proportions

    Interval Estimation for Proportions (Margin of Error)Interval Estimation for Two Proportions (Margin of Error)Comparing a Population to Sample (Z-test)Comparing Two Independent Samples (Z-test)

    Estimating Appropriate Sample Size

    Interval Estimation for Proportions(Margin of Error)

    Interval estimation (margin of error) involves using sample data to determinea range (interval) that, at an established level of confidence, will contain thepopulation proportion.

    Steps

    Determine the confidence level (alpha is generally .05)

    Use the z-distribution table to find the critical value for a 2-tailedtest given the selected confidence level (alpha)

    Estimate the sampling error

    where

    p = sample proportion

    q=1-p

    Estimate the confidence interval

    CV = critical value

    http://c/Program%20Files/AcaStat-5/Help/19.htmlhttp://c/Program%20Files/AcaStat-5/Help/20.htmlhttp://c/Program%20Files/AcaStat-5/Help/21.htmlhttp://c/Program%20Files/AcaStat-5/Help/22.htmlhttp://c/Program%20Files/AcaStat-5/Help/21b.htmlhttp://c/Program%20Files/AcaStat-5/Help/19.htmlhttp://c/Program%20Files/AcaStat-5/Help/20.htmlhttp://c/Program%20Files/AcaStat-5/Help/21.htmlhttp://c/Program%20Files/AcaStat-5/Help/22.htmlhttp://c/Program%20Files/AcaStat-5/Help/21b.html
  • 8/6/2019 Research Methods Hand Book

    39/82

    CI = p (CV)(Sp)

    Interpret

    Based on alpha .05, you are 95% confident that the proportion inthe population from which the sample was obtained is between__ and __.

    Note: Given the sample data and level of error, the confidenceinterval provides an estimated range of proportions that is mostlikely to contain the population proportion. The term "mostlikely" is measured by alpha (i.e., in most cases there is a 5%chance --alpha .05-- that the confidence interval does notcontain the true population proportion).

    Example

    Example: Interval Estimation for Proportions

    Problem: A random sample of 500 employed adults found that 23% hadtraveled to a foreign country. Based on these data, what is your estimate forthe entire employed adult population?

    N=500, p = .23 q = .77

    Use alpha .05 (i.e., the critical value is 1.96)

    Estimate Sampling Error

    Compute Interval

    Interpret

    http://c/Program%20Files/AcaStat-5/Help/19a.htmlhttp://c/Program%20Files/AcaStat-5/Help/19a.html
  • 8/6/2019 Research Methods Hand Book

    40/82

    You are 95% confident that the actual proportion of all employed adults whohave traveled to a foreign country is between 19.3% and 26.7%.

    Interval Estimation for the Difference Between Two Proportions(Margin of Error)

    Statistical estimation involves using sample data to determine a range(interval) that, at an established level of confidence, will contain thedifference between two population proportions.

    Steps

    Determine the confidence level (generally alpha .05)

    Use the z distribution table to find the critical value for a 2-tailed

    test (at alpha .05 the critical value would equal 1.96)

    Estimate Sampling Error

    where and

    Estimate the Interval

    CI = (p1-p2) (CV)(Sp1-p2)

    p1-p2 = difference between two sample proportions

    CV = critical value

    Interpret

    Based on alpha .05, you are 95% confident that the differencebetween the proportions of the two subgroups in the populationfrom which the sample was obtained is between __ and __.

    Note: Given the sample data and level of error, the confidence

    interval provides an estimated range of proportions that is mostlikely to contain the difference between the populationsubgroups. The term "most likely" is measured by alpha or inmost cases there is a 5% chance (alpha .05) that the confidenceinterval does not contain the true difference between thesubgroups in the population.

    Comparing a Population Proportion to a Sample Proportion(Z-test)

  • 8/6/2019 Research Methods Hand Book

    41/82

    Assumptions

    Independent random sampling

    Nominal level data

    Large sample size

    State the HypothesisNull Hypothesis (Ho): There is no difference between theproportion of the population (P) and the proportion of thesample (p).

    Alternate Hypothesis (Ha): There is a difference between theproportion of the population and the proportion of the sample.

    Ha for 1-tailed test: The proportion of the sample is greater (orless) than the proportion of the population.

    Set the Rejection CriteriaEstablish alpha and 1-tailed or 2-tailed test

    Use z-distribution table to estimate critical value

    At alpha .05, 2-tailed test Zcv = 1.96At alpha .05, 1-tailed test Zcv = 1.65

    Compute the Test StatisticStandard error

    p = population proportion

    q = 1 - p

    n = sample size

    Test statistic

  • 8/6/2019 Research Methods Hand Book

    42/82

    Decide Results of Null HypothesisThere is/is not a significant difference between the proportion of

    one population and the proportion of another population fromwhich the sample was obtained.

    Example

    Example: Sample Proportion to a Population

    Problem: Historical data indicates that about 10% of your agency's clientsbelieve they were given poor service. Now under new management for sixmonths, a random sample of 150 clients found that 15% believe they weregiven poor service.

    Pu = .10

    Ps = .15

    n = 110

    State the HypothesisHo: There is no statistically significant difference between thehistorical proportion of clients reporting poor service and thecurrent proportion of clients reporting poor service.

    If 2-tailed test

    Ha: There is a statistically significant difference between thehistorical proportion of clients reporting poor service and thecurrent proportion of clients reporting poor service.

    If 1-tailed test

    Ha: The proportion of current clients reporting poor service issignificantly greater than the historical proportion of clientsreporting poor service.

    Set the Rejection Criteria

    If 2-tailed test, Alpha .05, Zcv = 1.96

    If 1-tailed test, Alpha .05, Zcv = 1.65

    Compute the Test StatisticEstimate Standard Error

    http://c/Program%20Files/AcaStat-5/Help/21a.htmlhttp://c/Program%20Files/AcaStat-5/Help/21a.html
  • 8/6/2019 Research Methods Hand Book

    43/82

    Test Statistic

    Decide Results of Null HypothesisIf a 2-tailed test was used

    Since the test statistic of 1.724 did not meet or exceed thecritical value of 1.96, you must conclude there is no statisticallysignificant difference between the historical proportion of clientsreporting poor service and the current proportion of clientsreporting poor service.

    If a 1-tailed test was used

    Since the test statistic of 1.724 exceeds the critical value of1.65, you can conclude the proportion of current clientsreporting poor service is significantly greater than the historicalproportion of clients reporting poor service.

    Comparing Proportions From Two Independent Samples(Z-test)

    Assumptions

    Independent random sampling

    Nominal level data

  • 8/6/2019 Research Methods Hand Book

    44/82

    Large sample size

    State the HypothesisNull Hypothesis (Ho): There is no difference between theproportion of one population (p1) and the proportion of anotherpopulation (p2).

    Alternate Hypothesis (Ha): There is a difference between theproportion of one population (p1) and the proportion of anotherpopulation (p2).

    Ha for 1-tailed test: The proportion of p1 is greater (or less) thanthe proportion of p2.

    p1 > p2 or p1 < p2

    Set the Rejection CriteriaEstablish alpha and 1-tailed or 2-tailed test

    Use z-distribution table to estimate critical value

    At alpha .05, 2-tailed test Zcv = 1.96

    At alpha .05, 1-tailed test Zcv = 1.65

    Compute the Test StatisticStandard error of the difference between two proportions

    where

    Test statistic

  • 8/6/2019 Research Methods Hand Book

    45/82

    Decide Results of Null HypothesisThere is/is not a significant difference between the proportions

    of the two populations.

    Example

    Example: Two Sample Proportions

    Problem: A survey was conducted of students from the Princeton publicschool system to determine if the incidence of hungry children was consistentin two schools located in lower-income areas. A random sample of 80elementary students from school A found that 23% did not have breakfastbefore coming to school. A random sample of 180 elementary students fromschool B found that 7% did not have breakfast before coming to school.

    State the Hypothesis

    Ho: There is no statistically significant difference between theproportion of students in school A not eating breakfast and theproportion of students in school B not eating breakfast.

    Ha: There is a statistically significant difference between theproportion of students in school A not eating breakfast and theproportion of students in school B not eating breakfast.

    Set the Rejection CriteriaAlpha.05, Zcv = 1.96Compute the Test Statistic

    Estimate of Standard Error

    http://c/Program%20Files/AcaStat-5/Help/22a.htmlhttp://c/Program%20Files/AcaStat-5/Help/22a.html
  • 8/6/2019 Research Methods Hand Book

    46/82

    Test Statistic

    Decide Results of the Null HypothesisSince the test statistic 3.721 exceeds the critical value of 1.96,you conclude there is a statistically significant differencebetween the proportion of students in school A not eatingbreakfast and the proportion of students in school B not eatingbreakfast.

    Chi-Square

    Goodness of FitChi-Square Test of IndependenceMeasuring AssociationStandardized Residuals

    Chi-square Goodness of Fit Test

    Comparing frequencies of nominal data for a one-sample case.

    Assumptions

    Independent random sampling

    Nominal level data

    http://c/Program%20Files/AcaStat-5/Help/25.htmlhttp://c/Program%20Files/AcaStat-5/Help/26.htmlhttp://c/Program%20Files/AcaStat-5/Help/27.htmlhttp://c/Program%20Files/AcaStat-5/Help/24a.htmlhttp://c/Program%20Files/AcaStat-5/Help/25.htmlhttp://c/Program%20Files/AcaStat-5/Help/26.htmlhttp://c/Program%20Files/AcaStat-5/Help/27.htmlhttp://c/Program%20Files/AcaStat-5/Help/24a.html
  • 8/6/2019 Research Methods Hand Book

    47/82

    State the HypothesisNull Hypothesis (Ho): There is no significant difference betweenthe categories of observed frequencies for the data collected.

    Alternative Hypothesis (Ha): There is a significant differencebetween the categories of observed frequencies for the datacollected.

    Set the Rejection CriteriaDetermine degrees of freedom k - 1

    Establish the confidence level (.05, .01, etc.)

    Use the chi-square distribution table to establish the criticalvalue

    Compute the Test Statistic

    n = sample size

    k = number of categories or cells

    Fo = observed frequency

    Fe = expected frequency (determined by the establishedprobability of occurrence for each category/cell -- the unbiased

    probability for distribution of frequencies n/k)

    Decide Results of Null HypothesisThe null hypothesis is rejected if the test statistic equals orexceeds the critical value. If the null hypothesis is rejected, thisis an indication that the differences between the expected andobserved frequencies are too great to be attributed to samplingfluctuations (error). There is a significant difference in thepopulation between the frequencies observed in the variouscategories.

    Example

    Example: Chi-Square Goodness-of-Fit Test

    Problem: You wish to evaluate variations in the proportion of defectsproduced from five assembly lines. A random sample of 100 defective partsfrom the five assembly lines produced the following contingency table.

    http://c/Program%20Files/AcaStat-5/Help/25a.htmlhttp://c/Program%20Files/AcaStat-5/Help/25a.html
  • 8/6/2019 Research Methods Hand Book

    48/82

    Line A Line B Line C Line D Line E

    24 15 22 20 19

    State the Hypothesis

    Ho: There is no significant difference among the assembly linesin the observed frequencies of defective parts.

    Ha: There is a significant difference among the assembly lines inthe observed frequencies of defective parts.

    Set the Rejection Criteriadf=5-1 or df=4

    At alpha .05 and 4 degrees of freedom, the critical value from

    the chi-square distribution is 9.488

    Compute the Test Statistic

    Line A Line B Line C Line D Line E

    Fo 24 15 22 20 19

    Fe (100/5=20) 20 20 20 20 20

    .8 1.25 .2 0 .05 2.30

    Decide Results of Null HypothesisSince the chi-square test statistic 2.30 does not meet or exceedthe critical value of 9.488, you would conclude there is nostatistically significant difference among the assembly lines inthe observed frequencies of defective parts.

    Chi-square Test of IndependenceComparing frequencies of nominal data for k-sample cases

    Assumptions

    Independent random sampling

    Nominal/Ordinal level data

  • 8/6/2019 Research Methods Hand Book

    49/82

    No more than 20% of the cells have an expected frequency lessthan 5

    No empty cells

    State the HypothesisNull Hypothesis (Ho): The observed frequencies occurindependently of the variables tested.

    Alternative Hypothesis (Ha): The observed frequencies dependon the variables tested.

    Set the Rejection CriteriaDetermine degrees of freedom df=(# of rows - 1)(# of columns -1)

    Establish the confidence level (.05, .01, etc.)

    Use the chi-square distribution table to estimate the criticalvalue

    Compute the Test Statistic

    Fo= observed frequency

    Fe= expected frequency for each cell

    Fe=(frequency for the row)(frequency for the column)/n

    Decide Results of Null Hypothesis The frequencies exhibited are/are not independent of thevariables tested. There is/is not a significant association in thepopulation between the variables.

    Example

    Example: Chi-Square Test of Independence

    Problem: You wish to evaluate the association between a person's sex andtheir attitudes toward school spending on athletic programs. A randomsample of adults in your school district produced the following table.

    Female Male Row Total

    http://c/Program%20Files/AcaStat-5/Help/26a.htmlhttp://c/Program%20Files/AcaStat-5/Help/26a.html
  • 8/6/2019 Research Methods Hand Book

    50/82

    Spend more money 15 25 40

    Spend the same 5 15 20

    Spend less money 35 10 45

    Column Total 55 50 105

    State the Hypothesis

    Ho: There is no association between a person's sex and theirattitudes toward spending on athletic programs.

    Ha: There is an association between a person's sex and theirattitudes toward spending on athletic programs.

    Set the Rejection CriteriaDetermine degrees of freedom df=(3 - 1)(2 - 1) or df=2

    Alpha = .05

    Based on the chi-square distribution table, the critical value =5.991

    Compute the Test Statistic

    Frequency Observed

    Female Male Row Total

    Spend more money 15 25 40

    Spend the same 5 15 20

    Spend less money 35 10 45

    Column Total 55 50 105

    Frequency Expected

    Female Male Row Total

  • 8/6/2019 Research Methods Hand Book

    51/82

    Spend moremoney

    55*40/105 =20.952

    50*40/105 =19.048

    40

    Spend the same 55*20/105 =10.476

    50*20/105 =9.524

    20

    Spend less money 55*45/105 =23.571

    50*45/105 =21.429

    45

    Column Total 55 50 105

    Chi-square Calculations

    Female Male

    Spend more money (15-20.952)2/20.952 (25-19.048)2/19.048

    Spend the same (5-10.476)2/10.476 (15-9.524)2/9.524

    Spend less money (35-23.571)2/23.571 (10-21.429)2/21.429

    Chi-square

    Female Male

    Spend more money 1.691 1.860

    Spend the same 2.862 3.149

    Spend less money 5.542 6.096

    21.200

    Decide Results of Null HypothesisSince the chi-square test statistic 21.2 exceeds the critical valueof 5.991, you may conclude there is a statistically significantassociation between a person's sex and their attitudes towardspending on athletic programs. As is apparent in thecontingency table, males are more likely to support spendingthan females.

  • 8/6/2019 Research Methods Hand Book

    52/82

    Coefficients for Measuring Association

    The following are a few of the many measures of association used with chi-square and other contingency table analyses. When using the chi-squarestatistic, these coefficients can be helpful in interpreting the relationshipbetween two variables once statistical significance has been established. Thelogic for using measures of association is as follows:

    Even though a chi-square test may show statistical significance between twovariables, the relationship between those variables may not be substantivelyimportant. These and many other measures of association are available tohelp evaluate the relative strength of a statistically significant relationship. Inmost cases, they are not used in interpreting the data unless the chi-squarestatistic first shows there is statistical significance (i.e., it doesn't make senseto say there is a strong relationship between two variables when your

    statistical test shows this relationship is not statistically significant).

    PhiOnly used on 2x2 contingency tables. Interpreted as a measureof the relative (strength) of an association between twovariables ranging from 0 to 1.

    Pearson's Contingency Coefficient (C)It is interpreted as a measure of the relative (strength) of anassociation between two variables. The coefficient will always beless than 1 and varies according to the number of rows andcolumns.

    Cramer's V Coefficient (V)Useful for comparing multiple X2 test statistics and is

    generalizable across contingency tables of varying sizes. It is notaffected by sample size and therefore is very useful in situationswhere you suspect a statistically significant chi-square was theresult of large sample size instead of any substantiverelationship between the variables. It is interpreted as ameasure of the relative (strength) of an association between twovariables. The coefficient ranges from 0 to 1 (perfectassociation). In practice, you may find that a Cramer's V of .10

  • 8/6/2019 Research Methods Hand Book

    53/82

    provides a good minimum threshold for suggesting there is asubstantive relationship between two variables.

    where q = smaller # of rows or columns

    Correlation

    Comparing Paired ObservationsSignificance Testing Correlation (r)Comparing Rank-Paired ObservationsSignificance Testing Paired Correlation (rs)Bivariate Regression

    Pearson's Product Moment Correlation Coefficient

    Verify Conditions for using Pearson r

    Interval/ratio data must be from paired observations.

    A linear relationship should exist between the variables --verified by plotting the data on a scattergram.

    Pearson r computations are sensitive to extreme values in thedata

    Compute Pearson's r

    n = number of paired observations

    X = variable A

    Y = variable B

    Interpret the Correlation CoefficientA positive coefficient indicates the values of variable A vary in

    the same direction as variable B. A negative coefficient indicatesthe values of variable A and variable B vary in oppositedirections.

    Characterizations of Pearson r

    .9 to 1 very high correlation

    .7 to .9 high correlation

    http://c/Program%20Files/AcaStat-5/Help/30.htmlhttp://c/Program%20Files/AcaStat-5/Help/31.htmlhttp://c/Program%20Files/AcaStat-5/Help/32.htmlhttp://c/Program%20Files/AcaStat-5/Help/33.htmlhttp://c/Program%20Files/AcaStat-5/Help/34.htmlhttp://c/Program%20Files/AcaStat-5/Help/30.htm