0324305419_65709

8/13/2019 0324305419_65709

1/36

Dr S.L Gupta

Data Analysis: Analyzing

Individual Variables and Basicsof Hypothesis Testing

Chapter 19

8/13/2019 0324305419_65709

2/36

Dr S.L Gupta

(1)Is the variable to be analyzed by itself

(univariate analysis) or in relationship

to other variables (multivariate

analysis)?

(2)What level of measurement was

used?

If you can answer these two quest ions,

data analys is is easy...

Data Analysis: Two Key

Considerations

8/13/2019 0324305419_65709

3/36

Dr S.L Gupta

CATEGORICAL MEASURES: A

commonly used expression for nominal

and ordinal measures.

CONTINUOUS MEASURES: A

commonly used expression for intervaland ratio measures.

8/13/2019 0324305419_65709

4/36

Dr S.L Gupta

Basic Univariate Statistics:

Categorical Measures

FREQUENCY ANALYSIS:A count of

the number of cases that fall into each

of the response categories.

8/13/2019 0324305419_65709

5/36

Dr S.L Gupta

Frequency Analysis

8/13/2019 0324305419_65709

6/36

Dr S.L Gupta

Use of Percentages

Percentages are very useful for

interpreting the results of categorical

analyses and should be included

whenever possible.

Unless your sample size is VERY large,

however, report percentages as whole

numbers (i.e., no decimals)

8/13/2019 0324305419_65709

7/36

Dr S.L Gupta

Researchers almost always work with

valid percentages which are simply

percentages after taking out cases with

missing data on the variable being

analyzed. Note: In the example, there were no missing

cases. As a result, the Percent column entrieswere identical to the Valid Percent column

entries.

Frequency Analysis

8/13/2019 0324305419_65709

8/36

Dr S.L Gupta

Uses of Frequency Analysis

Univariate categorical analysis

Identify blunders and cases with

excessive item nonresponseIdentify outliers

Determine empirical distribution of a

variable

8/13/2019 0324305419_65709

9/36

Dr S.L Gupta

Frequency Analysis

8/13/2019 0324305419_65709

10/36

Dr S.L Gupta

8/13/2019 0324305419_65709

11/36

Dr S.L Gupta

8/13/2019 0324305419_65709

12/36

Dr S.L Gupta

Confidence Interval

A projection of the range within which a

population parameter will lie at a given

level of confidence based on a statistic

obtained from a probabilistic sample.

This is why you need to drawa probability sample!

8/13/2019 0324305419_65709

13/36

Dr S.L Gupta

Confidence Intervals for Proportions

where z= zscore associated with the desired level ofconfidence;p= the proportion obtained from the

sample; and n= the number of valid cases overall on

which the proportion was based.

CONFIDENCE INTERVAL:

8/13/2019 0324305419_65709

14/36

Dr S.L Gupta


EXAMPLE:In Exhibit 19.2, we saw that30% of the people in the sample had

financed the most recent car purchase.

Assuming that the 100 respondents had beensecured using a probability sampling plan,

what is the 95% confidence interval for the

population parameter?

8/13/2019 0324305419_65709

15/36

Dr S.L Gupta


Therefore, we can be 95% confident that the

proportion of people in the population who would

respond that they had financed their most recent car

purchase is between .21 and .39, inclusive.

CAUTION i I i

8/13/2019 0324305419_65709

16/36

Dr S.L Gupta

CAUTION in Interpreting

Confidence Intervals

The confidence interval only takes

sampling error into account.

It DOES NOTaccount for other commontypes of error (e.g., response error,

nonresponse error).

The goal is to reduce TOTAL error, notjust one type of error.

8/13/2019 0324305419_65709

17/36

Dr S.L Gupta

DESCRIPTIVE STATISTICS:Statistics

that describe the distribution of

responses on a variable. The most

commonly used descriptive statistics are

the meanand standard deviation.

Basic Univariate Statistics:

Continuous Measures

8/13/2019 0324305419_65709

18/36

Dr S.L Gupta

Converting Continuous Measures to


Sometimes it is useful to convertcontinuous measures to categoricalmeasures. This is legitimate, because measures at

higher levels of measurement (in this case,continuous measures) have all theproperties of measures at lower levels of

measurement (categorical measures).Why do this?Ease of interpretationfor managers

C ti C ti M t

8/13/2019 0324305419_65709

19/36

Dr S.L Gupta

TWO-BOX TECHNIQUE:A technique

for converting an interval-level rating

scale into a categorical measure usually

used for presentation purposes. The

percentage of respondents choosing

one of the top two positions on a rating

scale is reported.



C ti C ti M t

8/13/2019 0324305419_65709

20/36

Dr S.L Gupta

Please rate the quality of service provided by Better Smiles Dental

Office on the following scales:

very very

poor poor neutral good good

Dental technicians (2) (6) (36) (32) (24)

Receptionist (10) (16) (18) (36) (20)

Dentist (17) (17) (35) (21) (10)

Frequency count of respondents selecting each

response category shown in red



C ti C ti M t

8/13/2019 0324305419_65709

21/36

Dr S.L Gupta

two-box mean (s.d.)

Dental technicians 56% 3.70 (0.97)

Receptionist 56% 3.40 (1.25)

Dentist 31% 2.90 (1.21)

(n=100)



8/13/2019 0324305419_65709

22/36

Dr S.L Gupta

Confidence Intervals for Means

where z= zscore associated with the desired level ofconfidence; s= the sample standard deviation; and

n= the total number of cases used to calculate the

mean.

CONFIDENCE INTERVAL:

8/13/2019 0324305419_65709

23/36

Dr S.L Gupta

EXAMPLE:A sample of 100 car ownersrevealed that the mean number of family

members was 4.0, with a sample standard

deviation of 1.9 family members. Assumingthat the 100 respondents had been secured

using a probability sampling plan, what is the

95% confidence interval for the mean number

of family members in the population?


8/13/2019 0324305419_65709

24/36

Dr S.L Gupta


Therefore, we can be 95% confident that the mean

number of family members in the population lies

somewhere between 3.6 and 4.4, inclusive.

8/13/2019 0324305419_65709

25/36

Hypothesis Testing

THE ISSUE: How can we tell if a

particular result in the samplerepresents the true situation in the

population or simply occurred by

chance?

8/13/2019 0324305419_65709

26/36

Dr S.L Gupta

Hypotheses

Unproven propositions about some

phenomenon of interest.

8/13/2019 0324305419_65709

27/36

Hypothesis Testing

Null Hypothesis (Ho) The hypothesis thata proposed result is not true for the

population. Researchers typically attempt toreject the null hypothesis in favor of somealternative hypothesis.

Alternative Hypothesis (HA)Thehypothesis that a proposed result is true forthe population.

Typical Hypothesis Testing

8/13/2019 0324305419_65709

28/36

Dr S.L Gupta

Typical Hypothesis Testing

ProcedureSpecify Null and Alternative Hypotheses after

Analyzing the Research Problem

Choose an Appropriate Statistical Test Considering the

Research Design and after Determining the Sampling

Distribution That Applies Given the Chosen Test Statistic

Specify the Significance Level (Alpha) for theProblem Being Investigated

Collect the Data and Compute the Value of the Test Statistic

Appropriate for the Sampling Distribution

Determine the Probability of the Test Statistic under the Null

Hypothesis Using the Sampling Distribution Specified in Step 2

Compare the Obtained Probability with the Specified Significance

Level and Then Reject or Do Not Reject the Null Hypothesis on

the Basis of the Comparison

8/13/2019 0324305419_65709

29/36

Dr S.L Gupta

Significance Level ()

The acceptable level of Type I error

selected by the researcher, usually set

at 0.05. Type I error is the probability of

rejecting the null hypothesis when it is

actually true for the population.

8/13/2019 0324305419_65709

30/36

Dr S.L Gupta

p-value

The probability of obtaining a given

result if in fact the null hypothesis were

true in the population. A result is

regarded as statistically significant if thep-value is less than the chosen

significance level of the test.

Common Misinterpretations of What

8/13/2019 0324305419_65709

31/36

Dr S.L Gupta

Common Misinterpretations of What

Statistically Significant Means

Viewing p-values as if they represent the probability

that the results occurred because of sampling error

(e.g., p=.05 implies that there is only a .05 probability

that the results were caused by chance).

Assuming that statistical significance is the same thing

as managerial significance.

Viewing the or p levels as if they are somehow related

to the probability that the research hypothesis is true

(e.g., a p-value such as p>.001 is highly significant

and therefore more valid than p

8/13/2019 0324305419_65709

32/36

Dr S.L Gupta

Testing Hypotheses about

Individual Variables

Chi-square Goodness-of-Fit Test for

Frequencies:A statistical test to determine

whether some observed pattern of frequencies

corresponds to an expected pattern.


8/13/2019 0324305419_65709

33/36

Dr S.L Gupta



Kolmogorov-Smirnov Test:A statistical test used

with ordinal data to determine whether some

observed pattern of frequencies corresponds to

some expected pattern; also used to determine

whether two independent samples have been

drawn from the same population or from

populations with the same distribution.


8/13/2019 0324305419_65709

34/36

Dr S.L Gupta



Z-test for Comparing Sample Proportion against

a Standard

wherep= proportion from the sample, = theproportion standard to be achieved, p= the standard

error of the proportion, and n= number of respondents

in the sample.


8/13/2019 0324305419_65709

35/36

Dr S.L Gupta



t-test for Comparing Sample Mean against a

Standard (Small Sample, n 30)

wherex= sample mean, = the population standard,sx= the standard error of the mean, s= sample

standard deviation, and n= sample size.


8/13/2019 0324305419_65709

36/36

Dr S.L Gupta



z-test for Comparing Sample Mean against a

Standard (Large Sample, n > 30)

wherex= sample mean, = the population standard,sx= the standard error of the mean, s= sample

standard deviation, and n= sample size.

0324305419_65709

Documents

Transcript of 0324305419_65709