Session 1: Overview of Quantitative Research Methods in Innovation Studies

CENTRE FOR INNOVATION, RESEARCH AND COMPETENCE IN THE LEARNING ECONOMY

Session 1: Overview of Quantitative Research

Methods in Innovation Studies

Taehyun Jung [email protected]

CIRCLE, Lund University

13.15-15.00 December 10 2012

For Survey of Quantitative Research, NORSI

CIRCLE, Lund University, Sweden

Motivation – data analytic trendsQualitative v. Quantitative ResearchEmpirical research design

–Validity & ReliabilityStructure and Elements of Empirical Research

–Research process–Example–Research Question–Data

2

Contents


Motivation

3

CIRCLE, Lund University, Sweden 4

Motivation > Data Analytic Trends in strategic management


Qualitative v. Quantitative Research

5


Qualitative Research– aims at understanding. It answers primarily to how? –questions– interpretive approach to data, studies `things' within their context and considers the

subjective meanings that people bring to their situation– Case studies

Cf. “the method does not imply any particular form of data collection - which can be qualitative or quantitative” (Yin 1993)

Quantitative Research– aims at (causal) explanation. It answers primarily to why? –questions– statistical, quantitative research methods and analysis– Social surveys and experiments

Complementary - not contradictory– different kinds of research questions and objects of research– different perspectives on the same research objects / questions (methodological

triangulation)

Qualitative v. Quantitative research

6


Based on the idea that social phenomena can be quantified, measured and expressed numerically.

The information about a social phenomenon is expressed in numeric terms that can be analyzed by statistical methods.

The observations can be directly numeric information or can be classified into numeric variables.

The quantitative method

7


Strengths...– Enables the research and

description of social structures and processes that are not directly observable.

– Well-suited for quantitative description, comparisons between groups, areas etc.

– Description of change.– Analysis and explanation of (causal)

dependencies between social phenomena.

...and Weaknesses.– Simplifies and “compresses” the

complex reality: abstract and constrained perspective.

– Only applicable for measurable (quantifiable) phenomena

– Presumes relatively extensive knowledge on the subject matter in order to be able to ask “correct” questions.

– Difficult to study processes or “dynamic” phenomena: produces static view of the reality

– Description of actors’ perspectives, intentions and meanings difficult.

Quantitative research

8


What is going on (descriptive research)?– E.g. social, innovation indicators– to describe the invention rate in a country, to examine trends over time or to

compare the rates in different countries– Good description provokes the `why' questions of explanatory research

Why is it going on (explanatory research)?– focuses on why questions– why the invention rate is as high as it is, why some types of invention are increasing

or why the rate is higher in some countries than in others?– Answering the `why' questions involves developing causal explanations. Causal

explanations argue that phenomenon Y (e.g. income level) is affected by factor X (e.g. gender).

Most research includes both description and explanation

Description and explanation

9

Source: De Vaus, D. (2001). Research design in social research: SAGE Publications Ltd.


Three types of causal relationships

10



Correlation and causation:– There is a correlation between the number of fire engines at a fire and the amount

of damage caused by the fire (the more fire engines the more damage) Is it therefore reasonable to conclude that the number of fire engines causes the amount

of damage? Clearly the number of fire engines and the amount of damage will both be due to some

third factor - such as the seriousness of the fire Prediction and causation:

– Knowing the type of school attended improves our capacity to predict academic achievement.

But this does not mean that the school type affects academic achievement. Predicting performance on the basis of school type does not tell us why private school students do better.

Good prediction does not depend on causal relationships. Nor does the ability to predict accurately demonstrate anything about causality.

causation

11



While we can observe correlation we cannot observe cause. We have to infer cause.

– These inferences are `necessarily fallible . . . [they] are only indirectly linked to observables' (Cook and Campbell, 1979: 10).

– Because our inferences are fallible we must minimize the chances of incorrectly saying that a relationship is causal when in fact it is not.

One of the fundamental purposes of research design in explanatory research is to avoid invalid inferences.

Adopting a sceptical approach to explanations– scientific knowledge must always be provisional (Popper)– rather than seeking evidence that is consistent with our theory we should seek

evidence that provides a compelling test of the theory– strategies for doing this:

eliminating rival explanations of the evidence deliberately seeking evidence that could disprove the theory

We have to infer cause

12 Source: De Vaus, D. (2001). Research design in social research: SAGE Publications Ltd.


If A then B. B is true. Therefore A is true

If A [or C, or D, or E, or F, or . . .] then B. We observe B. Therefore A [or C, or D, or E, or F, or . . .] is true

“There always may be an unthought-of explanation”

– The more alternative explanations that have been eliminated and the more we have tried to disprove our theory, the more confidence we will have in it, but we should avoid thinking that it is proven

Think of the alternative hypotheses and avoid the logical fallacy of affirming the consequent

13



Empirical research design

14


Logical structure of the research (data). “The function of a research design is to ensure that the evidence obtained

enables us to answer the initial question as unambiguously as possible.” (David de Vaus: Research Design in Social Research, 2001)

– given this research question (or theory), – what type of evidence is needed to answer the question (or test the theory) in a

convincing way? Empirical support for practically any hypothesis can usually be obtained by

manipulating data. Good research design prevents this kind of manipulative use of data by taking

into account possible alternative explanations and enabling comparisons and judgments between them.

Research design

15


Validity: are conclusions true?– Degree to which you are truly measuring what you intend to measure– Does the instrument measure what it is meant to measure?– An instrument can be reliable, but not valid

Example: Measure anxiety with the temperature readings on a thermometer– If an instrument is valid, it must also be reliable

Reliability: can findings be repeated?– If the design of a research study is reliable, then its findings should be repeatable,

replicable, generalizable– Can the study be replicated?– Will the research yield stable, consistent results when applied repeatedly?

Example: “How many books have you borrowed this year?” A study in which this is an important question might be unreliable - subjects likely will not

recall the exact number, will guess different numbers at different times– "repeatability" or "consistency".

Validity and Reliability

16


The center of the target: the concept that you are trying to measure

Shots (dots) = observation

Validity and Reliability

17

Source: http://www.socialresearchmethods.net/kb/relandval.php

http://www.socialresearchmethods.net/kb/relandval.php

http://www.socialresearchmethods.net/kb/relandval.php


As developed by Campbell (1957), Campbell & Stanley (1963), Cook & Campbell (1979), with very minor changes in Shadish, Cook & Campbell (2002)

– Internal Validity–Statistical Conclusion Validity–Construct Validity–External Validity

Each of the validity types has prototypical threats to validity—common reasons why we are often wrong about each of the four inferences.

Campbell’s Validity Typology

18


did the treatment affect the outcome (Campbellian)?– whether observed covariation between A (the presumed treatment) and B (the

presumed outcome) reflects a causal relationship from A to B, as those variables were manipulated or measured.

approximate truth about inferences regarding cause-effect or causal relationships

– identify casual relationships and rule out other explanations for relationships– not relevant in most observational or descriptive studies– Central focus for studies that assess the effects of social programs or interventions

Goal is to be sure that the conclusions drawn from experimental results accurately reflect what went on in the experiment itself

– whether observed changes can be attributed to your program or intervention (i.e., the cause) and not to other possible causes (or “alternative explanations” or “confounding factors” for the outcome)

Internal validity

19


Selection threat– groups exposed to treatments non-randomly may differ in ways that mimic what treatment

might achieve– Participant characteristics confounded with treatment conditions because of use of intact or

self-selected participants, or more generally, whenever predictor variables represent measured characteristics as opposed to independently manipulated treatments.

History threat– treatment groups may differ over time because an event happened to the units assigned to on

treatment but not the other– Events, in addition to an assigned condition, to which participants are exposed between

repeated measurements that could influence performance. Maturation threat

– treatment groups may grow apart over time because they spontaneously mature at different rates

– Observed changes as a result of ongoing, naturally occurring processes rather than condition effects.

Threats to Internal validity could plausibly have caused an observed relationship even if the treatment have never taken place (Campbell 91)

20


Instrumentation Threat– E.g. Changed definitions of ‘innovation’

Attrition (mortality) Threat– Differential drop out across conditions at one or more time points that may be responsible for

differences. – E.g. Innovative performances of new firms in year 1 and year 5

Regression threat– "regression artifact" or "regression to the mean“: a statistical phenomenon– if a variable is extreme on its first measurement, it will tend to be closer to the average on a

second measurement, and—a fact that may superficially seem paradoxical—if it is extreme on a second measurement, will tend to have been closer to the average on the first measurement

Threats to Internal validity (cont’d)

21


Best ruled out through random assignmentOr control group design

– In this scenario, you would have two groups: one receives your program and the other one doesn't. In fact, the only difference between these groups should be the program. If that's true, then the control group would experience all the same history and maturation threats, would have the same testing and instrumentation issues, and would have similar rates of mortality and regression to the mean. In other words, a good control group is one of the most effective ways to rule out the single-group threats to internal validity. Of course, when you add a control group, you no-longer have a single group design.

– Cf. Jaffe’s matching design

22


Given there is a valid causal relationship, is the interpretation of the constructs involved in that relationship correct?

the degree to which inferences can legitimately be made from the operationalizations in your study to the theoretical constructs on which those operationalizations were based

–how accurately our talk matches what we actually did–generalizing from your program or measures to the concept of your

program or measures–an assessment of how well you translated your ideas or theories into actual

programs or measures–E.g. innovation (concept) measured by patents (measure)

Construct Validity

23


Inadequate Preoperational Explication of Constructs–you didn't do a good enough job of defining (operationally) what you mean

by the construct–Failure to adequately explicate a construct may lead to incorrect inferences

about the relationship between the operation and construct.Mono-Operation Bias

–Mono-operation bias pertains to the independent variable, cause, program or treatment in your study

– If you only use a single version of a program in a single place at a single point in time, you may not be capturing the full breadth of the concept of the program

Mono-Method Bias–When all operationalizations use the same method (e.g., self-report), that

method is part of the construct actually studied

Threats to construct validity

24


Hypothesis Guessing – Participants are likely to base their behavior on what they guess about the study,

not just on your treatment. Evaluation Apprehension

– Many people are anxious about being evaluated. – For example women taking a math test may not perform to their full potential

because of concerns regarding women’s stereotyped difficulties with math. In this situation, evaluation apprehension is called stereotype threat

Experimenter Expectancies – Sometimes the researcher can communicate what the desired outcome for a study

might be (and participant desire to "look good" leads them to react that way). – For instance, the researcher might look pleased when participants give a desired

answer. If this is what causes the response, it would be wrong to label the response as a treatment effect.

The "Social" Threats to Construct Validity

25


“was the original statistical inference correct?” The validity of inferences about the correlation (covariation) between treatment and

outcome.– The power of the analysis focuses on the sensitivity or ability to detect a relationship– Did the investigators arrive at the correct conclusion regarding whether or not a relationship

between the variables exists or the extent of the relationship?– Not concerned with the causal relationship between variables, but whether or not there is any

relationship, either causal or not Closely tied to Internal Validity

– SCV asks if the two variables are correlated. IV asks if that correlation is due to causation Type I Error

– Conclude that a relationship exists between two variables, when in fact there is no relationship.

Type II Error– Conclude that there is no relationship when one exists.

Statistical Conclusion Validity

26


Low Statistical Power (very common)Violated Assumptions of Statistical Tests (especially problems of

nesting—students nested in classes)Unreliability of MeasuresRestriction of RangeUnreliability of Treatment ImplementationExtraneous Variance in the Experimental SettingHeterogeneity of Units Inaccurate Effect Size Estimation

Threats to Statistical Conclusion Validity

27


Threats leading to overly conservative bias Remedies

Small sample size Increase sample size Increased error from irrelevant, unreliable,

or invalid measures Improve measurements

High variability due to participant diversity Control individual differences: control for covariates; using a design that blocks, matches, or uses repeated measures.

Violation of statistical assumptions Transform data or use different analysis methods.

Threats leading to overly liberal bias Repeated statistical test Use adjusted test procedures

Violation of statistical assumptions Transform data or use different analysis methods

Biased estimates of effects Use corrected values to estimate effects in population

Threats to statistical conclusion validity and their remedies

28


“Can the finding be generalized across populations, settings, or time?”

The validity of inferences about whether the cause-effect relationship holds over variation in persons, settings, treatment variables, and measurement variables.

Generalization and applicability of your research to similar problems/settings

External Validity

29


“validity is subjective rather than objective” (Cronbach 1982)–Validity is a property of a conclusion to a critical audience–Validity is assimilated to credibility

30


Structure and Elements of Empirical Research

31


The logic of the research process

32



Articulate research problemSelect research design

–Proper empirical setting–Measures–Analytic methods

Collect dataAnalyze data Infer the results

Research process

33


Cohen, W. M., Nelson, R. R., & Walsh, J. P. (2002). Links and Impacts: The Influence of Public Research on Industrial R&D. Management Science, 48(1), 1-23. citation 1233 counted on Dec. 6, 2012 Google Scholar

Research questions– “to characterize the extent and nature of the contribution of public research to

industrial R&D” 1. “how public research tends to be used in industrial R&D labs” 2. “the overall importance of public research, as well as that of specific fields of basic and

applied research and engineering” 3. “the importance of the different pathways through which public research may impact

industrial R&D, including publications, informal interactions, consulting, and the hiring of university graduates”

4. “what roles different kinds of flrms (e.g., large versus small and start-ups versus established flrms) play in bridging public research and industrial R&D.”

Structure of a typical empirical paper

34


Data–a survey of R&D managers administered in 1994

The population sampled are all the R&D units located in the U.S. conducting R&D in manufacturing industries as part of a manufacturing firm

The sample was randomly drawn from the eligible labs listed in Bowker's Directory of American Research and Technology (1994) or belonging to firms listed in Standard and Poor's COMPUSTAT, stratified by three-digit SIC industry

– We sampled 3,240 labs, and received 1,478 responses, yielding an unadjusted response rate of 46% and an adjusted response rate of 54%

– For the analysis in this paper, we restricted our sample to firms whose focus industry was in the manufacturing sector and were not foreign owned, yielding a sample of 1,267 cases

–Sample characteristics

Cohen, Nelson, & Walsh (2002)

35


Sample characteristics


36


Analysis - RQ1– Public research outscores, however,

consultants/contract R&D as a source of knowledge for both suggesting new R&D projects (p < 0.0001) and contributing to project completion (n.s.)… Although rivals constitute a more important source for project ideas than public research institutions (41% versus 32%, p < 0.0001), public research institutions are markedly more important than rivals as a source of knowledge contributing to project completion— 36% for public research versus 12% for competitors (p < 0.0001), suggesting that the impact of public research on firms' R&D is at least comparable to that of rivals' R&D


37


Articulate research problem– In the form of research question(s) and or hypotheses–Determine the appropriate type of research design–To formalize the research topic into an operational guide for the study,

connecting the conceptual framework to the methods Focused and testable

– E.g. what is the best –Also clarifies the specific type of data to be collected

38


Secondary data–Financial data, indicators, patent documents, Thompson Web of Science,

etc..Self-report measures

–Survey & questionnaire Advantages

– Sample large populations (cheap on materials & effort)– Efficiently ask a lot of questions

Disadvantages– Self-report is fallible – Response biases are unavoidable

– Interviews

Data

39


Selecting respondents from population of concern– Specify population– Sampling framework– Sampling bias

(Simple) random sampling– randomly selected individuals. Each individual in the population has the same probability of

being in the sample. All possible samples of size n have the same chance of being drawn Systematic selection Stratified sampling Convenience sampling Voluntary Response Sampling Snowball sampling

– especially useful when you do not know very well about population. – E.g. Name three experts in nanotechnology

Sampling

40

Survey & questionnaire


Convenience sampling: Just ask whoever is around. – Example: “Man on the street” survey (cheap, convenient, often quite opinionated or

emotional → now very popular with TV “journalism”)– Which men, and on which street?– Ask about gun control or legalizing marijuana “on the street” in Berkeley, CA and in

some small town in Idaho and you would probably get totally different answers. – Even within an area, answers would probably differ if you did the survey outside a

high school or a country-western bar. – Bias: Opinions limited to individuals present

Voluntary Response Sampling: – Individuals choose to be involved. These samples are very susceptible to being

biased because different people are motivated to respond or not. They are often called “public opinion polls” and are not considered valid or scientific.

– Bias: Sample design systematically favors a particular outcome.

Bad sampling

41


Sampling bias– are respondents representative of population of interest? How were they selected?– do all persons in the population have an equal chance of getting selected?

Non-response & self selection bias– People who feel they have something to hide or who don’t like their privacy being

invaded probably won’t answer. Yet they are part of the population. Response bias

– Social desirability: Fancy term for lying when you think you should not tell the truth. Like if your family doctor asks: “How much do you drink?” Or a survey of female students asking: “How many men do you date per week?”

– Recency effects: People also simply forget and often give erroneous answers to questions about the past.

Wording (or framing) effects: – Questions worded like “Do you agree that it is awful that…” are prompting you to

give a particular response.

General Survey Biases

42


The techniques of inferential statistics allow us to draw inferences or conclusions about a population from a sample.

–Your estimate of the population is only as good as your sampling design - Work hard to eliminate biases.

–Your sample is only an estimate—and if you randomly sampled again, you would probably get a somewhat different result.

–The bigger the sample the better.

inference

43

inference

Population

Sample


interval/ quantitative /scale– Something that can be counted or measured for each individual on a scale of equal units. Can

be added, subtracted, averaged, etc., across individuals in the population.– Example: How tall you are, your age, your blood cholesterol level, the number of credit cards

you own. categorical

– Something that falls into one of several categories. What can be counted is the count or proportion of individuals in each category.

– Nominal: no inherent order– Ordinal: ordered but cannot measure the differences in meaningful units– Dichotomous or dummy: only two values (e.g. yes or no, male and female, promoted and not

promoted)– Example: Your blood type (A, B, AB, O), your hair color, your ethnicity, whether you paid income

tax last tax year or not.

Types of variables

44


Session 2:–Correlation–Statistical Inference and Hypothesis Testing– t-Test–Confidence Interval–Chi-square Statistic

Session 3–Simple Regression Model

Next session

45

Session 1: Overview of Quantitative Research Methods in Innovation Studies

Documents

Transcript of Session 1: Overview of Quantitative Research Methods in Innovation Studies