Session 1: Overview of Quantitative Research Methods in Innovation Studies
description
Transcript of Session 1: Overview of Quantitative Research Methods in Innovation Studies
CENTRE FOR INNOVATION, RESEARCH AND COMPETENCE IN THE LEARNING ECONOMY
Session 1: Overview of Quantitative Research
Methods in Innovation Studies
Taehyun Jung [email protected]
CIRCLE, Lund University
13.15-15.00 December 10 2012
For Survey of Quantitative Research, NORSI
CIRCLE, Lund University, Sweden
Motivation – data analytic trendsQualitative v. Quantitative ResearchEmpirical research design
–Validity & ReliabilityStructure and Elements of Empirical Research
–Research process–Example–Research Question–Data
2
Contents
CIRCLE, Lund University, Sweden
Motivation
3
CIRCLE, Lund University, Sweden 4
Motivation > Data Analytic Trends in strategic management
CIRCLE, Lund University, Sweden
Qualitative v. Quantitative Research
5
CIRCLE, Lund University, Sweden
Qualitative Research– aims at understanding. It answers primarily to how? –questions– interpretive approach to data, studies `things' within their context and considers the
subjective meanings that people bring to their situation– Case studies
Cf. “the method does not imply any particular form of data collection - which can be qualitative or quantitative” (Yin 1993)
Quantitative Research– aims at (causal) explanation. It answers primarily to why? –questions– statistical, quantitative research methods and analysis– Social surveys and experiments
Complementary - not contradictory– different kinds of research questions and objects of research– different perspectives on the same research objects / questions (methodological
triangulation)
Qualitative v. Quantitative research
6
CIRCLE, Lund University, Sweden
Based on the idea that social phenomena can be quantified, measured and expressed numerically.
The information about a social phenomenon is expressed in numeric terms that can be analyzed by statistical methods.
The observations can be directly numeric information or can be classified into numeric variables.
The quantitative method
7
CIRCLE, Lund University, Sweden
Strengths...– Enables the research and
description of social structures and processes that are not directly observable.
– Well-suited for quantitative description, comparisons between groups, areas etc.
– Description of change.– Analysis and explanation of (causal)
dependencies between social phenomena.
...and Weaknesses.– Simplifies and “compresses” the
complex reality: abstract and constrained perspective.
– Only applicable for measurable (quantifiable) phenomena
– Presumes relatively extensive knowledge on the subject matter in order to be able to ask “correct” questions.
– Difficult to study processes or “dynamic” phenomena: produces static view of the reality
– Description of actors’ perspectives, intentions and meanings difficult.
Quantitative research
8
CIRCLE, Lund University, Sweden
What is going on (descriptive research)?– E.g. social, innovation indicators– to describe the invention rate in a country, to examine trends over time or to
compare the rates in different countries– Good description provokes the `why' questions of explanatory research
Why is it going on (explanatory research)?– focuses on why questions– why the invention rate is as high as it is, why some types of invention are increasing
or why the rate is higher in some countries than in others?– Answering the `why' questions involves developing causal explanations. Causal
explanations argue that phenomenon Y (e.g. income level) is affected by factor X (e.g. gender).
Most research includes both description and explanation
Description and explanation
9
Source: De Vaus, D. (2001). Research design in social research: SAGE Publications Ltd.
CIRCLE, Lund University, Sweden
Three types of causal relationships
10
Source: De Vaus, D. (2001). Research design in social research: SAGE Publications Ltd.
CIRCLE, Lund University, Sweden
Correlation and causation:– There is a correlation between the number of fire engines at a fire and the amount
of damage caused by the fire (the more fire engines the more damage) Is it therefore reasonable to conclude that the number of fire engines causes the amount
of damage? Clearly the number of fire engines and the amount of damage will both be due to some
third factor - such as the seriousness of the fire Prediction and causation:
– Knowing the type of school attended improves our capacity to predict academic achievement.
But this does not mean that the school type affects academic achievement. Predicting performance on the basis of school type does not tell us why private school students do better.
Good prediction does not depend on causal relationships. Nor does the ability to predict accurately demonstrate anything about causality.
causation
11
Source: De Vaus, D. (2001). Research design in social research: SAGE Publications Ltd.
CIRCLE, Lund University, Sweden
While we can observe correlation we cannot observe cause. We have to infer cause.
– These inferences are `necessarily fallible . . . [they] are only indirectly linked to observables' (Cook and Campbell, 1979: 10).
– Because our inferences are fallible we must minimize the chances of incorrectly saying that a relationship is causal when in fact it is not.
One of the fundamental purposes of research design in explanatory research is to avoid invalid inferences.
Adopting a sceptical approach to explanations– scientific knowledge must always be provisional (Popper)– rather than seeking evidence that is consistent with our theory we should seek
evidence that provides a compelling test of the theory– strategies for doing this:
eliminating rival explanations of the evidence deliberately seeking evidence that could disprove the theory
We have to infer cause
12 Source: De Vaus, D. (2001). Research design in social research: SAGE Publications Ltd.
CIRCLE, Lund University, Sweden
If A then B. B is true. Therefore A is true
If A [or C, or D, or E, or F, or . . .] then B. We observe B. Therefore A [or C, or D, or E, or F, or . . .] is true
“There always may be an unthought-of explanation”
– The more alternative explanations that have been eliminated and the more we have tried to disprove our theory, the more confidence we will have in it, but we should avoid thinking that it is proven
Think of the alternative hypotheses and avoid the logical fallacy of affirming the consequent
13
Source: De Vaus, D. (2001). Research design in social research: SAGE Publications Ltd.
CIRCLE, Lund University, Sweden
Empirical research design
14
CIRCLE, Lund University, Sweden
Logical structure of the research (data). “The function of a research design is to ensure that the evidence obtained
enables us to answer the initial question as unambiguously as possible.” (David de Vaus: Research Design in Social Research, 2001)
– given this research question (or theory), – what type of evidence is needed to answer the question (or test the theory) in a
convincing way? Empirical support for practically any hypothesis can usually be obtained by
manipulating data. Good research design prevents this kind of manipulative use of data by taking
into account possible alternative explanations and enabling comparisons and judgments between them.
Research design
15
CIRCLE, Lund University, Sweden
Validity: are conclusions true?– Degree to which you are truly measuring what you intend to measure– Does the instrument measure what it is meant to measure?– An instrument can be reliable, but not valid
Example: Measure anxiety with the temperature readings on a thermometer– If an instrument is valid, it must also be reliable
Reliability: can findings be repeated?– If the design of a research study is reliable, then its findings should be repeatable,
replicable, generalizable– Can the study be replicated?– Will the research yield stable, consistent results when applied repeatedly?
Example: “How many books have you borrowed this year?” A study in which this is an important question might be unreliable - subjects likely will not
recall the exact number, will guess different numbers at different times– "repeatability" or "consistency".
Validity and Reliability
16
CIRCLE, Lund University, Sweden
The center of the target: the concept that you are trying to measure
Shots (dots) = observation
Validity and Reliability
17
Source: http://www.socialresearchmethods.net/kb/relandval.php
CIRCLE, Lund University, Sweden
As developed by Campbell (1957), Campbell & Stanley (1963), Cook & Campbell (1979), with very minor changes in Shadish, Cook & Campbell (2002)
– Internal Validity–Statistical Conclusion Validity–Construct Validity–External Validity
Each of the validity types has prototypical threats to validity—common reasons why we are often wrong about each of the four inferences.
Campbell’s Validity Typology
18
CIRCLE, Lund University, Sweden
did the treatment affect the outcome (Campbellian)?– whether observed covariation between A (the presumed treatment) and B (the
presumed outcome) reflects a causal relationship from A to B, as those variables were manipulated or measured.
approximate truth about inferences regarding cause-effect or causal relationships
– identify casual relationships and rule out other explanations for relationships– not relevant in most observational or descriptive studies– Central focus for studies that assess the effects of social programs or interventions
Goal is to be sure that the conclusions drawn from experimental results accurately reflect what went on in the experiment itself
– whether observed changes can be attributed to your program or intervention (i.e., the cause) and not to other possible causes (or “alternative explanations” or “confounding factors” for the outcome)
Internal validity
19
CIRCLE, Lund University, Sweden
Selection threat– groups exposed to treatments non-randomly may differ in ways that mimic what treatment
might achieve– Participant characteristics confounded with treatment conditions because of use of intact or
self-selected participants, or more generally, whenever predictor variables represent measured characteristics as opposed to independently manipulated treatments.
History threat– treatment groups may differ over time because an event happened to the units assigned to on
treatment but not the other– Events, in addition to an assigned condition, to which participants are exposed between
repeated measurements that could influence performance. Maturation threat
– treatment groups may grow apart over time because they spontaneously mature at different rates
– Observed changes as a result of ongoing, naturally occurring processes rather than condition effects.
Threats to Internal validity could plausibly have caused an observed relationship even if the treatment have never taken place (Campbell 91)
20
CIRCLE, Lund University, Sweden
Instrumentation Threat– E.g. Changed definitions of ‘innovation’
Attrition (mortality) Threat– Differential drop out across conditions at one or more time points that may be responsible for
differences. – E.g. Innovative performances of new firms in year 1 and year 5
Regression threat– "regression artifact" or "regression to the mean“: a statistical phenomenon– if a variable is extreme on its first measurement, it will tend to be closer to the average on a
second measurement, and—a fact that may superficially seem paradoxical—if it is extreme on a second measurement, will tend to have been closer to the average on the first measurement
Threats to Internal validity (cont’d)
21
CIRCLE, Lund University, Sweden
Best ruled out through random assignmentOr control group design
– In this scenario, you would have two groups: one receives your program and the other one doesn't. In fact, the only difference between these groups should be the program. If that's true, then the control group would experience all the same history and maturation threats, would have the same testing and instrumentation issues, and would have similar rates of mortality and regression to the mean. In other words, a good control group is one of the most effective ways to rule out the single-group threats to internal validity. Of course, when you add a control group, you no-longer have a single group design.
– Cf. Jaffe’s matching design
22
CIRCLE, Lund University, Sweden
Given there is a valid causal relationship, is the interpretation of the constructs involved in that relationship correct?
the degree to which inferences can legitimately be made from the operationalizations in your study to the theoretical constructs on which those operationalizations were based
–how accurately our talk matches what we actually did–generalizing from your program or measures to the concept of your
program or measures–an assessment of how well you translated your ideas or theories into actual
programs or measures–E.g. innovation (concept) measured by patents (measure)
Construct Validity
23
CIRCLE, Lund University, Sweden
Inadequate Preoperational Explication of Constructs–you didn't do a good enough job of defining (operationally) what you mean
by the construct–Failure to adequately explicate a construct may lead to incorrect inferences
about the relationship between the operation and construct.Mono-Operation Bias
–Mono-operation bias pertains to the independent variable, cause, program or treatment in your study
– If you only use a single version of a program in a single place at a single point in time, you may not be capturing the full breadth of the concept of the program
Mono-Method Bias–When all operationalizations use the same method (e.g., self-report), that
method is part of the construct actually studied
Threats to construct validity
24
CIRCLE, Lund University, Sweden
Hypothesis Guessing – Participants are likely to base their behavior on what they guess about the study,
not just on your treatment. Evaluation Apprehension
– Many people are anxious about being evaluated. – For example women taking a math test may not perform to their full potential
because of concerns regarding women’s stereotyped difficulties with math. In this situation, evaluation apprehension is called stereotype threat
Experimenter Expectancies – Sometimes the researcher can communicate what the desired outcome for a study
might be (and participant desire to "look good" leads them to react that way). – For instance, the researcher might look pleased when participants give a desired
answer. If this is what causes the response, it would be wrong to label the response as a treatment effect.
The "Social" Threats to Construct Validity
25
CIRCLE, Lund University, Sweden
“was the original statistical inference correct?” The validity of inferences about the correlation (covariation) between treatment and
outcome.– The power of the analysis focuses on the sensitivity or ability to detect a relationship– Did the investigators arrive at the correct conclusion regarding whether or not a relationship
between the variables exists or the extent of the relationship?– Not concerned with the causal relationship between variables, but whether or not there is any
relationship, either causal or not Closely tied to Internal Validity
– SCV asks if the two variables are correlated. IV asks if that correlation is due to causation Type I Error
– Conclude that a relationship exists between two variables, when in fact there is no relationship.
Type II Error– Conclude that there is no relationship when one exists.
Statistical Conclusion Validity
26
CIRCLE, Lund University, Sweden
Low Statistical Power (very common)Violated Assumptions of Statistical Tests (especially problems of
nesting—students nested in classes)Unreliability of MeasuresRestriction of RangeUnreliability of Treatment ImplementationExtraneous Variance in the Experimental SettingHeterogeneity of Units Inaccurate Effect Size Estimation
Threats to Statistical Conclusion Validity
27
CIRCLE, Lund University, Sweden
Threats leading to overly conservative bias Remedies
Small sample size Increase sample size Increased error from irrelevant, unreliable,
or invalid measures Improve measurements
High variability due to participant diversity Control individual differences: control for covariates; using a design that blocks, matches, or uses repeated measures.
Violation of statistical assumptions Transform data or use different analysis methods.
Threats leading to overly liberal bias Repeated statistical test Use adjusted test procedures
Violation of statistical assumptions Transform data or use different analysis methods
Biased estimates of effects Use corrected values to estimate effects in population
Threats to statistical conclusion validity and their remedies
28
CIRCLE, Lund University, Sweden
“Can the finding be generalized across populations, settings, or time?”
The validity of inferences about whether the cause-effect relationship holds over variation in persons, settings, treatment variables, and measurement variables.
Generalization and applicability of your research to similar problems/settings
External Validity
29
CIRCLE, Lund University, Sweden
“validity is subjective rather than objective” (Cronbach 1982)–Validity is a property of a conclusion to a critical audience–Validity is assimilated to credibility
30
CIRCLE, Lund University, Sweden
Structure and Elements of Empirical Research
31
CIRCLE, Lund University, Sweden
The logic of the research process
32
Source: De Vaus, D. (2001). Research design in social research: SAGE Publications Ltd.
CIRCLE, Lund University, Sweden
Articulate research problemSelect research design
–Proper empirical setting–Measures–Analytic methods
Collect dataAnalyze data Infer the results
Research process
33
CIRCLE, Lund University, Sweden
Cohen, W. M., Nelson, R. R., & Walsh, J. P. (2002). Links and Impacts: The Influence of Public Research on Industrial R&D. Management Science, 48(1), 1-23. citation 1233 counted on Dec. 6, 2012 Google Scholar
Research questions– “to characterize the extent and nature of the contribution of public research to
industrial R&D” 1. “how public research tends to be used in industrial R&D labs” 2. “the overall importance of public research, as well as that of specific fields of basic and
applied research and engineering” 3. “the importance of the different pathways through which public research may impact
industrial R&D, including publications, informal interactions, consulting, and the hiring of university graduates”
4. “what roles different kinds of flrms (e.g., large versus small and start-ups versus established flrms) play in bridging public research and industrial R&D.”
Structure of a typical empirical paper
34
CIRCLE, Lund University, Sweden
Data–a survey of R&D managers administered in 1994
The population sampled are all the R&D units located in the U.S. conducting R&D in manufacturing industries as part of a manufacturing firm
The sample was randomly drawn from the eligible labs listed in Bowker's Directory of American Research and Technology (1994) or belonging to firms listed in Standard and Poor's COMPUSTAT, stratified by three-digit SIC industry
– We sampled 3,240 labs, and received 1,478 responses, yielding an unadjusted response rate of 46% and an adjusted response rate of 54%
– For the analysis in this paper, we restricted our sample to firms whose focus industry was in the manufacturing sector and were not foreign owned, yielding a sample of 1,267 cases
–Sample characteristics
Cohen, Nelson, & Walsh (2002)
35
CIRCLE, Lund University, Sweden
Sample characteristics
Cohen, Nelson, & Walsh (2002)
36
CIRCLE, Lund University, Sweden
Analysis - RQ1– Public research outscores, however,
consultants/contract R&D as a source of knowledge for both suggesting new R&D projects (p < 0.0001) and contributing to project completion (n.s.)… Although rivals constitute a more important source for project ideas than public research institutions (41% versus 32%, p < 0.0001), public research institutions are markedly more important than rivals as a source of knowledge contributing to project completion— 36% for public research versus 12% for competitors (p < 0.0001), suggesting that the impact of public research on firms' R&D is at least comparable to that of rivals' R&D
Cohen, Nelson, & Walsh (2002)
37
CIRCLE, Lund University, Sweden
Articulate research problem– In the form of research question(s) and or hypotheses–Determine the appropriate type of research design–To formalize the research topic into an operational guide for the study,
connecting the conceptual framework to the methods Focused and testable
– E.g. what is the best –Also clarifies the specific type of data to be collected
38
CIRCLE, Lund University, Sweden
Secondary data–Financial data, indicators, patent documents, Thompson Web of Science,
etc..Self-report measures
–Survey & questionnaire Advantages
– Sample large populations (cheap on materials & effort)– Efficiently ask a lot of questions
Disadvantages– Self-report is fallible – Response biases are unavoidable
– Interviews
Data
39
CIRCLE, Lund University, Sweden
Selecting respondents from population of concern– Specify population– Sampling framework– Sampling bias
(Simple) random sampling– randomly selected individuals. Each individual in the population has the same probability of
being in the sample. All possible samples of size n have the same chance of being drawn Systematic selection Stratified sampling Convenience sampling Voluntary Response Sampling Snowball sampling
– especially useful when you do not know very well about population. – E.g. Name three experts in nanotechnology
Sampling
40
Survey & questionnaire
CIRCLE, Lund University, Sweden
Convenience sampling: Just ask whoever is around. – Example: “Man on the street” survey (cheap, convenient, often quite opinionated or
emotional → now very popular with TV “journalism”)– Which men, and on which street?– Ask about gun control or legalizing marijuana “on the street” in Berkeley, CA and in
some small town in Idaho and you would probably get totally different answers. – Even within an area, answers would probably differ if you did the survey outside a
high school or a country-western bar. – Bias: Opinions limited to individuals present
Voluntary Response Sampling: – Individuals choose to be involved. These samples are very susceptible to being
biased because different people are motivated to respond or not. They are often called “public opinion polls” and are not considered valid or scientific.
– Bias: Sample design systematically favors a particular outcome.
Bad sampling
41
CIRCLE, Lund University, Sweden
Sampling bias– are respondents representative of population of interest? How were they selected?– do all persons in the population have an equal chance of getting selected?
Non-response & self selection bias– People who feel they have something to hide or who don’t like their privacy being
invaded probably won’t answer. Yet they are part of the population. Response bias
– Social desirability: Fancy term for lying when you think you should not tell the truth. Like if your family doctor asks: “How much do you drink?” Or a survey of female students asking: “How many men do you date per week?”
– Recency effects: People also simply forget and often give erroneous answers to questions about the past.
Wording (or framing) effects: – Questions worded like “Do you agree that it is awful that…” are prompting you to
give a particular response.
General Survey Biases
42
CIRCLE, Lund University, Sweden
The techniques of inferential statistics allow us to draw inferences or conclusions about a population from a sample.
–Your estimate of the population is only as good as your sampling design - Work hard to eliminate biases.
–Your sample is only an estimate—and if you randomly sampled again, you would probably get a somewhat different result.
–The bigger the sample the better.
inference
43
inference
Population
Sample
CIRCLE, Lund University, Sweden
interval/ quantitative /scale– Something that can be counted or measured for each individual on a scale of equal units. Can
be added, subtracted, averaged, etc., across individuals in the population.– Example: How tall you are, your age, your blood cholesterol level, the number of credit cards
you own. categorical
– Something that falls into one of several categories. What can be counted is the count or proportion of individuals in each category.
– Nominal: no inherent order– Ordinal: ordered but cannot measure the differences in meaningful units– Dichotomous or dummy: only two values (e.g. yes or no, male and female, promoted and not
promoted)– Example: Your blood type (A, B, AB, O), your hair color, your ethnicity, whether you paid income
tax last tax year or not.
Types of variables
44
CIRCLE, Lund University, Sweden
Session 2:–Correlation–Statistical Inference and Hypothesis Testing– t-Test–Confidence Interval–Chi-square Statistic
Session 3–Simple Regression Model
Next session
45