Chapter 3. Empirical Research and the Logic(s) of Inferencepolsci.org/pluemper/esr_03.pdfsome form...

29
1 Chapter 3. Empirical Research and the Logic(s) of Inference

Transcript of Chapter 3. Empirical Research and the Logic(s) of Inferencepolsci.org/pluemper/esr_03.pdfsome form...

Page 1: Chapter 3. Empirical Research and the Logic(s) of Inferencepolsci.org/pluemper/esr_03.pdfsome form of sampling. Given a hypothesis about a population, for which we wish to draw inferences,

1

Chapter 3. Empirical Research and the Logic(s) of Inference

Page 2: Chapter 3. Empirical Research and the Logic(s) of Inferencepolsci.org/pluemper/esr_03.pdfsome form of sampling. Given a hypothesis about a population, for which we wish to draw inferences,

2

Scientific Goals In principle, scientists are ultimately interested in the formulation of valid theories – theories that simplify reality, make it understandable and can inform behaviour and action. There are of course lots of scientific contributions which, though not directly fulfilling the ultimate goal, facilitate scientific progress: - concept development - measurement - empirical analyses - methodological development - theory tests

Page 3: Chapter 3. Empirical Research and the Logic(s) of Inferencepolsci.org/pluemper/esr_03.pdfsome form of sampling. Given a hypothesis about a population, for which we wish to draw inferences,

3

Uncertainty In all the above dimensions, science deals with uncertainty. Theories are not ‘correct’ with certainty. Empirical models are likely to be misspecified. Measurements come with unknown errors. Methods do not work perfectly. How do scientists reduce these uncertainties and make inferences with some certainty? Before we deal with analyses and research designs, we take a closer look at how scientist make inferences.

Page 4: Chapter 3. Empirical Research and the Logic(s) of Inferencepolsci.org/pluemper/esr_03.pdfsome form of sampling. Given a hypothesis about a population, for which we wish to draw inferences,

4

Wikipedia on Inferences Inferences are steps in reasoning, moving from premises to logical consequences; etymologically, the word infer means to "carry forward". Inference is theoretically traditionally divided into deduction and induction, a distinction that in Europe dates at least to Aristotle (300s BCE). Deduction is inference deriving logical conclusions from premises known or assumed to be true, with the laws of valid inference being studied in logic. Induction is inference from particular premises to a universal conclusion. A third type of inference is sometimes distinguished, notably by Charles Sanders Peirce, distinguishing abduction from induction, where abduction is inference to the best explanation. Various fields study how inference is done in practice. Human inference (i.e. how humans draw conclusions) is traditionally studied within the field of cognitive psychology; artificial intelligence researchers develop automated inference systems to emulate human inference. Statistical inference uses mathematics to draw conclusions in the presence of uncertainty. This generalizes deterministic reasoning, with the absence of uncertainty as a special case. Statistical inference uses quantitative or qualitative (categorical) data which may be subject to random variations.

Page 5: Chapter 3. Empirical Research and the Logic(s) of Inferencepolsci.org/pluemper/esr_03.pdfsome form of sampling. Given a hypothesis about a population, for which we wish to draw inferences,

5

Deduction and Induction Deductive reasoning ("top-down logic") contrasts with inductive reasoning ("bottom-up logic") in the following way; in deductive reasoning, a conclusion is reached reductively by applying general rules which hold over the entirety of a closed domain of discourse, narrowing the range under consideration until only the conclusion(s) is left. In inductive reasoning, the conclusion is reached by generalizing or extrapolating from specific cases to general rules, i.e., there is epistemic uncertainty. However, the inductive reasoning mentioned here is not the same as induction used in mathematical proofs – mathematical induction is actually a form of deductive reasoning.

Page 6: Chapter 3. Empirical Research and the Logic(s) of Inferencepolsci.org/pluemper/esr_03.pdfsome form of sampling. Given a hypothesis about a population, for which we wish to draw inferences,

6

Deduction Deduction starts with theory development, the consistent derivation of predictions (hypotheses) from this theory, and the empirical testing of the validity of the theory. At the end of this process, the theory may be, but need not be, changed. Problem: Since empirical evidence tends to be consistent with more than one theory, support for a theory is usually not perfectly conclusive.

Page 7: Chapter 3. Empirical Research and the Logic(s) of Inferencepolsci.org/pluemper/esr_03.pdfsome form of sampling. Given a hypothesis about a population, for which we wish to draw inferences,

7

The Logic of Deduction Deductive arguments are evaluated in terms of their validity and soundness. An argument is “valid” if it is impossible for its premises to be true while its conclusion is false. In other words, the conclusion must be true if the premises are true. An argument can be “valid” even if one or more of its premises are false. An argument is “sound” if it is valid and the premises are true. It is possible to have a deductive argument that is logically valid but is not sound. Fallacious arguments (fallacies) often take that form.

Page 8: Chapter 3. Empirical Research and the Logic(s) of Inferencepolsci.org/pluemper/esr_03.pdfsome form of sampling. Given a hypothesis about a population, for which we wish to draw inferences,

8

Two Examples The following is an example of an argument that is “valid”, but not “sound”: Everyone who eats carrots is a quarterback. John eats carrots. Therefore, John is a quarterback. Logically correct, but not true. Note: this is a borderline example: Since causal mechanisms in social science theories are usually probabilistic, predictions are also probabilistic. A bullet in the head is likely to cause death. John has been shot in the head. Therefore, he is likely to be dead. Logically correct, valid and ‘sound’ -- depending on our understanding of likely. But that does not mean that John is dead. And if he is dead, he need not have died of his bullet wound. He may have died of cancer.

Page 9: Chapter 3. Empirical Research and the Logic(s) of Inferencepolsci.org/pluemper/esr_03.pdfsome form of sampling. Given a hypothesis about a population, for which we wish to draw inferences,

9

Induction Induction begins with the observation of a phenomenon, which will be interpreted, and then a theory is developed, which, if valid, would explain the phenomenon. Problem: Induction is not deductively valid. That is: the interpretation may be wrong. Inductive results cannot be generalized from the sample to a broader population (unless the analysed sample has been randomized...)

Page 10: Chapter 3. Empirical Research and the Logic(s) of Inferencepolsci.org/pluemper/esr_03.pdfsome form of sampling. Given a hypothesis about a population, for which we wish to draw inferences,

10

What do we mean by ‘induction is not deductively valid’? Suppose someone shows us a coin and tests to see if the coin is either a fair one or two-headed. They flip the coin ten times, and ten times it comes up heads. At this point, there is a strong reason to believe it is two-headed. After all, the chance of ten heads in a row is .000976: less than one in one thousand. Then, after 100 flips, every toss has come up heads. Now there is “virtual” certainty that the coin is two-headed. Still, one can neither logically or empirically rule out that the next toss will produce tails. No matter how many times in a row it comes up heads this remains the case. If one programmed a machine to flip a coin over and over continuously at some point the result would be a string of 100 heads. In the fullness of time, all combinations will appear. Inductive reasoning is a form of argument that—in contrast to deductive reasoning—allows for the possi-bility that a conclusion can be false, even if all of the premises are true. Instead of being valid or invalid, inductive arguments are either strong or weak, according to how probable it is that the conclusion is true. We may call an inductive argument plausible, probable, reasonable, justified or strong, but never certain or necessary. Logic affords no bridge from the probable to the certain.

Page 11: Chapter 3. Empirical Research and the Logic(s) of Inferencepolsci.org/pluemper/esr_03.pdfsome form of sampling. Given a hypothesis about a population, for which we wish to draw inferences,

11

The Scientific Method observation of real world phenomena logical deduction of predictions from

assumptions identification of a puzzle or a questions to which the answer appears to be unknown

formulate a potential causal mechanism

formulation of an ad hoc explanation developing predictions into hypotheses identify a case or a set of cases to which the explanation applies

identification of the population of cases to which the ‘theory’ applies

develop a model that explains the variation of outcomes in the population of cases

collect data to explore the phenomenon collect data that matches the model test the prediction of the theory embedded in a

model using a random draw of cases from the population

generalize findings in respect to - causal mechanism - effect strengths - population

generalize sample results to population

results in a theory that explains the selected cases

results in a tested theory, verified or falsified for the chosen empirical model and the sample effects are average treatment effects for the sample

Page 12: Chapter 3. Empirical Research and the Logic(s) of Inferencepolsci.org/pluemper/esr_03.pdfsome form of sampling. Given a hypothesis about a population, for which we wish to draw inferences,

12

Types of Inferences - Descriptive Inference - Statistical Inferences - Causal Inference

Page 13: Chapter 3. Empirical Research and the Logic(s) of Inferencepolsci.org/pluemper/esr_03.pdfsome form of sampling. Given a hypothesis about a population, for which we wish to draw inferences,

13

1. Descriptive Inferences Descriptive inference is concerned with the historical accuracy of scientific information. Example: The vote share of social democratic party declines. Descriptive inference = using observations about the world to learn about other unobserved facts.

Page 14: Chapter 3. Empirical Research and the Logic(s) of Inferencepolsci.org/pluemper/esr_03.pdfsome form of sampling. Given a hypothesis about a population, for which we wish to draw inferences,

14

Example: Vote Intention

Somehow, we believe (and are being made believe) that this is the distribution of vote intentions in the UK’s voters – but YouGov has only interviewed at best a few thousand interviewees. And, believe it or not, the results YouGov obtained were probably very different from the above figures. What goes on???

Page 15: Chapter 3. Empirical Research and the Logic(s) of Inferencepolsci.org/pluemper/esr_03.pdfsome form of sampling. Given a hypothesis about a population, for which we wish to draw inferences,

15

A Slightly More Sensitive Example The average global temperature is not the average of measured temperatures. Neither does NASA (or whoever else) use the measured temperatures, nor do they average the altered versions of the measured temperatures. What goes on in both cases??? How would we make predictions of Austria’s electoral outcome based on the distribution of YOUR preferences on parties?

Page 16: Chapter 3. Empirical Research and the Logic(s) of Inferencepolsci.org/pluemper/esr_03.pdfsome form of sampling. Given a hypothesis about a population, for which we wish to draw inferences,

16

2. Statistical Inferences Inferential statistical analysis uses sample properties to infer properties of a population. It is assumed that the sample is a subset of a larger population.

Page 17: Chapter 3. Empirical Research and the Logic(s) of Inferencepolsci.org/pluemper/esr_03.pdfsome form of sampling. Given a hypothesis about a population, for which we wish to draw inferences,

17

Example: Per Capita Income and Life Expectancy: Country Data

Page 18: Chapter 3. Empirical Research and the Logic(s) of Inferencepolsci.org/pluemper/esr_03.pdfsome form of sampling. Given a hypothesis about a population, for which we wish to draw inferences,

18

3. Causal Inferences The social sciences have moved increasingly toward a quantitative framework for assessing causality. Much of this has been described as a means of providing greater rigor to social science methodology. Political science was significantly influenced by the publication of Designing Social Inquiry, by Gary King, Robert Keohane, and Sidney Verba, in 1994. King, Keohane, and Verba (often abbreviated as KKV) recommended that researchers applying both quantitative and qualitative methods adopt the language of statistical inference to be clearer about their subjects of interest and units of analysis. Proponents of quantitative methods have also increasingly adopted the potential outcomes framework, developed by Donald Rubin, as a standard for inferring causality. Causal inference use research designs to infer the existence of a causal effect from data analysis.

Page 19: Chapter 3. Empirical Research and the Logic(s) of Inferencepolsci.org/pluemper/esr_03.pdfsome form of sampling. Given a hypothesis about a population, for which we wish to draw inferences,

19

Example: Does Smoking Cause Cancer? Or is smoking and cancer co-determined by a genetic variable, as the industry has suggested? Rosenbaum asks by how much an unobserved lung cancer propensity factor of heavy smokers has to exceed that of non-smoking individuals to render the causal effect of smoking statistically insignificant. Rosenbaum (2001: 114) concludes: “To attribute the higher rate of death from lung cancer to an unobserved covariate u rather than to an effect of smoking, that unobserved covariate would need to produce a sixfold increase in the odds of smoking, and it would need to be a near perfect predictor of lung cancer.”

Page 20: Chapter 3. Empirical Research and the Logic(s) of Inferencepolsci.org/pluemper/esr_03.pdfsome form of sampling. Given a hypothesis about a population, for which we wish to draw inferences,

20

Strategies for Inferences

Page 21: Chapter 3. Empirical Research and the Logic(s) of Inferencepolsci.org/pluemper/esr_03.pdfsome form of sampling. Given a hypothesis about a population, for which we wish to draw inferences,

21

Statistical and Observational Inference - Sampling Strategies: Randomization, Stratification, Convenience

Statistical Inference - Significance - Effect Size - Robustness

Causal Inference - Randomization of Treatment - Balancing of Treatment and Control Group

Page 22: Chapter 3. Empirical Research and the Logic(s) of Inferencepolsci.org/pluemper/esr_03.pdfsome form of sampling. Given a hypothesis about a population, for which we wish to draw inferences,

22

Random Sampling Statistical inference makes propositions about a population, using data drawn from the population with some form of sampling. Given a hypothesis about a population, for which we wish to draw inferences, statistical inference consists of (first) selecting a statistical model of the process that generates the data and (second) deducing propositions from the model. If a random sample is a perfect random draw from the population and if the sample is representative (i.e. large enough), then the sample properties are very similar to the population properties. Criteria: - perfect random draws from the population are almost difficult and often impossible - all cases including in the population need to have an ex ante identical probability of being drawn into the sample - the necessary number of observations of the sample depends on the scarcity of the most scarcest relevant factor

Page 23: Chapter 3. Empirical Research and the Logic(s) of Inferencepolsci.org/pluemper/esr_03.pdfsome form of sampling. Given a hypothesis about a population, for which we wish to draw inferences,

23

Alternatives to Random Sampling: Stratification In statistics, stratified sampling is a method of sampling from a population which can be partitioned into subpopulations. In statistical surveys, when subpopulations within an overall population vary, it could be advantageous to sample each subpopulation (stratum) independently. Stratification is the process of dividing members of the population into homogeneous subgroups before sampling. The strata should define a partition of the population. That is, it should be collectively exhaustive and mutually exclusive: every element in the population must be assigned to one and only one stratum. Then simple random sampling or systematic sampling is applied within each stratum. The objective is to improve the precision of the sample by reducing sampling error. It can produce a weighted mean that has less variability than the arithmetic mean of a simple random sample of the population.

Page 24: Chapter 3. Empirical Research and the Logic(s) of Inferencepolsci.org/pluemper/esr_03.pdfsome form of sampling. Given a hypothesis about a population, for which we wish to draw inferences,

24

Stratified samples are not perfect Wiki: Stratified sampling is not useful when the population cannot be exhaustively partitioned into disjoint subgroups. It would be a misapplication of the technique to make subgroups' sample sizes proportional to the amount of data available from the subgroups, rather than scaling sample sizes to subgroup sizes. Data representing each subgroup are taken to be of equal importance if suspected variation among them warrants stratified sampling. If subgroup variances differ significantly and the data needs to be stratified by variance, it is not possible to simultaneously make each subgroup sample size proportional to subgroup size within the total population.

Page 25: Chapter 3. Empirical Research and the Logic(s) of Inferencepolsci.org/pluemper/esr_03.pdfsome form of sampling. Given a hypothesis about a population, for which we wish to draw inferences,

25

Randomized Treatment A randomized controlled trial (or randomized control trial; RCT) is a type of scientific experiment that aims to reduce certain sources of bias when testing the effectiveness of new treatments. This is accomplished by randomly allocating subjects to two or more groups, treating them differently, and then comparing them with respect to a measured response. One group—the experimental group—has the intervention being assessed, while the other—usually called the control group—has an alternative condition, such as a placebo or no intervention. The groups are followed under conditions of the trial design to see how effective the experimental inter-vention was. Treatment efficacy is assessed in comparison to the control. There may be more than one treatment group or more than one control group. THUS: if the number of participants is large enough, randomization guarantees high levels of similarity between treatment and control group.

Page 26: Chapter 3. Empirical Research and the Logic(s) of Inferencepolsci.org/pluemper/esr_03.pdfsome form of sampling. Given a hypothesis about a population, for which we wish to draw inferences,

26

Validity of Inferences: The Goal of Empirical Research Designs Inferential validity refers to the extent to which research designs guarantee that inferences are reliably close to the truth. Note: Research designs that cannot reliably guarantee ‘correct inferences’ can also be unbiased. However, in the absence of optimal research designs, the validity of inferences remains uncertain.

Page 27: Chapter 3. Empirical Research and the Logic(s) of Inferencepolsci.org/pluemper/esr_03.pdfsome form of sampling. Given a hypothesis about a population, for which we wish to draw inferences,

27

Three Dimensions of Validity 1. Concept validity refers to the accuracy with which theoretical concepts are transferred into operational definitions and then measured. 2. Internal validity refers to the unbiasedness of estimated or computed ‘treatment effects’ generated by research designs (i.e. randomization of treatment). 3. External validity refers to the generalizability of the results and the inferences from the sample to the population. It requires perfect randomization of the sample from a correctly defined population.

Page 28: Chapter 3. Empirical Research and the Logic(s) of Inferencepolsci.org/pluemper/esr_03.pdfsome form of sampling. Given a hypothesis about a population, for which we wish to draw inferences,

28

Conclusion Social science methodology relies on - random sampling - stratification - randomization of treatment - control of confounders to maximize the probability of valid inferences. Methods rely on at least one of these techniques to increase the validity of inferences. However, the best method would combine random sampling, randomization of treatment, and control of confounders (not needed if the sample is infinitely large) and therefore be conceptually, internally, and externally valid at the same time.

Page 29: Chapter 3. Empirical Research and the Logic(s) of Inferencepolsci.org/pluemper/esr_03.pdfsome form of sampling. Given a hypothesis about a population, for which we wish to draw inferences,

29