R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | i 1 INF 397C Introduction to...

Post on 17-Jan-2018

217 views 0 download

description

R. G. Bias | School of Information | SZB 562BB | Phone: | i 3 Probability Remember all those decisions we talked about, last week. VERY little of life is certain. It is PROBABILISTIC. (That is, something might happen, or it might not.)

Transcript of R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | i 1 INF 397C Introduction to...

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 1

iINF 397C

Introduction to Research in Library and Information Science

Fall, 2005

Day 5

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 2

i5 things today

1. Y’all teach me what Dr. Rice Lively said2. Probability3. Work the sample problems4. Graphs/tables/figures/charts5. Start to look at experimental design

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 3

iProbability

• Remember all those decisions we talked about, last week.

• VERY little of life is certain.• It is PROBABILISTIC. (That is,

something might happen, or it might not.)

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 4

iProb. (cont’d.)

• Life’s a gamble!• Just about every decision is based on a

probable outcomes.• None of you raised your hands in Week 1

when I asked for “statistical wizards.” Yet every one of you does a pretty good job of navigating an uncertain world.– None of you touched a hot stove (on purpose.)– All of you made it to class.

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 5

iProbabilities

• Always between one and zero.• Something with a probability of “one” will

happen. (e.g., Death, Taxes).• Something with a probability of “zero” will not

happen. (e.g., My becoming a Major League Baseball player).

• Something that’s unlikely has a small, but still positive, probability. (e.g., probability of someone else having the same birthday as you is 1/365 = .0027, or .27%.)

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 6

iJust because . . .

• . . . There are two possible outcomes, doesn’t mean there’s a “50/50 chance” of each happening.

• When driving to school today, I could have arrived alive, or been killed in a fiery car crash. (Two possible outcomes, as I’ve defined them.) Not equally likely.

• But the odds of a flipped coin being “heads,” . . . .

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 7

iLet’s talk about socks

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 8

iProb (cont’d.)

• Probability of something happening is – # of “successes” / # of all events– P(one flip of a coin landing heads) = ½ = .5– P(one die landing as a “2”) = 1/6 = .167– P(some score in a distribution of scores is greater

than the median) = ½ = .5– P(some score in a normal distribution of scores is

greater than the mean but has a z score of 1 or less is . . . ?

– P(drawing a diamond from a complete deck of cards) = ?

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 9

iProbabilities – and & or

• From Runyon:– Addition Rule: The probability of selecting a

sample that contains one or more elements is the sum of the individual probabilities for each element less the joint probability. When A and B are mutually exclusive,

• p(A and B) = 0.• p(A or B) = p(A) + p(B) – p(A and B)

– Multiplication Rule: The probability of obtaining a specific sequence of independent events is the product of the probability of each event.

• p(A and B and . . .) = p(A) x p(B) x . . .

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 10

iMore prob.

• From Slavin:– Addition Rule: If X and Y are mutually

exclusive events, the probability of obtaining either of them is equal to the probability of X plus the probability of Y.

– Multiplication Rule: The probability of the simultaneous or successive occurrence of two events is the product of the separate probabilities of each event.

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 11

iYet more prob.

• http://www.midcoast.com.au/~turfacts/maths.html– The product or multiplication rule. "If two chances

are mutually exclusive the chances of getting both together, or one immediately after the other, is the product of their respective probabilities.“

– the addition rule. "If two or more chances are mutually exclusive, the probability of making ONE OR OTHER of them is the sum of their separate probabilities."

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 12

iAdditional Resources• Phil Doty, from the ISchool, has taught this class

before. He has welcomed us to use his online video tutorials, available at http://www.gslis.utexas.edu/~lis397pd/fa2002/tutorials.html– Frequency Distributions– z scores– Intro to the normal curve– Area under the normal curve– Percentile ranks, z-scores, and area under the normal curve

• Pretty good discussion of probability:http://ucsub.colorado.edu/~maybin/mtop/ms16/exp.html

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 13

iThink this through.

• What are the odds (“what are the chances”) (“what is the probability”) of getting two “heads” in a row?

• Three heads in a row?• Three flips the same (heads or tails) in a

row?

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 14

iSo then . . .

• WHY were the odds in favor of having two people in our class with the same birthday?

• Think about the problem!• What if there were 367 people in the

class. – P(2 people with same b’day) = 1.00

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 15

iHappy B’day to Us

• But we had 50.• Probability that the first person has a

birthday: 1.00.• Prob of the second person having the

same b’day: 1/365• Prob of the third person having the same

b’day as Person 1 and Person 2 is 1/365 + 1/365 – the chances of all three of them having the same birthday.

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 16

iSooooo . . .

• http://www.people.virginia.edu/~rjh9u/birthday.html

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 17

iPractice Problems1. If I have a z score of .75, what percentage of the

scores have I “beaten”? ___2. My score was one and a half standard deviations

above the mean. What’s my z score? ___3. I beat12% of the people on a calculus test. What was

my z score? ___4. What if I beat 88%? What was my z score? ___5. What’s the probability of flipping a coin three times

and getting all tails? ___6. What’s the probability of flipping a coin three times

and getting first a head, then a tail, then a head? ___7. What’s the probability of flipping a coin three times

and getting two heads and a tail? ___

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 18

iGraphs

• Graphs/tables/charts do a good job (done well) of depicting all the data.

• But they cannot be manipulated mathematically.

• Plus it can be ROUGH when you have LOTS of data.

• Let’s look at your examples.

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 19

iYour Charts/Graphs/Tables

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 20

iSome rules . . .

• . . . For building graphs/tables/charts:– Label axes.– Divide up the axes evenly.– Indicate when there’s a break in the rhythm!– Keep the “aspect ratio” reasonable.– Histogram, bar chart, line graph, pie chart,

stacked bar chart, which when?– Keep the user in mind.

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 21

iThe Scientific Method

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 22

iMore than anything else . . .

• . . . scientists are skeptical.• P. 28: Scientific skepticism is a gullible

public’s defense against charlatans and others who would sell them ineffective medicines and cures, impossible schemes to get rich, and supernatural explanations for natural phenomena.”

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 23

iResearch Methods

S, Z, & Z, Chapters 1, 2, 3, 7, 8

Researchers are . . .- like detectives – gather evidence, develop a

theory.- Like judges – decide if evidence meets

scientific standards.- Like juries – decide if evidence is “beyond a

reasonable doubt.”

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 24

iScience . . .

• . . . Is a cumulative affair. Current research builds on previous research.

• The Scientific Method:– is Empirical (acquires new knowledge via

direct observation and experimentation)– entails Systematic, controlled observations.– is unbiased, objective.– entails operational definitions.– is valid, reliable, testable, critical, skeptical.

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 25

iCONTROL

• . . . is the essential ingredient of science, distinguishing it from nonscientific procedures.

• The scientist, the experimenter, manipulates the Independent Variable (IV – “treatment – at least two levels – “experimental and control conditions”) and controls other variables.

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 26

iMore control

• After manipulating the IV (because the experimenter is independent – he/she decides what to do) . . .

• He/she measures the effect on the Dependent Variable (what is measured – it depends on the IV).

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 27

iKey Distinction

• IV vs. Individual Differences variable• The scientist MANIPULATES an IV, but

SELECTS an Individual Differences variable (or “subject” variable).

• Can’t manipulate a subject variable. – “Select a sample. Have half of ‘em get a

divorce.”• Consider an Individual Difference, or

Subject Variable, as a TYPE of IV.

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 28

iOperational Definitions

• Explains a concept solely in terms of the operations used to produce and measure it.– Bad: “Smart people.”– Good: “People with an IQ over 120.”– Bad: “People with long index fingers.”– Good: “People with index fingers at least 7.2 cm.”– Bad: Ugly guys.– Good: “Guys rated as ‘ugly’ by at least 50% of the

respondents.”

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 29

iValidity and Reliability

• Validity: the “truthfulness” of a measure. Are you really measuring what you claim to measure? “The validity of a measure . . . the extent that people do as well on it as they do on independent measures that are presumed to measure the same concept.”

• Reliability: a measure’s consistency.• A measure can be reliable without being valid,

but not vice versa.

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 30

iTheory and Hypothesis

• Theory: a logically organized set of propositions (claims, statements, assertions) that serves to define events (concepts), describe relationships among these events, and explain their occurrence.– Theories organize our knowledge and guide our

research

• Hypothesis: A tentative explanation.– A scientific hypothesis is TESTABLE.

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 31

iGoals of Scientific Method• Description

– Nomothetic approach – establish broad generalizations and general laws that apply to a diverse population

– Versus idiographic approach – interested in the individual, their uniqueness (e.g., case studies)

• Prediction– Correlational study – when scores on one variable can be

used to predict scores on a second variable. (Doesn’t necessarily tell you “why.”)

• Understanding – con’t. on next page• Creating change

– Applied research

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 32

iUnderstanding

• Three important conditions for making a causal inference:– Covariation of events. (IV changes, and the

DV changes.)– A time-order relationship. (First the scientist

changes the IV – then there’s a change in the DV.)

– The elimination of plausible alternative causes.

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 33

iConfounding• When two potentially effective IVs are allowed to covary

simultaneously.

– Poor control!

• Remember week 1 – Men, overall, did a better job of remembering the 12 “random” letters. But the men had received a different “clue” (“Maybe they’re the months of the year.”)

• So GENDER (what type of IV? A SUBJECT variable, or indiv. differences variable) was CONFOUNDED with “type of clue” (an IV).

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 34

iIntervening Variables

• Link the IV and the DV, and are used to explain why they are connected.

• Here’s an interesting question: WHY did the authors put this HERE in the chapter?– Because intervening variables are important

in theories.

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 35

iA bit more about theories

• Good theories provide “precision of prediction”

• The “rule of parsimony” is followed– The simplest alternative explanations are

accepted• A good scientific theory passes the most

rigorous tests• Testing will be more informative when

you try to DISPROVE (falsify) a theory

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 36

iPopulations and Samples

• Population: the set of all cases of interest

• Sample: Subset of all the population that we choose to study.

Population Sample

Parameters Statistics

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 37

iCh. 3 -- Ethics• Read the chapter.• Understand informed consent, p. 57 – a person’s

expressed willingness to participate in a research project, based on a clear understanding of the nature of the research, the consequences of declining, and other factors that might influence the decision.

• Odd quote, p. 69 – Debriefing should be informal and indirect.

• Know that UT has an IRB: http://www.utexas.edu/research/rsc/humanresearch/

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 38

iCh. 7 – Independent Groups Design

• Description and Prediction are crucial to the scientific study of behavior, but they’re not sufficient for understanding the causes. We need to know WHY.

• Best way to answer this question is with the experimental method.

• “The special strength of the experimental method is that it is especially effective for establishing cause-and-effect relationships.”

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 39

iGood Paragraph

• P. 196, para. 2 – Discusses how experimental methods and descriptive methods aren’t all THAT different – well, they’re different, but related. And often used together.

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 40

iGood page – P. 197

• Why we conduct experiments• If results of an experiment (a well-run

experiment!) are consistent with theory, we say we’ve supported the theory. (NOT that it is “right.”)

• Otherwise, we modify the theory.• Testing hypotheses and revising theories

based on the outcomes of experiments – the long process of science.

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 41

iLogic of Experimental Research

• Researchers manipulate an independent variable in an experiment to observe the effect on behavior, as assessed by the dependent variable.

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 42

iIndependent Groups Design

• Each group represents a different condition as defined by the independent variable.

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 43

iRandom . . .

• Random Selection vs. Random Assignment– Random Selection = every member of the

population has an equal chance of being selected for the sample.

– Random Assignment = every member of the sample (however chosen) has an equal chance of being placed in the experimental group or the control group.

• Random assignment allows for individual differences among test participants to be averaged out.

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 44

iLet’s step back a minute

• An experiment is personkind’s way of asking nature a question.

• I want to know if one variable (factor, event, thing) has an effect on another variable – does the IV systematically influence the DV?

• I manipulate some variables (IVs), control other variables, and count on random selection to wash out the effects of all the rest of the variables.

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 45

iBlock Randomization

• Another way to wash-out error variance.• Assign subjects to blocks of subjects,

and have whole blocks see certain conditions.

• (Very squirrelly description in the book.)

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 46

iChallenges to Internal Validity• Testing intact groups. (Why is the group a group?

Might be some systematic differences.)• Extraneous variables. (Balance ‘em.) (E.g.,

experimenter).• Subject loss

– Mechanical loss, OK.– Select loss, not OK.

• Demand characteristics (cues and other info participants pick up on) – use a placebo, and double-blind procedure

• Experimenter effects – use double-blind procedure

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 47

iRole of Data Analysis in Exps.

• Primary goal of data analysis is to determine if our observations support a claim about behavior. Is that difference really different?

• We want to draw conclusions about populations, not just the sample.

• Two different ways – statistics and replication.

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 48

iTwo methods of making inferences

• Null hypothesis testing– Assume IV has no effect on DV; differences we

obtain are just by chance (error variance)– If the difference is unlikely enough to happen by

chance (and “enough” tends to be p < .05), then we say there’s a true difference.

• Confidence intervals– We compute a confidence interval for the “true”

population mean, from sample data. (95% level, usually.)

– If two groups’ confidence intervals don’t overlap, we say (we INFER) there’s a true difference.

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 49

iWhat data can’t tell us

• Proper use of inferential statistics is NOT the whole answer.– Scientist could have done a trivial

experiment.– Also, study could have been confounded.– Also, could by chance find this difference.

(Type I and Type II errors – hit this for real in week 5.)

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 50

iThis is HUGE.

• When we get a NONsignificant difference, or when the confidence intervals DO overlap, we do NOT say that we ACCEPT the null hypothesis. – Hinton, p. 37 – “On this evidence I accept the null

hypothesis and say that we have not found evidence to support Peter’s view of hothousing.”

• We just cannot reject it at this time.• We have insufficient evidence to infer an effect

of the IV on the DV.

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 51

iNotice

• Many things influence how easy or hard it is to discover a difference.– How big the real difference is.– How much variability there is in the

population distribution(s).– How much error variance there is.– Let’s talk about variance.

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 52

iSources of variance• Systematic vs. Error

– Real differences– Error variance

• What would happen to the standard deviation if our measurement apparatus was a little inconsistent?

• There are OTHER sources of error variance, and the whole point of experimental design is to try to minimize ‘em.

Get this: The more error variance, the harder for real differences to “shine through.”

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 53

iOne way to reduce the error variance

• Matched groups design– If there’s some variable that you think MIGHT

cause some variance, – Pre-test subjects on some matching test that

equates the groups on a dimension that is relevant to the outcome of the experiment. (Must have a good matching test.)

– Then assign matched groups. This way the groups will be similar on this one important variable.

– STILL use random assignment to the groups.– Good when there are a small number of possible

test subjects.

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 54

iAnother design

• Natural Groups design– Based on subject (or individual differences)

variables. – Selected, not manipulated.– Remember: This will give us description,

and prediction, but not understanding (cause and effect).

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 55

iWe’ve been talking about . . .

• Making two groups comparable, so that the ONLY systematic difference is the IV.– CONTROL some variables.– Match on some.– Use random selection to wash out the

effects of the others.– What would be the best possible match for

one subject, or one group of subjects?

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 56

iThemselves!

• When each test subject is his/her own control, then that’s called a – Repeated measures design, or a– Within-subjects design.

(And the independent groups design is called a “between subjects” design.)

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 57

iRepeated Measures

• If each subject serves as his/her own control, then we don’t have to worry about individual differences, across experimental and control conditions.

• EXCEPT for newly introduced sources of variance – order effects:– Practice effects– Fatigue effects

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 58

iCounterbalancing

• ABBA• Used to overcome order effects.• Assumes practice/fatigue effects are

linear.• Some incomplete counterbalancing

ideas are offered in the text.

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 59

iWhich method when?

• Some questions DO lend themselves to repeated measures (within-subjects) design – Can people read faster in condition A or condition

B?– Is memorability improved if words are grouped in

this way or that?• Some questions do NOT lend themselves to

repeated measures design– Do these instructions help people solve a particular

puzzle?– Does this drug reduce cholesterol?

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 60

iHinton typo

• P. 62, para. 1: “. . . population standard deviation, µ, divided by . . . .”

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 61

iMidterm• Emphasize

– How to lie with statistics – concepts– To know a fly – concepts– SZ&Z – Ch. 1, 2, 7, 8– Hinton – Ch. 1, 2, 3, 4, 5

• De-emphasize– SZ&Z – Ch. 3– Other readings

• Totally ignore for now– SZ&Z – Ch. 14– Hinton – Ch. 6, 7, 8

R. G. Bias | School of Information | SZB 562BB | Phone: 512 471 7046 | rbias@ischool.utexas.edu 62

iSome questions we’d like to ask Nature