Unit 01 intro to statistics - 1 per page

Post on 12-Aug-2015

117 views 1 download

Tags:

Transcript of Unit 01 intro to statistics - 1 per page

1

Welcome to Stat 101: Introduction to Quantitative Methods for

Psychology and the Behavioral Sciences

Kevin Rader, krader@fas.harvard.edu

2

Unit Outline

• Course Logistics and Details

• The What and Why of the Field of Statistics?

• Much needed terminology

• Snapshot of what’s to come in the course

3

Stat 100 vs. 101 vs. 102 vs. 104

• Each course assumes no prior knowledge of statistics

• They all cover the same basic statistical concepts (about 3/4 of

the course)s, though they each emphasize different topics

throughout

• Stat 104 will cover more material and is more mathematically

rigorous, others are similar level.

• Stat 101 will use mostly examples from psychology, general

social/behavioral sciences, and public health.

• Stat 102 emphasizes medical and lab science examples

• Stat 104 emphasizes examples from economics/finance

• Stat 100 is a more general course with a wide-range of examples

• More questions: ask after class.

4

Kevin’s Contact Info

• My office: Science Center, Room SC-105 (likely to change)

• Office Hours (stop in unannounced):

• Tues 12:30-1:30pm and Thurs 11:30am-12:30pm

• Also by appointment (via email)

• Phone numbers:

• Statistics Department: (617) 495-5496

• My office (SC-105): NA

• Email: krader@fas.harvard.edu (preferred over phone)

5

Teaching Staff

• Teaching Fellows:

Joseph Lee: lee26@fas.harvard.edu

Lazhi Wang: wang75@fas.harvard.edu

• Teaching assistants will be teaching sections, holding office hours, answering questions via email, and grading assignments and exams.

6

Course Website

Course website:

http://isites.harvard.edu/icb/icb.do?keyword=k97307

• There you will find (eventually):

• Syllabus

• Administrative Announcements

• Lecture Notes

• SPSS Tutorial (including download and install instructions)

• Assigned Homeworks

• HW #1 will be posted soon: Due Fri, Sept 13th

• Other Study Material (practice exams, web links, etc...)

7

Lecture Notes

• Paper copies will NOT be handed out at the beginning of

lecture after this week (we will provide copies on Thursday).

• They’re organized in Units: which follow chapters in the

textbook (will diverge a bit at end of semester)

• Lecture notes will be posted at least 24 hours in advance

• Notes are somewhat concise – you are encouraged to add

your own annotations and develop your own notes

• Occasionally mistakes appear in lecture notes; corrected

versions will be posted after class

8

Class Meetings

• Lectures:

• Tues & Thurs, 10–11:30am, Science Center SC-Hall A

• Sections

• Optional (but strongly recommended) weekly section to discuss

homework, do extra problems, and review difficult concepts.

• No section this week (begin week of Sept 9).

• Look for announcement on the course website for permanent times

(OH’s too).

• SPSS Tutorials

• To be held in SC-B09

• One on Thursday afternoon and one on Monday afternoon. Times TBD.

9

Textbook

• Statistical Methods for the Social Sciences, Agresti & Finlay,

4th edition. Amazon Link:

www.amazon.com/Statistical-Methods-Social-Sciences-Edition/dp/0205646417/

• Text’s website - http://bcs.whfreeman.com/ips7e/

(6th edition will work fine, 5th ed is prob OK too)

• About half of the assigned homework problems will be

assigned from the text, so it’s a good idea to have a copy.

• It’s a great reference for more details on what is seen in the

lectures. Fairly straightforward explanations.

10

Computing and Calculations

• For all exams (and some homework), you will need a

calculator with log, exponential, square-root functions.

• Statistical Computing Package: SPSS

• Can be downloaded from:

http://downloads.fas.harvard.edu/download

• Tutorial Document on website:

http://isites.harvard.edu/icb/icb.do?keyword=k97307&pag

eid=icb.page624278

• HW #1 will include an introduction to the software. There

are also the SPSS Tutorial Sessions…

11

Exams

• Tues, October 10th: Midterm I, 10-11:30am (in class)

• Tues, Nov 12th: Midterm II, 10-11:30am (in class)

• Tues, Dec 12-20th: Final Exam, Date and Time TBD

• You will be allowed 1 “cheat sheet” for Midterm 1, 2 sheets

for midterm 2, and 3 sheets for the Final Exam (front and

back OK).

12

Homeworks

• Posted to course website on Fridays: http://isites.harvard.edu/icb/icb.do?keyword=k97307&pageid=icb.page6

24266

• Hard Copies must be handed in to the 3rd floor HW

boxes.

• Late homeworks will only be accepted with an official

University excuse (either from UHS or from your

resident dean’s office). NO HW Scores will be

dropped!

13

HW Collaboration

• You are encouraged to discuss homework with other

students (and with the instructor and TFs, of course), but

you must write your final answers yourself, in your own

words.

• Solutions prepared “in committee” or by copying or

paraphrasing someone else’s work are not acceptable; your

handed-in assignment must represent your own thoughts.

All computer output you submit must come from work that

you have done yourself.

• Please indicate on your problem sets the names of the

students with whom you worked.

Group Project

• Will be a roughly 3-5 page paper of text (graphs, tables,

etc… are in addition to that) based on a data analysis of

your choosing.

• Groups of size 2 or 3 required. Very helpful to bounce

ideas of each other.

• Due towards end of reading period (Dec. 10).

• More details about the project to come later in the semester

(around Oct 31st).

14

15

Course Grading

Component Weighting1 Weighting2 Weighting3

Homeworks 30% 30% 30%

Midterm 1 10% 20% 20%

Midterm 2 20% 10% 20%

Final Exam 30% 30% 20%

Project 10% 10% 10%

Total 100% 100% 100%

Your overall score for the course will be the maximum of the 3 weighting schemes presented above. Final course letter grades are not assigned according to a fixed percentages of A's, B's etc (i.e., the course is not `curved'). Letter grades are assigned to the old-fashioned boundaries of A- to A: 90 - 100 final score; B- to B+: 80 - 90, etc. Slight adjustments may be made on the boundaries of letter grades (boundary moved down a bit).

Course Goals for Students

• To learn and understand descriptive statistics and graphical

summaries, basic probability theory, and statistical inference.

• To introduce a range of quantitative tools and methods of

analysis commonly used in the social, psychological and

behavioral sciences with an emphasis on application of methods

to real data.

• To become statistically skilled. At the end of the course, you

should be able to address a research question by choosing a

good data source, be able to figure out and perform what

analysis is most appropriate, and be able to report the findings

that are technically accurate. You should also be able to know

the limitations of your results.

16

17

Unit 1: Intro to Statistics

Chapter 1 in the Text

18

So what is statistics?

(and why is it so cool?)

• The study of the methods for obtaining, organizing, analyzing, and interpreting data.

• Why bother? Principles provide a framework for

• Collecting data and the design of experiments and observational studies (Design)

• Describing and summarizing data (Description)

• Drawing inferences about populations as a whole and predicting future events (Inference)

• Short story: Statistics is the science of using data to prove a point (and hopefully forming a correct conclusion).

19

Proving Points with Statistics

20

Other questions that could be

addressed statistically Social Sciences

• What are the features of online ads that are more likely to capture

your attention?

• How (if at all) is happiness associated with income, job

satisfaction, social life, religious beliefs, or political ideology?

Health

• Does a smoking ban in bars lower the rate of lung cancer?

• How can we study whether a new therapy is better than a standard

therapy for treating depression?

Sports

• Is David Ortiz truly a clutch hitter? Is Alex Rodriguez anti-clutch?

• Can we predict which teams in the NFL will improve from last

year?

21

In Class Exercise: “Research”

Question about Harvard Students

• Let’s brainstorm! In small groups (2 to 4 students)

discuss something you would like to find out about

your fellow Harvard classmates (the whole student

body or a subgroup).

• Kevin’s Boring Example: Who sends more text

messages: men or women?

• Please think of a more interesting example…

Any group want to share their example with the class?

22

Population vs. Sample

• Population: entire group of individuals on which we desire information.

• Technicality: actual vs. conceptual populations

• For our Harvard study:

• Sample: a part of the population on which we actually collect data.

• For our Harvard study:

23

Parameters and Statistics

• Descriptive statistics: summarize the data in the actual sample of data.

• Inferential statistics: provide predictions or generalizations about the population based on the data we collected in the sample.

• Parameter: a numerical summary of the population.

• For our Harvard study:

• Statistic: a numerical summary of the sample data.

• For our Harvard study:

How does this apply to the framework

of Statistical studies (3 parts)?

• How should the study

be conducted?

• How should we select

students (subjects) for

the study? How many

should be included?

• What information

(data) do we need to

collect?

Design: Study planning

and implementation

Description: Graphical

and numerical methods

for summarizing data

Inference: predictions

about the population based

on the sample

• What are the

characteristics of the

subjects in our

sample?

• What are the

summary

measurements of the

data we collected?

• How do our

measurements relate?

• How do these

measurements generalize

to all student of interest

(the population)?

• Could we predict how

the measurements would

relate in the larger

population?

• Anything further to

investigate in the future?

25

Software can help…a lot!

Here is just a quick preview

of what a dataset (a

collection of measurements

for the subjects in a

sample) looks like in SPSS:

Each column represents a

variable (a characteristic

measured on the subjects)

Each row represents a

different subject, and

contains the observations

for that subject

26

Take Home Message: 3 Major

Overarching Topics in the Course

1) Design: planning and obtaining data for your research study.

2) Description: summarizing the data in your sample.

3) Inference: making predictions based on the data and generalizing the results for the population.

Last Word If you are planning on taking this course, you should…

• Download and install SPSS from FAS IT:

http://downloads.fas.harvard.edu/download

• Go to the course website and follow the SPSS tutorial

document. And/or attend an SPSS tutorial session (the

schedule will be posted on the website later today).

• Read through (or at least browse) chapters 1 and 2 in the

text.

• Be aware that HW #1 will be posted by the end of the

week (it is due next Friday, Sept. 13th).

• Be happy!

27