Statistics 100 Lecture Set 1

38
Statistics 100 Lecture Set 1

description

Statistics 100 Lecture Set 1. Lecture Set 1. Course outline and important details about the course Chapter 1 … today Will be doing chapter 2 in the next lecture set Some suggested problems: Chapter 1: 1.3, 1.5, 1.11, 1.13, 1.17. Important Stuff. - PowerPoint PPT Presentation

Transcript of Statistics 100 Lecture Set 1

Page 1: Statistics 100 Lecture Set 1

Statistics 100Lecture Set 1

Page 2: Statistics 100 Lecture Set 1

Lecture Set 1

• Course outline and important details about the course

• Chapter 1 … today

• Will be doing chapter 2 in the next lecture set

• Some suggested problems:– Chapter 1: 1.3, 1.5, 1.11, 1.13, 1.17

Page 3: Statistics 100 Lecture Set 1
Page 4: Statistics 100 Lecture Set 1

Important Stuff

• Statistics and Actuarial Science Stats Lab (Statistics Workshop)

– What is Stats Lab for? One-on-one help is available during its operation hours.

– Where is it? The Stats Lab is located in K9516 (inside k9510)…

– How does the Stats Lab Work?

– The Statistics Workshop opens for regular use from the second week of classes. The hours will depend on the amount of T.A. time available and will be posted at the end of the first week of classes. The Workshop will be open only when there is a T.A. on duty.

• Typically, Mon-Fri:   9:30-16:30

Page 5: Statistics 100 Lecture Set 1

Important Stuff

• Text: Statistics: Concepts and Controversies, 8th edition, by Moore and Notz

• People have asked about 7th edition …

• Read Chapters 1 and 2 this week (they are short)

Page 6: Statistics 100 Lecture Set 1

Important Stuff

• Course web page can be found: www.stat.sfu.ca/~dbingham/stat100

• Download lecture notes day before class

• Will also have announcements (e.g., exam dates)

• Also has my office hours posted (Monday and Wednesday 1:00-2:00)

Page 7: Statistics 100 Lecture Set 1

Important Stuff

Grading Scheme:

– Assignments – 10%– Midterm 1 – 20%– Midterm 2 – 20%– Final – 50%

Tentative mid-term dates

– Mid-Term 1: Monday, February 17

– Mid-Term 2: Monday, March 17

Page 8: Statistics 100 Lecture Set 1

Important Stuff

• Assignments: 8-10 of them

• Usually will be due Wednesdays, before 4:30 in boxes outside lab

• The boxes are labeled (by class and alphabetically)

• Note: 1. Late assignments will not be accepted2. Assignments placed in the wrong box (e.g., stat 270) will

not be accepted

Page 9: Statistics 100 Lecture Set 1

• The classroom is likely to be full

• Be courteous … when you come in, do not sit in the aisle seat (unless you are left-handed)

• Do not put your bag down on a seat ….

• Turn off cell phones, do NOT text, …

• People with laptops …

Important Stuff

Page 10: Statistics 100 Lecture Set 1

Important Stuff

Other stuff

– Class email list: I will occasionally email the class with hints and other information.

– If the email is not from me or Robin Insley (lab instructor), then it is likely spam

Page 11: Statistics 100 Lecture Set 1

What is this course about

• Statistical methods are are used everywhere– Health studies– Industry– Economics– Studying manuscripts

• Most courses I teach are concerned with statistical methods– How to fit models to data– How to use statistics to make decisions

• This course is not about those things

• This course is about statistical reasoning

Page 12: Statistics 100 Lecture Set 1

How to do well

• Study and practice

• Ask questions

• Office hours and the drop-in lab

Page 13: Statistics 100 Lecture Set 1

PART I: Producing Data

• Not every product you could buy is well-made– Cars, phones, clothes, food– Cheaply & poorly made vs. carefully & properly made

• Data are the same way– Not all numbers should be viewed as having equal quality– How they are collected says a lot about the information

that they convey and our degree of belief

• Chapter 1 introduces data collection

Page 14: Statistics 100 Lecture Set 1

Chapter 1: Where do Data Come From?

Page 15: Statistics 100 Lecture Set 1

Chapter 1: Where do Data Come From?

What are the data?

Page 16: Statistics 100 Lecture Set 1

Example

• Does living high voltage power lines cause childhood leukemia?

• Study conducted to see if there is evidence that magnetic fields were related to leukemia a study was conducted

• Researchers compared 628 children who had leukemia and 620 who did not

• Measured magnetic field in the rooms in their houses

• What are the data?

Page 17: Statistics 100 Lecture Set 1

Example

• What are Data?

Individuals

Variables

Name Leukemia Magnetic field in bedroom …

Susan Yes 0.15 μT

Bobby No 0.12 μT

.

.

.

.

.

.

Page 18: Statistics 100 Lecture Set 1

Some Definitions

• Interested in something about a population.

• Population is a collection of individuals.

• Individuals are the objects described by the data

• Data sets contain information/facts relating to individuals.

• Variables are attributes of an individual (e.g., hair color, pain severity, ...).

Page 19: Statistics 100 Lecture Set 1

How are data collected?

• A good deal of effort is spent trying to figure out what data to collect

– Which individuals are measured?– What should be measured to answer the questions of

interest?– What population was the data collected from?– What is the population of interest?– Can we afford to conduct the study?

Page 20: Statistics 100 Lecture Set 1

How are data collected?

• Purpose of Study– Learn something about a group of individuals

– Population = group of individuals that you want to know about

– Sample = group of individuals that you actually measure

– Examples…

– Why not just measure the entire population (census)?

Page 21: Statistics 100 Lecture Set 1

Observational studies

• Observational study: observes individuals and measures variables but does not attempt to influence a study.

• The outcome(s) of interest is called the response variable.

• Observational studies (“you can observe a lot by watching”)

– Identify an individual, watch/measure variables– Do not interfere, merely observe (collect data)– Generally inexpensive… very common

Page 22: Statistics 100 Lecture Set 1

Example (back to leukemia study)

• Chapter 1 has a discussion of Leukemia and Power lines

• Looking for association between magnetic fields and Leukemia

• Measured Electro-magnetic fields & lots of other variables

• Found no link, despite anecdotes

• Notice that researchers did not interfere in the study (e.g., did not intentionally expose children to magnetic fields)

Page 23: Statistics 100 Lecture Set 1

Sample Surveys

• Sample Surveys: A collection of individuals (the sample) from the population are measured and chosen in a specific quantifiable manner

– Special kind of observational study– Use a sample carefully chosen from the population to best

represent the population– Idea is that the sample should be representative of the

population and can learn from the sample

• Examples:• Political Polls: How can we tell who will win an election• Government Surveys: inform policy• Market Research• NOT the leukemia study (more on this in chapters 2-4)

Page 24: Statistics 100 Lecture Set 1

Census

• Census: A sample survey where the sample is (ideally) the entire population

• Example: Statistics Canada conducts the Census of Population and the Census of Agriculture to develop a statistical portrait of Canada and its people

Page 25: Statistics 100 Lecture Set 1

Census

• Interesting side note: In the summer of 2010, the Canadian Federal Government announced that the 2011 long-form census questionnaire will no longer be mandatory

• What does this mean?

Page 26: Statistics 100 Lecture Set 1

Census

• Interesting side note: In the summer of 2010, the Canadian Federal Government announced that the 2011 long-form census questionnaire will no longer be mandatory

• What does this mean?

• “I want to take this opportunity to comment on a technical statistical issue which has become the subject of media discussion. This relates to the question of whether a voluntary survey can become a substitute for a mandatory census. It can not.” — Munir Sheikh, Chief Statistician of Canada

Page 27: Statistics 100 Lecture Set 1

Experiments

• Experiment: Is a study where a treatment is deliberately imposed on an individual in order to observe their response.

• Why do this?

• Why was this not done in the leukemia study?

• Experiments: • Clinical Trials• Agriculture• Manufacturing

Page 28: Statistics 100 Lecture Set 1

Example (Pain Reduction and Reiki)

• Is Reiki an effective pain management tool?

• Reiki treatment is touch therapy used as an alternative to pain medication.

• A pilot study involving 20 volunteers experiencing pain was conducted

• All treatments were provided by a certified Reiki therapist

• Pain was measured using before and after the Reiki treatment

• What kind of study is this?

• Is this a good study (more on this later)?

• If study was repeated, would we see the same results?

Page 29: Statistics 100 Lecture Set 1

Example (Saving for Retirement)

• What are the attitudes of low wage earners about saving for retirement?

• Americans earning $35,000 or less were asked how they are likely to accumulate enough money to retire.

• What are the data?

• What is the population?

• What kind of study is this?

Page 30: Statistics 100 Lecture Set 1

Chapter 1: Where do Data Come From?

Observational study, likely a sample survey

Page 31: Statistics 100 Lecture Set 1

• Which is worse:– Not knowing the answer to a question

– Thinking you know the answer, but being wrong

Page 32: Statistics 100 Lecture Set 1

• Which is worse:– Not knowing the answer to a question

– Thinking you know the answer, but being wrong

"We know he's been absolutely devoted to trying to acquire nuclear weapons, and we believe he has, in fact, reconstituted nuclear weapons."

Dick Cheney, March 16, 2003

Page 33: Statistics 100 Lecture Set 1

• There are many ways to collect data

– Some studies provide good information– Most don’t– How can you tell which is which?

– Being skeptical about studies and identifying good sampling techniques is key

Page 34: Statistics 100 Lecture Set 1

Brief Moment of Statistical Relevance

Page 35: Statistics 100 Lecture Set 1

Brief Moment of Statistical Relevance

• Two highlights from the commercial:

– Shoe is proven to “…work your hamstrings and calves 11% harder…”

– Shoe is proven to “…tone your butt up to 25% more than regular sneakers just by walking…”

Page 36: Statistics 100 Lecture Set 1

Brief Moment of Statistical Relevance

• Two highlights from the commercial:– Shoe is proven to “…work your hamstrings and calves 11%

harder…”– Shoe is proven to “…tone your butt up to 25% more than

regular sneakers just by walking…”

• Some other facts not in the commercial:– The study was based on a sample of 5 women who walked

on a treadmill for 500 steps wearing either the EasyTone or another Reebok walking shoe, and while barefoot.

Page 37: Statistics 100 Lecture Set 1

Brief Moment of Statistical Relevance

• Two highlights from the commercial:– Shoe is proven to “…work your hamstrings and calves 11%

harder…”– Shoe is proven to “…tone your butt up to 25% more than

regular sneakers just by walking…”

• Some other facts not in the commercial:– The study was based on a sample of 5 women who walked

on a treadmill for 500 steps wearing either the EasyTone or another Reebok walking shoe, and while barefoot.

– From the Reebok fine print: “The shoes are designed only for walking, and because of the instability design, wearers are discouraged from running, jumping and engaging in other athletic activities while wearing them.”

Page 38: Statistics 100 Lecture Set 1