Introduction to Regression Analysis

26
INTRODUCTION TO REGRESSION ANALYSIS SZILVESZTER MOLNAR

Transcript of Introduction to Regression Analysis

Page 1: Introduction to Regression Analysis

INTRODUCTION TO REGRESSION ANALYSIS

SZILVESZTER MOLNAR

Page 2: Introduction to Regression Analysis

INTRODUCTION TO REGRESSION ANALYSIS

BUSINESS QUESTIONS

▸ What is the trend in my sales?

▸ As a business owner I would like to know my month-to-month average revenue growth for the past 3 years.

Page 3: Introduction to Regression Analysis

INTRODUCTION TO REGRESSION ANALYSIS

BUSINESS QUESTIONS

Page 4: Introduction to Regression Analysis

INTRODUCTION TO REGRESSION ANALYSIS

TOOLS

Page 5: Introduction to Regression Analysis

INTRODUCTION TO REGRESSION ANALYSIS

HOTEL EXAMPLE

▸ Hotel price relation to distance from the city centre

Page 6: Introduction to Regression Analysis

INTRODUCTION TO REGRESSION ANALYSIS

HOTEL DATA

hotel_name distance price

Trump International 0.8 300

Four Seasons 1.2 178

Ritz-Carlton 1.5 200

IBIS 1.7 70

Page 7: Introduction to Regression Analysis

INTRODUCTION TO REGRESSION ANALYSIS

PRICE DISTRIBUTION

Page 8: Introduction to Regression Analysis

INTRODUCTION TO REGRESSION ANALYSIS

R

qplot(hotels$price, geom="histogram",

fill=I("deepskyblue1"),

col=I("black"),

xlab="Hotel Price",

ylab="Count")

Page 9: Introduction to Regression Analysis

INTRODUCTION TO REGRESSION ANALYSIS

DISTANCE DISTRIBUTION

Page 10: Introduction to Regression Analysis

INTRODUCTION TO REGRESSION ANALYSIS

RELATION?

▸ Can we see any relation between the price & distance?

Page 11: Introduction to Regression Analysis

INTRODUCTION TO REGRESSION ANALYSIS

SCATTERPLOT

Page 12: Introduction to Regression Analysis

INTRODUCTION TO REGRESSION ANALYSIS

R

ggplot(data = hotels, aes(x=distance, y=price)) +

geom_point(size=1.5, colour="deepskyblue1")

Page 13: Introduction to Regression Analysis

INTRODUCTION TO REGRESSION ANALYSIS

TREND IN THE DATA

Page 14: Introduction to Regression Analysis

INTRODUCTION TO REGRESSION ANALYSIS

R

ggplot(data = hotels, aes(x=distance, y=price)) +

geom_point(size=1.5, colour="deepskyblue1") +

geom_smooth(method="loess", colour="red")

Page 15: Introduction to Regression Analysis

INTRODUCTION TO REGRESSION ANALYSIS

LO(W)ESS

▸ Locally Weighted Scatterplot Smoothing

Page 16: Introduction to Regression Analysis

INTRODUCTION TO REGRESSION ANALYSIS

QUESTION

▸ How much prices decrease on average as we move 1km further away from the city centre?

Page 17: Introduction to Regression Analysis

LINEAR REGRESSION

Page 18: Introduction to Regression Analysis

INTRODUCTION TO REGRESSION ANALYSIS

LINEAR REGRESSION

▸ It's similar to LOESS but it's a ...

LINE

Page 19: Introduction to Regression Analysis

INTRODUCTION TO REGRESSION ANALYSIS

LINEAR REGRESSION

Page 20: Introduction to Regression Analysis

INTRODUCTION TO REGRESSION ANALYSIS

LINEAR REGRESSION

▸ E[y|x]

▸ E[y|x] = α + βx

Page 21: Introduction to Regression Analysis

INTRODUCTION TO REGRESSION ANALYSIS

HOTEL EXAMPLE

E[avgPrice | distance] = 119 - 2.6 * distance

Page 22: Introduction to Regression Analysis

INTRODUCTION TO REGRESSION ANALYSIS

LINEAR REGRESSION

Page 23: Introduction to Regression Analysis

INTRODUCTION TO REGRESSION ANALYSIS

R

# do a linear regression

regression <- lm(price ~ distance, data=hotels)

summary(regression)

# Linear regression visualised over a scatterplot

ggplot(data = hotels, aes(x=distance, y=price)) +

geom_point(size=1.5, colour="deepskyblue1") +

geom_smooth(method="lm", colour="red", se=F)

Page 24: Introduction to Regression Analysis

INTRODUCTION TO REGRESSION ANALYSIS

LINEAR REGRESSION TAKE AWAY

E[y|x] = α + βx

Page 25: Introduction to Regression Analysis

INTRODUCTION TO REGRESSION ANALYSIS

LINEAR REGRESSION ...

▸ Piecewise linear regression

Page 26: Introduction to Regression Analysis

INTRODUCTION TO REGRESSION ANALYSIS

QUESTIONS

Thanks :)