Post on 16-Apr-2017
INTRODUCTION TO REGRESSION ANALYSIS
SZILVESZTER MOLNAR
INTRODUCTION TO REGRESSION ANALYSIS
BUSINESS QUESTIONS
▸ What is the trend in my sales?
▸ As a business owner I would like to know my month-to-month average revenue growth for the past 3 years.
INTRODUCTION TO REGRESSION ANALYSIS
BUSINESS QUESTIONS
INTRODUCTION TO REGRESSION ANALYSIS
TOOLS
INTRODUCTION TO REGRESSION ANALYSIS
HOTEL EXAMPLE
▸ Hotel price relation to distance from the city centre
INTRODUCTION TO REGRESSION ANALYSIS
HOTEL DATA
hotel_name distance price
Trump International 0.8 300
Four Seasons 1.2 178
Ritz-Carlton 1.5 200
IBIS 1.7 70
INTRODUCTION TO REGRESSION ANALYSIS
PRICE DISTRIBUTION
INTRODUCTION TO REGRESSION ANALYSIS
R
qplot(hotels$price, geom="histogram",
fill=I("deepskyblue1"),
col=I("black"),
xlab="Hotel Price",
ylab="Count")
INTRODUCTION TO REGRESSION ANALYSIS
DISTANCE DISTRIBUTION
INTRODUCTION TO REGRESSION ANALYSIS
RELATION?
▸ Can we see any relation between the price & distance?
INTRODUCTION TO REGRESSION ANALYSIS
SCATTERPLOT
INTRODUCTION TO REGRESSION ANALYSIS
R
ggplot(data = hotels, aes(x=distance, y=price)) +
geom_point(size=1.5, colour="deepskyblue1")
INTRODUCTION TO REGRESSION ANALYSIS
TREND IN THE DATA
INTRODUCTION TO REGRESSION ANALYSIS
R
ggplot(data = hotels, aes(x=distance, y=price)) +
geom_point(size=1.5, colour="deepskyblue1") +
geom_smooth(method="loess", colour="red")
INTRODUCTION TO REGRESSION ANALYSIS
LO(W)ESS
▸ Locally Weighted Scatterplot Smoothing
INTRODUCTION TO REGRESSION ANALYSIS
QUESTION
▸ How much prices decrease on average as we move 1km further away from the city centre?
LINEAR REGRESSION
INTRODUCTION TO REGRESSION ANALYSIS
LINEAR REGRESSION
▸ It's similar to LOESS but it's a ...
LINE
INTRODUCTION TO REGRESSION ANALYSIS
LINEAR REGRESSION
INTRODUCTION TO REGRESSION ANALYSIS
LINEAR REGRESSION
▸ E[y|x]
▸ E[y|x] = α + βx
INTRODUCTION TO REGRESSION ANALYSIS
HOTEL EXAMPLE
E[avgPrice | distance] = 119 - 2.6 * distance
INTRODUCTION TO REGRESSION ANALYSIS
LINEAR REGRESSION
INTRODUCTION TO REGRESSION ANALYSIS
R
# do a linear regression
regression <- lm(price ~ distance, data=hotels)
summary(regression)
# Linear regression visualised over a scatterplot
ggplot(data = hotels, aes(x=distance, y=price)) +
geom_point(size=1.5, colour="deepskyblue1") +
geom_smooth(method="lm", colour="red", se=F)
INTRODUCTION TO REGRESSION ANALYSIS
LINEAR REGRESSION TAKE AWAY
E[y|x] = α + βx
INTRODUCTION TO REGRESSION ANALYSIS
LINEAR REGRESSION ...
▸ Piecewise linear regression
INTRODUCTION TO REGRESSION ANALYSIS
QUESTIONS
Thanks :)