Real price predictor

70
REAL PRICE PREDICTOR TOC PROJECT

description

Introuduce the project we've done. Check out on https://github.com/WemyJu/TOC_proj

Transcript of Real price predictor

Page 1: Real price predictor

REAL PRICE PREDICTORTOC PROJECT

Page 2: Real price predictor

WHO ARE WE?

Students from NCKU IIM

Page 3: Real price predictor

WHAT‘S THIS?

Use the real price data from DataGarage

to generate regression function.

Page 4: Real price predictor

WHAT CAN WE DO WITH THIS

1. Knowing market quotation of real estate

2. Predict price of real estate

Page 5: Real price predictor

WHAT CAN WE DO WITH THIS

1. Knowing market quotation of real estate

2. Predict price of real estate

For example,

predict_price( 台北市 , 文山區 , other necessary information…) = 1000000

Page 6: Real price predictor

WHY CHOOSING THIS PROBLEM?

We’ve discussed with student in related department.

Page 7: Real price predictor

WHY CHOOSING THIS PROBLEM?

We’ve discussed with student in related department.

If they concern problem like this,

Page 8: Real price predictor

WHY CHOOSING THIS PROBLEM?

We’ve discussed with student in related department.

If they concern problem like this,

it means we’re dealing with data that the real world really care about!!!

Page 9: Real price predictor

WHY CHOOSING THIS PROBLEM?

We’ve discussed with student in related department.

If they concern problem like this,

it means we’re dealing with data that the real world really care about!!!

or… at least for those real estate appraisers.

Page 10: Real price predictor

HOW DO WE ACHIEVE IT?

Page 11: Real price predictor

HOW DO WE ACHIEVE IT?

Statistics Programming+

Page 12: Real price predictor

HOW DO WE ACHIEVE IT?

1. Parse the real price data

2. Classify the data

3. Generate regression for each region

4. Predict the price

Page 13: Real price predictor

HOW DO WE ACHIEVE IT?

It sounds easy, isn’t’ it?

Page 14: Real price predictor

HOW DO WE ACHIEVE IT?

The devil is in the details!!!

Page 15: Real price predictor

PARSE THE REAL PRICE DATA

Page 16: Real price predictor

PARSE THE REAL PRICE DATA

Page 17: Real price predictor

PARSE THE REAL PRICE DATA

Page 18: Real price predictor

PARSE THE REAL PRICE DATA

We don’t use method like this, instead….

Page 19: Real price predictor

PARSE THE REAL PRICE DATA

Page 20: Real price predictor

PARSE THE REAL PRICE DATA

Page 21: Real price predictor

PARSE THE REAL PRICE DATA

If the tools have already existed,

why should we write it ourselves?

Page 22: Real price predictor

PARSE THE REAL PRICE DATA

This API only deal with URL and parse filtered data instead of raw data

Page 23: Real price predictor

PARSE THE REAL PRICE DATA

Take hw3 for example,

if we want to find 土地區段位置或建物區門牌 contain 文山區

Page 24: Real price predictor

PARSE THE REAL PRICE DATA

Take hw3 for example,

http://www.datagarage.io/api/5365dee31bc6e9d9463a0057

if we want to find 土地區段位置或建物區門牌 contain 文山區

Page 25: Real price predictor

PARSE THE REAL PRICE DATA

Take hw3 for example,

http://www.datagarage.io/api/5365dee31bc6e9d9463a0057?selector= 土地區段位置或建物區門牌 =/ 文山區 /

if we want to find 土地區段位置或建物區門牌 contain 文山區

Page 26: Real price predictor

PARSE THE REAL PRICE DATA

Take hw3 for example,

http://www.datagarage.io/api/5365dee31bc6e9d9463a0057?selector= 土地區段位置或建物區門牌 =/ 文山區 /

if we want to find 土地區段位置或建物區門牌 contain 文山區

It’s just far more easier than filtered data ourselves, isn’t it?

Parsing data from this url and we get what we want!

Page 27: Real price predictor

PARSE THE REAL PRICE DATA

So we only parse necessary data from DataGarage.

By doing so, we can save plenty of processing time.

Page 28: Real price predictor

PARSE THE REAL PRICE DATA

Although, we say that we use this tool…

Page 29: Real price predictor

PARSE THE REAL PRICE DATA

the fact is that…

Although, we say that we use this tool…

Page 30: Real price predictor

PARSE THE REAL PRICE DATA

the fact is that…

we wrote it!!!

Although, we say that we use this tool…

Page 31: Real price predictor

PARSE THE REAL PRICE DATA

Page 32: Real price predictor

PARSE THE REAL PRICE DATA

The author of data garage merge our pull request!!!

Page 33: Real price predictor

CLASSIFY THE DATA

Page 34: Real price predictor

CLASSIFY THE DATA

Well… It’s kind of hard to explain…

Page 35: Real price predictor

CLASSIFY THE DATA

Well… It’s kind of hard to explain…

Let example tells the story

Page 36: Real price predictor

CLASSIFY THE DATA

土地區段位置或建物區門牌 : 桃園縣楊梅市金山街 298 巷 31~60 號 , 鄉鎮市區 : 楊梅市 , 交易年月 : 10302

土地區段位置或建物區門牌 : 桃園縣楊梅市金山街 298 巷 31~60 號 , 鄉鎮市區 : 楊梅市 , 交易年月 : 10302

土地區段位置或建物區門牌 : 臺中市太平區建成街 128 巷 1~30 號 , 鄉鎮市區 : 太平區 , 交易年月 : 10302

土地區段位置或建物區門牌 : 桃園縣八德市銀和街 71 巷 1~30 號 , 鄉鎮市區 : 八德市 , 交易年月 : 10302

土地區段位置或建物區門牌 : 臺中市西屯區臺灣大道四段 1261~1290 號 , 鄉鎮市區 : 西屯區 , 交易年月 : 10301

土地區段位置或建物區門牌 : 桃園縣楊梅市金山街 298 巷 31~60 號 , 鄉鎮市區 : 楊梅市 , 交易年月 : 10302

This is our raw data

Page 37: Real price predictor

CLASSIFY THE DATA

桃園縣

台中市

楊梅市

八德市

西屯區太平區

土地…門牌 : 桃園縣楊梅市金山街 298 巷 31~60 號 , 鄉鎮市區 : 楊梅市 , 交易年月 : 10302

土地…門牌 : 桃園縣楊梅市金山街 298 巷 31~60 號 , 鄉鎮市區 : 楊梅市 , 交易年月 : 10302

土地…門牌 : 桃園縣楊梅市金山街 298 巷 31~60 號 , 鄉鎮市區 : 楊梅市 , 交易年月 : 10302

土地…門牌 : 桃園縣八德市銀和街 71 巷 1~30 號 , 鄉鎮市區 : 八德市 , 交易年月 : 10302

土地…門牌 : 臺中市太平區建成街 128 巷 1~30 號 , 鄉鎮市區 : 太平區 , 交易年月 : 10302

土地…門牌 : 臺中市西屯區臺灣大道四段 1261~1290 號 , 鄉鎮市區 : 西屯區 , 交易年月 : 10301

Page 38: Real price predictor

CLASSIFY THE DATA

桃園縣

台中市

楊梅市

八德市

西屯區太平區

土地…門牌 : 桃園縣楊梅市金山街 298 巷 31~60 號 , 鄉鎮市區 : 楊梅市 , 交易年月 : 10302

土地…門牌 : 桃園縣楊梅市金山街 298 巷 31~60 號 , 鄉鎮市區 : 楊梅市 , 交易年月 : 10302

土地…門牌 : 桃園縣楊梅市金山街 298 巷 31~60 號 , 鄉鎮市區 : 楊梅市 , 交易年月 : 10302

土地…門牌 : 桃園縣八德市銀和街 71 巷 1~30 號 , 鄉鎮市區 : 八德市 , 交易年月 : 10302

土地…門牌 : 臺中市太平區建成街 128 巷 1~30 號 , 鄉鎮市區 : 太平區 , 交易年月 : 10302

土地…門牌 : 臺中市西屯區臺灣大道四段 1261~1290 號 , 鄉鎮市區 : 西屯區 , 交易年月 : 10301

classifiedData[‘ 桃園縣 ][‘’ 楊梅市 ][0] = { 土地…門牌 : 桃園縣楊梅市金山街 298 巷31~60 號 ,

鄉鎮市區 : 楊梅市 ,

交易年月 : 10302 }

Page 39: Real price predictor

CLASSIFY THE DATA

How?

Page 40: Real price predictor

CLASSIFY THE DATA

Regular expression!!!

How?

Page 41: Real price predictor

GENERATE REGRESSION FOR EACH REGION

Page 42: Real price predictor

GENERATE REGRESSION FOR EACH REGION

What is regression?

Page 43: Real price predictor

GENERATE REGRESSION FOR EACH REGION

Statistical approach to forecasting change in a dependent variable (sales revenue, for example) on the basis of change in one or more independent variables (population and income, for example).

Read more: http://www.businessdictionary.com/definition/regression-analysis-RA.html#ixzz36J8AoNeq

According to definition in businessdictionary.com,regression is

What is regression?

Page 44: Real price predictor

GENERATE REGRESSION FOR EACH REGION

This is a regression model with two variables

Page 45: Real price predictor

GENERATE REGRESSION FOR EACH REGION

But…

Page 46: Real price predictor

GENERATE REGRESSION FOR EACH REGION

But…

In this problem, we must consider more than two variables

Page 47: Real price predictor

GENERATE REGRESSION FOR EACH REGION

But…

In this problem, we must consider more than two variables

These are the variable we take into account土地區段位置或建物區門牌 鄉鎮市區總價元 有無管理組織 建物型態土地移轉總面積平方公尺 車位移轉總面積平方公尺建物移轉總面積平方公尺 建物型態建築完成年月 交易年月

Page 48: Real price predictor

GENERATE REGRESSION FOR EACH REGION

Yi ( 各區段 總價元 ) =

This is our model

Page 49: Real price predictor

GENERATE REGRESSION FOR EACH REGION

Yi ( 各區段 總價元 ) =

X1i ( 房物物價指數 ) + X2i ( 有管理員 :1, 沒管理員 :0) + X3i ( 土地移轉面積 ) + X4i( 車位移轉面積 ) + X5i( 建物移轉面積 ) + X6i( 房齡 ) + X7i( 住宅大樓 , 1 為是 , 0 為否 ) + X8i( 套房 ) + X9i( 華夏 ) + X10i( 公寓 ) + x11i( 透天厝 ) + X12i( 店鋪 )

This is our model

Page 50: Real price predictor

GENERATE REGRESSION FOR EACH REGION

Yi ( 各區段 總價元 ) =

X1i ( 房物物價指數 ) + X2i ( 有管理員 :1, 沒管理員 :0) + X3i ( 土地移轉面積 ) + X4i( 車位移轉面積 ) + X5i( 建物移轉面積 ) + X6i( 房齡 ) + X7i( 住宅大樓 , 1 為是 , 0 為否 ) + X8i( 套房 ) + X9i( 華夏 ) + X10i( 公寓 ) + x11i( 透天厝 ) + X12i( 店鋪 )

This is our model

There are 12 variables…

Page 51: Real price predictor

GENERATE REGRESSION FOR EACH REGION

Thanks to the great libraries for statistics in Python

Page 52: Real price predictor

GENERATE REGRESSION FOR EACH REGION

Page 53: Real price predictor

PREDICT THE PRICE

Page 54: Real price predictor

PREDICT THE PRICE

We use the example on previous page

Page 55: Real price predictor

PREDICT THE PRICE

If the user input an address in 台中市西屯區 ,

then we’ll get

this regression

Page 56: Real price predictor

PREDICT THE PRICE

If the user input an address in 台中市西屯區 ,

then we’ll get

this regression

Page 57: Real price predictor

PREDICT THE PRICE

After the user input these data有無管理組織 建物型態土地移轉總面積平方公尺 車位移轉總面積平方公尺 建物移轉總面積平方公尺屋齡 交易年月

Page 58: Real price predictor

PREDICT THE PRICE

After the user input these data有無管理組織 建物型態土地移轉總面積平方公尺 車位移轉總面積平方公尺 建物移轉總面積平方公尺屋齡 交易年月

we’ll quantize these data.

x1 = … x2 = … x3 = … and so on.

Page 59: Real price predictor

PREDICT THE PRICE

Substitue thoes vaule into regression likey = -5 + 2*x1 - 2*x2 + 1.8*x3 ...

Page 60: Real price predictor

PREDICT THE PRICE

Substitue thoes vaule into regression likey = -5 + 2*x1 - 2*x2 + 1.8*x3 ...

Then we’ll get the predicted price

Page 61: Real price predictor

HOW TO USE

Page 62: Real price predictor

PREREQUISITES

1. Python 3

2. Numpy

3. statsmodels

Page 63: Real price predictor

PREREQUISITES

1. Python 3

2. Numpy

3. statsmodels

All the instructions are on github.

Page 64: Real price predictor

PREREQUISITES

Wait a moment…

Page 65: Real price predictor

PREREQUISITES

Where is this repo???

Wait a moment…

Page 66: Real price predictor

PREREQUISITES

https://github.com/WemyJu/TOC_proj/

Page 67: Real price predictor

HOW TO USE

Regression Generator

Price predictor

You can generate regression information and find the result in folder regression_resutlt.

Enter the value as interactive shell ask, and you'll get the predicted price.

If the regression functions have not been generated, it will automatically generate through default data.

Page 68: Real price predictor

FOR FURTHER INFORMATION

https://github.com/WemyJu/TOC_proj

Page 69: Real price predictor

Q & A

Page 70: Real price predictor