Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  ·...

34
Boğaziçi University Department of Management Information Systems MIS 463 Decision Support Systems for Business PROJECT FINAL-REPORT GOAL PREDICTION IN FOOTBALL GAMES Project Team No: 5 Yerzhan BERDIMBET Sezgin DEGİRMENCİ Yakup Can KARADENİZ Neslişah KOCADEMİR Instructor : Aslı Sencer

Transcript of Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  ·...

Page 1: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewIn our global world, ... but still people betting in football games make

Boğaziçi UniversityDepartment of Management Information Systems

MIS 463 Decision Support Systems for Business

PROJECT FINAL-REPORT

GOAL PREDICTION IN FOOTBALL GAMES

Project Team No: 5

Yerzhan BERDIMBETSezgin DEGİRMENCİ

Yakup Can KARADENİZNeslişah KOCADEMİR

Instructor : Aslı Sencer

İstanbul - December, 2015

Page 2: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewIn our global world, ... but still people betting in football games make

I. INTRODUCTION

In our global world, with gradually developing technology and huge flow of

information, people who bet should make right estimations for scores of football games,

but still people betting in football games make wrong estimations. As a result, many of

them lose huge amount of money. To decrease the chance of wrong estimation, we

created a decision support system, which helps people to prevent from further money

loss in their bets by correct prediction.

I.1 The Decision Environment

The main decision is about the goal prediction of the teams, in other words, to know

whether a team will score on the game or not and, this way, users indirectly may

facilitate from this decision to estimate the results of their teams’ game. Also,

bookmakers give an option where one can bet if one team scores or not, which would be

perfect for our system.

Actually, we can say that everybody who uses the system can be described as

decision makers and so users who want to estimate the scores of their teams in terms of

the information that they provided before running the system are the main decision

makers.

The system makes the decision about whether a team scores to another team or not.

The system does not predict the exact game result. The user can use the system before a

particular game starts.

There are many independent variables that affect the game result. But there is not

enough information about some of them before the game starts. The squad of a team

also plays big role and one can’t know it many days before, so best prediction would be

just before the start of a game.

Even if we have every independent variable that affects the game result, we cannot

predict the sudden events that may occur during the game such as injury of famous

player in first 15 minutes. Also, it would be harder to predict if a team scores or not

because data might be inaccurate for teams that recently joined the main league by

winning or qualifying from lower league.

There is a lot of information about factors that make a team to score or not to score.

However, it is difficult to specify these factors and this process spends too much time to

gather all this information.

Page 3: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewIn our global world, ... but still people betting in football games make

In every estimation that we make, there will be very little amount of risk that we

cannot control. For example, one can receive a red card in a game and team might play

defensive way and not score at all. Mainly the cost of erroneous estimation is the

amount invested or betted.

Betting is an enormous sector around the world that people invest in and gain/lose

huge amount of money. Experts developed advanced statistical structures. Our system is

created to support the truth of users’ decisions.

I.2 Mission of ProjectThe mission of the project is to provide the goal estimation of the football games

with a high probability of occurrence. Despite the hardships in the estimation of results

of football games, we will implement a scientific approach to provide accurate

estimations to the people who bet and people who are interested in this kind of

information. This way, as a mission, we can protect people from further huge amounts

of money loss.

When it comes to sub-goals, we cannot only prevent people from money loss, also

we can try to increase their profits from bets as a sub-goal. Moreover, later on, by

developing this system and project further, we will be able to expand to other type of

sports.

I.3 Scope of ProjectOur goal prediction system is focusing on Turkish Superlig. We are predicting if one

team from this league scores or not scores in the upcoming game. Data have been

gathered from previous 4 years’ matches of teams.

I.4 Methodology We implemented data mining approach with decision tree classification in this

project. Decision Trees are commonly used in data mining with the objective of creating

a model that predicts the value of a target (in our case it is 1 if a team will score or 0 if a

team will not score) or dependent variable based on the values of several input (or

independent variables).

Page 4: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewIn our global world, ... but still people betting in football games make

After creating a database of the system, we used IBM SPSS in order to develop a

model, actually a decision tree based on which we wrote our code and made user

interface.

Football game scores from former seasons, cost of the team, league points of a team,

and bet rates from websites were collected and manually entered into database. After

filtering and processing the data and also by normalizing data, we applied decision tree

approach, which helps us to decide whether a team scores or not in an upcoming game.

Then, we designed and developed a user-friendly graphical interface, which makes

able users to select the teams from Turkish Superlig and makes able users to see the

estimation by clicking the button.

II. LITERATURE SURVEY

In this literature survey, we made a research on the former studies about football

match prediction and similar statistical studies. In our internet age, betting companies

became widespread with the effect of the internet and so many people use internet to bet

on different games from football to basketball and even to ice-hockey. However, most

people do wild guesses on these games and they don’t think properly over the

possibilities, so they lose money and decrease their chance of profit unfortunately.

Therefore, as Gomes, Portela, and Santos (2015) aimed to support users to increase

their profits on bets related to football matches, and give them advice which bet they

should choose, we are also, in this study, mainly aiming to help people who bet and lose

money on games unfortunately because of limited information, we want to give them

better choices to minimize their losses and increase their profits. Moreover, in another

similar study, stanford graduates Cheng, Dade, Lipman, and Mills (2013) researched to

predict the NBA game outcomes more accurately than the experts who decide on the

betting lines, this way they tried to help people with better prediction than professional

analysts and basketball experts.

In the part of data collection, we can say that taking the data of the previous years

from websites is a popular and important way of data collection. As we collect the data

from www.canliskor.com, the other researchers in this topic mostly took their data from

Page 5: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewIn our global world, ... but still people betting in football games make

sports websites like Liu and Lai (2010) received the data from www.cfbstats.com which

is a american college football statistics repository. Not only for american football but

also for football (soccer) and basketball, previous researchers almost always looked to

sports websites for match scores and statistical records like Haghigat, Rastegari, and

Nourafza (2013) collected from the valid websites which is related to that specific

sports and also Langseth (2013) collected the data of UK’s premier league from

www.football-data.co.uk for her study.

When it comes to the variables of our study, it can easily be said that all of them can

be supported by previous studies. Former researchers in this area used all or some part

of our variables in their own studies. First and major mutuality is that all the researchers

took the match result records of previous years of football teams or basketball teams etc.

In our study, we collected data records of spor toto super league of Turkey, which

consists of 4 seasons’ data records as Gomes et al. (2013) and Langseth (2013) collected

records of English premier league.

The first and the most important variable in our study is the goals scored by the home

team and away team of a game. This variable is an inevitable part of our study and

collected by almost all of the previous researchers in this area because the scores of

teams in previous years is an important source for prediction of scores of future games

to decide whether a team will win or lose. Not only for football but also even for

basketball and NHL (National Hockey League), the researchers collected the scores of

teams in previous years as Cheng et al. (2013) collected basketball game scores of

previous years and Pischedda (2014) took the data of previous NHL (National Hockey

League) scores of teams in previous seasons.

The second variable of our study is the position of the team in the league because we

think that position reflects the performance of the team in that season. As we consider

that as a variable in our study, Joseph, Fenton, and Neil (2006) take position of a team

in the league into consideration for predicting football game results and also Pischedda

(2014) paid attention to position of the teams within the NHL.

Thirdly, we added bet rates into our research because bookmakers who decide on bet

rates considers some data which we cannot reach and evaluate easily such as current

Page 6: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewIn our global world, ... but still people betting in football games make

injuries of important players, banned players due to red or yellow cards, resignation of

coach and transfer of a player in the middle of the season. For example, Gomes et al.

(2013) took betting rates not only from one betting source, even from several betting

sources as a variable in their study of decision support system for predicting football

game result.

Fourth and the last one is team value. Team value is total of team’s players’ values.

Player’s value composed of players’ abilities, performance, scores etc. so that if a team

has more valuable players than the other team, the chance of winning and scoring will

be high. Joseph et al. (2006) used team quality in his project, even though this variable

is not as same as we used, it is similar because they take the value of team, players’

quality and performances as an indication for team quality.

Also, it is not a must for this part but we would like to mention about methodologies

and our own methodology which will be used in this study. In this study, we will use

decision tree because of the format our topic. Decision trees can handle both nominal

and numeric input attributes. Also, decision tree representation is rich enough to

represent any discrete-value classifier. In the overview chart below, you will see that

decision tree is one of the most popular methods in these type of studies and it was used

by most of the researchers in previous studies. For example, Gomes et al. (2013) used

decision tree for their football prediction studies and also, Pischedda, G. (2014) used

decision tree even for NHL which consists more complex game rules and principles

compared to football. These are just a few examples and these examples can be

increased with our other referenced articles. You can see the chart below to see more

clearly.

Page 7: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewIn our global world, ... but still people betting in football games make

An Overview Chart of Variables Used in Former Studiesand For Mutual Ones Used in Our Study

Former Studies Methods Variables Mutual

Decision Support System for

Predicting Football Game Result

Gomes, J. Portela, F. Santos, M.F. (2013)

Naïve Bayes,

Decision Tree,

Support Vector Machine

Date, Home/away team, Full time home goals, full time

away goals, half time home goals and half time away team goals, home team

shoots and away team shoots, fauls, corners, red and yellow

cards, betting odds…

Full time home score and away score (simply final score of the match)

Betting odds

Predicting the Betting Line in

NBA Games

Cheng, B. Dade, K. Lipman, M. & Mills,

C. (2013)

Support Vector Machine

Home team score, away team score, Fouls, blocks,

rebounds, attendance…

Home team score Away team score

NHL Match Outcomes with ML

Models

Pischedda, G. (2014)

Decision Tree

And

Multi-Layer Artificial Neural

Networks

Goals for, goals against, goals differential, power play

success rate, power kill success rate, shoot%, save%,

winning streak, league position…

Goals for(equivalent of home score)

Goals against(equivalent of away score)

League position

Predicting football Results using

Bayesian nets and other machine

learning techniquesA.Joseph, N.E. Fenton, M.Neil

(2006)

Bayesian Nets

And

Decision Trees

Final league position, average performance, home

or away, representative quality of attacking

force(high, medium, low), team quality…

League position Team quality (can be

considered similar to value of the team

Beating the Bookie: A look at statistical

models for prediction of

football matches

Langseth, H. (2013)

Maher model and Gaussian model in a combination with

the aggressive Markovitz strategy.

The number of goal saved, fired shoots, shoots on

target, away teams defensive ability, attacking strength of

home team, home advantage…

Attacking strength of team (which is extracted from the number of goal scored in previous games, so closely similar to our home and away team scores of teams)

Predicting Sports Events from Past

ResultsBuursma, D. (2011)

Classification via Regression

Goals scored by home teamGoals scored by away teamGoals conceded by home

teamGoals conceded by away

teamAvg. number of points

gained by home and away teams

Goals scored by home team Goals scored by away team Goals conceded by home

team Goals conceded by away

team Number of points gained by

teams in the league

Page 8: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewIn our global world, ... but still people betting in football games make

III. DEVELOPMENT OF THE DSS

III.1 DSS Architecture

First, a user selects the team that he

wants to estimate if it will score or not.

Then the user decides if the team plays at home or away. Then the user selects the

rival team.

Then, according to the inputs from the user, the system evaluates the possibility of scoring of the team, which is selected by the user and the possibility of the rival team to concede.

And, the system returns the result to the screen whether the selected team will score or not.

Page 9: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewIn our global world, ... but still people betting in football games make

III.2 Technical Issues

In order to develop a model, we used decision tree classification with data mining

approach in IBM SPSS and then by using our model we wrote a code on JavaScript. By

using a template we developed a user-friendly interface.

Yes, we made a web-based application. As the database, we have used Excel file

with variables and we are updating them if necessary.

III.3 Data Source and Flow Mechanisms

The inputs for independent variables are obtained from www.canliskor2.com and

www.mackolik.com websites, calculated and entered manually.

Basically, most of the data are collected from www.canliskor2.com

Game scores, bet rates, rank of the teams, team points, and etc. Also, normalized

variables are acquired using these data.

On this screenshot, you can see how the data looks like on the website.

Page 10: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewIn our global world, ... but still people betting in football games make

If you click on any game, there is a section “Oran Karşılaştırması”, from where you

can obtain bet rates.

From www.mackolik.com we have obtained team costs. If you click on a team, its

market value of players is shown as “P. Değeri”. In order to find market values of

player of different seasons, you need to select related seaon on top-right corner.

Page 11: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewIn our global world, ... but still people betting in football games make

The bet rates for upcoming games are entered by a user in our website, because bet

rates are changing and not announced several days before the game.

Our database is on Excel file, here is the screenshot of part of it

III.4 Model and Algorithms

In our project we use data mining classification with decision tree technique in IBM

SPSS. A decision tree is a structure that includes a root node, branches, and leaf nodes.

Each internal node denotes a test on an attribute, each branch denotes the outcome of a

test, and each leaf node holds a class label. The topmost node in the tree is the root

node. Tree models where the target variable can take a finite set of values are

called classification trees. Decision tree classifies cases into groups or predicts values of

a dependent (target) variable based on values of independent (predictor) variables. The

procedure provides validation tools for exploratory and confirmatory classification

analysis.

We use two decision trees. The first one estimates if a home team scores, and the

second one estimates if an away team scores. We have two decision trees because we

have two different dependent variables to estimate. The first decision tree’s dependent

variable is “if a home team scores or not”, where 1 stands for scores, and 0 stands for

does not score and the second decision tree’s dependent variable is “if an away team

scores or not”, where 1 stands for scores, and 0 stands for does not score. For both

Page 12: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewIn our global world, ... but still people betting in football games make

decision trees there are 15 same independent variables that we used. The independent

variables are:

1. Team 1 (home team)

2. Team 2 (away team)

3. Bet 1 (bet rate for win of home team)

4. Bet X (bet rate for draw)

5. Bet 2 (bet rate for win of away team)

6. Team 1 cost (market cost of home team players)

7. Team 2 cost (market cost of away team players)

8. Team 1 rank (rank of home team in the league)

9. Team 2 rank (rank of away team in the league)

10. Team 1 points / played games (how many points on average home team gains)

11. Team 2 points / played games (how many points on average away team gains)

12. Home goal scored ave / game (how many goals on average home team scores)

13. Home goal conceded ave / game (how many goals on average home team

concedes)

14. Away goals scored ave / game (how many goals on average away team scores)

15. Away goals conceded ave / game (how many goals on average away team

concedes)

The first decision tree’s dependent variable is Team1Scoredbinary (home team

scored or not).

The second decision tree’s dependent variable is Team2Scoredbinary (away team

scored or not).

By using this model and later by summing up estimations form both decision trees,

we will give more accurate information on if the team will score or not.

The following decision trees are for the concept “team scores or not” that indicates

whether a team from Turkish Superlig is likely to score a goal or not. Each internal node

represents a test on an attribute. Each leaf node represents a class.

Page 13: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewIn our global world, ... but still people betting in football games make

Classification Tree for Home Team (Team 1)

Model Summary

Specifications

Growing Method CHAID

Dependent Variable Team1Scoredbinary

Independent Variables

Team1, Team2, Bet1, BetX, Bet2, Team1Cost,

Team2Cost, Team1Rank, Team2Rank,

Team1Pointsplayedgamesintheseason,

Team2Pointsplayedgamesintheseason,

Homegoalscoredavegame,

Homegoalconcededavegame,

Awaygoalsscoredavegame,

Awaygoalsconcededavegame

Validation None

Maximum Tree Depth 3

Minimum Cases in Parent Node 30

Minimum Cases in Child Node 10

Results

Independent Variables Included

Homegoalscoredavegame,

Team2Pointsplayedgamesintheseason,

Awaygoalsconcededavegame,

Awaygoalsscoredavegame, BetX, Bet2

Number of Nodes 23

Number of Terminal Nodes 16

Depth 3

Page 14: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewIn our global world, ... but still people betting in football games make

Risk

Estimate Std. Error

.213 .011

Growing Method: CHAID

Dependent Variable:

Team1Scoredbinary

Classification

Observed Predicted

.0 1.0 Percent Correct

.0 103 208 33.1%

1.0 71 930 92.9%

Overall Percentage 13.3% 86.7% 78.7%

Growing Method: CHAID

Dependent Variable: Team1Scoredbinary

Here, the decision tree first divided the first node “Team1scoredornot” into 4 nodes

that are based on “Homegoalscoredavegame” that is how many goals home team scored

per game on average.

a) The first of 4 nodes has condition “Homegoalscoredavegame <=0.88”, it has on

total 117 samples, where “Team1scoredornot=1” percentage is 50.4, that is 59

samples.

b) The second of 4 nodes has condition “Homegoalscoredavegame >0.88 and

Homegoalscoredavegame <=1.24”, it has on total 293 samples, where

“Team1scoredornot=1” percentage is 69.3, that is 203 samples.

c) The third of 4 nodes has condition “Homegoalscoredavegame >1.24 and

Homegoalscoredavegame <=1.71”, it has on total 490 samples, where

“Team1scoredornot=1” percentage is 77.1, that is 378 samples.

d) The forth of 4 nodes has condition “Homegoalscoredavegame >1.71”, it has on

total 412 samples, where “Team1scoredornot=1” percentage is 87.6, that is 361

samples.

The same logic is applied for the rest nodes.

Page 15: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewIn our global world, ... but still people betting in football games make

For example, we want to estimate if a home team will score or not. Let’s say that

home team scores 2 goals per home game on average.

So, by following tree model, we find out that the percentage of occurring 0 (the

team does not score) is 12.4 and the percentage of occurring 1 (the team scores) is 87.6.

And the P-value=0.000, which is quite good. Thus, we say that the team will score with

around 87.6% of probability.

Classification Tree for Away Team (Team 2)

Model Summary

Specifications

Growing Method CHAID

Dependent Variable Team2Scoredbinary

Independent Variables

Team1, Team2, Bet1, BetX, Bet2, Team1Cost,

Team2Cost, Team1Rank, Team2Rank,

Team1Pointsplayedgamesintheseason,

Team2Pointsplayedgamesintheseason,

Homegoalscoredavegame,

Homegoalconcededavegame,

Awaygoalsscoredavegame,

Awaygoalsconcededavegame

Validation None

Maximum Tree Depth 3

Minimum Cases in Parent Node 30

Minimum Cases in Child Node 10

Results

Independent Variables IncludedAwaygoalsscoredavegame,

Homegoalconcededavegame, BetX

Number of Nodes 16

Number of Terminal Nodes 10

Depth 3

Page 16: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewIn our global world, ... but still people betting in football games make

Risk

Estimate Std. Error

.293 .013

Growing Method: CHAID

Dependent Variable:

Team2Scoredbinary

Classification

Observed Predicted

.0 1.0 Percent Correct

.0 57 356 13.8%

1.0 28 871 96.9%

Overall Percentage 6.5% 93.5% 70.7%

Growing Method: CHAID

Dependent Variable: Team2Scoredbinary

The logic of analyzing and getting probability is the same as for “Classification Tree

for Home Team”, but only decision tree is different.

Page 17: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewIn our global world, ... but still people betting in football games make

III.5 User Interface and Reports

This is our homepage, it has “HOW IT WORKS?” button that redirects you to related

part.

This part explains how the site, making a prediction works. Firstly, you select home

and away teams, then you enter bet rates for an upcoming game in order to make

prediction more accurate, and then you get results with their probabilities of occurrence.

When you click “START!” button, you will be redirected to prediction part of the

site.

Page 18: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewIn our global world, ... but still people betting in football games make

This is a prediction part, here you select home and away teams, and you also should

enter bet rates of a game according to selected teams. After Bet 1 (bet rate for win of the

home team), Bet X (bet rate for draw), and Bet 2 (bet rate for win of the away team) are

entered, you click “SHOW RESULT” button and the prediction will be shown.

Let’s make an example, Besiktas vs. Mersin, Bet 1 is 1.50, Bet X is 2.40, Bet 2 is

4.70

Now you can see results. For this example, the probability that Besiktas will score is

84.8%, and the probability that Mersin will score is 55.8%. The calculations are based

on the decision tree models we have developed and they are working in our codes.

Page 19: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewIn our global world, ... but still people betting in football games make

We have 4 team members who are listed in this part of the site. Also, the fields that we worked on are written.

IV. ASSESSMENT

Firstly, we created a team and discussed project topics we could choose for this

course. After deciding on project topic we made a project plan and master plan, which

we were updating periodically.

We have chosen a project topic, which required a methodology that had not been

used before in this course. So, we couldn’t consult the former 463 students of previous

years. That is why it was challenging for our team to be the first group that used data

mining approach with decision tree classification. After meeting the professor with

related field of study, we started to investigate on our project. Another challenge of our

project can be seen as the number of members of our team. We worked on project with

just 3 members and we did it as web-based project which required for teams with 5

members.

We were thinking that the coding part would be really hard for our team, But by

studying hard, combining and using the fields that we are best on, we could follow

master plan and finish our project on time. Also, we are so thankful to our coding

instructors because we consulted them so much and they provide us so critical

information when we couldn’t find out any answer our problems especially to the

instructor of our web based application programming course.

Page 20: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewIn our global world, ... but still people betting in football games make

V. PROJECT PLAN

We are meeting twice a week. Generally we have met on Fridays and Wednesdays.

Since we have almost the same courses, we have a chance to see each other and talk

about our project every day. In weekends, we are using online co-working tools.

We use Decision Tree methodology. Since there is no available database, we had to

create our database manually. We decided to work on database that consists of 3 leagues

but Aslı Sencer advised us to narrow the scope of the project and work on 1 league. So,

we decided to work on Spor Toto Süper Lig. Creating database manually has taken very

long time. We changed our task allocation and all of the group members entered data to

our database. We entered all match scores and bet rates from 2011-2012 to 2015-2016

seasons. We developed classifying decision tree model on IBM SPSS and used it in our

website as a background.

Meeting Place: Hisar Campus

Meeting Time: Fridays, 13:00 – Wednesdays, 14:00

Coordinator: Yerzhan Berdimbet

Task Allocation:

Development: Neslişah Kocademir, Sezgin Değirmenci, Yakup Can Karadeniz,

Yerzhan Berdimbet

Design: Yakup Can Karadeniz

Creating algorithms: Yerzhan Berdimbet

Database: Sezgin Değirmenci, Yakup Can Karadeniz, Neslişah Kocademir, Yerzhan

Berdimbet

Documentation: Neslişah Kocademir

Data Collection: Sezgin Değirmenci

Page 21: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewIn our global world, ... but still people betting in football games make

MASTER PLAN

Project Code Group 5Project Title Goal Prediction in football games

Team Members Yerzhan Berdimbet, Neslisah Kocademir, Sezgin Degirmenci, Yakup Can Karadeniz

Phase Planned Actual Complete% ProblemsStart Finish Start Finish

Team Formation 29 Sept. 9 Oct.  29 Sept 5 Oct.  100

Project Proposal  10 Oct. 25 Oct.  10 Oct. 25 Oct.  100

 We had a problem about writing decision-tree solutions on Excel cell, but we handled this problem via IBM SPSS.

Presentation 15 Oct. 26 Oct. 20 Oct. 26 Oct 100Literature Review (Library, Web, former studies) 6 Oct. 26 Oct.  6 Oct. 26 Oct. 100Interviews with experts, decision makers in the related area  6 Oct. 26 Oct.  6 Oct. 24 Oct. 100

Development of the model  10 Oct. 26 Oct.  6 Oct. 26 Oct. 100

Midreport  29 Oct. 22 Nov. 29 Oct. 22 Nov. 100

Presentation 19 Nov. 22 Nov. 19 Nov. 22 Nov. 100

Data Collection and Organization  10 Nov. 15 Nov. 10 Nov. 18 Nov.  100

Coding interfaces  20 Nov. 10 Dec.  20 Nov.  19 Dec.  100

Validation (Optional)

Final Report  10 Dec. 15 Dec.  20 Dec. 21 Dec.   100

Page 22: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewIn our global world, ... but still people betting in football games make

Presentation  15 Dec. 20 Dec.  21 Dec.

Page 23: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewIn our global world, ... but still people betting in football games make

VI. CONCLUSION

We created “Goal prediction in football games” Decision Support System and on

general, it works as we planned.

There was no delay in project. We had to spend more time on creating our database

but we have completed tasks on schedule. In order to use our limited time, we specified

new deadlines for next steps. Planned and actual dates of previous and next steps of our

project are stated in master plan.

The weakness of our DSS is that we have to enter data manually in order to make it

up to date. In the future, we could connect them with required database, so we could

pull bet rates and game scores in order to make variables up to date. Also, for this

project, 15 independent variables are very good, but we can use more to be more

accurate.

Another weakness of this DSS is that at the beginning of the season, there are low

amount of data, so we need to wait until several games are played and then to make a

prediction. Also, when a new team that qualified from lower league to the upper league

and has not played in the main league for years, it is harder to make correct prediction.

However, it is not the only problem of this DSS, it is a problem for every DSS projects

on this field that we previously researched in the literature review part.

On the other hand, the main positive sides of this DSS are its current results and user-

friendly interface. When we consider former studies in this field, we see that most of

them only depends on the data of previous years. However, our project has current bet

rate variables. If any important player of a team is injured or got red card in a few days

ago of the next game, it will affect the bet rates of the team. And, we use these current

bet rates to get more current results, but former studies depend on only previous years

data. Another difference of our project is its user friendly interface which can be

understood even by a child, it is so easy, understandable and well designed. The reason

why we focus on interface is the bad and complicated design of similar websites which

cannot be understood by everyone because they consist so much symbols and jargons as

we represented you in class presentation.

Page 24: Project Format, mis463 - boun.edu.trmisprivate.boun.edu.tr/sencer/mis463/PreviousProjects/…  · Web viewIn our global world, ... but still people betting in football games make

REFERENCES------------Journal paper--------------------

[1] Gomes, J. Portela, F. and Santos, M. F. (2016) “Real-Time Data Mining Models to Predict Football 2-Way Result.” Jurnal Teknologi, Penerbit UTM Press. (accepted for publication)

[2] Haghighat. M. Rastegari, H. and Nourafza, N. (2013) “A Review of Data Mining Techniques for Result Prediction in Sports”, ACSIJ Advances in Computer Science: an International Journal, Vol. 2, Issue 5, No.6, pp.1-6.

[3] Joseph, A. Fenton, N.E. and Neil, M. (2006) “Predicting Football Results Using Bayesian Nets and Other Machine Learning Techniques”, Knowledge Based Systems, vol.19, no.7, pp.544-553.

[4] Pischedda, G. (2014) “Predicting NHL Match Outcomes with ML Models”, International Journal of Computer Applications, vol.101, no.9, pp.15-22.

----------Conference Proceedings---------

[5] Buursma, D. (2011) “Predicting Sports Events from Past Results”, 14th Twente Student Conference on IT, Enschede, Netherlands

[6] Langseth, H. (2013) “Beating the Bookie: A Look at Statistical Models for Prediction of Football Matches”, Presented at the 12th Scandinavian AI conference, Aalborg, Denmark

-----------Web sources-----------------------

[7] Cheng, B. Dade, K. Lipman, M. & Mills, C. (2013). “Predicting the betting line in NBA games”. Retrieved from Stanford University, Program on Computer Science Web Site: http://cs229.stanford.edu/proj2013/ChengDadeLipmanMills-PredictingTheBettingLineInNBAGames.pdf

[8] Liu, B. and Lai, P. (2010). “Beating the NCAA Football Point Spread”. Retrieved from Stanford University, Program on Computer Science Web Site:

http://cs229.stanford.edu/proj2010/LiuLai-BeatingTheNCAAFootballPointSpread.pdf