Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows...

54
IN DEGREE PROJECT ELECTRICAL ENGINEERING, SECOND CYCLE, 30 CREDITS , STOCKHOLM SWEDEN 2016 Knowledge discovery and machine learning for capacity optimization of Automatic Milking Rotary System TIAN XIE KTH ROYAL INSTITUTE OF TECHNOLOGY SCHOOL OF ELECTRICAL ENGINEERING

Transcript of Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows...

Page 1: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

IN DEGREE PROJECT ELECTRICAL ENGINEERING,SECOND CYCLE, 30 CREDITS

, STOCKHOLM SWEDEN 2016

Knowledge discovery and machine learning for capacity optimization of Automatic Milking Rotary System

TIAN XIE

KTH ROYAL INSTITUTE OF TECHNOLOGYSCHOOL OF ELECTRICAL ENGINEERING

Page 2: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

Knowledge discovery and machinelearning for capacity optimization ofAutomatic Milking Rotary System

TIAN XIE

Master in System on Chip designDate: December 2016Supervisor: Peter Mellgren (DeLaval),Saikat Chatterjee (KTH)Examiner: Saikat ChatterjeeSchool of Electrical Engineering

Page 3: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

i

Abstract

Dairy farming as one part of agriculture has thousands of year’s history. The increas-ing demands of dairy products and the rapid development of technology bring dairyfarming tremendous changes. Started by first hand milking, dairy farming goes throughvacuum bucket milking, pipeline milking, and now parlors milking. The automatic andtechnical milking system provided farmer with high-efficiency milking, effective herdmanagement and above all booming income.

DeLaval Automatic Milking Rotary (AMRTM) is the world’s leading automatic milk-ing rotary system. It presents an ultimate combination of technology and machinerywhich brings dairy farming with significant benefits. AMRTM technical milking capac-ity is 90 cows per hour. However, constrained by farm management, cow’s condition andsystem configuration, the actual capacity is lower than technical value. In this thesis, anoptimization system is designed to analyze and improve AMRTM performance. The re-search is focusing on cow behavior and AMRTM robot timeout. Through applying knowl-edge discover from database (KDD), building machine learning cow behavior predictionsystem and developing modeling methods for system simulation, the optimizing solu-tions are proposed and validated.

Keywords: Dairy farming, DeLaval Automatic Milking Rotary (AMRTM), capacity,cow behavior, robot timeout, KDD, machine learning, modeling.

Page 4: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

ii

Sammanfattning

Mjölkproduktion är en del av vårt jordbruks tusenåriga historia. Med ökande krav påmejeriprodukter tillsammans med den snabba utvecklingen utav tekniken för det enormaförändringar i mjölkproduktionen. Mjölkproduktion började inledningsvis med hand-mjölkning sedan har mjölkproduktionsmetoder utvecklats genom olika tekniker och gettoss t.ex. vakuum mjölkning, rörledning mjölkning, fram till dagens mjölkningskarusell.Nu har det automatiska och tekniska mjölkningssystem försedd bönder med högeffektivmjölkning, effektiv djurhållningen och framför allt blomstrande inkomster.

DeLaval Automatic Milking Rotary (AMRTM) är världens ledande automatiska rote-rande mjölkningssystemet. Den presenterar en ultimat kombination av teknik och maski-ner som ger mjölkproduktionen betydande fördelar. DeLaval Automatic Milking Rotarytekniska mjölknings kapacitet är 90 kor per timme. Den begränsas utav jordbruksdrift,tillståndet hos kor och hantering av systemet. Det gör att den faktiska kapaciteten blirlägre än den tekniska. I denna avhandling undersöks hur ett optimeringssystem kan ana-lysera och förbättra DeLaval Automatic Milking Rotary prestanda genom fokusering påkors beteenden och robot timeout. Genom att tillämpa kunskap från databas (KDD), ska-pa maskininlärande system som förutsäger kors beteenden samt utveckla modellerings-metoder för systemsimulering, ges lösningsförslag av optimering samt validering.

Nyckelord: Mjölkproduktion, DeLaval Automatic Milking Rotary, kapacitet, korsbeteenden, robot timeout, KDD, maskininlärande system, modelleringsmetoder.

Page 5: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

iii

Acknowledgements

This thesis is worth the effort of eight months’ work. I would like to thank everyonewho had supported and helped me along the way to get the master degree.

First and foremost, I am grateful for the guidance and support from my DeLaval su-pervisor Peter Mellgren. His visionary thought and patience lead me to the right direc-tion and provide me the opportunity to fulfill my idea. I would like to thank my KTHsupervisor Saikat Chatterjee. He provides me with significant opinions and thoughtfuladvice.

DeLaval AMRTM department gives me a creative and positive working environment,I really love every day of it. Thank Thomas Olsson for providing me with comprehen-sive database; Thank Arto Rajala and Fredrik Kange for letting me join the badmintonclub; Thank all the colleagues’ help on improving my thesis work.

This two-year study in KTH teaches me a lot and gives me countless memories. Wishall my friends have a good future.

Last but not least, I would like to say thanks to my family and Congyu, for theirlove, understanding and support. Thanks all of you for making me a perfect person.

Page 6: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

Contents

Contents iv

1 Introduction 11.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 AMRTM Capacity between ideal and reality . . . . . . . . . . . . . . . . . . . 11.3 Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.4 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.5 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.6 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Theoretic Background 52.1 AMRTM general description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.2 Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3 Methods 83.1 Data analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83.2 Machine learning classification . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3.2.1 Binary classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93.2.2 Decision tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103.2.3 Support vector machine . . . . . . . . . . . . . . . . . . . . . . . . . . . 113.2.4 Extreme learning machine (ELM) . . . . . . . . . . . . . . . . . . . . . 12

3.3 AMRTM system simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

4 AMRTM optimization system design 164.1 KDD process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164.2 Bad cow definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174.3 Robot timeout analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184.4 Machine learning classification . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

4.4.1 Creating database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184.4.2 Decision tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184.4.3 Support vector machine . . . . . . . . . . . . . . . . . . . . . . . . . . . 194.4.4 Extreme learning machine . . . . . . . . . . . . . . . . . . . . . . . . . 20

4.5 AMRTM system Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

5 Result 235.1 Bad cow definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

5.1.1 Single-variable definition . . . . . . . . . . . . . . . . . . . . . . . . . . 235.1.2 Multi-variable definition . . . . . . . . . . . . . . . . . . . . . . . . . . 23

iv

Page 7: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

CONTENTS v

5.2 Machine learning prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245.2.1 Decision tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245.2.2 Support vector machine . . . . . . . . . . . . . . . . . . . . . . . . . . . 255.2.3 Extreme learning machine . . . . . . . . . . . . . . . . . . . . . . . . . 26

5.3 Robot Timeout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265.4 AMRTM system simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

6 Discussion 296.1 Bad cow selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

6.1.1 Bad cow classification with single variable . . . . . . . . . . . . . . . . 296.1.2 Bad cow classification with multi-variable . . . . . . . . . . . . . . . . 30

6.2 Machine learning classification . . . . . . . . . . . . . . . . . . . . . . . . . . . 306.3 Comparison on different optimizing levels . . . . . . . . . . . . . . . . . . . . 316.4 Robot timeout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

7 Conclusion and future work 357.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

Bibliography 37

A Simulation time and capacity 39

B ACA2 Robot Timeout Analysis 44

Page 8: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,
Page 9: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

Chapter 1

Introduction

1.1 Background

DeLaval is a world leader in the dairy farming industry, providing integrated milking so-lutions designed to improve dairy farmers’ production, animal welfare and overall qual-ity of life[1]. DeLaval provides the best solution for customers in more than 100 coun-tries, including milking systems, cooling and feeding systems, housing systems and farmmanagement support systems. The company has been over a century of history sinceGustaf de Laval founded in 1883. Nowadays, DeLaval is a company with 4600+ employ-ees and achieves 1 billion EUR net annual sales.

DeLaval AMRTM is the world first automatic milking rotary system. It is designed toaccelerate customers’ transition from milking management to general farm management.AMRTM provides customers with more efficient labour usage, better cow health condi-tion and higher milk quality with less milk harvesting cost. The sophisticated systemis intended for loose housing system or grassland farms with more than 300 lactatingcows[2] in both voluntary and batch milking.

The AMRTM bidirectional herringbone rotary platform has 24 bails. With five hy-draulic functional robots (TPM1, TPM2, ACA1, ACA2, TSM will be introduced in chapter2) equipped, automatic milking process can be handled simultaneously for the modern24-hour operational daily farm. Every entrance cow has a unique electronic identificationwhich contains information like teat positions for robot camera recognition. During milk-ing operation, Cow behavior, milk quality and platform performance are monitored andrecorded into DelProTM database. Robot arms with specific effectors are responsible toclean teats, attach milking cups and protect teats against bacteria. After the attachment,milking process starts immediately. The qualified milk is stored in the milk tank throughvacuum pumps. After a milked cow leaves the platform, milk cups and floor will be au-tomatically flushed and prepared for next cow.

1.2 AMRTM Capacity between ideal and reality

The capacity is an intuitive measurement for AMRTM performance and a feature mostlyconcerned by customers. It is defined as the number of cows which have been milkedin one hour (cows/hour) or the number of eligible rotations (the duration which at leastone of the robots operated) in one hour (rotation/hour). The theoretical capacity 90 cows/hour

1

Page 10: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

2 CHAPTER 1. INTRODUCTION

is calculated by assuming each rotation is finished in 40 seconds:

Capacity = 1Hour/Rotationduration = 3600sec/40sec = 90cows/hour (1.1)

The actual capacity is restricted by various factors. Previous work shows that robotsuccess rate and cow traffic waiting times which were mentioned in AMR instructionbook [2] have the high impact on capacity. Robot success rate (92-98%) is determined byequipment maintenance condition (especially the teat location camera system) and theunique cow features (the age, the udder shapes, the teat locations and its adaptability toAMR system). Traffic waiting occurs when cows stand in queue for entering or stray onthe way to the platform (reasonable variation is 10-15% of the actual capacity). It couldbe reduced by the efficient farm traffic solutions and the learning ability of cows. An ex-ample of the actual capacity calculation is given:

Assuming 2.4 milkings/cow/24 hours, 10% traffic waiting time, and 95% robot suc-cess rate. The theoretical system capacity is 1620 milkings/18 hours (the maximum oper-ation time is 18 hours within 24 hours).

Robot success rate (95%):1620 ∗ 0.95 = 1539 (1.2)

Cow traffic waiting time (-10%):

1539 ∗ (100%− 10%) = 1385.1 (1.3)

2.4 milkings/cow/24 hours:

1385.1/2.4 = 577.125(577cows/24hours) (1.4)

Capacity / hour:1385.1/18 = 76.95(cows/hour) (1.5)

As the calculation shown, the actual capacity is reasonably lower than theoreticalvalue and varied with specific farm conditions. Observation in real farm and data min-ing show that: if one robot is late, the entire system has to suspend until it finished; Milkcup attachment robot (ACA) has the right to extend the operation time in order to achievea higher successful attachment rate; naughty cows which kick off the milk cups or milkslowly needed more time even second round to complete milking; moreover, human in-volved operation suspend also cost extra time. How to analyze and optimize the capacityof AMRTM based on current data is the primary challenge.

1.3 Purpose

This thesis presents the novel of finding and verifying efficient solutions for optimizingAMRTM capacity. Cow on-platform behavior and AMRTM robot performance are mainresearch directions. Bad cows which couldn’t meet the milking requirement needed tobe separated and treat with special milking procedure. Supervised machine learning al-gorithms are studied and adopted to predict cow’s behavior. Testing different strategies

Page 11: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

CHAPTER 1. INTRODUCTION 3

on robot operation is another approach to optimize capacity. The AMRTM system simula-tion will be designed to examine proposed solutions by recreating the model of milkingsequence and procedure. Meanwhile, it should maintain system operating features asactual as possible. Efficient theoretical analysis and simulation could verify the possiblesolutions and provide the best support for future farm testing.

1.4 Goals

The project has four phases: literature study; examining current data; creating algorithmsand simulation for optimization and testing in actual farms. According to each phase,specific goals are listed:

1. Literature study:

• Understanding AMRTM system operation.

• Researching supervised machine learning algorithm.

2. Examining current data:

• Analyzing cow behavior on AMRTM platform.

• Defining bad cow.

• Analyzing robot timeout.

3. Creating algorithms and simulation for optimization:

• Build machine learning cow behavior prediction system.

• Designing AMRTM system simulation to examine optimizing solutions.

• Comparing machine learning algorithms and discussing different optimizingsolutions.

4. Testing in real farms. Limited by the time constraint, testing in actual farms is notable to research in this thesis project.

1.5 Methodology

This thesis follows quantitative research method. The phenomena, empirical hypothesisand positivism assumption are verified through statistical data analysis and mathemati-cal modeling. The measurement data is the most important part of quantitative researchsince it connects the empirical observation with experiment expression.

Statistics methods will be applied to analyze, interpret, organize and present data[3].Through statistics results, we could find the internal relation between different variablesand exclude interruptions for creating better sorting and optimizing algorithms.

Machine learning is a “Field of study that gives the computers the ability to learnwithout being explicitly programmed”[4]. It shows a majority of advantages on address-ing classification and regression problem. For a particular learning problem, the learningstyle (supervised learning or unsupervised learning) is confirmed in terms of whether theanswer of the problem is known or unknown. It is well used in data mining, marketing,medical treatment and self-driving cars.

Page 12: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

4 CHAPTER 1. INTRODUCTION

System modeling and simulation, as an effective method for estimating the final prod-uct behavior without costly modification, are suitable for testing different optimizations,machine learning predicted results and robot timeouts. Modeling is the abstraction of ac-tual system procedure and constraints;simulation is the implementation of a designedmodel.

1.6 Outline

The remainder structure of this master thesis report consists:Chapter 2 provides a comprehensive study on AMRTM system operation.Chapter 3 describes the data analysis, machine learning and simulation methods ap-

plied in this project.Chapter 4 is the detailed implementation of designing AMRTM optimization system.Chapter 5 presents the experiment results.Chapter 6 discusses the obtained results.Chapter 7 summarizes optimization results and leads to future work.

Page 13: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

Chapter 2

Theoretic Background

2.1 AMRTM general description

A 24-bails AMRTM platform is shown in Figure 2.1. The internal rotary parlour has beendivided into 2 areas by functionality. In area A, robot activities use 5 bails, and entrance/exitarea occupy 4 bails. The rest of 15 bails in area B are available for milking and manualoperation. For instance, if a cow kicks off the milk cup or it is set to manually milking,the milker could attach it by hand in area B.

Figure 2.1: AMRTM overview

The ‘Adventure’ for a cow begin at lower-left corner. When it enters the platform,DeLaval DelRroTM Farm Manager starts to synchronize its information. Then it is movedto TPM1 for teats cleaning and milking preparation. During each rotation, five robotsoperate simultaneously. After the slowest robot finishes the task, the platform rotatesone bail clockwise. Milking starts as soon as ACA2 robot attaches milk cups and stopswhen yield milking reaches the expected amount or time is out. The last step before ex-

5

Page 14: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

6 CHAPTER 2. THEORETIC BACKGROUND

iting platform at lower-right corner is teat spraying in order to inhibit bacteria. Cow’sbehaviour, milk quality and time usage in each milking turn are saved in ‘CowMilking’database.

The hydraulic functional robot is shown in Figure 2.2. Each robot consists of samerobot arm structure (top graph in Figure 2.2) and different functional specific effectors(Ain top graph in Figure 2.2).

The end effector of the teat preparation robot (TPM) has a 3D camera (A in bottomleft graph in Figure 2.2) for finding the teats and a teat cleaning cup (B in bottom leftgraph in Figure 2.2) for cleaning one teat at a time. Two TPM robots are installed in a24-bails parlour. Each cleans two parallel teats during one procedure.

Cup attachment robot (CAM or ACA) has a double magnet gripper (B in bottommiddle graph in Figure 2.2) which is used to fetch two teat cups at a time from milk-ing point controller and attach them to teats. The teat cup is retracted individually whenmilking is finished or kick-off happens. Two ACA robots are installed in a 24-bails par-lour. Each attaches 2 parallel teats during one procedure. By configuration, ACA2 couldextend operating time to attach ACA1’s unfinished teats.

Teat spray robot (TSM) is equipped with two spray nozzles inside the nose cone (Bin bottom right graph in Figure 2.2). It is used to protect teats against bacteria until nextmilking.

Figure 2.2: AMRTM milking robot

AMRTM operating process, named as Piece Of Cake (POC), contains platform rotationduration, robot functional duration, and control operation duration. As shown in Figure2.3, a new POC starts before the platform rotated to the position. The time gap betweenPOC start time and robot active time is ’RobotsNotReadyDuration’. During the end ofrobot operation, the platform is ready again for a new rotation. The last finished robotduration determines the robot functional duration, called ’SlowestRobotDuration’. Thecontrol signal consists of milking wait, unknown wait, and OC wait. By definition, oneAMRTM operating process (POC) equals to:

POC = RobotsNotReadyDuration+ SlowestRobotDuration+Milkingwait+

Unknownwait+OCwait+RotationToEarlyWarningDuration(2.1)

In Equation 2.1, ’RobotsNotReadyDuration’, ’Unknownwait’, ’OCwait’, ’Milkingwait’and ’RotationToEarlyWarningDuration’ are defined as system intrinsic time which re-

Page 15: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

CHAPTER 2. THEORETIC BACKGROUND 7

mains the same during simulation. ’SlowestRobotDuration’ as the main part of POC isinfluenced by milking sequence and robot performance. The optimization methods willapply to minimize ’SlowestRobotDuration’.

Figure 2.3: AMRTM procedure

2.2 Database

AMRTM system saves all the crucial data generated during milking process and transmitsthem to cloud service. The database, which consists of system operating time stamps andcow’s milking information, had been converted to CowMilking and POC database.

In CowMilking database, each row includes all the information on one cow’s milkingprocedure at a time. Inside CowMilking database, it includes cow’s on-platform informa-tion, for instance, group, unique identify number, incomplete teats, kickoff teats, processId, TPM Id & result & duration, ACA Id & result & duration, yield milk.

POC (Piece Of Cake) database arranged data based on all the information for oneAMRTM platform rotation. Inside POC database, it contains rotation start time & endtime & duration, TPM cow Id & result & duration, ACA cow Id & result & duration,slowest robot duration (the rotation started after all robots finished their duties).

Page 16: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

Chapter 3

Methods

3.1 Data analysis

The bad cow was defined as when its behaviour on AMRTM platform negatively influ-enced the system capacity. Knowledge Discovery in Databases (KDD) was adopted tofind out the pattern of cow’s behaviour. The basic problem addressed by the KDD pro-cess is one of mapping low-level data (which are typically too voluminous to understandand digest easily) into other forms that might be more compact, abstract and useful[5].The general KDD process contains data preparation, pattern searching, knowledge evalu-ation, and refinement. The knowledge obtained at the end of process fully depended onuser purposes.

Figure 3.1: KDD process

According to the actual application, KDD process implementation was shown in Fig-ure 3.1. First was understanding the project scope and prior knowledge and selectingthe target database for knowledge discovering. Second was pre-processing the selecteddatabase which included formatting data, cleaning unrelated information, and removingmissing and incomplete data. Third was extracting project oriented features to reduce thecomplexity and distraction in the database. Fourth was using data mining methods tofind patterns and rules based on the specific requirements. The most common data min-ing algorithms could be discussed in two sections: statistics, neighbourhoods and cluster-ing (classical techniques); trees, networks and rules (next generation techniques). Fifth

8

Page 17: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

CHAPTER 3. METHODS 9

was evaluating the obtained patterns by returning to step one to four. This approachcould be done multiple times to interpret the pattern and generate the knowledge on thedatabase. Sixth was applying knowledge on other implementation and collating them forknowledge management.

3.2 Machine learning classification

3.2.1 Binary classification

Binary classification is a method for classifying target objects into two categories basedon the classification rules. It has been widely use in machine learning problems, suchas spam email detection (spam/ham)[6], disease diagnosis (benign/malignant)[7], gen-eral procedure control (pass/fail)[8]. The object database which contains multiple fea-tures had been binary labeled will be used to train a supervised machine learning model.Through feeding new data into well trained model, a predicted label on new object willbe generated at the model output.

A well-constructed classification method can divide objects to their belonged classproperly. However, miss-classification is inevitable and determines the overall perfor-mance. How to analyze the obtained result is an integrant part of classification. Con-fusion matrix is widely used to present the performance of a classification algorithm,specifically in machine learning (supervised) and statistical field. As shown in Table 3.1,a general confusion matrix consists of four type of instances, true positives (TP), falsepositives (FP), true negatives (TN) and false negatives (FN). Each row represents truecondition and each column represents predicted condition. Correct predictions (TP, TN)are the values in the diagonal of the matrix and wrong predictions (FP, FN) are locatedoutside the diagonal.

Table 3.1: Confusion matrix

General performance measurement equations which are derivate from confusion ma-trix are listed below:

True positive rate (TPR) or recall is true positives over predicted positives:

TPR = TP/P = TP/(TP + FN) (3.1)

True negative rate (TNR) or specificity (SPC) is true negatives over predicted nega-tives:

TNR = TN/N = TN/(FP + TN) (3.2)

Page 18: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

10 CHAPTER 3. METHODS

Positive predictive value (PPV) or precision is true positives over actual positives:

PPV = TP/(TP + FP ) (3.3)

Negative predictive value (NPV) is true negatives over actual negatives:

NPV = TN/(TN + FN) (3.4)

False positive rate (FPR) or fall-out is false positives over predicted positives (1 −SPC).

False discovery rate (FDR) is false positives over actual positives (1− PPV ).False negative rate (FNR) or miss rate is false negatives over predicted negatives (1 −

TPR).Accuracy (ACC) as an general performance estimation value is calculated by the cor-

rect prediction over all instances:

ACC = (TP + TN)/(P +N) (3.5)

More terms like Null Error Rate (the correct rate if predicting the majority class only),Receiver Operating characteristic (visualize the performance by plotting true positive rateagainst the false positive rate) and F1 Score are adopted base on the application require-ments.

3.2.2 Decision tree

A decision tree is a classifier expressed as a recursive partition of the instance space [9].As a real tree, the decision tree has one root (root node), trunks (internal nodes) andleaves (decision nodes). An example decision tree for mammal classification is shown inFigure 3.2. Body temperature is the root node which doesn’t have any incoming edges.It splits the tree to warm blooded and cold blooded vertebrates as child nodes. Since allthe cold blooded vertebrates are non-mammals, the right child is defined as a leaf nodewhich has no outgoing edges. Inside the warm blooded category, gives birth as an in-ternal node which has both incoming and outgoing edges, divides mammals from non-mammals. Both the child nodes of gives birth are leaf nodes.

Figure 3.2: A decision tree for the mammal classification problem[10]

Even with the same amount of features, decision tree could grow up differently. Al-gorithms with greedy strategy are used to obtain a locally optimal splitting solution.

Page 19: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

CHAPTER 3. METHODS 11

Typical decision tree such as CART[11], ID3[12], C4.5[13] are using Hunt’s algorithm.Considering the feature expressions, the attributes for constructing a decision tree canbe divided into four categories: binary, nominal, ordinal and continuous attributes. Inthis thesis application, continuous attributes which split the node by comparing the fea-ture X with the condition C(X > C or X < C) is applied. In order to obtain the bestsplit, impurity measurement on the child nodes is carried out. At a node t, the impurityis defined as i(t) = F [P (1t), . . . . . . , P (N |t)]. P (n|t) presents the proportion of instances innode t belonging to class n, n = 1, . . . . . . , N . Commonly used impurity measures include:

Entropy(t) = −N−1)∑n=0

P (n|t)log2P (n|t) (3.6)

Gini(t) = 1−N−1)∑n=0

[P (n|t)]2 (3.7)

In this thesis, CART for classification was implemented. The decrease in impurity forsplit t is define as:

4i(t) = i(t)− PLi(tL)− PRi(tR) (3.8)

Where PL and PR means the proportion of instance in t going to the left and rightchildren nodes of t. So the best split for node t is achieved by selecting the maximized4i(t).

Two of the widely used optimizing methods for CART are a minimum number ofpoints and cross validation. Fully grew decision tree might suffer from the overfittingproblem. One solution is setting the minimum number of points in children nodes whensplitting. In this case, the over-specific splits could be eliminated. Another approach isadopting cross validation on the train data to obtain the minimum misclassification error.

Comparing to other algorithms like CRUISE [14] and GUIDE [15], CART doesn’t needto decide which features to be selected in advance which means it will identify the mostsignificant variables and eliminate non-significant ones[16][17].

3.2.3 Support vector machine

Support vector machine (SVM) as a supervised learning algorithm, used for regressionand classification problems. Given training dataset, where each instance Xi has N fea-tures with corresponding labels yi = (+1,−1) for binary classification. In SVM a hy-perplane is used to separate instances to their belonging class. Support vectors were theclosest instances to the hyperplane. The intention of SVM is trying to find a hyperplanewhich maximizes the margin between support vectors and the hyperplane.

The hyperplane can be described as:

w ∗ x− b = 0 (3.9)

Where: w is normal to the hyperplane.For a linearly separable binary classification problem, the training dataset can be for-

mulated as:

w ∗ xi − b >= +1, yi = +1

w ∗ xi − b <= −1, yi = −1combineto :yi(w ∗ xi − b)− 1 >= 0

(3.10)

Page 20: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

12 CHAPTER 3. METHODS

Then the margin could be defined as the distance between two parallel plans: w ∗xi − b = +1 and w ∗ xi − b = −1, which is 1/(||w||) by vector geometry. Maximizingthe margin equals to minimizing||w||. In another way, minimize 1/2||w||2 for QuadraticProgramming (QP) optimization. The objective function could be formulated as:

min1

2‖ w2 ‖ , s.t.yi(w ∗ xi − b)− 1 ≥ 0 (3.11)

For non-linearly separable classification, soft margin C∑N)

1 i [18] is introduced intoobjective function 3.12:

min1

2‖ w ‖2 +C

N)∑1

i , s.t.yi(w ∗ xi − b)− 1 ≥ 0 (3.12)

Kernel function [19] transforms features to a higher dimensional space which is moresuitable for the non-linearly separable problem. By introducing the kernel function K(x),the classification function is formatted as:

yi(w ∗K(xi)− b)− 1 ≥ 0 (3.13)

Commonly used kernel functions are:

• linear: K(xi, xj) = xTi xj .

• polynomial: K(xi, xj) = (βxTi xj + r)d.

• radial basis function (RBF): K(xi, xj) = exp(−β ‖ xTi xj ‖2).

• sigmoid: K(xi, xj) = tanh(βxTi xj).

3.2.4 Extreme learning machine (ELM)

The learning speed of feedforward neural networks was constrained by the slow gradient-based learning algorithms and the all the parameters needed to tuned iteratively. Huangand Babri [20] proved that a single-hidden layer feedforward neural network (SLFNs)can learn N distinct observations with at most N hidden neurons and almost any nonlin-ear activation function. In Huang, Chen and Siew paper [21], SLFNs with randomly cho-sen hidden nodes and intensively calculated output weights are universal approximatorswith various activation functions. Since the input weights and biases can be randomlyassigned, the efficiency of SLFN improved compared to the researches [22][23][24] whichneed to tune the parameters of new hidden neurons.

According to the statement above, the theory of extreme learning machine (ELM) forSLFN was proposed which randomly chooses hidden nodes and analytically determinesthe output weights of SLFNs [25]. In paper [25], the performance among ELM, SVM andBP was tested in real applications. Train an ELM only needs seconds, compared to hoursfor SVM and BP; the simple and effective structure of ELM also gets rid of the commonissues in gradient-based learning algorithms, such as local optimal solution. Work [26]presented that ELM could be implemented in multi-layer networks in the form of everyELM represents one hidden layer or incorporation with other learning methods. One op-timization approach for ELM called O-ELM [27] adopts genetic algorithms to determinethe relative input variables and the structure and parameters of ELM.

Page 21: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

CHAPTER 3. METHODS 13

The upper bound of the required number of hidden nodes is the number of distincttraining samples, that is (N) <= N . The output function for SLFNs is:

O(j) =N∑i=1

βig(Wi •Xj + bi), j = 1, ..., N (3.14)

Where Wi, bi are the weight and bias vector connecting input nodes i and hiddennode. Considering output node as linearly, βi is the weight connecting hidden node andoutput nodes. g(x) is the active function. There exist βi,Wi, bi which makes:

N∑i=1

βig(Wi •Xj + bi) = tj , j = 1, ..., N (3.15)

Where tj is the target output. Compactly, 3.15 could be written as:

Hβ = T (3.16)

Where

H =

g(W1 •X1 + b1) · · · g(WN •X1 + bN)

... · · ·...

g(W1 •XN + b1) · · · g(WN •XN + bN)

(3.17)

β =

βT1...βTN

, T =

T T1

...T TN

(3.18)

In most cases, the amount of training data is larger than hidden nodes. Training aSLFN is equivalent to find a least-squares solution β of the linear system Hβ = T [25].According to the theorem in [25], the smallest norm least squares solution is:

β = H†T (3.19)

Where H† is the Moore-Penrose generalized inverse of matrix H .

3.3 AMRTM system simulation

AMRTM system simulation builds up the model for examining possible optimizing solu-tions. The simulation follows real AMRTM system operation process and contained im-portant features including cow entrance sequence; robots parallel working; batch milkingsession and actual POC duration structure.

The idea of bad cow classification and rearrangement comes from the observation ofmilking process. As mentioned in chapter 2, each POC duration contains rotation dura-tion, robot functional duration and control duration (wait time). Five robots (Tpm1&2,Aca1&2 and TSR) work simultaneously and the last finished robot decides the slowestrobot duration.

Page 22: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

14 CHAPTER 3. METHODS

The theory of AMRTM system simulation is rearranging the cow entrance sequence.Classified bad cows are split from good ones and milked together at the end of eachmilking session. A simplified model is built to verify that rearranging cow entrance se-quence could save time. Firstly, Ovesholm (OHM) milking information in March is se-lected as simulation target. The bad cows are defined as those average ACA2 functiondurations larger than 40 seconds (the classification work will be presented in chapter 4).As shown in Table3.2, In order to simplify the demonstration, the real robot timeout forgood and bad cow are replaced by average performance in good and bad group.

Table 3.2: Cow’s robot average performance

Secondly, assuming the entrance sequence follows the pattern that good cow and badcow went into the system alternately (such as ‘GBGBGBGBGBG. . . ’, where ‘G’ delegatesthe good cow, ‘B’ delegates the bad cow). The milking process is recreated and shown inTable 3.3. At Poc 1, the first cow (G1) goes into the system and it is rotated to teat pre-pare model (TPM1) immediately by platform. Then a bad cow (B1) goes into system andplatform rotates it to TPM1. Meanwhile, the first cow (G1) is rotated to TPM2. The lastcow (G4) goes into the system at POC 7 and the sample milking process stops at POC 10when the last cow (G4) finishes teat attachment at ACA2. All the bad cow positions andthe slowest durations cause by bad cow were marked with red color. At the end of thetable, total simulation time is calculated by summing up all the slowest durations.

Table 3.3: Simplified entrance sequence AMRTM system simulation

Page 23: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

CHAPTER 3. METHODS 15

Then the entrance sequence is rearranged as shown in Table 3.4. Bad cows (B1, B2and B3) are milked after all the good cows went into the system. We could observe thatthe bad cow dominated slowest durations (red fonts) reduced from 7 POC to 6 POC andthe simulation duration saved 21 seconds, comparing with Table 3.3.

Table 3.4: Rearranged entrance sequence AMRTM system simulation

In a real farm condition, bad cow only accounts for around 5% of a farm’s herd. Thedistribution of bad cows will be more random and sparse. In this case, more POC slow-est durations will be dominated by bad cows than the example shown. Sorting bad cowsdecreases the slowest robot duration dominated effect on good cows. However, the av-erage robot duration for bad cows is still longer than good ones. The optimization meth-ods for limiting bad cow robot duration are introduced in next chapter.

Page 24: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

Chapter 4

AMRTM optimization system design

The intention of the entire project is to increase the actual capacity of AMRTM as highsuccessful milking rate as possible by investigating on cow milking sequence and robottimeout data. As shown in 4.2, the start of the project was to examine the current dataand extract valuable information using KDD process. The basic understanding of thedatabase and formatting work were done during data preparation. According to thecow’s milking behavior, bad cows were defined and verified through statistical meth-ods. Then machine learning prediction system was adapted to classify and predict cowbehaviors in different farms based on single variable bad cow definition. Meanwhile, theanalysis on robot timeout setting was carried out. All the possible solutions were testedthrough AMRTM system simulation. The output contains simulation time, capacity, badcow report and robot timeout report. The detailed experimental procedure has been illus-trated in the sections given below.

Figure 4.1: System overview

4.1 KDD process

As illustrated in Figure 4.2, POC and CowMilking databases are handled separately toobtain refined database and knowledge for machine learning and system simulation.

After CowMilking KDD process, a refined CowMilking database was created in or-der to simplify database and increase processing speed; Single and multiple variable badcow definition were created; ML database was generated for training and testing ma-chine learning algorithms with bad cow definition.

16

Page 25: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

CHAPTER 4. AMRTM OPTIMIZATION SYSTEM DESIGN 17

After POC KDD process, refined POC database, robot timeout, and session informa-tion were carried out. The knowledge on robot timeout was applied for analyzing. Ses-sion information was used for creating the system simulation.

Figure 4.2: Data preparation

4.2 Bad cow definition

A bad cow was defined as its behavior on AMRTM had a negative impact on the ca-pacity. Through KDD process and expert consultation, the main effect features orderedfrom high to low impact were ‘ACA1&ACA2 durations’, ‘kickoff teats’, ‘incomplete milk-ing teats’, ‘TPM result teats’ and ‘ACA result teats’. As explained in the theoretic back-ground, ‘ACA1&ACA2 durations’ were decisive factors on ‘slowest robot duration’ whichwas the key component of POC duration. The rest of parameters also contributed tobad cow definition, ‘TPM result teats’ and ‘ACA result teats’ had more direct effects onrobot successful rate; ‘kickoff teats’ and ‘incomplete milking teats’ reflected the cow’s on-platform behavior.

In order to understand the general cow behavior, the milking data was analyzed monthly.All the milking turns for each individual cow were summed up and calculated the aver-age numerical range for each feature. The distribution of cow’s robot durations reflecteda more accurate performance in a certain period of time. The relationship among mean,median and skewness was introduced to refine bad cow definition.

Based on the distribution of average ACA2 durations and robot timeout initial set-ting, the single variable bad cow classification boundary was set at where the averageACA2 duration was 40 seconds.

For multi-variable bad cow definition, the classification boundary needed to fulfil allthe conditions, which were:

• ACA2 average duration > 40 sec

Page 26: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

18 CHAPTER 4. AMRTM OPTIMIZATION SYSTEM DESIGN

• Kickoff teats possibility > 50%

• Incomplete milking teats possibility > 50%

• TPM success possibility < 50%

• ACA success possibility < 50%

4.3 Robot timeout analysis

Initially, TPM robots were programmed to terminate operation at a certain amount oftime even the task was not finished. However, ACA2 duration was assigned dependingon ACA1 task completion status. Generally, TPM interrupt threshold was 30 seconds,ACA1 interrupt time was 45 seconds and ACA2 could increase to 90 seconds.

Two important robot timeout evaluation criteria were duration and successful rate.Theoretically, the higher successful rate was achieved by increasing robot operating dura-tion. By doing so, the rotation needed to wait a longer time until all the robot stopped.

Considering average ACA duration was longer than TPM’s, the investigation on ACA2duration had two approaches. On the one hand, by increasing the ACA operating du-ration, the attachment successful rate could improve which meant Aca2 could save thechance of finishing ACA1 remaining task. This idea needed actual testing on the farmto obtain data which would not include in this thesis. On the other hand, limiting themaximum duration, especially on ACA2 also contributed to time saving and mechani-cal maintenance. Nevertheless, there was a trade off between shortening duration andmaintaining successful rate. The detailed discussion was illustrated in result chapter anddiscussion chapter.

4.4 Machine learning classification

4.4.1 Creating database

The first step of machine learning classification was collecting data and selecting rela-tive features. The sufficient data set was subdivided into a training set, validation set,and test set. The training set was used for training the specific machine learning model.The validation set was used for evaluating the generalization error of the selected model.The test set was the target data which required to be classified by the well-trained modelwith optimal parameters. For an insufficient data set, cross-validation and bootstrap couldbe adopted to increase data diversity.

The training and validation data set were collected from three farms (OHM, FIN andLAP). It contained 781 bad cows and 6407 good cows. In order to build an unbiaseddataset, the amount of good cows was cut down to match with good cows. The finaldata set contains 750 bad cows and 750 good cows (training set: 600 bad cows and 600good cows; validation set: 150 bad cows and 150 good cows). OHM milking data in Maywas selected as the test set. It contained 30 bad cows and 524 good cows.

4.4.2 Decision tree

The decision tree implemented in this thesis was classification and regression tree (CART)with Gini impurity measurement in MATLAB. The decision tree grew by selecting the

Page 27: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

CHAPTER 4. AMRTM OPTIMIZATION SYSTEM DESIGN 19

maximum information gain from all possible binary splits at each node and repeatingfor the two child nodes. Overfitting is the general problem for a fully grew decision treesince some child nodes might contain few observations. In order to reduce generaliza-tion error, the constraint on minimum observation in leaf node (’Minileafsize’) was set asan optimizing splitting criterion. Resubstitution error and confusion matrix were used toexamine the performance of decision tree.

Figure 4.3: Decision tree implementation

As shown in Figure 4.3, ’DT Train’ contained training and validation process. thedecision tree was trained by restricting the ’minleafsize’ and then validated. Consider-ing that the real farm normally had a biased herd which good cows dominated (>90%of herd scale) and the main purpose of prediction was on bad cows, I created the crite-rion for selecting the best model which had the well-trained parameters. In order to bethe best model, the validation result must fulfil the conditions that the accuracy on bothgood cow prediction (PPV) and bad cow prediction (NPV) was over 80% and the cor-rect predicted bad cows were maximum.Then the actual farm data was fed into the bestmodel to predict the cow behavior in next month.

4.4.3 Support vector machine

The support vector machine (SVM) implementation was based on LibSVM [28] in Mat-lab. In order to reduce information loss and extract information loss, all the features weresparse before adapted. At the beginning, RBF kernel was selected since it is the mostwidely used kernel function. However, the overfitting problem accrued during imple-mentation. Then Polynomial kernel function had been selected and tested in 1st, 2nd and3rd degree. Grid search method had been applied for optimizing the penalty parameterC in soft margin. Tuning the parameters of SVM is very challenging and time consum-ing, other parameters (gamma, coef0 in Polynomial kernel function:(gamma ∗ u′ ∗ v +

coef0)degree) were arbitrarily selected.

Page 28: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

20 CHAPTER 4. AMRTM OPTIMIZATION SYSTEM DESIGN

Figure 4.4: SVM implementation

As shown in Figure 4.4, the support vector machine was trained by tuning polyno-mial degree and grid search for penalty parameter C and validated. The best model wasobtained with the same criteria as in decision tree. Then the actual farm data was testedto get the cow behavior prediction in next month.

4.4.4 Extreme learning machine

The extreme learning machine implementation was based on the Matlab model providedby NanYang Technological University[29]. The input features were normalized into therange between -1 and 1. In this thesis, the effect of active function and number of hiddenneurons on single hidden layer ELM cow behavior prediction were examined. Three ac-tive functions (tribas, radbas and sig) were chosen and compared with the same range ofhidden neurons (1-800).

Figure 4.5: ELM implementation

As shown in Figure 4.5, the extreme learning machine was trained by tuning activefunction and hidden neurons and validated. The best model was obtained with the samecriteria as in decision tree. Then the actual farm data was tested to get the cow behaviorprediction in next month.

4.5 AMRTM system Simulation

AMRTM system simulation was designed to test the result of a feasibility study, bad cowdefinition and robot timeout analysis. It simulated the AMRTM milking procedure by let-ting cows enter the system in designed sequence. Meanwhile, system operating schedule

Page 29: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

CHAPTER 4. AMRTM OPTIMIZATION SYSTEM DESIGN 21

remained as actual as possible.The simulation implementation was shown in Figure 4.6. The input databases were

’CowMilking data’, ’Session information’, and ’ Constant time’.’CowMilking data’ included all the milking records in a specific month and arranged

by the milking sequence. Each record had been labeled based on cow classification (goodor bad). ’Session information’ which contained session’s last milking records was usedfor splitting milking records on session scale. Two milking sequences were adopted: orig-inal (follow the actual milking sequence in each session) and sorting (bad cows were sep-arated from good cows and rearranged to the end of each session).’Constant time’ wasthe waiting times which should keep constant since changing milking sequence didn’tinfluence them.

Followed by selected milking sequence, one cow went into the platform during eachrotation. The new entered cow was assigned to TPM1 directly and moved to TPM2 atnext rotation. When the last cow in one session went into the platform, the entrancecows were set to empty until the last cow left ACA2 bail. By doing so, the new sessionstarted without the influence from the last session.

The simulation generally simulated each POC procedure by calculating the slowestrobot duration and combining it with constant time. After running the program, a reportin ‘.txt’ format which contains farm information, simulation duration, bad cow frequencyand yield loss in session scale was generated.

Figure 4.6: AMRTM system simulation

In order to further research the sorting strategy, two categories were proposed andvalidated. The implementation was shown in Figure 4.7. Bad cows could be excludedfrom milking sequence or manually milked by farmers. Both strategies could increasecapacity and save time. However, we should keep in mind that exclude bad cows willcause milk loss and manually milk bad cows might cause extra labor.

Page 30: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

22 CHAPTER 4. AMRTM OPTIMIZATION SYSTEM DESIGN

Figure 4.7: Optimization on sorting bad cows

The functionality of ACA2 robot caused an extendable duration. Theoretically, it sac-rificed time to achieve a higher attachment rate. An optimization on entire ACA2 du-ration aimed to simulate the trade off between successful attachment rate and duration.As shown in Figure 4.8, ACA2 robot terminated operation when duration reached thethreshold time.

Figure 4.8: Aca2 duration threshold

Page 31: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

Chapter 5

Result

5.1 Bad cow definition

5.1.1 Single-variable definition

According to the single variable definition (average ACA2 duration larger than 40 sec-onds), the cow information in OHM, FIN, and LAP was presented in Table 5.1. The firstcolumn showed the source of milking data, named by ‘Farm name’ and ‘Month’. Thenthe number of milked cows, classified bad cows and the proportion of bad cows weredemonstrated. The bad cow proportion in OHM and FIN were around 5% of the herdsize. LAP had a larger amount of bad cows which hold around 15%.

Table 5.1: Single variable bad cow definition

5.1.2 Multi-variable definition

Multi-variable bad cow definition was a more comprehensive model. Since not only ACA2duration, but also the kick off teats, TPM & ACA successful attached teats, and incompli-ant teats were applied to define a bad cow. The classification result was shown in Table5.2, more bad cows were classified comparing to the single variable definition. Averag-ing three-month data, the bad cows were 12.5% for OHM farm, 25.5% for FIN farm and23.92% for LAP farm.

23

Page 32: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

24 CHAPTER 5. RESULT

Table 5.2: Multi-variable bad cow definition

5.2 Machine learning prediction

5.2.1 Decision tree

According to the decision tree implementation, the relation between Resubstitution er-ror and the minimum observation in leaf was presented in Figure 5.1. The error rate in-creases linearly below the 100 minimum observations in leaf node and becomes stablearound 0.08. When minimum observation in leaf node exists 750, the error rate reached0.45 instantaneously. It is evident that 750 observations in the leaf node is a thresholdvalue for resubstitution error.

Figure 5.1: Resubstitution error with min-leaf size variation

Considering the feature of the milking database that good cows dominated the herd(over 90%), the overall accuracy is not valid for evaluating the performance of machinelearning algorithm. It is easy to get a good overall accuracy by classifying all the cowsto good. A new performance measurement was created to deal with the biased milkingdatabase. The classification accuracy was considered as both positive predictive value(PPV) and negative predictive value (NPV) over minimum leaf size. As shown in Figure5.2, blue legend delegated true negative ratio (NPV) and red legend delegated true posi-tive ratio (PPV). Generally, PPV had better prediction accuracy than NPV. The overfittingproblem occurred when minimum leaf size was zero. With minimum leaf size increas-

Page 33: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

CHAPTER 5. RESULT 25

ing, PPV drop from 95% to 90%. But NPV had the opposite trend, it rising from 60% to80% which meant the overfitting had been limited. Then both predictive values were sta-bled, 94% for PPV and 77% for NPV. When minimum observation in leaf size beyond750, PPV reached 100% and NPV declined to 0% which meant the model was underfit-ting. It had explained the sharp increase on Resubstitution error in Figure 5.1.

Figure 5.2: DT classification accuracy with min-leaf size variation

5.2.2 Support vector machine

In this thesis, support vector machine with polynomial kernel implementation was im-plemented. Firstly, the effect of polynomial kernel degree on the classification accuracywas examined. As shown in Table 5.3, the best prediction results in 1st, 2nd and 3rd de-gree of polynomial kernel with default parameters are presented. The 2nd degree poly-nomial kernel had the outstanding performance over the other two degrees. On the con-trary, 3rd degree polynomial kernel seemed not able to learn the prediction rules by se-lected features.

Table 5.3: The best prediction results of polynomial kernel in 1st, 2nd and 3rd degree.

According to the performance of three different degree polynomial kernels, 1st and2nd degree were selected to optimize the soft margin penalty parameter C. Grid searchfor optimal C in the range of -20 to 20 were shown in Table 9. 1st degree polynomial ker-nel had a stable performance on both PPV and NPV (over 90% accuracy) in the rangeof -15 to 5. However, the accuracy on 2nd degree polynomial kernel fluctuated greatlywhen C beyond 0.Figure 5.3 is the grid search for soft margin penalty parameter C.

Page 34: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

26 CHAPTER 5. RESULT

Figure 5.3: Grid search for soft margin penalty parameter C.

5.2.3 Extreme learning machine

Three active functions (radbas, sig and tribas) were tested in the extreme learning ma-chine implementation. PPV and NPV in three active functions had the same trend onincreasing the number of hidden neurons. Considering ELM algorithm randomly gener-ated input bias and weights, 20 trials for each hidden neuron were implemented in train-ing phase. As shown in Figure 5.4, average and best predicted records were presentedfor both PPV and NPV. By observing, PPV generally had better prediction accuracy thanNPV. Both PPV and NPV decreased when the number of hidden neurons beyond 300.The downward trend of NPV was noticeable around 20% than 5% in PPV.

Figure 5.4: ELM

5.3 Robot Timeout

The analysis result on Aca2 duration was shown in Table 5.4, the first column indicatesdifferent Aca2 time intervals. The successful rate was calculated through dividing ACA2successful attachment times by Aca2 total attachment times in each time interval. Thelast column shows the saved time by limiting ACA2 duration to the threshold value. Thethreshold value means the maximum operating time for ACA2. The observation in threefarms shown that the successful rate was almost 100% when ACA2 duration under 40seconds. When ACA2 robot needed to extend time to finish the task, the successful ratestarted dropping. In order to increase the capacity of AMRTM, ACA2 duration should belimited.

Page 35: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

CHAPTER 5. RESULT 27

Table 5.4: Aca2 duration analysis

5.4 AMRTM system simulation

The simulation results were presented in two aspects, system operating time and capac-ity. The simulation database contained three months’ (March, April, and May) milkingrecords and system operating time stamps in four farms (OHM, OTT, FIN and LAP).Based on the optimizing implementation, enable TPM or disable TPM during milkingbad cows divided the simulation into two branches. Within each branch, the optimiza-tion level included: bad cow sorting; bad cow manually milking in 20 seconds; bad cowmanually milking in 10 seconds; bad cow manually milking in 6 seconds.

The actual simulation time for four farms in different optimizing levels (original, sort-ing, ACA 20 (manually milk in 20 sec), ACA 10 (manually milk in 10 sec), and ACA 6(manually milk in 6 sec)) with TPM enable/disable in three continue months are pre-sented in Figure 5.5. Sorting had the same impact on both Enable TPM and Disable TPM,since bad cows were milked by robots. The difference of two approaches shown whenmanually milk bad cows. When enable TPM, the task for a milker was attaching milkcups. The TPM duration dominated the robot operation time and halt the declining trendof simulation time. When disable TPM, a milker needed to take over both TPM and ACArobot tasks (clear teats and attach milk cups). Without the robot influence, simulationtime decreased further.

Page 36: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

28 CHAPTER 5. RESULT

Figure 5.5: Actual AMRTM simulation time

The actual simulation capacity for four farms in different optimizing levels (original,sorting, ACA 20, ACA 10, ACA 6) with TPM enable/disable in three continue monthsare presented in Figure 5.6. The capacity was calculated by dividing milking records bysimulation time. Since the milking records were constant, the decreasing of simulationtime increased the capacity.

Figure 5.6: Actual AMRTM capacity

Page 37: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

Chapter 6

Discussion

6.1 Bad cow selection

In order to sort cows and optimize the system performance, we need to select bad cowsproperly. In the discussion, we look at the different variants of the bad cows and howthe composition of bad cows varies over time. Two selection methods (single variable &multi-variable) were compared.

6.1.1 Bad cow classification with single variable

The bad cow defined by single variable (average Aca2 duration > 40 sec) was applied tothe AMRTM system simulation. The amount of bad cow was in the reasonable range (5%-10% of the herd) in OHM, OTT and FIN farm. However, the proportion of bad cow inLAP reached around 15% of the herd (114 bad cows over 816 cows in March). The badcow proportion influence on the optimization methods will be explained in 6.1.3.

The variations of the bad cow in four farms were shown in Figure 6.1 to illustrate thebad cow relation in two continue months. March is the first sampling month, so thereis no bad cow composition. The following months contain three classes of the bad cow,which are new bad cows appeared in this month (Blue bar); same bad cows appearedboth this and last month (red bar) and the bad cows which performance good in lastmonth (green bar).

From the statistical analysis, half of the bad cows still performed badly in next month.Around 20% of bad cows were newly introduced cow in each month. The prediction onlast month’s good cow turned to bad cow in next month had a significant role in sort-ing cows and optimizing system capacity, since those cows hold 30% of bad cows. OTTas a test farm contained less cow and implemented more system test and update whichinfluenced the bad cow variation and system performance.

29

Page 38: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

30 CHAPTER 6. DISCUSSION

Figure 6.1: Bad cow variation in four farms

6.1.2 Bad cow classification with multi-variable

Multi-variable defined bad cow provided us a comprehensive understanding of bad per-formance. Instead of only focusing on cow’s ACA2 duration, kick-off teats probability,incomplete teats probability, and robot result teats were taken into consideration. How-ever, multi-variable defined bad cows occupied 15% of the farm herd and even 20% inLAP farm. Meanwhile, the bad behaviors except average ACA2 duration didn’t have thedirect influence on system operating time which meant the defined bad behavior didn’tinfluence the robot performance. In this case, milking multi-variable defined bad cowstogether was difficult to implement (large amount) and might reduce the optimized ca-pacity than single variable definition.

6.2 Machine learning classification

The analysis on machine learning classification was focused on the prediction accuracy.Three machine learning algorithms (Decision Tree, Support Vector Machine and Extremelearning machine) were trained with same training data, then test data (OHM cows’ be-havior in May) was fed to the well trained models and outputted the best predicted re-sults of cows’ behavior in next month. According to the machine learning implementa-tion methods in Chapter 4, the best classification results were shown in Table 6.1. Be-cause of the herd scale varied in each month, the prediction was only based on the pre-sented cows in the test month which meant the newly entered cows in next month werebeyond consideration. The behavior in last month was used as a reference prediction tocompare with machine learning predicted results.

The statistical analysis on prediction results were shown in Table 6.2. Bad cow preci-sion was the proportion of correctly predicted bad cows over all the predicted bad cows.Good cow recall was the proportion of correctly predicted good cows over actual goodcows. Bad cow recall was the proportion of correctly predicted bad cows over actual badcows. The overall classification accuracy for both good cow and bad cow was presentedat the end of the table. The best prediction we got so far was using extreme learningmachine with 83 neurons and tribas active function. In bad cow recall aspect, the scoreof machine learning prediction beat the last month model (predicted directly using lastmonth behavior), which means machine learning algorithms can predict more bad cows.

Page 39: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

CHAPTER 6. DISCUSSION 31

Table 6.1: The best machine learning predictions and behaviour in last month

However, miss classification of actual good cows limited the performance of machinelearning classification on the test data (OHM milking records in May).

Table 6.2: The best machine learning model and last month model performance results

Considering the split method of decision tree is similar to the criteria of bad cow def-inition, it was selected to predict cow behavior. However, by optimizing the minimumobservation in the leaf node, the general predicted result only used average ACA2 dura-tion as split criteria. The prediction effect of other features was discarded. The best deci-sion tree (CART) predicted result used average ACA2 duration and average ACA2 resultteats.

In order to examine all the selected features and get the optimal prediction, supportvector machine was adopted. With 2nd degree polynomial kernel and optimized C pa-rameter, a better predicted result was generated. However, the execution time of SVMwas longer than decision tree. Especially when implementing grid search method.

The wish of combining speed and accuracy propelled me to find a more suitable al-gorithm for predicting bad cows. Extreme learning machine algorithm believes that theparameters (Input bias and weights) for a neuron network could be randomly assignedand not need to be well tuned. This impressive feature attributed to extremely fast learn-ing speed and the predicted result is compatible with DT and SVM.

6.3 Comparison on different optimizing levels

The optimization methods were divided into two branches: manually milking bad cowwith enabled TPM and manually milking bad cow with disabled TPM. Each branch con-

Page 40: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

32 CHAPTER 6. DISCUSSION

tained four optimizing levels: sorting bad cows and milking at the end of each session;manually milking sorted bad cows in 20 seconds; manually milking sorted bad cowsin 10 seconds and manually milking sorted bad cows in 6 seconds. Capacity and savedtime as the measurement of different optimizing levels were analyzed in this section.

The capacities in different farms were shown in Table 6.3. The original and sortingdidn’t change robot durations, so the increased capacity came from rearranging the milk-ing sequence. In the manually bad cow milking levels, enable TPM prevented the ca-pacity increasing since TPM operating duration was longer than 20 seconds. However,disable TPM could keep the rising trend, which means the milker needed to finish bothcleaning and attaching. At the sorting level, the capacity increased only around 1 ∼ 3

cows/hour. Considering the amount of bad cows hold 5% of the herd (10 ∼ 30 cows),manually milk was proposed. Based on the milker’s proficiency, manually milk time wasset at 20, 10 and 6 seconds. When enable TPM, capacity increased 3 ∼ 6 cows/hour. Weincreased 4 ∼ 11 cows/hour with disabled TPM. It is important to notice that the num-ber of bad cows reached 15% of the LAP herd which caused that capacity increased 8cows/hour with enabled TPM and 14 cows/hour with disabled TPM.

Table 6.3: The capacity on different level optimization

Page 41: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

CHAPTER 6. DISCUSSION 33

The optimized saving time provided how much time could be saved by implement-ing the optimizing methods. In Table 6.4, enable TPM and disable TPM had the sametrend as in the optimized capacity table. By sorting bad cows, we could save 2 ∼ 12 min-utes. Depending on the bad cow amount, implement manually milk saved 14 ∼ 26 min-utes with enabled TPM and 20 ∼ 39 minutes with disabled TPM. Analyzing by an ex-ample, OHM (average 30 bad cows in 550 cows) saved 25 minutes by enabling TPM andmaximum 39.2 minutes by manually milking in 6 seconds. Because of the amount of badcow in LAP, enable TPM could save 72 minutes and disable TPM could save over 100minutes. Considering the actual condition, the bad cow classification rule in LAP neededto be modified from satisfying ACA2 operation time larger than 40 seconds to both meetthe ACA2 rule and within 5% of herd scale.

Table 6.4: The session saved time of different level optimization

6.4 Robot timeout

The robot timeout analysis mainly focused on ACA2 which had the possibility to extendoperation time. The duration was divided into 6 intervals from operation time over 40second to over 90 seconds. As shown in Figure6.2, successful attachment rate and savedtime per month were presented. With the increase of interval threshold time, the suc-cessful rate decreased from under 60% (over 40 seconds) to 1% (over 90 seconds). By set-ting the operation time to the interval threshold time, we could save up to 7.56 hours permonth which means 7 minutes per session. We need to take the tradeoff between suc-cessful rate and saved time into consideration.

Page 42: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

34 CHAPTER 6. DISCUSSION

Figure 6.2: ACA2 timeout analysis

Page 43: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

Chapter 7

Conclusion and future work

7.1 Conclusion

The bad cow defined by the single variable is used to classify the herd. In this case, theoptimization is maximized by only considering the direct influence factor (ACA2 dura-tion) on slowest robot duration. Multi-variable bad cow definition provides a more com-prehensive cow behavior assessment for the farm. Due to the large proportion and in-direct affect on capacity, the multi-variable definition is not suitable to classify bad cow.According to the experiment, the idea proportion for bad cows should account around5% ∼ 10 of the herd size. A larger proportion of bad cows will disturb milking sequenceand aggravate labor. Sorting eliminates the bad cows influence on good ones. However,bad cow still need to be milked automatically. The experimental results in different farmsshow a very limited gain on capacity. In this case, manually milk bad cow with differentoptimization levels is proposed. When enable TPM during manual milk, the capacity in-creased 3 ∼ 6 cows/hour and 14 ∼ 30 minutes are saved per session. Even more, thecapacity increases 4 ∼ 11 cows/hour and saved time per session reaches 20 ∼ 39 minuteswith disabled TPM. For robot timeout analysis, setting the maximum operating thresholdfor ACA2 saves a relatively small amount of time, it could be applied as an associatedmethod and cooperated with bad cow manual milking method to further optimize thecapacity. Because the unexpected ACA2 duration in good cow sequence will be limited.

The fundamental condition to achieve the optimization result above is based on theassumption that all the bad cows are classified correctly. Put farm’s implementation feasi-bility into consideration, the cow’s behavior prediction in next month according to cow’sbehavior in the current month is applied. After building the data set for training, vali-dation, and testing, three machine learning algorithms (DT, SVM and ELM) are imple-mented and optimized to obtain the best prediction result. According to current work,the bad cow classification result on machine learning algorithms are better than the con-trol group (define next month behavior by using current month behavior directly). How-ever, machine learning algorithms have a poor performance at predicting good cowswhich limit the overall performance below than control group.

35

Page 44: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

36 CHAPTER 7. CONCLUSION AND FUTURE WORK

7.2 Future work

The AMRTM system simulation proved the optimizing ability to classify bad cows, sort-ing and manually milking. Future work based on simulation has two main directions:collecting new milking data and test the simulation procedure in real farms. Since hu-man involved procedures (manually attach milk cups and set cows which had bad robotperformance to manual milk) affect the authenticity of bad cow classification, new milk-ing data should be collected with less human factors; by implementing different opti-mization levels in actual farms, the feedback on actual capacity, cow’s behavior, and po-tential problems could be utilized to improve the simulation system.

For machine learning part, the training data set created in this thesis contained 8 fea-tures which all related to AMRTM system procedure. In order to build a comprehensivetraining data set, the variety of feature needs to be increased. For instance, other im-portant features like body temperature, weight, healthy condition, feeding informationshould be included. The comparison among different machine learning algorithms wasbased on the prediction accuracy on good and bad cows. In the future, the classificationpattern for different learning algorithms needs to be researched to better understand al-gorithms and derive a more comprehensive solution. The final aim is to build a generaldecision algorithm on top of all the algorithms which could learn from different predic-tion pattern.

Page 45: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

Bibliography

[1] Delaval. Delaval company information. URL http://www.delavalcorporate.com/DeLaval-company-about/.

[2] DeLaval. Delaval amrTM 2.0 instruction book. 2015-11-05.

[3] Y. Dodge. The oxford dictionary of statistical terms. 2006.

[4] Phil Simon. Too big to ignore: The business case for big data.p.89. Wiley, 2015.

[5] G. Piatetsky-shapiro U. Fayyad and P. Smyth. From data mining to knowledge dis-covery in database. vol. 17, no. 3, pp. 37–54, 1996.

[6] C. Lassification. Machine learning methods for spam e-mail. J. Comput. Sci., vol. 3, no.1, pp. 173–184, 2011.

[7] I. Tsamardinos D. Hardin A. Statnikov, C. F. Aliferis and S. Levy. A comprehensiveevaluation of multicategory classification methods for microarray gene expressioncancer diagnosis. Bioinformatics, vol. 21, no. 5, pp. 631–643, 2005.

[8] G. Management L. M. Rudner and A. Council. Scoring and classifying examineesusing measurement decision theory. Pract. Assessment, Res. Eval., vol. 14, no. 8, pp.1–12, 2009.

[9] L. Rokach and O. Maimon. Classification trees. Data Mining Knowledge Discover Hand-book, pp. 149–174, 2010.

[10] M. Steinbach P.-N. Tan and V. Kumar. Classification: Basic concepts, decision trees,and model evaluation. Introducing to Data Mining, vol. 67, no. 17, pp. 145–205, 2006.

[11] Olshen RA Stone CJ Breiman L, Friedman JH. Classification and regression trees.CRC Press, 1984.

[12] J. R. Quinlan. Induction of decision trees. Machine Learning, vol. 1, no. 1, pp. 81–106,1986.

[13] Quinlan JR. C4.5: Programs for machine learning. San Mateo: Morgan Kaufmann, 1993.

[14] H. Kim and W.-Y. Loh. Classification trees with unbiased multiway splits. Journal ofthe American Statistical Association, vol. 96, no. 454, pp. 589–604, 2001.

[15] W. Y. Loh. Improving the precision of classification trees. Annals of Applied Statistics,vol. 3, no. 4, pp. 1710–1737, 2009.

37

Page 46: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

38 BIBLIOGRAPHY

[16] Roman Timofeev. Classification and regression trees (cart) theory and applications.master thesis, Humboldt University Berlin, 2004.

[17] Adnan Aijaz, Mischa Dohler, A Hamid Aghvami, Vasilis Friderikos, and MagnusFrodigh. Realizing the tactile internet: Haptic communications over next generation5g cellular networks. arXiv preprint arXiv:1510.02826, 2015.

[18] C. Cortes and V. Vapnik. Support-vector networks. Machine Learning, vol.20, no.3, pp.273–297, 1995.

[19] I. M. Guyon B. E. Boser and V. N. Vapnik. A training algorithm for optimal mar-gin classifiers. Proceedings of the fifth annual workshop on Computational learning theory,pp.144–152, 1992.

[20] G. Bin Huang and H. A. Babri. Upper bounds on the number of hidden neurons infeedforward networks with arbitrary bounded nonlinear activation functions. IEEETrans. Neural Networks, vol. 9, no. 1, pp. 224–229, 1998.

[21] R. Araújo T. Matias, F. Souza and C. H. Antunes. Universal approximation usingincremental constructive feedforward networks with random hidden nodes. IEEETrans. Neural Networks, vol. 17, no. 4, pp. 879–892, 2006.

[22] A. R. Barron. Universal approximation bounds for superpositions of a sigmoidalfunction. IEEE Trans. Inf. Theory, vol. 39, no. 3, pp. 930–945, Mar, 1993.

[23] no. 5 pp. 1131–1148 IEEE Trans. Neural Networks, vol. 8. Objective functions fortraining new hidden units in constructive neural networks. IEEE Trans. Neural Net-works, vol. 8, no. 5, pp. 1131–1148, 1997.

[24] R. Meir and V. E. Maiorov. On the optimality of neural-network ap- proximationusing incremental algorithms. IEEE Trans. Neural Netw., vol. 11, no. 2, pp. 323–337, Mar.2000.

[25] Q. Zhu G.-B. Huang and C. Siew. Extreme learning machine: Theory and applica-tions. Neurocomputing, vol. 70, no. 1–3, pp. 489–501, 2006.

[26] G. Huang. What are extreme learning machines? filling the gap between frankrosenblatt ’ s dream and john von neumann ’ s puzzle. Cognit. Comput., vol. 7, no.3, pp. 263–278, 2015.

[27] pp. 428–436 Neurocomputing, vol. 129. Learning of a single-hidden layer feedfor-ward neural network using an optimized extreme learning machine. 2014.

[28] LibSVM. Libsvm: a library for support vector machines, 2001. software available at.URL http://www.csie.ntu.edu.tw/~cjlin/libsvm.

[29] Nanyang Technological University. Extreme learning machin.software available at.URL www.extreme-learning-machines.org.

Page 47: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

Appendix A

Simulation time and capacity

39

Page 48: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

40 APPENDIX A. SIMULATION TIME AND CAPACITY

TableA

.1:Simulation

time

(EnableTPM

)

FarmInfo

Num

berofSession

Original

Duration

(Hour)

Simulation

-SortingSim

ulation-M

anualmilking

badcow

in20

sec

Simulation

-M

anualmilking

badcow

10sec

Simulation

-M

anualmilking

badcow

6sec

Duration

(Hour)

Savedtim

e(m

in/Ses.)D

uration(H

our)Saved

time

(min/Ses.)

Duration

(Hour)

Savedtim

e(m

in/Ses.)D

uration(H

our)Saved

time

(min/Ses.)

OH

MM

arch61

441.02429.56

11.27414.59

26.00414.28

26.31414.15

26.43O

HM

April

59421.09

408.7512.54

394.2427.30

393.9727.57

393.8727.68

OH

MM

ay62

442.91431.80

10.75417.59

24.51417.26

24.82417.26

24.82O

TTM

arch61

291.58275.86

15.46266.98

24.20266.68

24.49266.57

24.60O

TTA

pril59

275.11268.76

6.46264.99

10.29264.71

10.57264.61

10.67O

TTM

ay62

278.01272.37

5.46268.77

8.94268.50

9.20268.39

9.30FIN

March

64546.08

543.852.10

518.0226.31

517.8826.44

517.8326.49

FINA

pril61

523.32521.85

1.44497.67

25.23497.35

25.54497.22

25.67FIN

May

63548.41

546.931.41

519.0827.93

518.7428.26

518.6028.39

LAP

March

51661.45

649.5414.01

598.1674.46

597.9174.75

597.8174.86

LAP

April

58632.56

615.1518.01

554.8380.41

554.6180.64

554.5280.73

LAP

May

60646.95

635.0211.94

584.6962.27

584.4062.56

584.2862.67

Page 49: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

APPENDIX A. SIMULATION TIME AND CAPACITY 41

Tabl

eA

.2:S

imul

atio

nti

me

(Dis

able

TPM

)

Farm

Info

Num

ber

of Sess

ion

Ori

gina

lD

urat

ion

(Hou

r)

Sim

ulat

ion

-Sor

ting

Sim

ulat

ion

-Man

ualm

ilkin

gba

dco

win

20se

c

Sim

ulat

ion

-M

anua

lmilk

ing

bad

cow

10se

c

Sim

ulat

ion

-M

anua

lmilk

ing

bad

cow

6se

cD

urat

ion

(Hou

r)Sa

ved

tim

e(m

in/S

es.)

Dur

atio

n(H

our)

Save

dti

me

(min

/Ses

.)D

urat

ion

(Hou

r)Sa

ved

tim

e(m

in/S

es.)

Dur

atio

n(H

our)

Save

dti

me

(min

/Ses

.)O

HM

Mar

ch61

441.

0242

9.56

11.2

740

8.68

31.8

240

3.51

36.9

040

1.44

38.9

4O

HM

Apr

il59

421.

0940

8.75

12.5

438

8.21

32.3

438

3.00

37.4

638

0.92

39.5

0O

HM

May

6244

2.91

431.

8010

.75

411.

2330

.66

405.

9335

.78

403.

8237

.83

OTT

Mar

ch61

291.

5827

5.86

15.4

625

2.22

38.7

124

4.56

46.2

524

1.50

49.2

6O

TTA

pril

5927

5.11

268.

766.

4625

6.46

18.9

625

1.87

23.6

425

0.03

25.5

0O

TTM

ay62

278.

0127

2.37

5.46

261.

0116

.45

256.

9020

.42

255.

2622

.01

FIN

Mar

ch64

546.

0854

3.85

2.20

497.

0748

.21

473.

4271

.47

463.

9680

.78

FIN

Apr

il61

523.

3252

1.85

1.44

479.

0643

.53

458.

1464

.11

449.

7772

.34

FIN

May

6354

8.41

546.

931.

4149

8.46

47.5

747

5.52

69.4

146

6.35

78.1

5LA

PM

arch

5166

1.45

649.

5414

.01

574.

4810

2.32

556.

0112

4.05

548.

6213

2.74

LAP

Apr

il48

632.

5661

5.15

21.7

652

8.84

129.

6550

8.55

155.

0150

0.43

165.

16LA

PM

ay60

646.

9563

5.02

11.9

456

4.80

82.1

654

8.50

98.4

554

1.98

104.

97

Page 50: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

42 APPENDIX A. SIMULATION TIME AND CAPACITY

TableA

.3:Capacity

(Disable

TPM)

FarmInfo

Milk

recordsO

riginalcapacity

Simulation

-SortingSim

ulation-M

anualmilking

badcow

in20

sec

Simulation

-M

anualmilking

badcow

10sec

Simulation

-M

anualmilking

badcow

6sec

Duration

(Hour)

Capacity

Duration

(Hour)

Capacity

Duration

(Hour)

Capacity

Duration

(Hour)

Capacity

OH

MM

arch37228.00

84.41429.56

86.66408.68

91.09403.51

92.26401.44

92.74O

HM

April

34577.0082.11

408.7584.59

388.2189.07

383.0090.28

380.9290.77

OH

MM

ay36027.00

81.34431.80

83.43411.23

87.61405.93

88.75403.82

89.22O

TTM

arch23917.00

82.03275.86

86.70252.22

94.83244.56

97.79241.50

99.04O

TTA

pril23912.00

86.92268.76

88.97256.46

93.24251.87

94.94250.03

95.64O

TTM

ay24563.00

88.35272.37

90.18261.01

94.11256.90

95.61255.26

96.23FIN

March

43329.0080.48

535.6980.88

516.9183.82

510.7784.83

508.3285.24

FINA

pril41846.00

81.00514.72

81.30496.23

84.33490.41

85.33488.08

85.74FIN

May

43201.0079.65

540.2079.97

520.0783.07

513.8384.08

511.3484.49

LAP

March

47125.0071.25

649.5472.55

574.4882.03

556.0184.76

548.6285.90

LAP

April

44491.0070.34

615.1572.33

528.8484.13

508.5587.49

500.4388.91

LAP

May

46314.0071.59

635.0272.93

564.8082.00

548.5084.44

541.9885.45

Page 51: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

APPENDIX A. SIMULATION TIME AND CAPACITY 43

Tabl

eA

.4:C

apac

ity

(Ena

ble

TPM

)

Farm

Info

Milk

reco

rds

Ori

gina

lca

paci

tySi

mul

atio

n-S

orti

ngSi

mul

atio

n-M

anua

lmilk

ing

bad

cow

in20

sec

Sim

ulat

ion

-M

anua

lmilk

ing

bad

cow

10se

c

Sim

ulat

ion

-M

anua

lmilk

ing

bad

cow

6se

cD

urat

ion

(Hou

r)C

apac

ity

Dur

atio

n(H

our)

Cap

acit

yD

urat

ion

(Hou

r)C

apac

ity

Dur

atio

n(H

our)

Cap

acit

y

OH

MM

arch

3722

884

.41

1546

428

86.6

614

9251

989

.80

1491

395

89.8

614

9095

589

.89

OH

MA

pril

3457

782

.11

1471

510

84.5

914

1927

787

.70

1418

308

87.7

614

1793

287

.79

OH

MM

ay36

027

81.3

415

5448

383

.43

1503

307

86.2

715

0213

086

.34

1502

130

86.3

4O

TTM

arch

2391

782

.03

9931

1186

.70

9611

1989

.58

9600

5289

.68

9596

4489

.72

OTT

Apr

il23

912

86.9

296

7535

88.9

795

3976

90.2

495

2972

90.3

395

2608

90.3

7O

TTM

ay24

563

88.3

598

0520

90.1

896

7580

91.3

996

6600

91.4

896

6208

91.5

2FI

NM

arch

4332

980

.48

1928

478

80.8

818

8022

982

.96

1879

869

82.9

818

7972

582

.98

FIN

Apr

il41

846

81.0

018

5299

581

.30

1804

521

83.4

818

0338

983

.53

1802

937

83.5

6FI

NM

ay43

201

79.6

519

4470

879

.97

1891

651

82.2

218

9045

782

.27

1889

981

82.2

9LA

PM

arch

4712

571

.25

2338

342

72.5

521

5336

478

.78

2152

481

78.8

221

5212

978

.83

LAP

Apr

il44

491

70.3

422

1454

672

.33

1997

382

80.1

919

9658

280

.22

1996

262

80.2

3LA

PM

ay46

314

71.5

922

8606

672

.93

2104

867

79.2

121

0383

579

.25

2103

423

79.2

7

Page 52: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

Appendix B

ACA2 Robot Timeout Analysis

44

Page 53: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

APPENDIX B. ACA2 ROBOT TIMEOUT ANALYSIS 45

Table B.1: ACA2 Robot Timeout AnalysisOHM 0301-0401 Total data : 22148Aca2 Duration (Sec) Successful rate (%) Success / Total Saved Times (H)>90 2 3/150 0.14>80 7.66 21/274 0.714>70 16.29 58/356 1.59>60 23.15 97/419 2.65>50 23.35 199/852 4.23>40 43.34 615/1419 7.23<= 40 98.05 20325/20729OHM 0401-0501 Total data : 18619Aca2 Duration (Sec) Successful rate (%) Success / Total Saved Times (H)>90 0.78 1/128 0.117>80 9.45 21/222 0.6>70 17.9 53/296 1.32>60 26.83 99/369 2.22>50 24.36 202/829 3.67>40 44.11 592/1342 6.54<= 40 98.4 17008/17277OHM 0501-0601 Total data : 18243Aca2 Duration (Sec) Successful rate (%) Success / Total Saved Times (H)>90 1.48 2/135 0.12>80 10.6 26/244 0.637>70 29.8 62/331 1.45>60 27.5 114/414 2.48>50 22.9 218/951 4.15>40 42.4 662/1561 7.47<= 40 98.4 16415/16682OTT 0301-0401 Total data :10508Aca2 Duration (Sec) Successful rate (%) Success / Total Saved Times (H)>90 0% 10358 0>80 5.12 17/332 0.313>70 25.12 148/589 1.58>60 43.3 352/813 3.46>50 61.7 757/1227 6.18>40 67.6 1649/2438 10.82<= 40 98.9 7981/8070OTT 0401-0501 Total data :11084Aca2 Duration (Sec) Successful rate (%) Success / Total Saved Times (H)>90 0% 10978 0>80 5.12 8/189 0.189>70 25.12 77/346 0.9125>60 37.8 169/447 1.98>50 54.74 346/632 3.43>40 58.19 810/1392 5.93<= 40 99.6 9653/9692OTT 0501-0601 Total data :10912Aca2 Duration (Sec) Successful rate (%) Success / Total Saved Times (H)>90 0% 14531 0>80 3.05 5/164 0.156>70 8.06 59/326 0.806>60 34.2 144/421 1.83>50 50.9 298/586 3.2>40 52.8 707/1339 5.56<= 40 98.6 9438/9573GAL 0301-0401 Total data :25307Aca2 Duration (Sec) Successful rate (%) Success / Total Saved Times (H)>90 0.77 1/130 0.1>80 10.84 23/295 0.69>70 20.76 87/419 1.68>60 27.6 174/630 3.05>50 47.44 418/881 5.07>40 67 1079/1543 8.24<= 40 99.95 23754/23764GAL0401-0501 Total data :24350Aca2 Duration (Sec) Successful rate (%) Success / Total Saved Times (H)>90 1.7 2/116 0.09>80 6.8 21/309 0.69>70 14.7 72/491 1.78>60 24.9 170/682 3.34>50 43.7 407/931 5.49>40 64.8 970/1497 8.66<= 40 99.9 22841/22853

Page 54: Knowledge discovery and machine learning for capacity ...1064081/FULLTEXT01.pdf · ity is 90 cows per hour. However, constrained by farm management, cow’s condition and system configuration,

TRITA -EE 2016:205

ISSN 1653-5146

www.kth.se