ML LAId bare - Cambridge Wireless · BigML 48 Node Decision Tree 82.7% BigML Neural Net (shallow)...

ML LAId bareCambridge Wireless SIG Meeting

Mary-Ann & Phil Claridge

23 November 2017

www.mandrel.com @MandrelSystems [email protected]

1

© 2017 Mandrel Systems www.mandrel.com @MandrelSystems

http://www.mandrel.com/

Welcome To Our ToolboxOur Opinionated Views !

• ”Data” IDE

• Wrangling

• Mainline Exploration and Prototyping – For Programmers

• Supporting Cast

• Up and Coming – Hard Thinking

• Datasets & Kaggle

• Demos

2


Data “IDE”

H2O

Weka

Honorable mention:

BigML (free to use, not open source)

3


H20• Good

• Good analysis of performance of generated AI/ML. • Some ML knowledge required

to interpret terminology and results.

• Easy install (local or cloud)• Dowload, unzip• java –jar h2o.jar

• Sparking Water provides integration with Spark for large Data Sets.

• Bad• Model generation focused on

Java

• Recommended • Excellent results for many

commercial projects from open source.

• Demo later !4


Installing H20http://h2o-release.s3.amazonaws.com/h2o/rel-weierstrass/7/index.html

5


Weka• Good

• Lots of basic data wrangling with no code.

• E.g String to vector.

• High degree of control to evaluate different ML algorithms

• Auto Weka

• Bad• Now a little dated• Focus on traditional Machine Learning

• Recommended • Learn how ML works, and ML algorithms

without coding.• Some quick and dirty wrangling.• Good set of free training videos focus on

ML not programming.

• https://www.cs.waikato.ac.nz/ml/index.html

6


https://www.cs.waikato.ac.nz/ml/index.html

Aside: BigML(not open source)

• Good• Point and click model building• Download ready to use models in most

languages including Python, C#, Java, Javascript and Excel!

• Decision tree (inc Ensemble) + Neural Nets• Fantastic graphics.

• Bad• Not open source (but free to use for small

data sets).

• Recommended • Three days pay for three hours work

• For many commercial applications.

• Explaining basic ML to non technical audience.

• White box vs black box.

• Sufficient for poor quality or low volume data where more sophisticated tools no benefit

7


Wrangling

Small: Python + Pandas

Large: Python + Spark

Large, complex and production: Scala + Spark

Honorable supporting cast: Anaconda: Pre-built Python environment

Parquet: Database table as a file. Fast & small

Intellij: Commercial IDE Python - Java, Scala, Javascript, Python Web

8


Programming Basic ML + Neural Nets

• Python and …

• Scikit

• Tensorflow (+ Keras)

• Scala …• Spark.ML

9


AI GymReinforcement Learn

• Reinforcement learning

• Go play.

• A very different kind of AI

• No demo today !

• https://gym.openai.com/docs/

10


https://gym.openai.com/docs/

Demo Data - Pima Indians Diabetes Data Set https://archive.ics.uci.edu/ml/datasets/pima+indians+diabetes

1. Number of times pregnant2. Plasma glucose concentration a 2 hours in an oral glucose tolerance test3. Diastolic blood pressure (mm Hg)4. Triceps skin fold thickness (mm)5. 2-Hour serum insulin (mu U/ml)6. Body mass index (weight in kg/(height in m)^2)7. Diabetes pedigree function8. Age (years)9. Class variable (0 or 1)

11


For more data sets: Kaggle !

Model Accuracy

Tool Model Accuracy

H20 Ten level tree 97%

H20 Neural Net 97.5%

BigML 48 Node Decision Tree 82.7%

BigML Neural Net (shallow) 77%

Keras 3 Layer Simple Neural Net 77.73%

Keras Deeper neural net To follow

12


Next

• Demos• H20

• Keras

• Q&A

13


Screen Shots

14


Demo H2O - import PIMA Indian diabetes data set, and build decision tree

H2O

15


H2O

16


H2O

17


H2O

18


H2O

19


H2O

20


H2O

21


H2O

22


H2O

23


BigML – same process as H2O demo above

24


BigML

25


BigML

26


BigML

27


BigML

28


BigML

29


BigML

30


BigML

31


BigML

32


BigML

33


BigML

34


BigML

35


BigML

36


BigML

37


ML LAId bare - Cambridge Wireless · BigML 48 Node Decision Tree 82.7% BigML Neural Net (shallow)...

Documents

Transcript of ML LAId bare - Cambridge Wireless · BigML 48 Node Decision Tree 82.7% BigML Neural Net (shallow)...