User-Operated Model-Building Systems - Data Science: Inconvenient Truths

18
PROOF OF FAILURE Clare Corthell Machine Learning Engineer & Data Scientist @clarecorthell www.datasciencemasters.org

Transcript of User-Operated Model-Building Systems - Data Science: Inconvenient Truths

PROOF OF FAILURE

Clare CorthellMachine Learning Engineer & Data Scientist

@clarecorthellwww.datasciencemasters.org

Deal Intelligence Platformfind and evaluate private companies

Machine Learning Need

• Very little structured information

• Disaggregated data

• Need for categorization

=> Data Structuring & Creation

WHAT DOES THE COMPANY DO?“industry” dimension

INDUSTRIES AS BINARY CATEGORIESyou’re in or out

inputs outputsmodel

decision:• reinforce• ship

PERFECTION! UTOPIA!

HUMAN INFERENCE WITHOUT HUMANS!and 60 fewer people on payroll

- $4.2m / yr

ANALYSTSthe user is not the database

EXAMPLE 1: WIND TURBINESWind Turbines.

definition

EXAMPLE 2: WEARABLES

on your body?electronic?

new materials?what are they?

definition

REINFORCEMENTdoesn’t always work

REINFORCEMENT PITFALLS

- (Technical) Overfitting- Humans have to question their own assumptions- Dimensional encoding issues (is this expressible in features?)- Human definitions is inadequate

SOOOOOOOOOOOOOOO…

SVM > neural nets

things I’ve heard recently

WHY IS SVM BETTER?Feature inspectability

• sometimes for debugging• mostly for humans

Humans don’t know what transformation the black box exerts on inputs. But sometimes, they need to know.

Their investors, their customers, their data analysts, their operators, their CEO — all want to know.

MONSTERS IN THE BLACK BOX

because

HUMANS SHOULD BE HUMANSCOMPUTERS SHOULD BE COMPUTERS.

Sometimes, our identities get a little mixed up.

1. Set Expectationsmake sure the organization understands failures

2. Reduce the “Trickery”*We build systems for humans. They need to understand how the levers and knobs affect the outcome

*h/t Sean Taylor

datasciencemasters.org

[email protected]

@clarecorthell

mattermark.com