Why AI Goes Wrong And How To Avoid It...›Rolled out in 2015 to compete with instant gratification...
Transcript of Why AI Goes Wrong And How To Avoid It...›Rolled out in 2015 to compete with instant gratification...
© 2018 FORRESTER. REPRODUCTION PROHIBITED.
Why AI Goes Wrong And How To Avoid It
Brandon Purcell
June 18, 2018
2© 2018 Forrester Research, Inc. Reproduction Prohibited
Source: https://twitter.com/jackyalcine/status/615329515909156865
We probably don’t need to worry about this in the near future…
3© 2018 Forrester Research, Inc. Reproduction Prohibited
Source: https://twitter.com/jackyalcine/status/615329515909156865
Google’s response:
“There is still clearly a lot
of work to do with
automatic image labeling,
and we're looking at how
we can prevent these
types of mistakes from
happening in the future."
But this is happening today
© 2018 Forrester Research, Inc. Reproduction Prohibited 4
Companies are learning that the road to hell is paved with good intentions
5© 2018 Forrester Research, Inc. Reproduction Prohibited
›Reputational erosion
›Revenue loss
›Regulatory fines
And they are paying for it in three ways:
Reputational risk: the erosion of brand equity
Microsoft Deletes ‘Teen Girl' AI After It Became A
Hitler-Loving Sex Robot Within 24 Hours
Amazon Prime And The Racist Algorithms
7© 2018 Forrester Research, Inc. Reproduction Prohibited
"I would definitely stop doing business with any company altogether if I found out that they discriminate against anyone!” – 27 year old female consumer
8© 2018 Forrester Research, Inc. Reproduction Prohibited
Source: Google Finance
Ethical failures erode shareholder value
9© 2018 Forrester Research, Inc. Reproduction Prohibited
› 9(1) Processing of personal data revealing racial or ethnic origin,
political opinions, religious or philosophical beliefs, or trade union
membership, and the processing of genetic data, biometric data for the
purpose of uniquely identifying a natural person, data concerning health or
data concerning a natural person's sex life or sexual orientation shall
be prohibited.
› For breaches against key points of GDPR such as the basic security
principles for processing data, obtaining consent and requirements to
internal transfers, the higher of 4% of annual global revenue or
€20,000,000 can be fined.
Biased AI could result in severe regulatory penalties
10© 2018 Forrester Research, Inc. Reproduction Prohibited
Why is this happening?
11© 2018 Forrester Research, Inc. Reproduction Prohibited
Source: https://www.forrester.com/report/The+Ethics+Of+AI+How+To+Avoid+Harmful+Bias+And+Discrimination/-/E-RES130023
Models can learn three types of bias
12© 2018 Forrester Research, Inc. Reproduction Prohibited
Source: https://www.forrester.com/report/The+Ethics+Of+AI+How+To+Avoid+Harmful+Bias+And+Discrimination/-/E-RES130023
Algorithmic bias
13© 2018 Forrester Research, Inc. Reproduction Prohibited
A model is only as good as the
data used to train it
14© 2018 Forrester Research, Inc. Reproduction Prohibited
© 2018 Forrester Research, Inc. Reproduction Prohibited 15
A crash course in machine learning
Supervised learning Unsupervised learning
Purpose To predict / classify To explore / understand
Training data Labelled (knows the
“answer”)
Not labelled (no “right
answer”)
Accuracy Measurable Qualitatively evaluated
Use cases for
marketing
Predict which customers
are likely to respond /
churn / buy
Behavioral customer
segmentation
The birds and the bees of model-making: supervised machine learning
Machine
learning
algorithm
Classification
model
Final output:
Newly
classified data
Labeled
dataTraining data
Unlabeled
data
Validation data
17© 2018 Forrester Research, Inc. Reproduction Prohibited
FaceApp demonstrates the problem of algorithmic bias
18© 2018 Forrester Research, Inc. Reproduction Prohibited
Bad training data created a racist filter
19© 2018 Forrester Research, Inc. Reproduction Prohibited
Algorithmic bias is caused by unrepresentative training data
Training data Total population
20© 2018 Forrester Research, Inc. Reproduction Prohibited
Algorithmic bias is caused by unrepresentative training data
Training data Total population
21© 2018 Forrester Research, Inc. Reproduction Prohibited
Training data should be IID –independent and identically distributed
Training data Total population
22© 2018 Forrester Research, Inc. Reproduction Prohibited
Training data should be IID –independent and identically distributed
Much better!
23© 2018 Forrester Research, Inc. Reproduction Prohibited
What happens when historical
biases are capture in the data?
24© 2018 Forrester Research, Inc. Reproduction Prohibited
Source: https://www.forrester.com/report/The+Ethics+Of+AI+How+To+Avoid+Harmful+Bias+And+Discrimination/-/E-RES130023
Human bias
© 2018 Forrester Research, Inc. Reproduction Prohibited 25
Even with good training data, models can pick up on human biases
Man is to woman as
Computer programmer is to _________
Google’s Word2Vec model for natural
language processing is sexist
© 2018 Forrester Research, Inc. Reproduction Prohibited 26
They can be sexist…
Man is to woman as
Computer programmer is to homemaker
Google’s Word2Vec model for natural
language processing is sexist
27© 2018 Forrester Research, Inc. Reproduction Prohibited
Amazon Prime same day delivery shows the problem of human bias
› Rolled out in 2015 to compete with instant gratification
factor of brick & mortar retailers
› 27 metropolitan areas
› Postal codes with 77 million people
› Excludes predominantly black postal codes in 6 major
cities: Atlanta, Boston, Chicago, Dallas, New York, and
Washington, D.C.
28© 2018 Forrester Research, Inc. Reproduction Prohibited
A tale of two cities
Source: https://www.bloomberg.com/graphics/2016-amazon-same-day/
The blue shaded areas got
same day delivery
29© 2018 Forrester Research, Inc. Reproduction Prohibited
A tale of two cities
Source: https://www.bloomberg.com/graphics/2016-amazon-same-day/
30© 2018 Forrester Research, Inc. Reproduction Prohibited
› “Demographics play no role in it. Zero.” - Craig
Berman, Amazon’s VP, Global Communications
›Model based on concentration of Prime members
› Inherited historical human bias in the form of red-
lining and de facto segregation
› Included variables that are a proxy for race
Amazon did not intend to exclude predominantly black postal codes
31© 2018 Forrester Research, Inc. Reproduction Prohibited
Inherited human bias perpetuates that bias in a vicious cycle
Insights
Action
Data
Data with
human bias
Models with
human bias
Discriminatory
action
32© 2018 Forrester Research, Inc. Reproduction Prohibited
The perpetuation of human bias in the criminal justice system…
› COMPAS - Correctional Offender Management Profiling
for Alternative Sanctions
33© 2018 Forrester Research, Inc. Reproduction Prohibited
Can have devastating consequences
› Black defendants were almost twice as likely as white
ones to be falsely labelled future criminals
› Whites more likely to be mislabeled as low risk
34© 2018 Forrester Research, Inc. Reproduction Prohibited
Combatting human bias requires a deep understanding of the problem and the data
›Are you including variables that are proxies for
race, age, or other protected classes?
›Can you exclude these variables?
›Or can you modify the training data to reflect a
more just outcome?
35© 2018 FORRESTER. REPRODUCTION PROHIBITED.
Code the change you want to see in the
world
36© 2018 Forrester Research, Inc. Reproduction Prohibited
Source: https://www.forrester.com/report/The+Ethics+Of+AI+How+To+Avoid+Harmful+Bias+And+Discrimination/-/E-RES130023
Useful (intentional) bias
37© 2018 Forrester Research, Inc. Reproduction Prohibited
But sometimes it is ok to exploit differences between customers
›Who should you market these items to?
38© 2018 Forrester Research, Inc. Reproduction Prohibited
Models help you identify and take advantage of different preferences and behaviors
›Good luck selling Waldo’s sweater to Charlie
Brown!
39© 2018 Forrester Research, Inc. Reproduction Prohibited
When is it ok to treat different
customers differently…
and when isn’t it?
40© 2018 Forrester Research, Inc. Reproduction Prohibited
• Defining “ethical” needs to be an executive-level
conversation
• Business units should be responsible for overseeing
ethical deployment and measurement
• Data scientists should be the first line of defense
against algorithmic bias
Define roles and responsibilities for ensuring the ethics of algorithms
41© 2018 Forrester Research, Inc. Reproduction Prohibited
• Employ diverse perspectives at the data scientist, LoB,
and executive levels
• Listen to the Voice of the Customer for their opinions
• Consult experts in algorithmic bias
• Algorithmic Justice League – Joy Buolamwini
• University of Massachusetts at Amherst – Themis
• IEEE – Global Initiative for Ethical Considerations in Artificial Intelligence
and Autonomous Systems
Embrace diversity by soliciting a diverse array of viewpoints
42© 2018 Forrester Research, Inc. Reproduction Prohibited
Most importantly, make your models FAIR
43© 2018 Forrester Research, Inc. Reproduction Prohibited