You can substitute photo in this slidepages.catapultsystems.com/rs/998-YNO-494/images... ·...
Transcript of You can substitute photo in this slidepages.catapultsystems.com/rs/998-YNO-494/images... ·...
Approachable AI
network: MSFTGUESTCode: msevent44uc
2
Introductions
Andrew KraemerSenior Lead Consultant – Data ScienceData and AI Practice
Lee HarperPrincipal Data ScientistData and AI Practice
Agenda
9.00–9:30 Welcome, What is AI, what problems can it solve?
9:30–10.15 Lifting the covers on the Data Science process
10.15 Break
10:30–11:00 Staffing your AI capability
11:00–11:15 AI tools in Microsoft Azure
11:15 Break
11:25–11:45 Demonstration – Deploying an IoT Model using Azure
11:45–12:30 Frontiers of AI, Q and A
What is AI?
What Problems Can it Solve?
5
What Isn’t AI?
6
What Is AI?
Depends upon who you ask!
Artificial intelligence (AI) is the simulation of human cognitive
processes by machines, especially computer systems.
In practice, mathematical models are used to make decisions
based on available data, in the presence of uncertainty.
7
Is AI Just Hype?
• According to Gartner, 85% of AI
projects fall short of expectations
• Disconnect between expectations
and reality
• There are many problems where AI
has been proven to add substantial
value
8
Artificial Intelligence Breakthroughs
9
Why Now?
10
Importance of AI to an Organization
AI
value
86% of CEOs consider digital technologies and AI to be the priority of № 1 for their companies
SourcePwC General Directors Survey Companies
AI Value Drivers:
New types of income
Improved customer experience
Lower operating costs
Increased productivity
Increase asset efficiency
Risk reduction
11
Types of Advanced Analytics
Most AI in use today falls
into these categories
12
Computer Vision – Information From Images
Disease Diagnosis
Optical Character Recognition
Object Detection and Recognition
13
Natural Language Processing
Sentiment
Analysis – How
are citizens
feeling?
Automatic
Translation
Chatbots and
smart assistants
Speaker and
voice recognition
14
Prediction: What Will Happen
Regression / Forecasting:
Predict a future value
Classification:
Predict probability of outcome
School
Enrollment
Forecasting
House Price
Estimation
Identify students
at risk of failing to
graduate
Predictive
maintenance on
shale gas fields
The Machine Learning Workflow
16
Example Problem
A car dealership group has the following problem statement:
“We have a large potential customer
base. We would like to figure out
which of our customers are most
likely to buy a new car from us, so
that we can produce a highly
targeted marketing campaign. We
want to increase sales relative to our
current segmentation strategy.”
17
Customer Profile
• A large automobile dealership group
• Almost 50 dealerships across multiple geographies
• Around 500,000 active customers
• Data on over 1.5 million customers
• Around 70 million customer interactions recorded
• Data exists on 5 different systems of record
18
CDK Dealership Management
What Data Might We Get?
• There may not be common elements across all tables – making combining difficult
• Data may arrive completely unprocessed
Data Set
SalesforceThird Party
Financial DataOperations
ManagementInternet
EnquiriesPhone Calls
19
Exploratory Data Analysis
• Identify biggest areas for improvement — is the problem worth solving?
• Investigate feasibility of the AI project
• Provide instant analytical insights
0 10 20 30 40 50 60
Dealership A
Dealership B
Dealership C
Conversion Rate at Time of Car Sale
20
Assembling Customer Journeys
• How do we assign data to a customer?
• Event based data:
• Also have non-event based attributes (e.g., demographic data)
• AI model is going to learn the history and demographic factors that tend to lead to a car being purchased
Service email
Called to book service
Vehicle Service
Special offer
Visited dealer
Took a test drive
?
21
Extracting Features From Data
• AI models read mathematics, not timelines!
• Feature engineering is the “art” of the AI field
• Utilize many customer journey descriptors — all can be expressed mathematically
- Number of cars previously purchased
- Length of customer tenure
- Price of any previously purchased cars
Can be used for
traditional
segmentation
Service email
Called to book service
Vehicle Service
Special offer
Visited dealer
Took a test drive
- Did a person come in
for oil changes
- Distance from home
address to dealership
22
Setting up the AI Model
Training
set
Test
set
23
Training the AI Model
Only use the training set
Features LabelsAI Learns the
Patterns
Customer ID Number of cars
purchased
Customer tenure /
days
1 1 10
2 4 1024
3 2 5000
4 2 740
Customer ID Car purchased in last 5 years?
1 1
2 1
3 0
4 0
24
Lead Scoring A Customer
• Use the test set – not seen by the model during training
• Best simulation of how it will perform in real life
Model
78 %
24 %
Since > 50%
Since < 50%
25
How can we use the model?
Automation
• Incorporate this score into your marketing automation workflow
• Could target highest 25% of customers with one campaign
• Could then target next 25% of customers with different campaign
Augment human decision making
• Add the model score to customer profiles, so that staff can see it
• Empowers call center staff, aftersales staff, or pre-sales staff with more information about the customer
10 Minute Break
Staffing Your AI Capability
28
What Kind of Roles Exist?
Raw Data Deployed ModelModelling
29
The Data Engineer
• Skilled at transforming raw data to usable data
• Skilled at automating data transformation and storage processes
• Can handle data at huge scale (terabytes +)
Azure
SQL DB
Azure
Data Lake
Azure Data
Factory
30
The Data Scientist
• Skilled at designing and building AI models to solve real problems
• Adept problem solvers and/or mathematicians
• May be a specialist (e.g., natural language processing expert) or generalist
Azure Machine Learning Service
Machine Learning Studio
31
The Machine Learning Engineer
• Skilled at operationalizing machine learning models at scale
• Skilled at software engineering and cloud infrastructure
• Integrates machine learning into smart applications
Azure Machine Learning Service
Azure Kubernetes Service
Azure DevOps Azure Containers
32
The Data Science Researcher
• Skilled at developing new mathematical techniques and technologies
• Publishes novel research, whitepapers or files for patents
• Likely has a PhD in a mathematical subject area
• Probably works in academia, Microsoft Facebook, etc. or an AI startup
Azure Virtual
Machines
33
Raw Data Deployed ModelModelling
Building an Internal AI Capability
Data Scientist
(Modelling)
Initial Team Members
34
Build a Successful AI Capability
I know
enough to be
dangerous…
• Mentorship and continuing education
• Many analytical people can develop prototype
models based on low risk use cases
• Peer review from other technical staff and business
stakeholders
• Professional Data Scientists optimize those models,
and prepare them for deployment
35
Kaggle – A Favorite of Citizen Data Scientists
36
Kaggle – A Favorite of Citizen Data Scientists
The winning submissions
will then go to the
company’s data science
team for further
research and
productionization
AI Tools on Azure
38
Why Do AI In the Cloud?
• Data security can be guaranteed
• Control over who can access data
• Compliance with data privacy legislation
• Data and AI governance is easier to manage
• Powerful computing resources are available
• Everyone can work with the same data sources
• Making models available through deployment is made easier
39
What AI Tools Are Available in Azure?
Zero Coding
Azure Machine Learning
Coding
Azure Machine Learning
Data Science Virtual Machine
Cognitive Services
40
The Data Science Virtual Machine
Comes preloaded with:
And many more!
41
Azure Machine Learning
• Data science platform, includes powerful AutoML functionality
• Manage the entire machine learning lifecycle
• Code and no-code solutions can collaborate in the same place
42
Azure Machine Learning
43
Azure Databricks
• A powerful integrated data engineering and data science platform
• Allows for the use of distributed computing for big data processing
• Requires expertise in python or scala languages and spark syntax
1 TB hard drive
8GB of RAM
10 TB Dataset
44
Cognitive Services
• Microsoft has built some models for some specific tasks
• People or intelligent apps can access these models through API calls
Content moderator – recognizes adult content or profanity
Speech services – convert spoken word to text or vice versa
Translator Text – automatic language detection and translation
QnA maker – create a chatbot that can answer questions
Computer vision – identify features in images, transcribe text from pictures of documents (OCR)
10 Minute Break
Demos
Frontiers of AI
48
Designing AI to Earn Trust
49
Machine Learning Operations (MLOps)
Integrating key practices of software
engineering and DevOps with individuals
who contribute to the AI workflow
Improve productivity and insight quality through the automation of AI
50
Prescriptive Analytics
?
• What actions can we take to achieve a desired outcome?
• AI will be able to tell us the series of actions to minimize the chance
of the student dropping out
• Achieves ultimate personalization
Student applies
Student accepted
Student attends course A
Student attends
course B
Student receives bill
for $1000
Student pays bill on time
?
51
Advanced Natural Language Processing– AI With Memory
• Recent advancements (late 2019)
• Essential for accurate translation and text comprehension
• Uses advanced neural network know as a “Transformer”
52
Autonomous Robotics
Q and A
55
Build and deploy models using
Azure Machine Learning
How do we use the model when it’s built?
• Azure provides powerful tools for using live models
Make predictions using secure
Azure webservices
JSON Request AI Model in Web
Service
JSON Response
56
Example Problem
A higher education system has the following problem:
“We have a number of students that do not complete their 4-year degree programs. We would like to be able to predict which students are likely to drop out in a given year, so that we can provide additional support and resources.”
57
What Data Might We Get?
Accounting
DataCourse CatalogCRM Data
Demographic
DataEnrollment
3rd Party
Data
Data Set
- There may not be common elements across all tables – making combination difficult
- Data may arrive completely unprocessed
58
Exploratory Data Analysis
• Identify biggest areas for improvement – is the problem worth solving?
• Investigate feasibility of the AI project
• Provide instant analytical insights
0 100 200 300 400 500 600
Campus A
Campus B
Campus C
Number of Students Dropping Out 2014-2019
59
Assembling Student Data
• How do we assign data to a student?
• Event based data:
• Also have non-event based attributes – eg demographic data
• AI model is going to learn which event history and demographic factors tend to lead to the outcome of a person dropping out of college
Student
Applies
Student
Accepted
Student
attends
course A
Student
attends
course B
Student
receives bill
for $1000
Student
pays bill
on time
?
60
Extracting Features From Data
• AI models read mathematics, not timelines!
• Feature engineering is the “art” of the AI field
Student
Applies
Student
Accepted
Student
attends
course A
Student
attends
course B
Student
receives bill
for $1000
Student
pays bill
on time
• - Utilize many student journey descriptors – all can be expressed mathematically
- Cumulative GPA
- Semester hours
- Number of completed semesters
- Number of late payments
- Distance of home from campus
61
Setting up the AI Model
Training set
Test set
62
Training the AI Model
• Only use the training set
Student ID Cumulative GPA Average hours per semester
1 3.20 12.6
2 3.96 15.8
3 2.46 13.2
4 3.42 13.4
Student ID Dropped out?
1 1
2 0
3 0
4 1
Features LabelsAI Learns the
Patterns
63
Scoring A Student
• Use the test set – not seen by the model during training
• Best simulation of how it will perform in real life
Model
78 %
24 %
Since > 50%
Since < 50%
64
How can we use the model?
Automation
• Send automatic check-in email campaign targeting students at risk of dropping out
• Could target highest 10% at risk
• Or could target all students with a score greater than, say, 75%
Augment human decision making
• Add the model score to student profiles, so that staff can see it
• Empower them to be aware of students that may be experiencing difficulties