Cognitive Computing Solutions from Scry Analytics...Hadoop 2015 – Google’s “Search” uses...
Transcript of Cognitive Computing Solutions from Scry Analytics...Hadoop 2015 – Google’s “Search” uses...
Cognitive Computing Solutions from Scry Analytics
Dr. Alok Aggarwal
Founder and CEO
May 2017
Brief Overview of Artificial Intelligence
AGENDA
Collatio: Creating a Single Source of
Truth System
Jidoka: AI solutions to drive Automated Actionable Insights
2
Artificial Intelligence (AI) Brief History In 1951, Alan Turing, suggested the “Imitation Game” to define AI: Imagine three rooms, each connected via computer screen & keyboard; Judge decides which of the two talking is a man or computer – Turing Test AI has generated lots of excitement & optimism - also massive setbacks:
•Over 250,000 research articles related to AI •Research progressing to capture full human intelligence -- includes language, audio, and video processing; knowledge & learning; perception and the ability to move & manipulate objects •Still out of reach – intuition; reasoning; creativity; planning; emotions; intelligent actions (e.g., sense of humor) •We rarely use intuition, creativity or intelligent actions in “routine work”
Judge
Computer? Man?
3
Summary of Key Terms
Services Automation
Classical Quant. Modeling
Supervised Machine Learning
Unsupervised Mach. Learning
Reinforcement Learning
Natural Language Processing
Speech & Audio Processing
Video & Image Processing
Other Intelligent Actions
Fixed & Static Rules; Structured Data
Fixed Rules with Probability-Statistics
Pattern Recognition by Training
Self Learnt Pattern Recognition (in time)
Self Learnt Pattern Recognition (by doing)
Extracting Intelligence from Text
Extracting Intelligence from Speech & Audio
Extracting Intelligence from Video & Images
Modeling Intuition, Creativity, Reasoning, etc.
Cogn
itive
Com
putin
g
Artif
icia
l Int
ellig
ence
Inte
llige
nt S
ervi
ces
Auto
mat
ion
Mac
hine
Le
arni
ng
Category Brief Description
4
Advances in Technology 1951 – 2015
“Democratization” of Hardware
1965 – Moore’s Law: For a given price, processing power (or memory) doubles every 18 months 1965 – 2015: Improvement by 100 Million in CPU & Memory 2006 – Power & Memory are available (& scalable) as a commodity; 100 CPUs now rentable for only an hour from Amazon & others
Parallel & Distributed Computing
2004 – Parallel & distributed computing “reduced to practice” by Google 2005 – By modifying MapReduce, Yahoo! Introduced Open Source Hadoop 2015 – Google’s “Search” uses MapReduce & 110,000 computers
Improved Algorithms & Open Source Software
1989 - Free & Open Source Software Libraries contain many algorithms, leading to ‘democratization’ 2005 – Artificial Intelligence Algorithms have advanced substantially 2010 - Real time analysis of disparate data in large volume now a reality
5
Exponential Increase in Data Security & Anonymization
Need to bring varied data sources together (e.g., structured, unstructured, audio, video) so as to get a unified view with respect to volume, variety, veracity, and often, velocity Databases may be in different locations or countries and rules may prohibit these to be brought together; we may need data virtualization & determination of attributes that connect them Data may need to be anonymized; need to avoid Target & Netflix type issues
6
About Scry Analytics
Founder - Evalueserve; Dir. of Res. - IBM Watson Res. Ctr.; Founder - IBM India Research • Machine Learning (Paper Mills), Advanced algorithms for Hierarchical Memories & IBM Watson, etc. (1984 - 97) • Sabbatical at MIT (1990-1991); went back and deployed ML-genetic algorithms for Paper Mill Scheduling (1992-95) • Founded IBM India Research Lab. (1998 - 2000); grew it to 30 PhDs and 30 Masters • Co-founded Evalueserve (2000 - 13); research & analytics services co. with 3,200 employees worldwide • Founded Scry Analytics in February 2014; Scry – Crystal ball gazing or Fortune Telling
What We Do and How • Create single source of truth (~ 97% accurate) data systems so as to provide data relevance, reliability and quality
across full range of data (e.g., structured, unstructured, machine logs) • Use Cognitive Computing and Subject Matter Expertise to build library of automated solutions to provide
actionable insights for improving risk, complaints, regulatory compliance, revenue and profits
• Do a quick proof of concept (3-4 months) and then scale for production 7
Brief Overview of Artificial Intelligence
AGENDA
Collatio: Creating a Single Source of
Truth System
Jidoka: AI solutions to drive Automated
Actionable Insights
8
Prospecting & Marketing SMB Loans
• Small & Medium Business (SMB) Lending Group of a Bank wants to market actively to new customers • Our solution uses the base of current customers, i.e., who have taken loans & have a checking account • It creates “knowledge graphs” and use Machine Learning to provide predictive & prescriptive insights
to determine prospects for giving loans
Checking Account: (a) is SMB paying loans to some other lender, (b) given the average balance in the account, can this SMB use more loan, (c) which other SMBs are working with this SMB, etc. Creating a “Knowledge Graph”: Machine Learning Algorithms become more accurate when the number of SMBs who are customers of banks is very large because then most prospect SMBs are already “working with these customers” & their payment behavior becomes more evident Credit History of SMBs: For obtaining past credit history External Data: E.g., Glassdoor, Yelp, Privco, DnB, Social Media data, industry & geography data
9
Scry Analytics’ proprietary Natural Language Processing & Deep Learning solution provides: Automatic classification, review and categorization of complaints Identification w.r.t. sub-categories & product types Historical trends w.r.t. issues, their categories and sub-categories Comparison & benchmarking of complaints w.r.t. peer firms Decision Support for “Next Best Action” so as to resolve a new complaint
Complaint Categorization & Resolution System
10
Categorization of Complaints w.r.t. Firms E.g. Wells Fargo
If complaint is resolved with a customer directly, CFPB’s website may say “closed with explanation” 11
Benchmarking Complaints Among Peers E.g. Citibank vs. Bank of America
Citibank Bank of America
12
Next Best Action Decision Support for Complaint Resolution
I HAD A CREDIT BALANCE ON MY ACCOUNT OF XXXX CENTS. THIS MONTH MY ACCOUNT WAS CHARGED TO REMOVE THE XXXX CENTS. I CONTACTED CHASE AND WAS TOLD ANY CREDIT BALANCE UNDER XXXX (XXXX - XXXX) WOULD BE ADJUSTED AND RETAINED BY CHASE AS INTEREST IF NOT USED WITH 60 DAYS. I WAS NOT OFFERED A CREDIT BY CHASE OR ANY SUPPORTING DOCUMENTATION. I REVIEWED MY ACCOUNT OPENING MATERIALS AND COULD NOT FIND ANYWHERE THAT STATED THEY CAN KEEP MY CREDIT IF I DO N'T USE IT WITH 60 DAYS. XXXX CENTS IS NOT A LOT OF MONEY, HOWEVER WHEN YOU HAVE XXXX CUSTOMERS IN XXXX STATES XXXX - XXXX CENTS CAN ADD UP TO XXXX DOLLARS IN UNDESERVED PROFITS TO THE BANK THAT DOES NOT BELONG TO THEM. TO ME THIS APPEARS TO BE THEFT BY CHASE BANK AND XXXX CUSTOMERS ARE BEING DECEIVED ; BECAUSE TO MOST PEOPLE PENNIES DON'T MATTER. PLEASE INVESTIGATE
New Complaint
Output of the Decision Support System: • Category – Communication Issues • Next Best Action using similar issues in the past: Firm has responded to consumer &
chooses not to respond publicly
13
Checking Account: (a) is customer paying his/her bills on time, (b) account & credit card balances, (c) spending behavior, (d) loss of job, (e) application fee paid to another lender Credit History: FICO and credit-related attributes; other aspects of credit history Current Loan: (a) has asset value gone up, (b) will Adjustable Rate Mortgage go up soon, (c) have mortgage rates gone down, (d) is monthly payment high w.r.t. avg. balance in checking/saving account Social Media & Ads: (a) current lender getting bad reviews in social media, on CFPB website, or is sued for deceptive lending, (b) another lender giving huge incentives, etc. Other Attributes: (a) does the customer re-finances often, (b) is it a construction Loan, (c) own other homes, (d) payment to Architect, (e) demographics, e.g., education , age
Detecting Customers likely to Re-Finance
Mortgage Lending Group in Bank wants to pro-actively determine as to which of its customers are likely to re-finance their current mortgage; “customers“ are those with checking account & mortgage with Bank. Similar use case arises in figuring out who will pay off auto loans, credit cards and/or renew term deposits
14
Automated categorization of reasons for delinquency (e.g. inability to pay may be because of job-loss, alimony or medical reasons); hence recommendations based on the degree of similarity by Scry’s Cognitive Computing solution achieve 90%+ accuracy Solution checks all CCA discussions and provides “percentage of compliance” for each discussion Runs in REAL TIME by doing limited speech-to-text, ML, and NLP etc.
Improving Compliance & Efficiency Contact Centers
• Manual and time intensive process – CCAs often do not write correct reasons for delinquency (e.g., inability to pay vs. unwilling to pay)
• Ambiguity - Language ambiguity particularly while talking via phone • Determining resolution w.r.t. delinquency is not easy • Determining Compliance of CCAs is time consuming and laborious; hence,
only 2% are usually checked
Agent Name
Total Calls
Scry Compliance
Complete
Greeting
Company
Identification
Agent Name
Identification
Ask Caller
Identification
Agent Name ↓
Avg→ 720.7
1 56.12 56.48 50.81 52.47 54.74
Kam Mortenson 622 48.8 54.8 43.5 57.5 48.6
Tiny Mcelwee 892 59.9 54 42 40.3 53.7
Lyman Broadus 549 65.9 62.1 51.4 44.5 47
Kasie Mullen 1056 40 63.3 51.4 41.7 68.9
Charis Guice 490 57 57.3 58.4 59.6 48.4
Maragret Mani 840 61.9 52 60.9 66.5 61.6
Linnea Cuthbertson 596 68.8 50.3 49.6 66.2 40.3
15
Brief Overview of Artificial Intelligence
AGENDA
Collatio: Creating a Single Source of
Truth System
Jidoka: AI solutions to drive Automated
Actionable Insights
16
Scry Collatio Cognitive Computing – Creating Single Source of Truth
17
Grap
hica
l Use
r Int
erfa
ce
On Premise RDBMS
On Premise Unstructured
Data
External RDBMS
External Unstructured
Data
Online databases
Data
Lake
Connectors & Scrapers Library
User Groups
Pre-Built SSoT Containers
Ontology & Live Manual
Data Governance
Data Exploration
Role Based Access & Security
Data Collation Engine
Data Quality (DQ) Engine
Business Rules (BR) Engine
Business Dashboards
Connectors, Scrapers & Data Ingestion Scry Collatio Components
18
• Connectors • Relational DBs e.g. MSSQL, MySQL,
POSTGRES, Oracle • Many online databases • External systems such as Salesforce,
Service now, etc.
• Scrapers for various websites etc. • Schema Checks and Alerts • Full & Incremental Loads on
hourly/daily/weekly basis • Persistent Data Structure and Data
Lineage
On Premise RDBMS
On Premise Unstructured Data
External RDBMS
External Unstructured Data
Online databases Data Lake
Connectors/ Scrapers Library
Role Based Access, Encryption & Security Scry Collatio Components
19
• Two level encryption for data (if required) • Authentication • User Administration
• Ability to Manage users • Ability to manage roles and Role
Hierarchy • Authorization
• Pages - Restrict access to Pages and specific functions on various pages
• Data Set – Restrict access control to the granularity of specific data rows, datasets, data groups or data, columns, tables & DBs
Data Exploration & Preparation Scry Collatio Components
20
• Process structured or unstructured data
• Ability to upload files or connect to RDBMS to create projects and datasets
• Explore data, extract meta data & related statistics
• Identify row & column similarities & correlated attributes among datasets
• Create derived datasets – join/merge tables
• Identify patterns, charts, graphs in the data
• Keyword matching, topic modelling, & summarization of text documents
• Learns new connections and deploys them for all new, incremental data
Automated Data Quality (DQ) Engine Scry Collatio Components
21
• Automated DQ Configuration Management
• Automated cleansing using proprietary machine learning & NLP algorithms
• Audit & maintain history & Audit of the execution & changes to improve DQ
• Dashboards to visualize overall DQ and DQ at table/column level
• Ability to extract data that failed DQ rules
• Allows for cleansing data manually
• Learns how DQ exceptions were fixed by Data Governance Team in the past & applies it
• Provides data quality statistics regularly
• Outliers
• Ranges
• Duplicate Values
• Duplicate Rows
• Address validation
• Numeric Conversion
• Date Conversion
• Orphans
• Nulls
• Empty Strings
• Sets
• Candidate Key
• Length check
• Date Range
• Phone formats
• Address formats
Business Rules (BR) Engine Scry Collatio Components
22
• Allows configuring, scheduling & executing of business rules using the User Interface on the fly
• Dashboards to visualize overall BR, table/column specific BR results
• History/Audit of the execution and changes to Business Rules
• Extract data that failed Business Rules
• Provides statistics regarding Business Rules on a regular basis
End to End Automated Process Scry Collatio End to End Process
23
Schema Checks, Data Ingestion & hourly/daily snapshots (5 mins)
Data Quality Execution (15-30 mins)
Business Rules (BR) Execution (5-15 mins)
Data Collation (Source to FACT) Business Group 1
(15-30 mins)
Business Group 3
(15-30 mins)
Business Group 2
(15-30 mins)
Refresh DQ Dashboards
Refresh Business Dashboards and Data Objects for further analysis (5-10 mins)
Refresh BR Dashboards (5 mins)
Thank You Alok Aggarwal
Math and Statistical Algorithms
Machine Learning, Natural Language Processing & Information Retrieval Algorithms
Custom Creation of User Interfaces
Build, Maintain and Upgrade Integrated Solutions
Automatic Fill-In & Decision Support
Munge, Harmonize and Cleanse Data
Process Management & Rules-Based Frameworks
SCRY ANALYTICS INNOVATIVE
SOLUTIONS & SERVICES +1 914 980 4717
+1 408 872 1078
alok.aggarwal@ scryanalytics.com
ScryAnalytics.com
24