Post on 29-Sep-2020
USING NLP TO CLASSIFY COMPLAINTS
Is it complaining or a complaint?
TABLE OF CONTENTS
1 IntroductionWho are you?
2 DefinitionsWhat exactly is NLP? What is the difference between complaining and a complaint?
3 The ProblemHow did this project come to be? What is it even about?
4 Data and FeaturesWhat dataset ? How did you use the data to create a training set?
5 The ModelWhat models did you test and use?
6 MonitoringHow did you monitor your model’s performance?
MEET THE SPEAKER
Kyra KochData Scientist at TIAA
Graduated with honors and a computer science degree from Clemson University
Currently pursuing a master's degree in computer science with an emphasis in machine learning from Georgia Tech
Has worked on many different proofs of concept and projects, such as anomaly detection, forecasting models, and similarity identification, at TIAA
TERMINOLOGY
nat·u·ral lan·guage proc·ess·ing/ˈnaCH(ə)rəl ˈlaNGgwij ˈprōˌsesiNG/
nouna subfield of computer science, information engineering, and artificial intelligence concerned with the interactions between computers and human (natural) languages, in particular how to program computers to process and analyze large amounts of natural language data
com·plaint/kəmˈplant /
nouna statement that a situation is unsatisfactory or acceptable, with an emphasis on regulatory complaints
com·plain·ing/kəmˈplāniNG/
verbthe expression of dissatisfaction or annoyance, often subjective and unimportant from a regulatory perspective
COMPLAINING“ Please take your disgusting commercials off
my TV. “-Complaining Customer
“ **** you **** you .“-Complaining Customer
“ Stop send ing these god**** ******* em ails you ********.“
-Complaining Customer
“ You are just a bunch of id io ts unab le to he lp m e . “
-Complaining Customer
“ More ********. Not from you . You 're cool. From your id io t bosses. Please te ll them to go stra igh t to **** for m e , do not pass "Go," do
not collect $200. All they d id was send m e righ t back to square one , the ********. “
-Complaining Customer
“ DON'T LIKE THE NEW WEBSITE. “-Complaining Customer
“ I NEED MY ******* MONEY RIGHT NOW! “-Complaining Customer
COMPLAINTS“ My only option for access is by telephone, but
I don't have telephone service where I am. “-Customer Complaint
“ Som ehow the re is a fa ilu re to com m unica te . “-Customer Complaint
“ Three weeks and counting, nobody can ye t answer m y questions about the d iscrepancies
in the loan paym ent sta tem ents I got. I'm a lso on m y 4th ca ll about it now. “
-Customer Complaint
“ The curren t investm ent represen ta tive has not been ava ilab le nor taken tim e to p lan with m e . “
-Customer Complaint
“ The fon t on the actua l m essage is too sm all. It is hard to read for a pe rson with not the best of
sigh t. “-Customer Complaint
“ Just checked m y account and the changes we have d iscussed have not occurred . “
-Customer Complaint
PROBLEM
Can we use machine learning to classify complaints for quality assurance?
DATAWhat is the text?
FINRA Summary
4,336ROWS
3REVIEWERSAlpha/Bravo/
Charlie
1DETERMINATION
Consensus/Disagreement
1FINAL VOTE
The dataset was cleaned by removing stop words, excess whitespace, punctuation and numbers.
If there was a discrepancy in the classification between the three reviewers, that entry was marked as ‘disagreement’ and used in the testing set.
If the three different reviewers are in agreement, we can say with confidence that the label is correct.
What is FINRA?FINRA is a private, not -for -profit corporation authorized by Congress to act as a self -regulatory organization for the financial industry
FEATURE GENERATION AND MODEL DEVELOPMENT
How can we use the data we have to get the computer to solve our problem?
TERMINOLOGYbag of words/bag əv wərds/
nouna way of simplifying text to be used in natural language processing and information retrieval
doc·u·ment/ˈdäkyəmənt/
nouna single text within the corpus
cor·pus/ˈkôrpəs/
nouna collection of written texts
term fre·quen·cy -in·verse doc·u·ment fre·quen·cy/tərm ˈfrēkwənsē inˈvərs ˈdäkyəmənt ˈfrēkwənsē /
nouna num erica l sta tistic tha t is in tended to re flect how im portan t or un ique a word is to a docum ent in the corpus
BAG OF WORDS
In essence, a bag of words model is quite simply a frequency count of how often each word occurs in a single document in the corpus.
For example:2 documents in the corpus: “this is a test test ” and “this is another test”The resulting dataset will look as follows:
message “this” “is” “a” “test” “another”this is a test test 1 1 1 2 0this is another test 1 1 0 1 1
TF-IDFTerm frequency -inverse document frequency is a method of identifying unique or important words
𝑡𝑡𝑡𝑡 − 𝑖𝑖𝑖𝑖𝑡𝑡𝑡𝑡,𝑑𝑑 = 𝑡𝑡𝑡𝑡𝑡𝑡,𝑑𝑑 � 𝑙𝑙𝑙𝑙𝑙𝑙𝑁𝑁𝑖𝑖𝑡𝑡𝑡𝑡
𝑡𝑡𝑡𝑡𝑡𝑡,𝑑𝑑 = (num ber of tim es te rm 𝑡𝑡 appears in a docum ent)/(to ta l num ber of te rm s in the docum ent)
𝑖𝑖𝑖𝑖𝑡𝑡𝑡𝑡,𝑑𝑑 = 𝑙𝑙𝑙𝑙𝑙𝑙𝑒𝑒((to ta l num ber of docum ents)/(num ber of docum ents with te rm 𝑡𝑡 in it))
For exam ple :Using the sam e exam ple as the previous slide , the re su lting da tase t will look as
fo llows:
message “this” “is” “a” “test” “another”th is is a te st te st 0 0 .1386 0 0th is is anothe r te st 0 0 0 0 .1386
DATASETS
TRAINING
Used for building the model and letting it
learn
Is usually representative of the
total population
55% of the data
VALIDATION
Used for iterating on the model and fine -tuning parameters
25% of the data
TESTING
Used after fine -tuning the model to ensure that the parameters are not overfitted to
the training and validation set
20% of the data
MODEL
ran·dom fo·rest/ˈrandəm ˈfôrəst/
nounan ensemble of decision trees with a random subset of the features, which eliminates a decision tree’s problem of overfitting on the training dataset
de·ci·sion tree/dəˈsiZHən trē /
nouna decision support tool tha t uses a tree -like m ode l of decisions and the ir possib le consequences, includ ing chance even t ou tcom es, re source costs, and u tility
VALIDATIONAfter training a random forest of 100 trees with a minimum leaf size of 2, we run the validation set through the model.
Overall Accuracy: 78.51%
Predicted Non -Complaint Predicted Complaint
Actual Non -Complaint 31276.85% of Actual Non -Complaints
9423.15% of Actual Non -Complaints
Actual Complaint 13920.50% of Actual Complaints
53979.50% of Actual Complaints
When it comes to predicting, the model has trouble classifying non -complaints. However, both percentages are within a reasonable margin of each other, so it is not a cause for concern.
* Note: All data provided in this presentation is artificial to minimize risk.
TESTINGAfter fine -tuning the parameters and adjusting the appropriate thresholds, the model was tested on the final holdout set.
Overall Accuracy: 79.58%
Predicted Non -Complaint Predicted Complaint
Actual Non -Complaint 29175.58% of Actual Non -Complaints
9424.42% of Actual Non -Complaints
Actual Complaint 8317.22% of Actual Complaints
39982.78% of Actual Complaints
Maintained overall accuracy Distribution was better Consistent
* Note: All data provided in this presentation is artificial to minimize risk.
DEPLOYMENT
API
MONITORING
989 489426 3,651
V0.4.1
83.52%Miscla ssifica t ion Ra te : 16.48%
* Note: All data provided in this presentation is artificial to minimize risk.
MONITORING
* Note: All data provided in this presentation is artificial to minimize risk.
KEY TAKEAWAYS
Text Data Bag of Words
Random Forest
Quality Control Monitoring
Data FormattingThe data comes in as
raw text data that needs to be formatted
Word CountThe text is transformed
into a word count dataset
EnsemblePut the data through a collection of decision
trees
Human ResourcesHave a manual review of instances where the
model predicts differently than the
human
MonitorMonitor model
performance and make regular adjustments
Steps Taken to Create and Utilize an NLP Model