Data-Driven CRM Optimization for eReading

1
0.90 CRM at Kobo: Use emails to promote things like: New books Sales Releases by specific authors The project: Explore approaches for targeting and personalizing promotional emails. Build a system which: 1) Generates a list of the most applicable users for a given marketing campaign. 2) For each recipient, provides the optimal ordering of promoted books. Jake Stolee (Kobo) Jared Eccles (Kobo), Darius Braziunas (Kobo), Nathan Taback (U of T) Rakuten Kobo, 135 Liberty St. Suite 101, Toronto ON, M6K 1A7 Data-Driven CRM Optimization for eReading Marketing Email Targeting and Personalization Introduction Exploring Scoring Methods Item-item similarity scores can be aggregated to provide the similarity between a promoted book list (“P”) and a user’s library (“L”) – this is referred to as a user’s “affinity” to P. Cut off at a threshold, or take top “U” most applicable users. Proposed Scoring Methods: Item Jaccard: The number of users who purchased both items over the number of users who purchased at least one of the items. Aggregate over all items in a user’s library and all items in the promoted book list. Model the probability that a user will make a purchase from a specific email marketing campaign. Train a discriminative model that is able to predict (ℎ|x), given some vector of features, x, that captures information about both the user and the campaign. Feature Generation: 4 Users purchased only “The Girl on the Train” 6 Users purchased only “The Hunger Games” 10 users purchased both J(x, y)= -. /. = 0.5 J(x, y) = ∈L, ∈P aff C = cos = >? > > A aff WJ = J w ( > , A ) = CDE(F G ,I G ) K GLM CNF(F G ,I G ) K GLM Email Targeting Using “Affinity Scores” aff JSum = J(x, y) ∈P ∈Q , aff JAvg = aff JSum Q P Email Personalization Using “Affinity Scores” User Account User Purchase Behaviour User Email Purchase Behaviour User Reading Behaviour Affinity Scores Campaign Info Label: Converted {0,1} Hadoop Currently In Progress/To Be Completed: Algorithm evaluation (logistic regression & neural network classifiers, among others). Model selection/evaluation: A/B testing against current affinity-based approach. Sum J(x, y) between a given promoted book and every book in a user’s library. The resulting score for each promoted book can be used to order the book list for every user. Software/Tools Used and more… User A’s Library Promoted List User B’s Library 0.97 0.20 0.95 0.80 0.60 Email Logs (HDFS) Tracking Messages (HDFS) SQL Weighted Jaccard (between mean item vectors): Compute the mean item vectors for a given list and user’s library, calculate the generalized “weighted” Jaccard similarity between the two mean item vectors. aff JAvg = 0.391 aff JAvg = 0.150 Resulting Order Based On User A’s Library 1.57 0.00 Cosine (between mean item vectors): Calculate the cosine of the angle between mean item vectors for the list and user library. “User” Jaccard: Treat customer libraries and promoted lists as bit vectors, where an element indicates whether a specific book is present or not. For a given user library vector, , and the list vector, , compute: A Machine Learning Approach to Targeting 1.95 1 2 3 aff U = J(u,) Treat items as bit vectors (of size ) - every element indicates whether a specific user purchased the item or not. These vectors can be used to compute item-item vector “similarity scores”. Example (“50 Book Pledge”) Campaign Normalized Score (aff C ) 0.0 0.2 0.4 0.6 0.8 1.0 0 1 2 3 4 5 6 Density aff JAvg aff C aff WJ aff JSum aff U Score Type 0.00 0.05 0.10 0.15 0.20 Mean Score Difference Between Groups

Transcript of Data-Driven CRM Optimization for eReading

Page 1: Data-Driven CRM Optimization for eReading

0.90

CRMatKobo:Useemailstopromotethingslike:• Newbooks• Sales• Releasesbyspecificauthors

Theproject:Exploreapproachesfortargetingandpersonalizingpromotionalemails.Buildasystemwhich:

1) Generatesalistofthemostapplicableusersforagivenmarketingcampaign.

2) Foreachrecipient,providestheoptimalorderingofpromotedbooks.

JakeStolee(Kobo)JaredEccles(Kobo),DariusBraziunas (Kobo),NathanTaback (UofT)

Rakuten Kobo,135LibertySt.Suite101,TorontoON,M6K1A7

Data-DrivenCRMOptimizationforeReadingMarketingEmailTargetingandPersonalization

Introduction

ExploringScoringMethods

• Item-itemsimilarityscorescanbeaggregatedtoprovidethesimilaritybetweenapromotedbooklist(“P”) andauser’slibrary(“L”)– thisisreferredtoasauser’s“affinity”toP.

• Cutoffatathreshold,ortaketop“U”mostapplicableusers.

ProposedScoringMethods:ItemJaccard:Thenumberofuserswhopurchasedbothitemsoverthenumberofuserswhopurchasedatleast oneoftheitems.

Aggregateoverallitemsinauser’slibraryandallitemsinthepromotedbooklist.

• Modeltheprobabilitythatauserwillmakeapurchasefromaspecificemailmarketingcampaign.

• Trainadiscriminativemodelthatisabletopredict𝑝(𝑃𝑢𝑟𝑐ℎ𝑎𝑠𝑒|x),givensomevectoroffeatures, x,thatcapturesinformationaboutboththeuserandthecampaign.

FeatureGeneration:

4Userspurchasedonly “TheGirlontheTrain”

6 Userspurchasedonly “TheHungerGames”

10userspurchased

both

J(x, y) = -./.= 0.5

J(x, y)= 𝐱∩𝐲𝐱∪𝐲

𝐱 ∈ L, 𝐲 ∈ P

affC= cos 𝜃 = 𝐱>?𝐲>𝐱> 𝒚A

affWJ =Jw(𝐱>,𝒚A )= ∑ CDE(FG,IG)KGLM

∑ CNF(FG,IG)KGLM

EmailTargetingUsing“AffinityScores”

affJSum = ∑ ∑ J(x, y)�𝐱∈P

�𝒚∈Q , affJAvg =

affJSumQ P

EmailPersonalizationUsing“AffinityScores”

UserAccountUserPurchaseBehaviourUserEmailPurchaseBehaviourUserReadingBehaviourAffinity ScoresCampaignInfoLabel:Converted {0,1}

Hadoop

𝐃𝐚𝐭𝐚𝐒𝐞𝐭

CurrentlyInProgress/ToBeCompleted:• Algorithmevaluation(logisticregression&neuralnetworkclassifiers,among

others).• Modelselection/evaluation:A/Btestingagainstcurrentaffinity-basedapproach.

• SumJ(x, y) betweenagivenpromotedbookandeverybookinauser’slibrary.• Theresultingscoreforeachpromotedbookcanbeusedtoorderthebooklist

foreveryuser.

Software/ToolsUsedandmore…

UserA’sLibrary Promoted List UserB’sLibrary

0.97

0.20

0.95

0.80

0.60

EmailLogs(HDFS)

TrackingMessages(HDFS)

SQL

WeightedJaccard (betweenmeanitemvectors):Computethemeanitemvectorsforagivenlistanduser’slibrary,calculatethegeneralized“weighted”Jaccard similaritybetweenthetwomeanitemvectors.

affJAvg= 0.391 affJAvg= 0.150✓ ✘

Resulting OrderBasedOn UserA’sLibrary

1.57 0.00

Cosine(betweenmeanitemvectors):Calculatethecosine oftheanglebetweenmeanitemvectorsforthelistanduserlibrary.

“User”Jaccard:Treatcustomerlibrariesandpromotedlistsasbitvectors,whereanelementindicateswhetheraspecificbookispresentornot.Foragivenuserlibraryvector,𝐮,andthelistvector,𝓵,compute:

AMachineLearningApproachtoTargeting

1.95

1 2 3

affU= J(u,𝓵)

• Treatitemsasbitvectors(ofsize𝑁)- everyelementindicateswhetheraspecificuserpurchasedtheitemornot.

• Thesevectorscanbeusedtocomputeitem-itemvector“similarityscores”.

Example(“50BookPledge”)Campaign

NormalizedScore(affC)0.0 0.2 0.4 0.6 0.8 1.0

0

1

2

3

4

5

6

Density

affJAvg affC affWJ affJSum affU

ScoreType

0.00

0.05

0.10

0.15

0.20

MeanScoreDiffe

renceBe

tweenGrou

ps