Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

24
Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009

Transcript of Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

Page 1: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

Predicting Click Through Rate for Job Listings

Manish Gupta

Yahoo! HotJobs

Jan 22, 2009

Page 2: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.
Page 3: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

CTR and its applications

• CTR = Ratio of clicks to get full description of entity to views of a reduced version

• Rank results• Impacts publisher revenue in pay for perf

models• Bidding in ad exchanges• Trends can help detect click frauds

Page 4: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.
Page 5: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.
Page 6: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

CTR for new job listings

• Avg CTR = 2.29%• MLE would have high variance

Page 7: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

CTR for job listings

Page 8: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

Related work• Regelson and Fain – Estimate CTR using topic clusters (job categories)

• Richardson et. al.– Describe features for predicting CTR for ads.

• Our baseline: avg CTR for a test job (2.29%)

Page 9: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

Refined Problem definition

• Ideal: Predict CTR(job j, position p, user cluster u, context c)

Data sparsity Huge feature vector• Predict CTR(job)

Use CTR versus position curve• Predict CTR(job, position)

Page 10: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

Data set

• Used HotJobs data from Aug 11, 2008 to Aug 31, 2008 to predict CTR of jobs on Sep 1, 2008

• 40K jobs from 7k+ companies• 32K train set and 8K as test set• Jobs have location, company name, category,

creation date, posting date, optional position wise click history, job source, title, snippet & job description.

Page 11: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

Different models

• Weka: Linear Regression and SMOReg• Treenet: Gradient Boosted Decision Trees

• Feature selection:– Weka: wrapper with evaluator=linear regression

and search=GreedyStepwise– Treenet: Variable importance metrics

Page 12: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

Features

• Features from Similar Jobs (60)– CTR of jobs with same

title/company/state/city+state/category and their cardinalities posted in past one/two weeks or all jobs based on the click history of past one/two/three weeks

• Features from Related Jobs (288) – CTR_mn of related jobs with m= |A-B| and

n=|B-A| and cardinalities (0 ≤m,n≤ 5) posted in past one/two weeks or all jobs based on the click history of past one/two/three weeks

Page 13: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

Features

• Job Title Features (11)– #words, #capitalized words, isAllCaps, hasHighPunct,

hasLongWords, hasNumbers, vocabulory features• Daily CTR Features for past 3 weeks (21)• Other Features (10)– Job Category, age, location specificity, job source, and

job description page features• Other potential features– high-marketing-pitch words, brand value of company,

spam feedback, seasonal variations

Page 14: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

Experiments and results• Baseline: Predict avg CTR for a test job (2.29%)• Predicting avg - category-wise – CTR (A)• Linear Regression over 390 features (B) – uses only 142 regressors.• GBDT using Treenet over 390 features (C) – uses 300 regressors. (at

256_600_0.01_100)

Page 15: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

Analysis of regressor distribution

Page 16: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

Important features

• Similar Jobs features– Same company, title, city+state using 1 week click

history• Others features– Creation date, job description page size, date of

update, posting date, job category• Related Jobs features– Related_11, related_12 jobs posted in past 1/3

weeks over 1/3 week click history

Page 17: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

Pruning the feature set

Page 18: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

Pruning the feature set

• Wrapper based feature selection with linear regression and with Treenet’s variable importance (E) -11 features.

Page 19: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

In absence of click history …

• Linear regression with 369 features (F) – uses 187 regressors.

• Treenet uses 282 regressors at 256_600_0.01_20 (G)

Page 20: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

Analysis of regressor distribution

None of the sets alone helps!

Page 21: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

Pruning the feature set

Page 22: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

Variable importance curves

Page 23: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

Conclusion and future work• More features• Dyadic models to predict user-personalized CTR with

(job feature vector, user feature vector) dyads.• Auto model updates to correct model drift

• We built a machine learning system to predict CTR for job listings and presented our results using various regression metrics.

Page 24: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009.

Thanks for your time