Human-Centric Machine Learning

17
© 2017 NAVER LABS. All rights reserved. Matthias Gallé Naver Labs Europe @mgalle Human-Centric Machine Learning Rakuten Technology Conference 2017

Transcript of Human-Centric Machine Learning

© 2017 NAVER LABS. All rights reserved.

Matthias GalléNaver Labs Europe

@mgalle

Human-Centric Machine Learning

Rakuten Technology Conference 2017

Supervised Learning

Where f typically such that

𝑓 = argmin𝑓∈𝐹1

𝑁

𝑖=1

𝐿 𝑓 𝑥𝑖 , 𝑦𝑖 + 𝜆𝑅 𝑓

I know what I want(and can formalize it)

I have time & money to label lots of data

X,Y f(x)

Example: Machine Translation

Given a text s and its proposed translation p, how to measure its distance with respect to a reference translation t ?

BLEU: n-gram overlap between t and ptypically: 1 ≤ 𝑛 ≤ 4, precision only, brevity penalty

METEORbonus points for matching stems and synonymsuse paraphrases

Statistical Machine Translation P Koehn

(www.statmt.org/book/slides/08-evaluation.pdf)

Consequences of not formalizing correctly

Users do not use your modelComputer-Assisted Translation used rule-based systems for years

Ad-hoc solutionsQuality PredictionAutomatic Post Edition

Unsupervised Learning

Where Z(X) capture some prior:• Compression• Clustering• Coverage• ….

I am not sure what I want I have a (big) corpus with assumed patterns

X Z(X)

Example: Exploratory Search

Whenever your task is:• Ill-defined:

– Broad / under-specified– Multi-faceted

• Dynamic:– Searcher’s understanding inadequate at the beginning– Searcher’s understanding evolves as results are gradually retrieved.

The answer to what you search is “I know it when I see it”

https://en.wikipedia.org/wiki/I_know_it_when_I_see_it

Interactive Learning

Exploratory Search: examples

E-Discovery

Sensitivity Review

• Vo, Ngoc Phuoc An, et al. "DISCO: A System Leveraging Semantic Search in Document Review." COLING (Demos). 2016.• Privault, Caroline, et al. "A new tangible user interface for machine learning document review." Artificial Intelligence and Law 18.4 (2010): 459-479.• Ferrero, Germán, Audi Primadhanty, and Ariadna Quattoni. "InToEventS: An Interactive Toolkit for Discovering and Building Event Schemas." EACL 2017 (2017): 104.

Example: Active Learning

Give initiative to the algorithmallow action of type: “please, label instance x”

Cognitive effort of labeling a document 3-5x higher than labelling a word [1]

Feature labelling: • type(feedback) ≠ type(label) • information load of a word label is small• word sense disambiguation

[1] Raghavan, Hema, Omid Madani, and Rosie Jones. "Active learning with feedback on features and instances." Journal of

Machine Learning Research7.Aug (2006): 1655-1686.

Conclusion

If you really want to solve a problem, don’t be prisoner of your performance indicator

Ask yourself:

1. Does it really capture success? does it align with human judgment?

2. What does the [machine | human] best?

3. Can you remove the burden from humans by smarter algorithms?

Further reading & Acknowledgments

Jean-Michel RendersMarc Dymetman Ariadna Quattoni

http://www.europe.naverlabs.com/Blog

Q&A

© 2017 NAVER LABS. All rights reserved.

Appendix

© 2017 NAVER LABS. All rights reserved.

Statistical Machine Translation P Koehn

(www.statmt.org/book/slides/08-evaluation.pdf)