Towards Anytime Active Learning: Interrupting Experts to ...
Transcript of Towards Anytime Active Learning: Interrupting Experts to ...
Towards Anytime Active Learning: Interrupting Experts to Reduce Annotation Costs
Maria E. Ramirez Loaiza
Aron Culotta
Mustafa Bilgic
Machine Learning Lab
Data X Learner Expert
Active Learning
𝑓 𝑥 = 𝑦
y y
How to spend the budget?
x x
Budget
2 www.cs.iit.edu/~ml Anytime Active Learning
Example: Text Classification
Which document and how much to read?
3 www.cs.iit.edu/~ml Anytime Active Learning
Example: Diagnostic Images
Which image and how much time to spend? 6 www.cs.iit.edu/~ml Anytime Active Learning
The Problem
• Anytime active learning
– Learner interrupts the expert at anytime, ask label
• The problem:
– Which instances to pick
– How much time to spend on each
7 www.cs.iit.edu/~ml Anytime Active Learning
Outline
• Anytime active learning in detail
• Experimental Results
• Future Work
• Conclusion
8 www.cs.iit.edu/~ml Anytime Active Learning
The Problem: Formally
• 𝑥𝑖𝑘 represents interruption
– Subinstance
– Interrupt at time k
• Value of information (VOI)
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Curabitur et viverra sem. Cras metus felis, vulputate vel leo nec, vestibulum suscipit lorem. Integer dolor nisi, porttitor in auctor a, malesuada id massa. Proin dictum nibh at convallis ornare. Morbi lectus justo, auctor eu tincidunt in, semper eget leo. Mauris massa.
k =50 words
Subinstances 𝒙𝒊𝒌
9 www.cs.iit.edu/~ml
Current Error Expected Future Error
Anytime Active Learning
Our Approach
• Search over all possible subinstances
• Models tradeoff between VOI and cost
10 www.cs.iit.edu/~ml
Labeling Cost VOI of Subinstance
Anytime Active Learning
Full Fixed
Search Space
𝑥110 𝑥1
20 … 𝑥1𝑘 … 𝑥1
100 𝑥1
𝑥210 𝑥2
20 … 𝑥2𝑘 … 𝑥2
100 𝑥2
⋮
⋮
𝑥𝑖10 𝑥𝑖
20 … 𝑥𝑖𝑘 … 𝑥𝑖
100 𝑥𝑖
⋮
⋮
𝑥𝑛10 𝑥𝑛
20 … 𝑥𝑛𝑘 … 𝑥𝑛
100 𝑥𝑛
Our framework
11 www.cs.iit.edu/~ml Anytime Active Learning
Results – Fixed Size Interruption
0.50
0.55
0.60
0.65
0.70
0 2000 4000 6000 8000 10000 12000 14000 16000
Acc
ura
cy
Number of Words
Accuracy - Movie - MNB with Fixed Subinstance Size
RND-FULL
RND-FIX-10
RND-FIX-30
VOI-FULL
VOI-FIX-10
VOI-FIX-30
Interrupted
Full
12 www.cs.iit.edu/~ml Anytime Active Learning
Results – Anytime Active Learning
0.50
0.55
0.60
0.65
0.70
0 2000 4000 6000 8000 10000 12000 14000 16000
Acc
ura
cy
Number of Words
Accuracy - Movie - MNB with Adaptive Subinstance Size
ANYTIME-γ=0.0001
ANYTIME-γ=0.0005
ANYTIME-γ=0.01
VOI-FULL
VOI-FIX-10
VOI-FIX-30
Full
13 www.cs.iit.edu/~ml
Interrupted
Anytime Active Learning
• Incorporate label error
• Other ways to interrupt
– Structured reading
– Automatic summary
• Other domains
– e.g., vision
• User study
Future Work
Sections Sections
14 www.cs.iit.edu/~ml Anytime Active Learning
Summary