Learning Data Manipulation for Augmentation and...
Transcript of Learning Data Manipulation for Augmentation and...
Learning Data Manipulation for Augmentation and Weighting
Zhiting Hu*, Bowen Tan*, Ruslan Salakhutdinov, Tom Mitchell, Eric XingCarnegie Mellon University, Petuum Inc.
Learning Data Manipulation• Data manipulation is often a crucial step to improve performance
• Data augmentation for small data problems
• Rule-based: image rotation, cropping; text synonym, …• Learning-based [Ratner et al.,17; Cubuk et al.,19]
• Data weighting for class-imbalance problems
• Rule-based: inverse class frequency, …• Learning-based: meta-learning [Ren et al., 18]; self-paced learning [Jiang et al.,18]
! = #$% & = &′ =augment
Specific to one manipulation type
( = 1 ( = 10
Learning Data Manipulation• Data manipulation is often a crucial step to improve performance
• Data augmentation for small data problems
• Rule-based: image rotation, cropping; text synonym, …• Learning-based [Ratner et al.,17; Cubuk et al.,19]
• Data weighting for class-imbalance problems
• Rule-based: inverse class frequency, …• Learning-based: meta-learning [Ren et al., 18]; self-paced learning [Jiang et al.,18]
! = #$% & = &′ =augment
Specific to one manipulation type
( = 1 ( = 10
Can we have a more generic learning method?
Background
Connecting the Dots between MLE and RL [Tan et al., 2019]
Parameterize Manipulation
• Standard MLE
[Hu et al., NeurIPS2019]
ℒ"#$,&# ', ( = *+(-,.) 01(-, .|3) − 5 KL ' -, . || 8 - 89 . - + ;(')
01 -, . 3 = <1 if -, . ∈ 3−∞ otherwise
Get valid rewards only when samples match training data
Parameterize Manipulation
• Standard MLE
[Hu et al., NeurIPS2019]
ℒ"#$,&# ', ( = *+(-,.) 01(-, .|3) − 5 KL ' -, . || 8 - 89 . - + ;(')
01 -, . 3 = <1 if -, . ∈ 3−∞ otherwise
Get valid rewards only when samples match training data
Relax it
Parameterize Manipulation
• Standard MLE
• Data augmentation
• Data weighting
[Hu et al., NeurIPS2019]
ℒ"#$,&# ', ( = *+(-,.) 01(-, .|3) − 5 KL ' -, . || 8 - 89 . - + ;(')
01 -, . 3 = <1 if -, . ∈ 3−∞ otherwise
Get valid rewards only when samples match training data
Relax it
0IJKL
-, . 3 = <1 if - ∼ NI - -∗, . -∗, . ∈ 3
−∞ otherwise
0IP -, . 3 = <
QR if -, . = -S, .S -S, .S ∈ 3−∞ otherwise
Any label-preserving transformations
Weight of the T-thtraining example
Use a Reward Learning Algorithm
• Intrinsic reward learning [Zheng et al., 18]
• Augment extrinsic reward for better performance
• Update model:
• Update reward:
• Map to data manipulation
• Update model:
• Update data:
[Hu et al., NeurIPS2019]
!"#$%& ', ) = !"# ', ) + !,%&(', ))/0 = / + 1∇3ℒ"#$%&(/,5)
50 = 5 + 1∇,ℒ"#(/′(5))
Train model with augmented reward
New model /′ gets higher extrinsic reward
Use a Reward Learning Algorithm
• Intrinsic reward learning [Zheng et al., 18]
• Augment extrinsic reward for better performance
• Update model:
• Update reward:
• Map to data manipulation
• Update model:
• Update data:
[Hu et al., NeurIPS2019]
!"#$%& ', ) = !"# ', ) + !,%&(', ))/0 = / + 1∇3ℒ"#$%&(/,5)
50 = 5 + 1∇,ℒ"#(/′(5))
Train model with augmented reward
New model /′ gets higher extrinsic reward
/0 = / + 1∇3ℒ78&%9(/,5)
50 = 5 + 1∇,ℒ:(/′(5))
Train model on manipulated data
New model /′ gets better valid-set perf.
Parameterization of Text Data Augmentation
Food was good , but the service was very disappointing . ! =neg# =Raw data
Random mask # = Food [Mask] [Mask] , but the service was very disappointing .
Augment data # = Food tasted nice , but the service was very disappointing .
Finetune BERT to condition on the label$
[Hu et al., NeurIPS2019]
Low-Data Text Classification
[Hu et al., NeurIPS2019]
30
31
32
33
34
35
36
37
38
SST-5 IMDB
Accuracy
Methods
Base
Synonym
Fixed
Ours
60
61
62
63
64
65
66
IMDB
Acc
urac
y
Methods
Base: BERT Synonym Fixed Ours
85
86
87
88
89
90
TREC
Acc
urac
y
Methods
Base: BERT Synonym Fixed Ours
85
86
87
88
89
90
TREC
Acc
urac
y
Methods
Base: BERT Synonym Fixed Ours
Train size: 40 per class; Val size: 2
Use a label-conditional BERT as the augmentation model !" # #∗, &
70
74
78
82
86
20:1000 50:1000 100:1000
Acc
urac
y
Imbalance ratio
Base: ResNet
Ours
Ren
Proportion
Class-Imbalance Image ClassificationCIFAR10 binary image classification
[Hu et al., NeurIPS2019]
(weighting)