Autoencoders and Jet Physics - ias.ust.hkias.ust.hk/program/shared_doc/2019/201901hep... · W rks...
Transcript of Autoencoders and Jet Physics - ias.ust.hkias.ust.hk/program/shared_doc/2019/201901hep... · W rks...
Autoencoders and Jet Physics
Jennifer Thompsn Universität Heidelberg
Based n: QCD r What? T. Heimel, G. Kasieczka, T. Plehn, JT arXiv:1808.08979
IAS HKUST wrkshp11/01/2019
2
Motivation: New Physics Searches● Particle physics is becoming very data-driven
➔ Even more so at future colliders● Can make the most of this
➔ With model independent searches➔ With data driven methods➔ With anomaly detection
● Unsupervised machine learning arXiv:1808.0897%9, T. Heimel, G. Kasiecska, T. Plehn, JT
arXiv:1808.089792, M. Farina, Y. Nakai, D. Shih
arXiv:180%.102731v2 J. Hajer, Y-Y. Li, T. Liu, H. Wang
arXiv:1%08.02479, E.Metodiev, N. Nachman, J. Thaaler
● In this talk: Focus on autoencoders
SM events
HiddenBSM
“Background” events
Anomaly detector
3
Autoencoder Anomaly Search
Idea: Train a network to detect generic anomalies
0 10 20 30
0
5
10
15
20
25
30
3510 7
10 6
10 5
10 4
10 3
10 2
p T[G
eV]
S it’s used t seeing this22 but nt this
Average QCD image
Preprocessing from arXiv:1803.00170% S. Macaluso and D. ShihImage credit: Michel Luchman
4
Search Strategy
Idea: Train a network to detect generic anomalies
0 10 20 30
0
5
10
15
20
25
30
3510 7
10 6
10 5
10 4
10 3
10 2
p T[G
eV]
Anomaly Detector
QCD What
5
Network Architecture
Apply (test)● Background + signal● Reconstruction
✔ Background
✗ Signal
Train● Background only● Compress to latent space● Learn reconstruction● Loss =
0 10 20 30
0
5
10
15
20
25
30
3510 7
10 6
10 5
10 4
10 3
10 2
p T[G
eV]
0 10 20 30
0
5
10
15
20
25
30
3510 7
10 6
10 5
10 4
10 3
10 2
p T[G
eV]
∑pixels
(kTinput
−kTout
)2
6
● So far we have considered images
● We also looked at constituent 4-vectors
● Implemented in with the CoLa/LoLa framework
● Diffeerent architecture● CoLa and LoLa layers at start● No convolutional layers
A Note on Data Format
0 10 20 30
0
5
10
15
20
25
30
3510 7
10 6
10 5
10 4
10 3
10 2
p T[G
eV]
arXiv:1707.089699v3 A. Butteer, G. Kasieczka, T. Plehn, and M. Russell
Cola LoLa
7
CoLa and LoLa Framework
kμ , i→~k μ , i=kμ , jC ji
CLa – simplifieed jet algrithm, trainable cmbinatins
LLa
~k μ , j→ k̂μ , j=(
m j2
pT , jw jmE Em
w jmd d jm
2 )
Traditinal
8
CoLa and LoLa Framework
kμ , i→~k μ , i=kμ , jC ji
CLa
LLa
~k μ , j→ k̂μ , j=(
m j2
pT , jw jmE Em
w jmd d jm
2 )
Traditinal Fr the AutencderMust be invertible
Nt invertible
~k μ , j=(
~E j~k 1 j~k 2 j~k 3 j
)→(~E j~k 1 j~k 2 j~k 3 j
√m j2)
9
Tops vs QCD
0 10 20 30
0
5
10
15
20
25
30
3510 7
10 6
10 5
10 4
10 3
10 2
p T[G
eV]
0 10 20 30
0
5
10
15
20
25
30
3510 7
10 6
10 5
10 4
10 3
10 2
p T[G
eV] Use public data set fr tps vs QCD
https://goo.gl/XGYju3
1t
● Pythia sample with Delphes detectr simulatin 14 TeV● 14 TeV hadrnic tps● 550GeV < pT , j<650GeV
0 100 200 300jet mass [GeV]
0.000
0.005
0.010
0.015
0.020
0.025
norm
alize
d di
strib
utio
n
QCDTop
0 20 40 60 80 100 120number of constituents
0.000
0.005
0.010
0.015
0.020
0.025
0.030
norm
alize
d di
strib
utio
n
QCD
Top
10
Thee Latent Space
0.0 0.2 0.4 0.6 0.8 1.0Signal efficiency S
101
102
Back
grou
nd re
ject
ion
1/B
40
50
60
70
80
90
100
110
120
Num
ber o
f con
stitu
ents
0.0 0.2 0.4 0.6 0.8 1.0Signal efficiency S
100
101
102
Back
grou
nd re
ject
ion
1/B
5791113151719212325
Bottl
enec
k siz
e
Very stable w.r.t number f cnstituents Plateau with btteleneck size
0 20 40 60 80 100 120number of constituents
0.000
0.005
0.010
0.015
0.020
0.025
0.030
norm
alized
distr
ibutio
n
QCD
Top
6 9 12 15 18 21 24Bottleneck size
0.74
0.76
0.78
0.80
0.82
0.84
0.86
0.88
Area u
nder
ROC c
urve
AUC
loss 1
2
3
4
5
valid
ation
loss
105
11
Test application: Tops vs QCD
0.0 0.2 0.4 0.6 0.8 1.0Signal efficiency S
100
101
102
103
Back
grou
nd re
ject
ion
1/B Constituents
Bottleneck 6AUC 0.93
ImagesBottleneck 32AUC 0.89
Cnstituents utperfrm
imagesWithut seeing tp events
AUC~0.9Supervised
Sees bth tp and QCD eventsAUC~0.98
12
Does the Network use Mass?
0 50 100 150 200 250 300jet mass [GeV]
0.000
0.002
0.004
0.006
0.008
0.010
norm
alize
d di
strib
utio
n
100%
70%
40%10%
5%
least QCD-like
0 100 200 300jet mass [GeV]
0.000
0.005
0.010
0.015
0.020
0.025
norm
alize
d di
strib
utio
n
QCDTopLw QCD jet mass
Thoe higher the jet mass, the less QCD-like✔ QCD jet-mass spectrum learnt
✗ Want t learn mre than jet mass
13
Decorrelating from the Jet Mass
We can use an adversary to decorrelate
● Introduce a second network
● Input = diffeerence between initial and finnal image ● Predict the (binned) jet mass●
Mass shaping
Smth distributi
n
Ladv=(~M (|kT ,i
adv−kT , i
auto|)−M )
2
Recnstructed mass
True mass
More adversarial training in particle physocsarXiv:180%.08%733 C. Englert, P, Galler, P. Harris, M. Spannowsky
14
Balancing Loss Functions Loss = : How do we set λ?
- Scan with the background-only hypothesis
- Want both loss functions to be of a similar size
0 50 100 150 200 250 300jet mass [GeV]
0.000
0.002
0.004
0.006
0.008
0.010
0.012
norm
alize
d di
strib
utio
n
= 1 10 3
AUC = 0.63
3% TopsQCD3% Tops(5% least QCD-like)QCD(5% least QCD-like)
0 50 100 150 200 250 300jet mass [GeV]
0.000
0.002
0.004
0.006
0.008
0.010
norm
alize
d di
strib
utio
n
= 5 10 4
AUC = 0.70
3% TopsQCD3% Tops(5% least QCD-like)QCD(5% least QCD-like)
0 50 100 150 200 250 300jet mass [GeV]
0.000
0.002
0.004
0.006
0.008
0.010
norm
alize
d di
strib
utio
n
= 1 10 4
AUC = 0.78
3% TopsQCD3% Tops(5% least QCD-like)QCD(5% least QCD-like)
Trade ff between perfrmance and mass-shaping
Lauto−λ Ladv
15
Adversarial Autoencoder
0.0 0.2 0.4 0.6 0.8 1.0Signal efficiency S
100
101
102
103
Back
grou
nd re
ject
ion
1/B
ConstituentsBottleneck 15
= 1 10 3
AUC 0.67
ImagesBottleneck 32
= 5 10 4
AUC 0.70
Wrks well fr images
Hard fr cnstituents:Explicit
mass-dependence
Typical area f interest
Fcus n images
16
Thee Adversarial Autoencoder in a Bump Hunt
1) Train on background
● Monte Carlo
● Data ➔ Background region
2) Decorrelate from jet mass
3) Bump hunt in jet mass
0 50 100 150 200 250 300jet mass [GeV]
0.000
0.002
0.004
0.006
0.008
0.010
norm
alize
d di
strib
utio
n
= 5 10 4
AUC = 0.70
3% TopsQCD3% Tops(5% least QCD-like)QCD(5% least QCD-like)
Queestin: can we train in the signal regin?
17
Training on the Signal Region
Train on 97% QCD, 3% tops
● Not all events encoded➔ Bottlleneck too small
● Network learns QCD➔ Dominant component
● Still see emergent peak
0 50 100 150 200 250 300jet mass [GeV]
0.000
0.002
0.004
0.006
0.008
0.010
norm
alize
d di
strib
utio
n
= 5 10 4
AUC = 0.65
3% TopsQCD3% Tops(5% least QCD-like)QCD(5% least QCD-like)
Train n any backgrund-dminated regin
18
Dark Showers
● Dark showers have many possible collider signatures
➔ Increased hadronic activity, displaced vertices,
● We consider a dark SU(3) symmetry
➔ Implemented in Pythia
● 2 benchmark points (fix5ed 200 GeV dark quark mass)
➔ 100 GeV dark meson mass
➔ 10 GeV dark meson mass
● Dark meson able to decay back to SM
●
ETmiss
575GeV < pT , j<625GeV
qv
�qv
q
�q
S
S
19
Dark Shower vs QCD
0 20 40 60 80 100 120number of constituents
0.000
0.005
0.010
0.015
0.020
0.025
norm
alize
d di
strib
utio
n
QCD m = 10GeV
m = 100GeV
0 50 100 150 200 250 300jet mass [GeV]
0.0000
0.0025
0.0050
0.0075
0.0100
0.0125
0.0150
0.0175
norm
alize
d di
strib
utio
n
QCD
m = 10GeV
m = 100GeV
100 GeV mesn mass = small mass peak
Similar t QCD
20
Dark Shower Results
0 50 100 150 200 250 300jet mass [GeV]
0.000
0.002
0.004
0.006
0.008
0.010
norm
alize
d di
strib
utio
n
5% least QCD-likefull sample
3% m = 100GeV3% m = 10GeVQCD
0.0 0.2 0.4 0.6 0.8 1.0Signal efficiency S
100
101
102
103
Back
grou
nd re
ject
ion
1/B
Adversarialm = 100GeV
Adversarialm = 10GeV
Non-Adversarialm = 100GeV
Non-Adversarialm = 10GeV
Discriminatin pwer in challenging prcesses
21
Conclusions
● Autoencoders allow for ● model-independent searches● training entirely on data
● Adversarial autoencoders allow jet mass-decorrelation● Bump hunt in jet mass
● See discrimination power in● Tops vs QCD● Dark showers