Local Representation Alignment: A Biologically Motivated...

65
Local Representation Alignment: A Biologically Motivated Algorithm for Training Neural Systems Alexander G. Ororbia II The Neural Adaptive Computing (NAC) Laboratory Rochester Institute of Technology 1

Transcript of Local Representation Alignment: A Biologically Motivated...

Page 1: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Local Representation Alignment: A Biologically Motivated Algorithm

for Training Neural Systems

Alexander G. Ororbia II

The Neural Adaptive Computing (NAC) Laboratory

Rochester Institute of Technology

1

Page 2: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Collaborators

• The Pennsylvania State University• Dr. C. Lee Giles

• Dr. Daniel Kifer

• Rochester Institute of Technology (RIT)• Dr. Ifeoma Nwogu (Computer Vision)

• Dr. Travis Desell (Neuro-evolution, distributed computing)

• Students• Ankur Mali (PhD student, Penn State, co-advised w/ Dr. C. Lee Giles)

• Timothy Zee (PhD student, RIT, co-advised w/ Dr. Ifeoma Nwogu)

• Abdelrahman Elsiad (PhD student, RIT, co-advised w/ Dr. Travis Desell)

2

Page 3: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Objectives

• Context: Credit assignment & algorithmic alternatives• Backpropagation of errors (backprop)• Feedback alignment algorithms• Target propagation (TP) • Contrastive Hebbian learning (CHL)

• Discrepancy Reduction – a family of learning procedures• Error-Driven Local Representation Alignment (LRA/LRA-E)• Adaptive Noise Difference Target Propagation (DTP-σ)

• Experimental Results & Variations

• Conclusions3

Equilibrium propagation (EP)

Contrastive Divergence (CD)

Page 4: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

4

Page 5: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Backprop, CHL, LRA

SGD, Adam, RMSprop

MSE, MAE, CNLL

MNISTMLP, AE, BM, RNN

5

MLP = Multilayer perceptronAE = AutoencoderBM = Boltzmann machine

Page 6: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Problems with Backprop

Global optimization, back-prop through whole graph.

6

• The global feedback pathway• Vanishing/exploding gradients• In recurrent networks, this is worse!!

• The weight transport problem• High sensitivity to initialization• Activation constraints/conditions

• Requires system to be fully differentiable → difficulty in handling discrete-valued functions

• Requires sufficiently linearity →adversarial samples

Page 7: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Feedforward InferenceIllustration: forward propagation in a multilayer perceptron (MLP) to collect activities

(Shared across most algorithms, i.e., backprop, random feedback alignment, direct feedback alignment, local representation alignment)

7

Page 8: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

8

Page 9: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

9

Page 10: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

10

Page 11: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

11

Page 12: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

12

Page 13: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Backpropagation of Errors

13

Page 14: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Conducting credit assignment using the activities produced by the inference pass 14

Page 15: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Pass error signal back through post-activations (get derivatives w.r.t. pre-activitions)

15

Page 16: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Pass error signal back through (incoming) synaptic weights to get error signal transmitted to post-activations in layer below

16

Page 17: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Repeat the previous steps, layer by layer (recursive treatment of backprop procedure)

17

Page 18: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

18

Page 19: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

19

Page 20: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Random Feedback Alignment

20

Page 21: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

21

Page 22: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Pass error signal back through post-activations (get derivatives w.r.t. pre-activitions)

22

Page 23: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Pass error signal back through fixed, random alignment weights (replaces backprop’s step of passing error through transpose of feedforward weights)

23

Page 24: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Repeat previous steps (similar to backprop)

24

Page 25: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

25

Page 26: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

26

Page 27: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Direct Feedback Alignment

27

Page 28: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

28

Page 29: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Pass error signal back through post-activations (get derivatives w.r.t. pre-activitions)

29

Page 30: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Pass error signal along first set of direct alignment weights to second layer

30

Page 31: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Pass error signal along next set of direct alignment weights to first layer

31

Page 32: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Treat the signals propagated along direct alignment connections as proxies for error derivatives and run them through post-activations in each layer, respectively

32

Page 33: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

33

Page 34: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Random Feedback Alignment: Direct Feedback Alignment:

Backpropagation of Errors:

34

Page 35: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Global versus Local Signals

Global optimization, back-prop through whole graph. Local optimization, back-prop through sub-graphs.

36

Page 36: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Global versus Local Signals

Global optimization, back-prop through whole graph. Local optimization, back-prop through sub-graphs.

37

Global feedback pathway

Will these yield coherent models?

Page 37: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Equilibrium Propagation

38Negative phase Positive phase

Page 38: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

The Discrepancy Reduction Family

• General process (Ororbia et. al., 2017 Adapt)• 1) Search for latent representations that better explain input/output (targets)

• 2) Reduce mismatch between currently “guessed” representations & target representations• Sum of internal, local losses (in nats) → total discrepancy (akin to “pseudo-energy”)

• Coordinated local learning rules

• Algorithms• Difference target propagation (DTP) (Lee et. al., 2014)• DTP-σ (Ororbia et. al., 2019)• LRA (Ororbia et. al., 2018, Ororbia et. al., 2019)• Others – targets could come from an external, interacting process

• NPC (neural predictive coding, Ororbia et. al., 2017/2018/2019)

39

Page 39: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Adaptive Noise Difference Target Propagation (DTP-σ)

Image adapted from (Lillicrap et al., 2018)

zL zL˄

z L-1z L-1˄

g(z )Lg(z )L˄

40

Page 40: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Error-Driven Local Representation Alignment (LRA-E)

41

Page 41: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

42

Page 42: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Transmit error along error feedback weights, and error correct the post-activations using the transmitted displacement/delta

43

Page 43: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Calculate local error in layer below, measuring discrepancy between original post-activation and error-corrected post-activation

44

Page 44: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Repeat the past several steps, error-correcting each layer further down within the network/system

45

Page 45: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

46

Page 46: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Optional…substitute & repeat!

47

Page 47: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Aligning Local Representations• Credit assignment by optimizing subgraphs linked by error units

The Cauchy local loss:48

Page 48: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Aligning Local Representations• Credit assignment by optimizing subgraphs linked by error units,

motivated/inspired by (Rao & Ballard, 1999)

49

Page 49: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Aligning Local Representations• Credit assignment by optimizing subgraphs linked by error units,

motivated/inspired by (Rao & Ballard, 1999)

There is more than one way to compute these changes 50

Page 50: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Some Experimental Results

51

Page 51: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Experimental Results

52

MNIST

Fashion MNIST

7

3

Trousers

Dress

Shirt

(Ororbia et al., 2018 Bio)

Page 52: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Acquired Filters

Third level filters acquired, after a single pass through the data, by tanh network trained by a) backprop, b) LRA.

Backprop LRA

53

Page 53: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Visualization of Topmost Post-Activities

54

Page 54: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

55

Angle between LRA, DFA, & DTP-σ against Backprop

Measuring Total Discrepancy in LRA-E

Page 55: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Equilibrium Propagation (8 layers):MNIST: 59.03% Fashion MNIST: 67.33%Equilibrium Propagation (3 layers):MNIST: 6.00% Fashion MNIST: 16.71%

Training Deep (& Thin) Networks

(Ororbia et al., 2018 Credit)

56

Page 56: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Training Networks from Null Initialization

LWTA: SLWTA:

(Ororbia et al., 2018 Credit) 57

Page 57: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Training Stochastic Networks

58(Ororbia et al., 2018 Credit)

Page 58: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

If time permits…let’s talk about modeling time…

59

Page 59: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Training Neural Temporal/Recurrent Models

The Parallel Temporal Neural Coding Network (P-TNCN) (Ororbia et al., 2018)

(Ororbia et al., 2018 Continual)

• Integrating LRA into recurrent networks – result = Temporal Neural Coding Network

60

Page 60: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Removing Back-Propagation through Time!• Each step in time entails: 1) generate hypothesis, 2) error correction in light of evidence

61

Page 61: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

62

Page 62: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

63

Page 63: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Conclusions

• Backprop has issues, alignment algorithms fix one issue• Other algorithms such as DTP or EP are slow….

• Discrepancy reduction• Local representation alignment• Adaptive noise difference target propagation (DTP-σ)

• Showed promising results, stable and performant compared to alternatives such as Equilibrium Propagation & alignment algorithms• Can work with non-differentiable operators (discrete/stochastic)

• Can be used to train recurrent/temporal models too!64

Page 64: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

Questions?

65

Page 65: Local Representation Alignment: A Biologically Motivated ...clgiles.ist.psu.edu/IST597/materials/slides/lect11/ororbia-lra-bio-inspired-talk.pdfmotivated/inspired by (Rao & Ballard,

References• (Ororbia et al., 2018, Credit) -- Alexander G. Ororbia II, Ankur Mali, Daniel Kifer,

and C. Lee Giles. “Deep Credit Assignment by Aligning Local Distributed Representations”. arXiv:1803.01834 [cs.LG].

• (Ororbia et al., 2018, Continual) -- Alexander G. Ororbia II , Ankur Mali, C. Lee Giles, and Daniel Kifer. “Continual Learning of Recurrent Neural Networks by Locally Aligning Distributed Representations”. arXiv:1810.07411 [cs.LG].

• (Ororbia et al., 2017, Adapt) -- Alexander G. Ororbia II , Patrick Haffner, David Reitter, and C. Lee Giles. “Learning to Adapt by Minimizing Discrepancy”. arXiv:1711.11542 [cs.LG].

• (Ororbia et al., 2018, Lifelong) -- Alexander G. Ororbia II , Ankur Mali, Daniel Kifer, and C. Lee Giles. “Lifelong Neural Predictive Coding: Sparsity Yields Less Forgetting when Learning Cumulatively”. arXiv:1905.10696 [cs.LG].

• (Ororbia et al., 2018, Bio) -- Alexander G. Ororbia II and Ankur Mali. “Biologically Motivated Algorithms for Propagating Local Target Representations”. In: Thirty-Third AAAI Conference on Artificial Intelligence.

66