1. Introductiondmqm.korea.ac.kr/uploads/seminar/Grad_CAM_191205... · 2019-12-06 · 2....

발표자 : 백인성

1. Introduction


출처: https://alexisbcook.github.io/2017/global-average-pooling-layers-for-object-localization/Cheng, Chi-Tung, et al. "Application of a deep learning algorithm for detection and visualization of hip fractures on plain pelvic radiographs." European radiology (2019): 1-9.

<고관절 골절 탐지>→ 병 원인 진단

<분류 모델 원인 해석>→ 예측 모델 결과에 대한 신뢰성


1. 예측 모델을 사용하는 현업자들에게 이해 시키기 쉽게

2. 구축한 예측 모델 결과가 타당함을 직관적으로 보이기 위해

3. 예측 결과에 대한 원인을 분석하고 향후 대처 할 수 있게


출처: Selvaraju, Ramprasaath R., et al. "Grad-cam: Visual explanations from deep networks via gradient-based localization."Proceedings of the IEEE International Conference on Computer Vision. 2017.

2. Convolutional Neural Network(CNN)


[Input Imgae][Convolutional Layer] [Neural Network]

토르 : 10%

아이언맨 : 80%

스파이더맨: 8%

캡틴아메리카: 2%


Red Channel

Green Channel

Blue Cannel

Convolution①

[Input Image]

Max-pooling Convolution②

Fully Connected


0 0 2 1 1

0 1 1 2 2

0 3 2 1 0

1 1 1 3 0

0 1 1 1 0

1 1 2 2 1

0 0 1 3 0

0 1 1 0 0

2 2 1 3 1

0 0 1 1 0

1 0 0 1 1

1 1 1 0 3

0 2 3 2 3

1 1 1 4 0

1 2 0 1 0

Red Channel

Green Channel

Blue Cannel

1 0 2

0 -1 1

-1 1 2

0 0 1

0 2 -1

0 1 0

-1 1 -2

0 0 1

2 0 0

채널 별 Filter(3x3)

*

*

*

11 4 3

3 10 6

7 9 -1

2 2 7

4 6 3

4 0 6

0 2 8

3 4 -2

-1 5 -7

13 8 18

10 20 7

10 14 -2

Pixel값 * Filter

채널 별 동일pixel값 합

최종 결과


0 0 2 1 1

0 1 1 2 2

0 3 2 1 0

1 1 1 3 0

0 1 1 1 0

1 1 2 2 1

0 0 1 3 0

0 1 1 0 0

2 2 1 3 1

0 0 1 1 0

1 0 0 1 1

1 1 1 0 3

0 2 3 2 3

1 1 1 4 0

1 2 0 1 0

Red Channel

Green Channel

Blue Cannel

1 0 2

0 -1 1

-1 1 2

0 0 1

0 2 -1

0 1 0

-1 1 -2

0 0 1

2 0 0

채널 별 Filter

*

*

*

11 4 3

3 10 6

7 9 -1

2 2 7

4 6 3

4 0 6

0 2 8

3 4 -2

-1 5 -7

13 8 18

10 20 7

10 14 -2

Pixel값 * Filter

0 × 1 + 0 × 0 + 2 × 2 + 0 × 0 + 1 × (−1)+ 1 × 1 + 0 × (−1) + 3 × 1 + 2 × 2 = 11


최종 결과


0 0 2 1 1

0 1 1 2 2

0 3 2 1 0

1 1 1 3 0

0 1 1 1 0

1 1 2 2 1

0 0 1 3 0

0 1 1 0 0

2 2 1 3 1

0 0 1 1 0

1 0 0 1 1

1 1 1 0 3

0 2 3 2 3

1 1 1 4 0

1 2 0 1 0

Red Channel

Green Channel

Blue Cannel

1 0 2

0 -1 1

-1 1 2

0 0 1

0 2 -1

0 1 0

-1 1 -2

0 0 1

2 0 0

채널 별 Filter

*

*

*

11 4 3

3 10 6

7 9 -1

2 2 7

4 6 3

4 0 6

0 2 8

3 4 -2

-1 5 -7

13 8 18

10 20 7

10 14 -2

Pixel값 * Filter


최종 결과

11 + 2 + 0


2 8 3

10 4 1

-3 1 -2

2 8 3

10 4 1

-3 1 -2

<Max Pooling> <Average Pooling>

10

10

8

4

6

3

4

1

→ Pooling 크기 내 최대값으로 요약 → Pooling 크기 내 평균값으로 요약

3. Class Activation Map(CAM)


출처: Zhou, Bolei, et al. "Learning deep features for discriminative localization." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.


Convolution① Max-pooling Convolution②

Fully Connected②

[Input Image]

Fully Connected①



Global Average Pooling


3 2 0

-1 … 1

1 0 1

1 1 0

5 … 1

5 0 2

1 -1 0

-2 … 1

1 0 9

0 2 0

1 … 1

2 0 0

1 1 0

0 … 1

0 0 4

0 2 0

0 … 1

0 0 -1

2

6

3

1

4

1

Global Average Pooling → 각 Feature별 평균 값을 구함

아이언맨

토르

스파이더맨

캡틴아메리카


3 2 0

-1 … 1

1 0 1

1 1 0

5 … 1

5 0 2

1 -1 0

-2 … 1

1 0 9

0 2 0

1 … 1

2 0 0

1 1 0

0 … 1

0 0 4

0 2 0

0 … 1

0 0 -1

2

6

3

1

4

1


아이언맨

토르

스파이더맨

캡틴아메리카

예측


3 2 0

-1 … 1

1 0 1

1 1 0

5 … 1

5 0 2

1 -1 0

-2 … 1

1 0 9

0 2 0

1 … 1

2 0 0

1 1 0

0 … 1

0 0 4

0 2 0

0 … 1

0 0 -1

2

6

3

1

4

1

아이언맨

토르

스파이더맨

캡틴아메리카

예측

Weight①

Weight②

Weight③

Weight④

Weight⑤

Weight⑥



3 2 0

-1 … 1

1 0 1

1 1 0

5 … 1

5 0 2

1 -1 0

-2 … 1

1 0 9

0 2 0

1 … 1

2 0 0

1 1 0

0 … 1

0 0 4

0 2 0

0 … 1

0 0 -1

2

6

3

1

4

1

아이언맨

토르

스파이더맨

캡틴아메리카

예측



3 2 0

-1 … 1

1 0 1

1 1 0

5 … 1

5 0 2

1 -1 0

-2 … 1

1 0 9

0 2 0

1 … 1

2 0 0

1 1 0

0 … 1

0 0 4

0 2 0

0 … 1

0 0 -1

2

6

3

1

4

1

아이언맨

토르

스파이더맨

캡틴아메리카

예측

Weight①

Weight②

Weight③

Weight④

Weight⑤

Weight⑥



2

6

3

1

4

1

아이언맨

토르

스파이더맨

캡틴아메리카

예측

Weight①

Weight②

Weight③

Weight④

Weight⑤

Weight⑥

3 2 0

-1 … 1

1 0 1

1 1 0

5 … 1

5 0 2

1 -1 0

-2 … 1

1 0 9

0 2 0

1 … 1

2 0 0

1 1 0

0 … 1

0 0 4

0 2 0

0 … 1

0 0 -1

2

6

3

1

4

1

아이언맨

토르

스파이더맨

캡틴아메리카

예측

Weight①

Weight②

Weight③

Weight④

Weight⑤

Weight⑥

분류 결과에 따라 CAM에 활용되는 Weight가 달라짐


3 2 0

-1 … 1

1 0 1

1 1 0

5 … 1

5 0 2

1 -1 0

-2 … 1

1 0 9

0 2 0

1 … 1

2 0 0

1 1 0

0 … 1

0 0 4

0 2 0

0 … 1

0 0 -1

각 Feature 별 heatmap을 그림

Weight①X

Weight②X

Weight③X

Weight④X

Weight⑤X

Weight⑥X

=

=

=

=

=

=

동일 위치Pixel별 합

아이언맨 얼굴을예측 원인으로 판단


0 1 6 1

2 1 -2 3

3 2 7 2

0 3 0 2

1 1 2 2 1

0 0 1 3 0

0 1 1 0 0

2 2 1 3 1

0 0 1 1 0

1 -1

0 2

*

2 6 6

3 7 7

3 7 7

2 6

3 7

Convolution① Max-pooling Convolution① Feature Map

0 1

1 -1

*

filter① Filter②

4. Grad_CAM



Fully Connected②

[Input Image]

Fully Connected①


3 2 0

-1 … 1

1 0 1

1 1 0

5 … 1

5 0 2

1 -1 0

-2 … 1

1 0 9

0 2 0

1 … 1

2 0 0

1 1 0

0 … 1

0 0 4

0 2 0

0 … 1

0 0 -1

2

6

3

1

4

1

아이언맨

토르

스파이더맨

캡틴아메리카

예측

Weight①

Weight②

Weight③

Weight④

Weight⑤

Weight⑥

𝑆𝑐 =

𝑘

𝑊𝑘𝑐 1

𝑍

𝑖

𝑗

𝐴𝑖,𝑗𝑘

• c = 예측 Class• 𝑊𝑘

𝑐 = c Class 예측하는 k번째Feature Map에 대한 weight

• 𝐴𝑘 = k번째 Feature Map

• 𝐴𝑖,𝑗 = Feature Map 내 i,j 위치 값

• Z = Feature Map 별 합

<C class에 대한 CAM Score>= C


3 2 0

-1 … 1

1 0 1

1 1 0

5 … 1

5 0 2

1 -1 0

-2 … 1

1 0 9

0 2 0

1 … 1

2 0 0

1 1 0

0 … 1

0 0 4

0 2 0

0 … 1

0 0 -1

2

6

3

1

4

1

아이언맨

토르

스파이더맨

캡틴아메리카

예측

Weight①

Weight②

Weight③

Weight④

Weight⑤

Weight⑥

𝑆𝑐 =

𝑘

𝑾𝒌𝒄 1

𝑍

𝑖

𝑗

𝐴𝑖,𝑗𝑘








3 2 0

-1 … 1

1 0 1

1 1 0

5 … 1

5 0 2

1 -1 0

-2 … 1

1 0 9

0 2 0

1 … 1

2 0 0

1 1 0

0 … 1

0 0 4

0 2 0

0 … 1

0 0 -1

2

6

3

1

4

1

아이언맨

토르

스파이더맨

캡틴아메리카

예측

𝑆𝑐 =

𝑘

𝑾𝒌𝒄 1

𝑍

𝑖

𝑗

𝐴𝑖,𝑗𝑘







K=1

K=2

K=3

K=4

K=5

K=6

𝑾𝟏𝟐

𝑾𝟐𝟐

𝑾𝟑𝟐

𝑾𝟒𝟐

𝑾𝟓𝟐

𝑾𝟔𝟐


3 2 0

-1 … 1

1 0 1

1 1 0

5 … 1

5 0 2

1 -1 0

-2 … 1

1 0 9

0 2 0

1 … 1

2 0 0

1 1 0

0 … 1

0 0 4

0 2 0

0 … 1

0 0 -1

2

6

3

1

4

1

아이언맨

토르

스파이더맨

캡틴아메리카

예측

𝑆𝑐 =

𝑘

𝑊𝑘𝑐 1

𝑍

𝑖

𝑗

𝑨𝒊,𝒋𝒌

↓ 𝑨𝒊,𝒋𝒌







K=1

K=2

K=3

K=4

K=5

K=6

𝑊12

𝑊22

𝑊32

𝑊42

𝑊52

𝑊62


3 2 0

-1 … 1

1 0 1

1 1 0

5 … 1

5 0 2

1 -1 0

-2 … 1

1 0 9

0 2 0

1 … 1

2 0 0

1 1 0

0 … 1

0 0 4

0 2 0

0 … 1

0 0 -1

2

6

3

1

4

1

아이언맨

토르

스파이더맨

캡틴아메리카

예측

𝑆𝑐 =

𝑘

𝑊𝑘𝑐 𝟏

𝒁

𝒊

𝒋

𝑨𝒊,𝒋𝒌

↓ 𝐴𝑖,𝑗𝑘

K=1

K=2

K=3

K=4

K=5

K=6

↓ 𝟏

𝒁σ𝒊σ𝒋𝑨𝒊,𝒋

𝒌







𝑊12

𝑊22

𝑊32

𝑊42

𝑊52

𝑊62


3 2 0

-1 … 1

1 0 1

1 1 0

5 … 1

5 0 2

1 -1 0

-2 … 1

1 0 9

0 2 0

1 … 1

2 0 0

1 1 0

0 … 1

0 0 4

0 2 0

0 … 1

0 0 -1

2

6

3

1

4

1

아이언맨

토르

스파이더맨

캡틴아메리카

예측

𝑊12


K=1

K=2

K=3

K=4

K=5

K=6

↓ 1

𝑍σ𝑖σ𝑗𝐴𝑖,𝑗

𝑘

𝑊22

𝑊32

𝑊42

𝑊52

𝑊62

𝑆𝑐 =

𝑘

𝑊𝑘𝑐 1

𝑍

𝑖

𝑗

𝐴𝑖,𝑗𝑘







만약 Global Average Pooling을사용하지 않는다면?


3 2 0

-1 … 1

1 0 1

1 1 0

5 … 1

5 0 2

1 -1 0

-2 … 1

1 0 9

0 2 0

1 … 1

2 0 0

1 1 0

0 … 1

0 0 4

0 2 0

0 … 1

0 0 -1

아이언맨

토르

스파이더맨

캡틴아메리카

예측

𝑊12


K=1

K=2

K=3

K=4

K=5

K=6

𝑊22

𝑊32

𝑊42

𝑊52

𝑊62

𝑆𝑐 =

𝑘

𝑊𝑘𝑐 1

𝑍

𝑖

𝑗

𝐴𝑖,𝑗𝑘







만약 Global Average Pooling을사용하지 않는다면?

→ Feature Map 별 Weight도사용하지 못함

→ Weight를 정의하는 방식을바꾸자

???


3 2 0

-1 … 1

1 0 1

1 1 0

5 … 1

5 0 2

1 -1 0

-2 … 1

1 0 9

0 2 0

1 … 1

2 0 0

1 1 0

0 … 1

0 0 4

0 2 0

0 … 1

0 0 -1

예측


K=1

K=2

K=3

K=4

K=5

K=6

Flatten

아이언맨

토르

스파이더맨

캡틴아메리카

𝑦1

𝑦2

𝑦3

𝑦4


3 2 0

-1 … 1

1 0 1

1 1 0

5 … 1

5 0 2

1 -1 0

-2 … 1

1 0 9

0 2 0

1 … 1

2 0 0

1 1 0

0 … 1

0 0 4

0 2 0

0 … 1

0 0 -1

예측


K=1

K=2

K=3

K=4

K=5

K=6

Flatten

𝑖

𝑗

𝜕𝑦𝑐

𝜕𝐴𝑖,𝑗𝑘

Gradient를 통해 기울기를 구하자

아이언맨

토르

스파이더맨

캡틴아메리카

𝑦1

𝑦2

𝑦3

𝑦4


3 2 0

-1 … 1

1 0 1

1 1 0

5 … 1

5 0 2

1 -1 0

-2 … 1

1 0 9

0 2 0

1 … 1

2 0 0

1 1 0

0 … 1

0 0 4

0 2 0

0 … 1

0 0 -1

예측


K=1

K=2

K=3

K=4

K=5

K=6

Flatten

𝑖

𝑗

𝜕𝑦𝑐


Gradient를 통해 기울기를 구하자

아이언맨

토르

스파이더맨

캡틴아메리카

𝑦1

𝑦2

𝑦3

𝑦4

𝑦𝑐 = 𝑊𝑥 + 𝑏

𝑦2 = 𝑊𝑥1,𝑦2𝑥1 + 𝑏

𝑥1

𝑊𝑥1,𝑦2


3 2 0

-1 … 1

1 0 1

1 1 0

5 … 1

5 0 2

1 -1 0

-2 … 1

1 0 9

0 2 0

1 … 1

2 0 0

1 1 0

0 … 1

0 0 4

0 2 0

0 … 1

0 0 -1

예측


K=1

K=2

K=3

K=4

K=5

K=6

Flatten

Gradient를 통해 기울기를 구함

𝑆𝑐 =

𝑘

𝑊𝑘𝑐 1

𝑍

𝑖

𝑗

𝐴𝑖,𝑗𝑘

<C class에 대한 CAM Score>

𝐿𝐺𝑟𝑎𝑑−𝐶𝐴𝑀𝑐 = 𝑅𝑒𝐿𝑈

𝑘

𝑎𝑘𝑐𝐴𝑘

𝑎𝑘𝑐 =

1

𝑍

𝑖

𝑗

𝜕𝑦𝑐


아이언맨

토르

스파이더맨

캡틴아메리카

↓𝑦𝐶

>< C class에 대한Grad_CAM Score

><

𝑦1

𝑦2

𝑦3

𝑦4


3 2 0

-1 … 1

1 0 1

1 1 0

5 … 1

5 0 2

1 -1 0

-2 … 1

1 0 9

0 2 0

1 … 1

2 0 0

1 1 0

0 … 1

0 0 4

0 2 0

0 … 1

0 0 -1


Weight①X

Weight②X

Weight③X

Weight④X

Weight⑤X

Weight⑥X

=

=

=

=

=

=




3 2 0

-1 … 1

1 0 1

1 1 0

5 … 1

5 0 2

1 -1 0

-2 … 1

1 0 9

0 2 0

1 … 1

2 0 0

1 1 0

0 … 1

0 0 4

0 2 0

0 … 1

0 0 -1


1

𝑍

𝑖

𝑗

𝜕𝑦𝑐

𝜕𝐴𝑖,𝑗1X =

=

=

=

=

=



1

𝑍

𝑖

𝑗

𝜕𝑦𝑐

𝜕𝐴𝑖,𝑗2X

1

𝑍

𝑖

𝑗

𝜕𝑦𝑐

𝜕𝐴𝑖,𝑗3X

1

𝑍

𝑖

𝑗

𝜕𝑦𝑐

𝜕𝐴𝑖,𝑗4X

1

𝑍

𝑖

𝑗

𝜕𝑦𝑐

𝜕𝐴𝑖,𝑗5X

1

𝑍

𝑖

𝑗

𝜕𝑦𝑐

𝜕𝐴𝑖,𝑗6X


출처: https://jsideas.net/grad_cam/

5. Conclusion

감사합니다.

Appendix


출처: https://datascienceschool.net/view-notebook/e2d30432911f4498b873232dd7d99cce/


출처: https://towardsdatascience.com/counting-no-of-parameters-in-deep-learning-models-by-hand-8f1716241889

https://jsideas.net/grad_cam/

https://you359.github.io/cnn%20visualization/GradCAM/

https://hugrypiggykim.com/2018/03/28/grad-cam-gradient-weighted-class-activation-mapping/

https://jsideas.net/grad_cam/

https://you359.github.io/cnn visualization/GradCAM/

https://hugrypiggykim.com/2018/03/28/grad-cam-gradient-weighted-class-activation-mapping/

1. Introductiondmqm.korea.ac.kr/uploads/seminar/Grad_CAM_191205... · 2019-12-06 · 2....

Documents

Transcript of 1. Introductiondmqm.korea.ac.kr/uploads/seminar/Grad_CAM_191205... · 2019-12-06 · 2....