최신인공지능기술연구동향과미래전망 그리고미래를대비한네이버의 · 8....

최신인공지능기술연구동향과미래전망, 그리고미래를대비한네이버의 AI

Jung-Woo Ha, PhD

Research Head, Clova AI, NAVER & LINE

5 November 2019

Contents

• 최근 AI 연구 및 기술 동향

• AI 기술/서비스의 미래 키워드

• NAVER의 AI – Clova

2

NAVER as a Global Player

3

최근 AI연구및기술동향

Powerful Pretrained Models

5

Backbone Heads

Object detectorInput Output

Classification, Segmentation, Pose, Face, GM, …

Powerful Pretrained Models (Computer Vision)

• ResNet, ResNeXt, WideResNet

• DenseNet, AmoebaNet with Gpipe, and EfficientNet

• InstagramNet

• Squeeze-Excitation

• Optimization

• Regularization

• Data augmentation

• Neural architecture search

6

[Tan & Le. ICML 2019]

Powerful Pretrained Models (NLP)

• W2V [Mikolov et al. 2013]

• GloVe [Pennington et al. 2014]

• ELMo [Peters et al. 2018]

• BERT [Devlin et al. 2018]• RoBERTa [Liu et al. 2019]

• ELECTRA [Anonymous 2019]

• GPTv2 [Radford et al. 2019]

• XLNet [Yang et al. 2019]

• ERNIE 2.0 [Sun et al. 2019]

• LaRva [Lee et al. 2019]

7

[Google AI Blog 2018]

Written by AI (OpenAI GPTv2)

8

How it can?

• More Parameters (Neural Network)

• More Data

• More GPUs

9

AutoML

• AutoML?• Automatic machine learning• Too large search space for finding good models• AutoML Workshop @ ICML

• Hyperparameter Optimization• Hyperband [Li et al. 2016]• BOHB [Falkner et al. 2018]

• Neural Architecture Search (NAS)• NAS with RL [Zoph & Le 2016]• PathNet [Fernando et al. 2017]• DARTS [Liu et al. 2019]• ProxylessNAS [Cai et al. 2019]

• ETC• AutoAugment [Cubuk et al. 2018]• Fast AutoAugment [Lim et al. 2019]

10

Lightweight

• Powerful? Excellent but…..

• Practical?• Many GPU

• Inference

• ROI

• On-device

• Lightweight !!!

11

Lightweight

• Small models for CNN backbones• MobileNet V1 / V2 / V3 [Howard et al. 2017 / Sandler et al. 2018 / Howard et al. 2019]• ShuffleNet V1 / V2 [Zhang et al. 2017 / Ma et al. 2018]• Han et al. 2019

• Distillation• Efficient knowledge transfer-based training for small models• ResNet-152 ResNet-50 // ResNet-50 MobileNet• Hinton 2015• Margin ReLU [Heo et al. 2019]

• Lightweight BERT• DistilBERT [Sanh et al. 2019]• PKD [Sun et al. 2019]• MobileBERT [Anonymous 2019]

• Face: EXTD [Yoo et al. 2019]

• NAS: model size & latency

12

AI 기술/서비스의미래키워드

Current AI

• AlphaGo (2016) AlphaFold (2018) AlphaStar* (2019)

14

AI: Beyond Technology

• Innovation itself

• New services

• New products

• New values: Enhance conventional things• Manufacturing

• Process

• Cost reduction

• Efficiency

• Embedded to enhance

15

Personalized

• AI assistant

• Each user-customized service

• Content

• Advertisement

• Interface: voice, gesture, and thinking

16

Small Data

• Many data but small labeled data

• Few shot learning

• Golden samples

• Active learning

• Self-supervised or semi-supervised

• Unsupervised

• Augmentation

• Generative Models

17

On-Device

• Edge computing – 5G

• On-device inference

• On-device training

• Low power

• Lightweight

• AI Chip

18

https://www.youtube.com/watch?v=RjMS15V_7nQ

https://www.youtube.com/watch?v=RjMS15V_7nQ

Data Privacy

• On-device

• Personal data Sensitive data, Privacy

• Federated learning• https://venturebeat.com/2019/06/03/how-federated-learning-could-shape-the-

future-of-ai-in-a-privacy-obsessed-world/

• Google FL WS: https://sites.google.com/view/federated-learning-2019/presentations

• Nvidia: https://venturebeat.com/2019/10/13/nvidia-uses-federated-learning-to-create-medical-imaging-ai/

• Secure knowledge transfer

19[https://ai.googleblog.com/2017/04/federated-learning-collaborative.html]

https://venturebeat.com/2019/06/03/how-federated-learning-could-shape-the-future-of-ai-in-a-privacy-obsessed-world/

https://sites.google.com/view/federated-learning-2019/presentations

https://venturebeat.com/2019/10/13/nvidia-uses-federated-learning-to-create-medical-imaging-ai/

https://ai.googleblog.com/2017/04/federated-learning-collaborative.html

Human vs. AI? Human with AI!!!

• Labeling with active learning• Easy data Auto labeling by AI

• Difficult data by Human

• Contact center AI• 단순한 질의는 AI로

• 정신노동에 가까운 응대도 AI로

• 구체적 설명이 필요한 것은 사람담당자가

20

네이버의 AI – Clova 중심으로

NAVER & LINE

22

AI

AI of NAVER & LINE

23

NAVER

Korea No.1 Internet Platform Company

LINE

Global Messenger Platform Company

Based on Japan

Clova

Vision of Clova: AI for Everyone

Innovative AI

Technologies

Practical &

Valuable AI

Services

B2B

B2CBetter World

24

Clova AI Core Technologies to Services

25

Deep

Learning

Speech

NLP

Recomme-

ndation

ML

Platform

Computer

Vision

OCR

SpeechRecognition

Chatbot

FaceSDK

Voice Synthesis

AI ARS

NSML

CutMix: New Robust Data Augmentation

• Data augmentation: Core techs for enhancing backbone performances

• Improvement, robustness, transferability, and computational efficiency

26

[Yun et al. ICCV 2019 (Oral)]

CutMix: New Robust Data Augmentation

27

[Yun et al. ICCV 2019 (Oral)]

Distillation: Student Better Than A Teacher

Our approach: Margin-ReLU and located at pre-ReLU

28

[Heo et al. ICCV 2019]

Distillation: Student Better Than A Teacher

29

[ImageNet-1k] [Other Tasks: Object Detection & Segmentation]

[Heo et al. ICCV 2019]

Lightweight: Change Your Backbone

30

[Classification (ImageNet-1k)]

Models Top-1 err (%)

Params(M)

Flops(G)

MobileNet_v2 28.0 3.51 0.33

ShuffleNet_v2 27.4 3.50 0.30

MNasNet 26.0 4.2 0.32

ClovaAI-A 22.8 3.99 0.75

ClovaAI-B 23.6 3.76 0.42

ResNet-50 24.7 23.5 4.12

ResNet-101 23.6 42.5 7.85

[Detection (MSCOCO)]

Models mAP(%) Params(M)

MobileN_v1 + SSDLite

22.2 5.1


22.1 4.3


22.0 4.97

ClovaAI 23.5 4.8

SSD300 23.2 36.1

YOLOv2 21.6 50.7

Models mAP(%) Params(M)

S3FD+VGG 85.9 22

MobileFaceNet 74.1 1.2

PyramidBox 88.7 22

EXTD(ClovaAI)

86.6 0.16

[Face (MSCOCO)]

• Lightweight backbone as a pretrained model

• Object detector, OCR, FaceSDK, pose estimator, video recognition, etc.

• See the details at DEVIEW 2019

EXTD: EXtremely Tiny face Detector

Recurrent usages of convolution blocks + FPN (os SSD)

31

[Yoo et al. Arxiv 2019]

EXID via EXTD

32

[EXTD: 0.16M] [MobileFaceNet: 1.5M]

0.1초 얼굴 인식

33

Detection in OCR

CRAFT: Character region awareness for text detection

34

[Baek et al. CVPR 2019]

OCR Challenges

35

OCR Services

36

[디지털타임즈, 2019.7.16]

손글씨 폰트 생성

37

SOTA Image Retrieval in Smart Lens

38

[Jeon et al. Arxiv 2019]

Merchandise

(recall@1 vs Alibaba 79%)86%

Video Intelligence

39

Clique – Interactive 3D Avatar

40

Clique – Interactive 3D Avatar

41

HDTS: SOTA Speech Synthesis

42

Required recorded voice data: 10hrs 4hrs 40mins[E. Song et al. ArXiv 2018, Yamamoto et al. Interspeech 2019]

화상 회의록 요약

43

Audio-Visual Speech Enhancement

44

• Speech separation given lip regions in the video[Afouras et al. Interspeech 2018, Chung et al. ICASSP 2019 & Interspeech 2019]

LaRva: Language Representation by Clova

• BigLM based on BERT

• New task definition• GLUE vs. KLUE and JLUE

• New encoding, corpus-level curriculum learning, n-gram masking

• Distributed learning for LaRva

• Fast LaRva

45

LaRva: Language Representation by Clova

• Giant-scale general-purpose language model by improving BERT

• Not Multi-lingual BERT but LaRva

46

[WikiSQL] [KorQuAD]

DUET to AI Call of NAVER Glace

47

NAVER AI @ DEVIEW 2019 with Mr. President

48

Project 1784

49

Global AI R&D Belt

50

NAVER Smart Machine Learning (NSML)

NSML: ML research platform that enables you to focus on your model!!

51

[Sung et al. MLSYS 2017@NIPS 2017]

NAVER Smart Machine Learning (NSML)

52

쉬운 설치와 실행

53

기술보다 사용자 경험 중심의 설계

• AI Researcher, Engineer의 삶에 도움이 되는 설계• AI Researcher, AI Engineer가 더 나은 모델을 만들어 내도록 설계

• 시스템 사용 경험을 통해 Insight를 얻을 수 있도록

• AI Team에 도움이 되는 설계• 문제와 연구자를 Decouple하여 업무 환경 유동성 확보

54

Meeting with NSML

55

효율적/효과적 자원배분

56

리더보드: 데이터/문제 중심 협업의 핵심

57

AutoML

58

AutoML: 연구+프러덕트화 효율성 극대화

실제 OCR 적용 케이스

59

Active Learning 중심의 데이터 레이블링

60

Achievements of AI Research

AI 연구실적 발표: NeurIPS, ICML, ICLR, CVPR, I(E)CCV, ACL, EMNLP, AAAI, IJCAI, ICASSP, Interspeech, KDD, …

61

1 3 5

15

26

2015 2016 2017 2018 2019

Top AI 학회 논문 발표

NSML

7. 풍부한 해커톤 경험

62

Selected Publications of Clova AI in 2019

63

1. Song et al. Hierarchical Context enabled Recurrent Neural Network for Recommendation, AAAI 2019.

2. Park et al. Adversarial Dropout for Recurrent Neural Networks, AAAI 2019.

3. Park et al. Paraphrase Diversification using Counterfactual Debiasing, AAAI 2019.

4. Heo et al. Knowledge Distillation with Adversarial Samples Supporting Decision Boundary, AAAI 2019.

5. Heo et al. Knowledge Transfer via Distillation of Activation Boundaries Formed by Hidden Neurons, AAAI 2019.

6. Oh et al. Modeling Uncertainty with Hedged Instance Embeddings, ICLR 2019.

7. Gu et al. DialogWAE: Multimodal Response Generation with Conditional Wasserstein Auto-Encoder, ICLR 2019.

8. Lee et al. Large-Scale Answerer in Questioner's Mind for Visual Dialog Question Generation, ICLR 2019.

9. Kim et al. Curiosity-Bottleneck: Exploration By Distilling Task-Specific Novelty, ICML 2019.

10. Chung et al. Perfect match: Improved cross-modal Embeddings for audio-visual synchronisation, ICASSP 2019

11. Moon et al. Domain Mismatch Robust Acoustic Scene Classification Using Channel Information Conversion, ICASSP 2019

12. Baek et al. Character Region Awareness for Text Detection, CVPR 2019.

13. Seo et al. Real-Time Open-Domain Question Answering with Dense-Sparse Phrase Index, ACL 2019.

14. Chung et al. Who said that?: Audio-visual speaker diarisation of real-world meeting, Interspeech 2019.

15. Afouras et al. My lips are concealed: Audio-visual speech enhancement through obstructions, Interspeech 2019.

16. Yamamoto et al. Probability Density Distillation with Generative Adversarial Networks for High-quality Parallel Waveform Generation, Interspeech 2019.

17. Hwang et al. Parameter enhancement for MELP speech codec in noisy communication environment, Interspeech 2019.

18. Kim et al. Tripartite Heterogeneous Graph Propagation for Large-scale Social Recommendation, RECSYS 2019.

19. Yun et al. CutMix: CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features, ICCV 2019.

20. Baek et al. What is wrong with scene text recognition model comparisons? dataset and model analysis, ICCV 2019.

21. Yoo et al. Photorealistic Style Transfer via Wavelet Transforms, ICCV 2019.

22. Heo et al. A Comprehensive Overhaul of Feature Distillation. ICCV 2019.

23. Cho et al. Mixture Content Selection for Diverse Sequence Generation, EMNLP 2019.

24. Chen et al. NL2pSQL: Generating Pseudo-SQL Queries from Under-Specified Natural Language Questions, EMNLP 2019.

25. Kim. Subword Language Model for Query Auto-Completion. EMNLP 2019.

26. Lee et al. Unpaired Sketch-to-Line Translation via Synthesis of Sketches. SIGGRAPH Asia 2019.

Send Your CV Today!

64

[email protected]

[Clova AI Github] [Facebook Clova AI Research] [Clova AI Tech Blog]

mailto:[email protected]

최신인공지능기술연구동향과미래전망 그리고미래를대비한네이버의 · 8....

Documents

Transcript of 최신인공지능기술연구동향과미래전망 그리고미래를대비한네이버의 · 8....