GIT Web Talk...NumPy Pandas Matplotlib Other Libraries Container Application App Deps and Libs Base...
Transcript of GIT Web Talk...NumPy Pandas Matplotlib Other Libraries Container Application App Deps and Libs Base...
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.1
일시
주최및협력
2020.09.17 (목) 14:00 ~ 15:00
제32회GIT Web Talk
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.2
AI/ML 주요개념
Artificial Intelligence (AI)Machines imitating intelligent human behavior. The term AI is primarily used by the business community.지능적인 인간 행동을 모방하는 기계
Machine Learning (ML)Subset of AI, gives computers the ability to learn without being explicitly programmed. The term ML is primarily used by the technical community. AI의 부분집합으로, 컴퓨터가 명시적으로 프로그래밍하지 않고도 학습할 수있음
Deep Learning (DL)Subset of ML, uses layers to progressively extract higher level features from the raw input. Applications include computer vision, image recognition, etc.ML의 부분집합으로, 레이어를 사용하여 raw input에서 더 높은 레벨의feature를 점진적으로 추출합니다.
MachineLearning
DeepLearning
Artificial Intelligence
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.3
영역 별 AI/ML의 기대 효과
●진단 효율성 증대
●빠르고 더 나은 진단
●더 나은 진료와 결과
Healthcare
●자동화된 승인 프로세스
●실제 사례 기반 보험 서비스 설계
Insurance
●더 나은 고객 접근성과
경험 제공
●네트워크 성능 개선
●장애와 위협 감지
Telcos
●더 개인화된 서비스
●리스크 분석 효율화
●오류, 사기 방지
●예측 적중률 증가
Financial services
●자율 주행
●예측된 차량 점검
●개선된 공급 체계
Automotive
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.4
The intelligent application journey
The intelligent application journey
•[Source] https://www.gartner.com/smarterwithgartner/top-trends-on-the-gartner-hype-cycle-for-artificial-intelligence-2019/
ThinkingAbout It
EvaluatingOptions
Proof of Concept
/ PilotProductionExploring
Options
You Are Here
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.5
Intelligent applications are distributed systems
•The model is a tiny part that gets a lot of focus
configurationdata collection
feature extractionprocess management
analysis tools
monitoring
serving infrastructure
machine resource managementdata verification
(Adapted from Sculley et al., “Hidden Technical Debt in Machine Learning Systems.” NIPS 2015)
Model Code
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.6
AI/ML 라이프사이클과 주요 요소
Business leadership
Data engineer
Data scientists
App developer
IT operations
Set goals
Gather and prepare data
Develop ML model
Deploy ML models in app dev pr
ocess
Implement Apps &
inference
ML models Monitoring &Management
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.7
The ML lifecycle
feature engineering model training and tuning
data collection and cleaning
modelvalidation
modeldeployment
monitoring,validation
codifying problem and metrics
codifying problem and metrics feature engineering model training
and tuningmodel
validationdata collection and cleaning
modeldeployment
monitoring,validation
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.8
You have seen this before
feature engineering model training and tuning
data collection and cleaning
modelvalidation
modeldeployment
monitoring,validation
codifying problem and metrics
codifying problem and metrics feature engineering model training
and tuningmodel
validationdata collectionand cleaning
modeldeployment
monitoring,validation
feature engineering model training and tuning
data collection and cleaning
modelvalidation
modeldeployment
monitoring,validation
codifying problem and metrics
bugfix or feature request develop test UATarchitecture
design production monitoring
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.9
The data scientist's domain
feature engineering model training and tuning
data collection and cleaning
modelvalidation
modeldeployment
monitoring,validation
codifying problem and metrics
codifying problem and metrics
feature engineering
model training and tuning
modelvalidation
data collection and cleaning
modeldeployment
monitoring,validation
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.10
The process is iterative
feature engineering model training and tuning
data collection and cleaning
modelvalidation
modeldeployment
monitoring,validation
codifying problem and metrics
codifying problem and metrics
feature engineering
model training and tuning
modelvalidation
data collection and cleaning
modeldeployment
monitoring,validation
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.11
Jupyter(Hub/Notebooks) in data science
feature engineering model training and tuning
data collection and cleaning
modelvalidation
modeldeployment
monitoring,validation
codifying problem and metrics
codifying problem and metrics
feature engineering
model training and tuning
modelvalidation
data collection and cleaning
modeldeployment
monitoring,validation
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.12
Everyone does the same thing… differently
feature engineering
model training and tuning
modelvalidation
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.13
Pooling and Sharing Resources with JupyterHub on OpenShift
Jupyter Hub DATA SCIENTIST
DATA SCIENTIST
DATA SCIENTIST
DATA SCIENTIST DATA SCIENTIST DATA SCIENTIST
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.14
Two primary bottlenecks
feature engineering model training and tuning
data collection and cleaning
modelvalidation
modeldeployment
monitoring,validation
codifying problem and metrics
codifying problem and metrics
feature engineering
model training and tuning
modelvalidation
data collection and cleaning
modeldeployment
monitoring,validation
●Getting started● Getting into production
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.15
The two bottlenecks
Getting Started:
• Self service• Repeatable/shareable environments• Repeatable/shareable experiments• Access to specialty resources (GPU, etc)
Getting into production:
• Self service• Ease of instrumentation• Access to specialty resources (GPU, etc)• (Automated) scale-out
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.16
Containers for models
Python 2.x, 3.x
NumPyPandasMatplotlibOther Libraries
Container
Application
App Deps and Libs
Base Image● Simpler, lighter, and denser than VMs● Base image guarantees starting consistency● Package apps with all dependencies● Reusable, portable, shareable as a unit
Model
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.17
What is DevOps?
CODE
BUILD TEST DEPLOY
MONITORREVIEW
Self-service provisioning
Automatedbuild & deploy
CI/CDpipelines
Consistentenvironments
Configuration management
App logs & metrics
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.18
One platform for all the domains
CODE
BUILD TEST DEPLOY
MONITORREVIEW
data engineering
CODE
BUILD TEST DEPLOY
MONITORREVIEW
data science
CODE
BUILD TEST DEPLOY
MONITORREVIEW
appdevelopment
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.19
AI/ML 도전과제
불확실한 수익은 초기 투자를
확보하기 어렵게 만들 수 있습니다.
명확하지 않은 ROI
많은 양의 데이터가 수집되지만 올바른 데이터를 찾고 준비하는 것은
어렵습니다.
쉽게 사용할 수 있는 데이터 부족
느리고 수동으로 조작하는 격리된제어체계로 인해 신속하게 구현할
수 없습니다.
격리된 제어 체계
핵심 기술이 부족하여 운영을 유지할 인재를 찾고 확보하기 어렵습니
다.
인재 부족
인프라의 빠른 가용성 확보가 어렵고 레거시 시스템이 발목을 잡습니
다.
준비되지 않은 인프라
AI/ML PlatformRed Hat has integrations and joint solutions with key AI/ML software and hardware ecosystems to ensure customer success.
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.21
분야별 도전과제
Set goals
Gather and prepare data
Develop ML model Deploy ML models in app dev process
Implement Apps & inference
ML models Monitoring & Management
데이터 수집 데이터 복제 데이터 관리 아키텍처 설계
데이터 접근/무결성
데이터엔지니어/개발자와의협업
적절한 툴의 선택과 배포 IT운영으로부터의 독립성 설계
하드웨어 성능 최적화
개발 프로세스배포 및 설계 학습, 테스트, 재학습에 대한모델 설계
모델과 앱 코드간 충돌 분석과 설계
API 구축
분리된 ML 워크플로우 관리 개발 프로세스에 배포 반복적 재학습과 재배포에 대한 조절
보안, 성능 관리 호환성 낮은 여러 도구에 대한 지원
느리고 오래된 IT 인프라와프로세스 관리
Data engineer
Data scientists
App developer
IT operations
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.22
컨테이너와 쿠버네티스를 AI/ML에 사용하려는 이유
AI/ML 솔루션 스택의 자동 확장 및
고 가용성
자동화 된 컴퓨팅 리소스 관리로
신속하게 대응
Agility Scalability
필요에 따라 AI/ML 환경을 프로비저닝
Flexibility
데이터 센터, 엣지 및 퍼블릭 클라우드에서ML 모델을 일관되게 개발 및 배포
Portability
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.23
ML 아키텍처 컨셉
ML Software Tools (e.g. TensorFlow, Jupyter Notebooks, Python, etc.)
Infrastructure
ML data services - databases (SQL, NoSQL, etc.), data lake, etc.
Compute acceleration (GPU, FPGA, TPU)
Hybrid, multi cloud platform with self service capabilities
Set goals
Gather and prepare data
Develop ML model Deploy ML models in app dev process
Implement Apps & inference
ML models monitoring & management
Virtual Private Public Hybrid EdgePhysical
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.24
ML 아키텍처와 레드햇 포트폴리오
Set goals
Gather and prepare data
Develop ML model Deploy ML models in app dev process
Implement Apps & inference
ML models monitoring & management
Software-definedInfrastructure
Data Services - Databases (SQL, NoSQL, etc.), Data Lake, etc.
Compute Acceleration (GPU, FPGA, TPU)
Hybrid, Multi Cloud Platform with self service capabilities
ML Software tools
TECHNOLOGY PARTNER
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.25
인텔리전트 애플리케이션
고객이지능형응용프로그램을구축하고자체핵심비즈니스에 AI를 적용할수있습니다.
• 고객이 Red Hat의 지능형 플랫폼 위에서 지능형 애플리케이션을 구축하
고 운영 할 수 있도록 지원
• Red Hat의 미들웨어 제품은 일련의 핵심 기능을 제공
• 광범위한 오픈 소스 구성 요소
• 서비스 카탈로그를 통해 통합 된 Red Hat의 광범위한 ISV 에코 시스템
제공
RED HAT MIDDLEWARE PRODUCT FUNCTIONAL COMPONENT
RED HAT DATA GRID IN-MEMORY PIPELINE STORAGE
RED HAT FUSE DATA TRANSFORMATION, SERVICE INTEGRATION
RED HAT AMQ DATA INGESTION, EVENT DISTRIBUTION
RED HATDECISION MANAGER*
RULES & MODEL EXECUTION, COMPLEX EVENT PROCESSING
RED HATPROCESS AUTOMATION MANAG
ER
BUSINESS PROCESS AUTOMATION
RED HAT 3SCALE API MANAGEMENT
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.26
NVIDIA GPU IN OPENSHIFT
프로덕션 환경에서 컨테이너화된 AI 워크로드 제공
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.27
Enablement of GPUs in an OpenShift Cluster
NFD Operator recognizes GPUs and labels the nodes
GPU Operator builds the GPU enablement stack
CUDA driver (or container)
K8s device plugin for GPU
GPU node_exporter for Prometheus
Label: GPU
CRIO GPU runtime plugin
1
2
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.28
Deploying GPU Workloads onto OpenShift
Pod Deployment
Job
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.29
OPEN DATA HUB
• OpenShift, Container Storage를 활용하여 AI를 서비스 플랫폼으로 구축하기위한 청사진
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.30
OPEN DATA HUB 아키텍처
Data in Motion
Security & GovernanceArtificial Intelligence & Machine Learning
Open source community project
●AI / ML 라이프 사이클을 가속화 하기위한
Red Hat 포트폴리오
● 오픈 소스 툴링의 가능성 가시화
● 오픈 소스 AI / ML 툴의 자동 배포
Metadata Management
Data Analysis
Model Lifecycle
Kubeflow, Seldonmlflow
ML Applications
Open Data Hub AI Library
Interactive Notebooks
JupyterHub, Hue
Business Intelligence
Superset
Big Data Processing
Spark Spark SQL Thrift
Streaming
Kafka Streams Elasticsearch
Data Exploration
Hue Kibana
Hive Metastore
Storage
Data Lake
Red Hat® Ceph Storage
In-Memory
Red Hat® Data Grid (Infinispan)
Relational Databases
PostgreSQL MySQL
Red Hat® AMQ Streams (Kafka Strimzi)
Red Hat® Ceph S3 API Kafka Connect Logstash Fluentd rsyslog
Red Hat® OpenShift OAuth
Red Hat® Single Sign-On (Keycloak)
Red Hat® Ceph Object Gateway
Red Hat® 3scale
Monitoring & Orchestration
Prometheus
Grafana
Argo Workflows
Jenkins CI/CD
Data scientist
Business Analyst
Data Engineer
Data Steward DevOps Engineer
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.31
Open Data Hub: Projects not products
●Unified analytics engine● Large-scale data access
● Multi-user Jupyter● Used for data science and research
● Monitoring and alerting toolkit● Used to diagnose problems
● Analytics platform for all metrics● Query, visualize and alert on metrics
● Deploying machine learning models
as micro-services● Full model lifecycle management
● Distributed Object Store● S3 Interface
● Distributed event streaming● Pub/Sub Messaging
● Container-native workflow engine● Declaratively deploy ML pipelines
and models
* *
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.32
Key workflows in the ML lifecycle enabled by Red Hat portfolio
Set goals
Gather and prepare data
Develop ML model Deploy ML models in app dev process
Implement apps
ML models monitoring & management
Jupyter-notebook as-a-service
Data acquisition and preparation
ML model serving and inferencing
Continuous development, integration, and deployment for ML models in
production
Why Red Hat for AI/ML?
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.34
레드햇 포트폴리오 상의 ML 라이프사이클의주요 워크플로우
● ML 모델링을 위한 Jupyter-notebook-as-a-Service (e.g.)
● 데이터 수집 및 준비
● ML 모델 제공 및 추론
● 프로덕션 환경에서 ML 모델을 위한 지속적인 개발, 통합 및 배포
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.35
JUPYTER-NOTEBOOK-AS-A-SERVICE
Project Echo
Project Tango
CPUs, GPUs, Memory, NVMe
• 인증과 RBAC
• 프로젝트간자원격리(데이터, 네트워크, 접근등)
• 보호된상태에서의자원공유
• 시스템자원격리
• 자원할당
• 우선순위와선점설정
Data Scientist
Data Scientist
Jupyter Hub
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.36
데이터 수집과 준비
Open Data Hub Operator
DB
AI/ML platformExternal data sources
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.37
Source-to-Image for models
Model Serving Image
S2I Base Image
BUILD
requirements.txtnotebook.ipnyb
Experiment
Serving Endpoint
Metrics Endpoint
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.38
Models as stateless microservices
MODEL
MODEL SERVICE
MODEL MODEL
APP
SCALE HORIZONTALLY PHASED ROLLOUTS
MODEL
MODEL*
MODEL1 MODEL2 MODEL3
MULTIPLE TRIALS
APP APP
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.39
AI/ML 모델의 지속적인 개발, 통합, 배포
Data Scientist
Check-in to source repo
Model test, iteration and integration
Promote and serve models into
production as servicesContinuous monitoring and change management
Source-2-image
OCI container
Deloy notebook container
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.40
이미지 레지스트리 연계 방안 (1/2)
개발 OCP
Private Docker Registry
Project
PodPod
Image Build & PushResource Build
CI/CD Pipeline
운영 OCP
Project
PodPodPod
Image mirror
Private Docker Registry
Image deploy
CI/CD 파이프라인설계시개발계 Image registry에서운영계 Image registry로이미지복사하는과정을추가하여설계하는방안
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.41
이미지 레지스트리 연계 방안 (2/2)
개발 OCP
Private Docker Registry
Project
PodPod
Image Build & PushResource Build
CI/CD Pipeline
운영 OCP
Project
PodPodPod
Add Image stream
Internal ImageRegistry
Image deploy
CI/CD 파이프라인설계시운영계 OCP의 Image stream에개발계 Image registry의이미지를연결하여배포하는방안
1. Image를별도로복사하는과정이불필요
2. OpenShift 프로젝트(Namespace)마다별도로태깅하여서로다른프로젝트에서는접근불가
Image stream link
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.42
CONTAINER SECURITY OPERATOR (CSO) IN OCP CONSOLE
Fetches vulnerability data from Quay / Clair
콘솔 대시보드 우측에 표기되는 모든 컨테이너취약점 검토
Scans images running on OpenShift and exposes data via API
Operator monitors pod objects and updates vulnerability data
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.43
Monitoring performance and drift
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.44
AI 서비스는 어디에서나 동작해야 합니다.
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.45
APPLICATION PORTABILITY WITH CONTAINERS
RHEL Containers + RHEL Host = Guaranteed PortabilityAcross Any Infrastructure
LAPTOP
Container
Application
OS dependencies
Guest VM
RHEL
BARE METAL
Container
Application
OS dependencies
RHEL
VIRTUALIZATION
Container
Application
OS dependencies
Virtual Machine
RHEL
PRIVATE CLOUD
Container
Application
OS dependencies
Virtual Machine
RHEL
PUBLIC CLOUD
Container
Application
OS dependencies
Virtual Machine
RHEL
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.46
OPENSHIFT의 AI/ML의 워크로드 정리
Persistentstorage
Existing automation toolsets
SCM(GIT)
CI/CD
Service layer
Registry
RHEL
Node
c
RHEL
Node
RHEL
Node
RHEL
Node
RHEL
Node
RHEL
Node
C
C
C C
C
C
C CC C
Red HatEnterprise Linux
Master
API/Authentication
Data store
Scheduler
Health/scaling
클라우드, 데이터 센터 및 엣지에AI/ML 배포
AI/ML을서비스로 노출,
로드 밸런싱, 확장 가능
공유 리소스에서 효율적으로 조정되는 마이크로 서비스로서의
AI/ML
소프트웨어 개발 생명 주기(SDLC) 제공
AI/ML 운영환경
Physical Virtual Private Public Hybrid
Data scientists
IT operations
Go to AI/ML World with Red HatImplementing AI/ML into your own business could have huge benefits for you and your customers.
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.48
OPENSHIFT의 AI/ML의 워크로드
Healthcare and public sector
Automotive Financial Oil and gas
RBC BankContainerized Apache Spark
Discover Financial ServicesUse Case
HCA HealthcareData driven diagnosis
Boston Children’s HospitalData driven diagnosis
Mass Open CloudUse Case
Ministry of Defence (Israel)Jupyter notebooks as a service
BMW GroupConnected Drive &
Autonomous Driving
ExxonMobilDemocratize data science for oil and gas exploration
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.49
발표 제목 입니다
• 내용 입니다
Dr. Edmund JacksonVP & Chief Data ScientistHCA Healthcare
패혈증 검사에 최대 18시간의 단축이 있었습니다.예측 데이터 분석을 사용하여 패혈증 탐지에 표준화된디지털 접근 방식을 확립했습니다.
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.50
발표 제목 입니다
• 내용 입니다
우리는 전세계 시장에서 고객에게 서비스를 제공해야하는 경우 다른 시장에서클러스터를 현지화 할 수있습니다.
Dr. Alexander LenkLead Architect Connected Vehicle, Digital Backend, Big Data, BlockchainBMW Group
BMWConnected Drive
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.51
The intelligent application journey
ThinkingAbout It
EvaluatingOptions
Proof of Concept ProductionExploring
Options
How do we get here?
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.52
AI/ML discovery session
A one day, no-cost planning session
GOAL: DETAIL: RED HAT PROVIDES:
To understand the customer’s business drivers and technical use cases to propose an Open AI/ML architecture.
Discussion guided by Red Hat AI/ML Consulting Architect
Attendees from Data Science team, Business decision makers, Engineering, Operations, andApplication Development
Deep dive into customer use cases
Our vision for organization-wide AI/ML adoption
Tailored proposal to solve customer use cases
High-level adoption roadmap
Red Hat AI/ML service offerings
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.53
Who attends the AI/ML discovery session?
Customer●Data scientists● Data engineers● IT operations● Intelligent App developers● Business sponsors
Red Hat● AI Center of Excellence (leads)● Red Hat Services● Open Innovation Labs● Solutions Architect (optional)● Account Team (optional)
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.54
AI/ML Architecture Review
Vendor lock-in and high costs of scaling AI/ML applications
Open and flexible architecture for AI/ML applications
Inputs Activities Outcomes• Customer AI/ML use cases• AI/ML platform
requirements
• Day 1: Customer use cases deep dive• Day 2: AI/ML architecture review workshop• Day 5: Live demo in customer lab• Mentoring and roadmap creation
• AI/ML platform strategy• AI/ML architecture and road
map• Live demo in customer lab
Scope• Red Hat OpenShift• 1 week
Key contacts• Neeraj Kuppam - Services Portfolio Team• William Benton - AI Center of Excellence
Solutions workshopWe offer workshops to understand your requirements and develop a solution strategy and execution roadmap.
Visit our websitesLearn more about our AI/ML capabilities, and see success stories from existing customers.
openshift.com/ai-ml, opendatahub.io
Watch our videosVisit our YouTube channel to discover a wide range of videos answering all your AI/ML questions.
Get started and harness the potential of AI/ML with Red Hat
NEXT STEPS
ⓒ 2020. Goodmorning Information Technology Co., Ltd all rights reserved.56
감사합니다