PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM...
Transcript of PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM...
![Page 1: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter](https://reader033.fdocuments.net/reader033/viewer/2022043000/5f76784d5cced8183a63a3ae/html5/thumbnails/1.jpg)
PowerAIWorld’s Fastest AI Platform for Enterprise
Sumit GuptaVP, HPC, AI, and AnalyticsIBM Cognitive Systems
May 2017
![Page 2: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter](https://reader033.fdocuments.net/reader033/viewer/2022043000/5f76784d5cced8183a63a3ae/html5/thumbnails/2.jpg)
NewadditionstoPowerAI
2
![Page 3: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter](https://reader033.fdocuments.net/reader033/viewer/2022043000/5f76784d5cced8183a63a3ae/html5/thumbnails/3.jpg)
3
Transmission Line Inspection
![Page 4: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter](https://reader033.fdocuments.net/reader033/viewer/2022043000/5f76784d5cced8183a63a3ae/html5/thumbnails/4.jpg)
4
Data LakeTransform & Prep Data (ETL)
Trained Model
Images of Damaged
Components
Model Training
Transform & Prep Data (ETL)
Off-LineTraining
Production
Live Video
![Page 5: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter](https://reader033.fdocuments.net/reader033/viewer/2022043000/5f76784d5cced8183a63a3ae/html5/thumbnails/5.jpg)
5
Data Lake & Data Stores
Distributed Computing
ML & DL Libraries & Frameworks
Cognitive APIs (Eg: Watson)
In-House Cognitive APIs
Applications
Hadoop HDFS,NoSQL DBs
Spark, MPI
TensorFlow, Caffe, SparkML
Speech, Vision, NLP, Sentiment
Segment Specific: Finance, Retail, Healthcare, etc.
Accelerated Servers Storage
Accelerated Infrastructure
Transform & Prep Data (ETL)
![Page 6: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter](https://reader033.fdocuments.net/reader033/viewer/2022043000/5f76784d5cced8183a63a3ae/html5/thumbnails/6.jpg)
6
Data Lake & Data Stores
Distributed Computing
ML & DL Libraries & Frameworks
Cognitive APIs (Eg: Watson)
In-House Cognitive APIs
Applications
Accelerated Servers Storage
Data Prep, ETL, Curation, Data
Labeling
Performance to Reduce Training Time
Multi-tenant, Cluster Virtualization, DL
Framework Scaling
Feature extraction, Selecting Right Model,
Hyper-parameter tuning
Finding Right “Tagged” Data, Model Integrity
Use Case Identification, Access to Enough Data
Transform & Prep Data (ETL)
![Page 7: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter](https://reader033.fdocuments.net/reader033/viewer/2022043000/5f76784d5cced8183a63a3ae/html5/thumbnails/7.jpg)
PowerAI: Enterprise Class, Ease of Use, Faster Training
Enterprise Software Distribution
BinaryPackageofMajorDeepLearningFrameworkswithEnterpriseSupport
Tools for Ease of Development
GraphicaltoolstoEnhanceDataScientistDeveloper
Experience
Faster Training Times for Data Scientists
PerformanceOptimizedforSingleNode&Distributed
ComputingScaling
![Page 8: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter](https://reader033.fdocuments.net/reader033/viewer/2022043000/5f76784d5cced8183a63a3ae/html5/thumbnails/8.jpg)
PowerAI: Making AI More Accessible to Developers
• AIVision:TargetedatApplicationDevelopers
• DataExtraction,TransformationandPreparationtool
• DLInsight
• DistributedDeepLearning
Multi-tenant,Enterprise-readyDeepLearningPlatformforDataScientists8
![Page 9: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter](https://reader033.fdocuments.net/reader033/viewer/2022043000/5f76784d5cced8183a63a3ae/html5/thumbnails/9.jpg)
PowerAI
DL Frameworks + Libraries(TensorFlow, Caffe, ..)
IBM Data Science Experience (DSX)
Distributed Computing with Spark & MPI
DL Developer Tools
SpectrumScaleHigh-SpeedFileSystemviaHDFSAPIsClusterofNVLink Servers
PowerAI Enterprise (Coming soon)
IBM Enterprise Support
Application Dev Services
EnterpriseSupport&ServicestoAugmentEnterprise
Expertise
Packaged,Pre-CompiledDeepLearningFrameworks
(TensorFlow,Caffe,Torch,..)
OptimizedforScaling&FastTrainingTime
DataScientistsProductivityToolsTargetedtoDL
Developers
IBMConfidential
![Page 10: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter](https://reader033.fdocuments.net/reader033/viewer/2022043000/5f76784d5cced8183a63a3ae/html5/thumbnails/10.jpg)
DL Frameworks (TF, Caffe, etc)
Data Prep & ETL via Spectrum Conductor
with Spark
InputData
Deep Learning GUIData & Model
Management, ETL Tools, Monitor, Visualize,
Advise
DL InsightTuning Engine
AI VisionComputer Vision App Development Toolkit
IBM Spectrum Conductor with SparkSystem mgmt, Distributed ETL, Distributed Training, Hyper-Parameter Optimization
Distributed Training
![Page 11: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter](https://reader033.fdocuments.net/reader033/viewer/2022043000/5f76784d5cced8183a63a3ae/html5/thumbnails/11.jpg)
11
![Page 12: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter](https://reader033.fdocuments.net/reader033/viewer/2022043000/5f76784d5cced8183a63a3ae/html5/thumbnails/12.jpg)
Tumor Proliferation Assessment – mitosis detectionImages from electron-microscope Size of image - 70K * 60K
Framework Format Input Size (Faster R-CNN)
Caffe LMDB 1K*1K
TensorFlow TensorRecord 1K*1K
Data Transformation
Data Distribution among training, validation and testing
Data Shuffle
![Page 13: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter](https://reader033.fdocuments.net/reader033/viewer/2022043000/5f76784d5cced8183a63a3ae/html5/thumbnails/13.jpg)
Import data from different formats Transform, split and shuffle data
![Page 14: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter](https://reader033.fdocuments.net/reader033/viewer/2022043000/5f76784d5cced8183a63a3ae/html5/thumbnails/14.jpg)
![Page 15: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter](https://reader033.fdocuments.net/reader033/viewer/2022043000/5f76784d5cced8183a63a3ae/html5/thumbnails/15.jpg)
RandomTPE
Tree-based ParzenEstimator
Bayesian
Multi-tenant Spark Cluster(IBM Spectrum Conductor with Spark)
Spark search jobs are generated dynamically and executed in parallel
![Page 16: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter](https://reader033.fdocuments.net/reader033/viewer/2022043000/5f76784d5cced8183a63a3ae/html5/thumbnails/16.jpg)
Data preparation Model training/tuning
Inference Marked result
![Page 17: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter](https://reader033.fdocuments.net/reader033/viewer/2022043000/5f76784d5cced8183a63a3ae/html5/thumbnails/17.jpg)
AIVision
Data Lake & Data Stores
Distributed Computing
ML & DL Libraries & Frameworks
Accelerated Servers Storage
Data set management Training task management
Model management Inference API management
Service Management LayerImage preprocessing
managementData label management
Self-defined Training with visualized
monitoring
Custom Learning for Image Classification
Inference API deployment
Image Labeling and Preprocessing
Vision Recognition LayerVideo Labeling
ServiceCustom Learning for
Object Detection
![Page 18: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter](https://reader033.fdocuments.net/reader033/viewer/2022043000/5f76784d5cced8183a63a3ae/html5/thumbnails/18.jpg)
AI Vision
18
Result on public cloud API : white, red, yellow and teal bird
Result on public cloud API : white and black short beak bird
I’m Aethopyga I’m Pycnonotus
We need to get a new model to classify birds with professional knowledge.
Acridotheres Acrocephalus Aethopyga
Butorides Corvus… >20 categories
User defines categories in AI Vision
Aethopyga: 0.90708
Pycnonotus: 0. 99988
![Page 19: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter](https://reader033.fdocuments.net/reader033/viewer/2022043000/5f76784d5cced8183a63a3ae/html5/thumbnails/19.jpg)
AI Vision
19
Medical image analysis for cytologic examination AI Talents:
We need tools to speed up
(study number from China)
![Page 20: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter](https://reader033.fdocuments.net/reader033/viewer/2022043000/5f76784d5cced8183a63a3ae/html5/thumbnails/20.jpg)
SAMPLE USE CASE: SALES ORDER PROCESSING
• Traditional capture is difficult on Sales Orders (SO)• Sales orders contain line data; one SO can have hundreds or
thousands of different line items• Large enterprises might have tens of thousands of clients ordering
items or services by email• Each client might have multiple locations that each has unique order
template(s)• Sample calculation: 40 000 clients x 20 locations -> 800 000 unique
Sales Order templates• To implement using traditional capture by templating:
• 10 hours / template -> 8 million hour exercise -> very bad business case!
• Each order could have hundreds of complex order items
![Page 21: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter](https://reader033.fdocuments.net/reader033/viewer/2022043000/5f76784d5cced8183a63a3ae/html5/thumbnails/21.jpg)
EXAMPLE: SALES ORDER PROCESSING USING DATACAP & ELINAR.AI
Oldorders/invoices+extractedinformation
=SeveralweeksofSuperComputercapacity(Power8 Minsky + power.ai)
TrainedAIModel
DatacapValidati-on&
Verificat-ion
IncomingOrder/Invoice
DatacapOCR/Layout
DatacapExtracti-
on
CustomerERP/Finance
Order/InvoiceHistory
AITraining
![Page 22: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter](https://reader033.fdocuments.net/reader033/viewer/2022043000/5f76784d5cced8183a63a3ae/html5/thumbnails/22.jpg)
DETAILS ON IMPLEMENTATION
• Lots of training material needed• IBM Datacap is used to create page layout.xml for each order• Previously human extracted values need to be mached into each
layout.xml for training purposes
• Clever data preparation allows higher quality/accuracy• We can use simple rules to tag certain types of data before it is fed
into neural network; for example Unit of Measurement (UOM) and ZIP code are easy
• Neural network can use these “hints” to increase training accuracy when data set is small; for example if page has 23 UOM tokens it is quite obvious that there has been 23 different order line
• Implemented using Torch LSTMs
![Page 23: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter](https://reader033.fdocuments.net/reader033/viewer/2022043000/5f76784d5cced8183a63a3ae/html5/thumbnails/23.jpg)
COMING SOON: ELINAR A.I. MINER FOR GDPR DATA• Set of AIs that can reliably extract personal data and privacy
information from:• Business documents and records• Databases and NoSQL data sources• Images
• Pipeline uses Neural Networks implemented using Caffe and Torch augmented with IBM BigInsights text miners and business rules
• Fully developed on IBM Power platfrom, AIs using power.ai
![Page 24: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter](https://reader033.fdocuments.net/reader033/viewer/2022043000/5f76784d5cced8183a63a3ae/html5/thumbnails/24.jpg)
WHY IBM POWER.AI?
• Nice packaging that has everything Deep Learning Nerd needs J
• Very fast time to value due simple installation; everything works “out-of-the-box”
• Leverages unique Power8 CPU-GPU NVLink communications on “Minsky” and P100 GPUs
• Allows developer to run insanely powerful “Minsky” supercomputer with standard AI tooling like Caffe and Torch
• We previously developed on high end x86, there is no going back
• Can run larger models faster
![Page 25: PowerAI - NVIDIA€¦ · Tuning Engine AI Vision Computer Vision App Development Toolkit IBM Spectrum Conductor with Spark System mgmt, Distributed ETL, Distributed Training, Hyper-Parameter](https://reader033.fdocuments.net/reader033/viewer/2022043000/5f76784d5cced8183a63a3ae/html5/thumbnails/25.jpg)
Thank You