Open Standard APIs for Embedded Vision Processing · Need for Camera Control API • We have choice...
Transcript of Open Standard APIs for Embedded Vision Processing · Need for Camera Control API • We have choice...
© Copyright Khronos Group 2014 - Page 1
Open Standard APIs for Embedded Vision Processing
Neil Trevett Vice President Mobile Ecosystem, NVIDIA
President, Khronos Group
© Copyright Khronos Group 2014 - Page 2
Speakers This Morning • Neil Trevett
- Vice President Mobile Ecosystem, NVIDIA
- President, Khronos
- Chair, OpenCL Working Group
• Mikael Sevenier
- Chair, Camera working group
• Jim Steele
- CTO, Sensor Platforms
- Chair, StreamInput
© Copyright Khronos Group 2014 - Page 3
Khronos Connects Software to Silicon
Open Consortium creating
ROYALTY-FREE, OPEN STANDARD
APIs for hardware acceleration
Defining the roadmap for
low-level silicon interfaces
needed on every platform
Graphics, compute, rich media,
vision, sensor and camera
processing
Rigorous specifications AND
conformance tests for cross-
vendor portability
Acceleration APIs
BY the Industry
FOR the Industry
Well over a BILLION people use Khronos APIs
Every Day…
© Copyright Khronos Group 2014 - Page 4
Khronos Standards
Visual Computing - 3D Graphics - Heterogeneous Parallel Computing
3D Asset Handling - 3D authoring asset interchange
- 3D asset transmission format with compression
Acceleration in HTML5 - 3D in browser – no Plug-in
- Heterogeneous computing for JavaScript
Camera
Control API
Over 100 companies defining royalty-free
APIs to connect software to silicon
Sensor Processing - Vision Acceleration - Camera Control - Sensor Fusion
© Copyright Khronos Group 2014 - Page 5
Sensors & Vision Driving Key Mobile Use Cases
Augmented Reality
Natural UI with Face, Body and
Gesture Tracking
Computational Photography and
Videography
3D Scene and Object Reconstruction
Time
© Copyright Khronos Group 2014 - Page 6
Vision Pipeline Challenges and Opportunities
• Light / Proximity
• 2 cameras
• 3 microphones
• Touch
• Position
- GPS
- WiFi (fingerprint)
- Cellular trilateration
- NFC/Bluetooth Beacons
• Accelerometer
• Magnetometer
• Gyroscope
• Pressure / Temp / Humidity
19
Sensor Proliferation Diverse sensor awareness of
the user and surroundings
• Camera sensors >20MPix
• Novel sensor configurations
• Stereo pairs
• Active Structured Light
• Active TOF
• Plenoptic Arrays
Growing Camera Diversity Capturing color, range
and lightfields
Diverse Vision Processors Driving for high performance
and low power
• Camera ISPs
• Dedicated vision IP blocks
• DSPs and DSP arrays
• Programmable GPUs
• Multi-core CPUs
Flexible sensor and camera
control to generate
required image stream
Use best processing available
for image stream processing –
with code portability
Control/fuse vision data
by/with all other sensor data
on device
Camera Control API
© Copyright Khronos Group 2014 - Page 7
OpenVX – Power Efficient Vision Acceleration • Acceleration API for real-time vision
- Focus on mobile and embedded systems
• Enable diverse efficient implementations
- From CPUs, through GPUs and DSPs
to dedicated hardware
• Foundational API for vision acceleration
- Can be used by middleware libraries or
by applications directly
• Complementary to OpenCV
- Which is great for prototyping
• Khronos open source sample implementation
- To be released with final specification
- Sample - not reference - spec remains the
definitive definition of OpenVX operation
Open source sample
implementation
Hardware vendor
implementations
OpenCV open
source library
Other higher-level
CV libraries
Application
© Copyright Khronos Group 2014 - Page 8
OpenVX Graphs – The Key to Efficiency • Vision processing directed graphs for power and performance efficiency
- Each Node can be implemented in software or accelerated hardware
- Nodes may be fused by the implementation to eliminate memory transfers
- Processing can be tiled to keep data entirely in local memory/cache
• EGLStreams can provide data and event interop with other Khronos APIs
- BUT use of other Khronos APIs are not mandated
• VXU Utility Library for access to single nodes
- Easy way to start using OpenVX by calling each node independently
OpenVX Node
OpenVX Node
OpenVX Node
OpenVX Node
Heterogeneous
Processing
Native
Camera
Control
Example OpenVX Graph
© Copyright Khronos Group 2014 - Page 9
OpenVX 1.0 Function Overview • Core data structures
- Images and Image Pyramids
- Processing Graphs, Kernels, Parameters
• Image Processing
- Arithmetic, Logical, and statistical operations
- Multichannel Color and BitDepth Extraction and Conversion
- 2D Filtering and Morphological operations
- Image Resizing and Warping
• Core Computer Vision
- Pyramid computation
- Integral Image computation
• Feature Extraction and Tracking
- Histogram Computation and Equalization
- Canny Edge Detection
- Harris and FAST Corner detection
- Sparse Optical Flow
OpenVX 1.0 defines
framework for
creating, managing and
executing graphs
Focused set of widely
used functions that are
readily accelerated
Implementers can add
functions as extensions
Widely used extensions
adopted into future
versions of the core
OpenVX Specification
Evolution
© Copyright Khronos Group 2014 - Page 10
Example Graph - Stereo Machine Vision
Camera 1 Compute Depth
Map (User Node)
Detect and track objects (User Node)
Camera 2
Image Pyramid
Stereo Rectify with
Remap
Stereo Rectify with
Remap
Compute Optical Flow
Object
coordinates
OpenVX Graph
Delay
Tiling extension enables user nodes (extensions) to also optimally run in local memory
© Copyright Khronos Group 2014 - Page 11
OpenVX and OpenCV are Complementary
Governance Community driven open source
with no formal specification
Formal specification defined and
implemented by hardware vendors
Conformance No conformance tests for consistency and
every vendor implements different subset
Full conformance test suite / process
creates a reliable acceleration platform
Portability APIs can vary depending on processor Hardware abstracted for portability
Scope Very wide
1000s of imaging and vision functions
Multiple camera APIs/interfaces
Tight focus on hardware accelerated
functions for mobile vision
Use external camera API
Efficiency Memory-based architecture
Each operation reads and writes memory
Graph-based execution
Optimizable computation, data transfer
Use Case Rapid experimentation Production development & deployment
© Copyright Khronos Group 2014 - Page 12
OpenVX Participants and Timeline • Provisional 1.0 specification released November 2013 for industry feedback
• Aiming for specification finalization and conformance tests 3Q14
• Itseez is working group chair (the convener of OpenCV)
• Qualcomm and TI are specification editors
© Copyright Khronos Group 2014 - Page 13
OpenCL – Portable Heterogeneous Computing • Portable Heterogeneous programming of diverse compute resources
- Targeting supercomputers -> embedded systems -> mobile devices
• One code tree can be executed on CPUs, GPUs, DSPs and hardware
- Dynamically interrogate system load and balance work across available processors
• OpenCL = Two APIs and C-based Kernel language
- Platform Layer API to query, select and initialize compute devices
- Kernel language - Subset of ISO C99 + language extensions
- C Runtime API to build and execute kernels
across multiple devices OpenCL
Kernel
Code
OpenCL
Kernel
Code
OpenCL
Kernel
Code
OpenCL
Kernel
Code
GPU
DSP CPU
CPU HW
© Copyright Khronos Group 2014 - Page 14
OpenCL as Foundation for Parallel Compute • 100+ tool chains and languages leveraging OpenCL
- Heterogeneous solutions emerging for the most popular programming languages
C++ syntax
compiler
extensions
SYCL JavaScript
binding for
initiation of
OpenCL C
kernels
WebCL River Trail
Language
extensions to
JavaScript
C++ AMP
Shevlin Park
Uses Clang
and LLVM
OpenCL provides vendor optimized,
cross-platform, cross-vendor access to
heterogeneous compute resources
Harlan
High level
language
for GPU
programming
Compiler
directives for
Fortran,
C and C++
Aparapi
Java language
extensions
for
parallelism
PyOpenCL
Python
wrapper
around
OpenCL
Language for
image
processing and
computational
photography
SPIR Standard Portable
Intermediate Representation (extending LLVM for parallel computation)
SPIR 1.2 Released in January 2014
© Copyright Khronos Group 2014 - Page 15
OpenVX and OpenCL are Complementary
Use Case General
Heterogeneous programming Domain targeted Vision processing
Architecture Language-based
– needs online compilation Library-based
- no online compiler required
Target Hardware
‘Exposed’ architected memory model – can impact performance portability
Abstracted node and memory model - diverse implementations can be optimized
for power and performance
Precision Full IEEE floating point mandated Minimal floating point requirements –
optimized for vision operators
Ease of Use General-purpose math libraries with
no built-in vision functions Fully implemented vision operators and
framework ‘out of the box’
It is possible to use OpenCL to build OpenVX Nodes
© Copyright Khronos Group 2014 - Page 16
Need for Camera Control API • We have choice of APIs for image and vision image processing
- BUT no open standard API for camera control to FEED these APIs!
• Need advanced control of ISP and camera subsystem
- Generate sophisticated image stream for advanced imaging & vision apps
• No system API fulfills all developer requirements
- Advanced, high-frequency burst control of camera and sensor operation
- Portable support for diversity of sensors: e.g. depth sensors and sensor arrays
- Tight system integration: e.g. synch of camera and MEMS sensors
Pre-processing Image Signal
Processor (ISP) Post-processing
Sensor, Color Filter Array
Lens, Flash, Focus, Aperture
Bayer RGB/YUV Image/Vision
Applications
Lens, sensor, aperture control
3A - Auto Exposure (AE), Auto White Balance (AWB), Auto Focus (AF)
Scope of Camera Control API
© Copyright Khronos Group 2014 - Page 17
Advanced Camera Control Use Cases • High-dynamic range (HDR) and computational flash photography
- High-speed burst with individual frame control over exposure and flash
• Subject isolation and depth detection - High-speed burst with individual frame control over focus
• Rolling shutter elimination
- High-precision intra-frame synchronization between camera and motion sensor
• Augmented Reality
- 60Hz, low-latency capture with motion sensor synchronization
- Multiple Region of Interest (ROI) capture
- Synchronized stereo sensors for scene scaling
- Detailed feedback on camera operation per frame
• Time-of-flight or structured light depth camera processing
- Aligned stacking of data from multiple sensors
© Copyright Khronos Group 2014 - Page 18
Camera API Architecture will be FCAM-based • No global state
- State travels with image requests
- Every stage in the pipeline may have different state
- Enables fast, deterministic state changes
• Synchronize devices
- Lens, flash, sound capture, gyro…
- Devices can schedule Actions
- E.g. to be triggered on exposure change
© Copyright Khronos Group 2014 - Page 19
Khronos Camera API Requirements • Application control over ISP processing (including 3A)
- Including multiple, re-entrant ISPs
• Control multiple sensors with synch and alignment
- E.g. Stereo pairs, Plenoptic arrays, TOF or structured light depth cameras
• Enhanced per frame detailed control
- Format flexibility, Region of Interest (ROI) selection
• Global timing & synchronization
- E.g. Between cameras and MEMS sensors
• Flexible processing/streaming
- Multiple input and output streams with RAW, Bayer or YUV Processing
- Streaming of rows (not just frames)
Enable new camera functionality not available on current platforms
and align with future platform directions for easy adoption
© Copyright Khronos Group 2014 - Page 20
Camera API Design Milestones and Philosophy • C-language API starting from proven designs
- e.g. FCAM
• Design alignment with widely used hardware standards
- e.g. MIPI CSI
• Focus on mobile, power-limited devices
- But do not preclude other use cases such as automotive, surveillance, DSLR…
• Minimize overlap and maximize interoperability with other Khronos APIs
- But other Khronos APIs are not mandated
• Support vendor-specific extensions
Apr13
Jul13
Group charter approved
4Q13
Architectural Design
1Q14
First draft specification
2Q14
Sample implementation
and tests
3Q14
Specification ratification
Working group proposed
© Copyright Khronos Group 2014 - Page 21
• Android Exposes Java camera APIs to developers
- Controls underlying Camera HAL
• Camera HAL v1 API simplified basic point and shoot apps
- Difficult or impossible to do much else
• Camera HAL v3 API is a fundamentally different API
- Streams-based to enable more sophisticated camera applications
Potential Adoption on Android
Open source
project developed
by Nokia and
Stanford
Camera API
HAL V3 adopts many
FCAM ideas and can use
EGL in its implementation
Khronos Camera API builds on FCAM with a
goal of being forward compatible with
Android architecture
Khronos Camera API may be used to IMPLEMENT
Android Camera HAL – and provide an advanced
native camera API in NDK
© Copyright Khronos Group 2014 - Page 22
StreamInput
Jim Steele CTO, Sensor Platforms
Chair, StreamInput Working Group
© Copyright Khronos Group 2014 - Page 23
Sensor Industry Fragmentation …
© Copyright Khronos Group 2014 - Page 24
Low-level Sensor Abstraction API
Apps Need Sophisticated Access to Sensor Data Without coding to specific
sensor hardware
Apps request semantic sensor information StreamInput defines possible requests, e.g.
Read Physical or Virtual Sensors e.g. “Game Quaternion”
Context detection e.g. “Am I in an elevator?”
StreamInput processing graph provides
optimized sensor data stream High-value, smart sensor fusion middleware can connect
to apps in a portable way
Apps can gain ‘magical’ situational awareness
Advanced Sensors Everywhere Multi-axis motion/position, quaternions,
context-awareness, gestures, activity monitoring, health and environmental sensors
Sensor Discoverability
Sensor Code Portability
© Copyright Khronos Group 2014 - Page 25
Sensor Types • Basic sensor data:
- Acceleration, Magnetic Field, Angular Rates
- Pressure, Ambient Light, Proximity, Temperature, Humidity, RGB light, UV light
- Heart rate, Blood Oxygen Level, Skin Hydration, Breathalyzer
• Sensor fusion
- Orientation (Quaternion or Euler Angles), Gravity, Linear Acceleration
- Position
• Context awareness
- Device Motion: general movement of the device: still, free-fall, …
- Carry: how the device is being held by a user: in pocket, in hand, …
- Posture: how the body holding the device is positioned: standing, sitting, step, …
- Transport: about the environment around the device: in elevator, in car, …
© Copyright Khronos Group 2014 - Page 26
StreamInput: Potential Sensor Fusion Stack
OS Sensor APIs (E.g. Android SensorManager or
iOS CoreMotion)
Low-level native API defines access to
fused sensor data stream and context-awareness
…
Applications
Sensor Sensor
Sensor
Hub Sensor
Hub
StreamInput implementations
compete on sensor stream quality,
reduced power consumption,
environment triggering and context
detection – enabling sensor
subsystem vendors to increased
ADDED VALUE
Middleware (E.g. Context-awareness engines,
gaming engines)
Platforms can provide
increased access to
improved sensor data stream
– driving faster, deeper
sensor usage by applications
Middleware engines need platform-
portable access to native, low-level
sensor data streams
Mobile or embedded
platforms without sensor
fusion APIs can provide
direct application access
to StreamInput
Hardware transport
interfaces are defined
by each system, e.g.
IIO or HID sensor
Embedded processors or
peripheral hardware
implementing StreamInput
provide a standard
interface to other system
processors
© Copyright Khronos Group 2014 - Page 27
Khronos APIs for Augmented Reality
Advanced Camera Control and stream
generation
3D Rendering and Video
Composition
On GPU
Audio
Rendering
Application
on CPUs, GPUs
and DSPs
Sensor
Fusion
Vision
Processing
MEMS
Sensors
Camera Control
API
EGLStream - stream data
between APIs
Precision timestamps
on all sensor samples
AR needs not just advanced sensor processing, vision
acceleration, computation and rendering - but also for
all these subsystems to work efficiently together
© Copyright Khronos Group 2014 - Page 28
Summary • Khronos is building a family of interoperating APIs for portable and
power-efficient vision processing
• OpenVX 1.0 has been provisionally released and non-members are invited to
provide feedback on the forums - http://www.khronos.org/message_boards/forumdisplay.php/110-OpenVX-General
• Khronos camera and sensor fusion APIs are currently in design and complement
and integrate with OpenVX
• Any company is welcome to join Khronos to influence the direction of mobile
and embedded vision processing!
- $15K annual membership fee for access to all Khronos API working groups
- Well-defined IP framework protects your IP and conformant implementations
• www.khronos.org