1 APPLICATIONS IN NATURAL LANGUAGE PROCESSING NATURAL LANGUAGE INTERFACES AND DIALOGUE SYSTEMS.
NATURAL USER INTERFACES - AMD
Transcript of NATURAL USER INTERFACES - AMD
NATURAL USER INTERFACESThe Second Revolution in Human-Computer InteractionNatural User Interface Track Bill CurtisAMDSenior Fellow
3 | Natural User Interface | June 2011
AGENDA
How do we control “things”?Human-computer interface revolution #1: Interactive computingHuman-computer interface revolution #2: Natural user interface (NUI)Three layer NUI model“Revolutionary” platforms for NUISummary – why it’s important to “think revolutionary”
4 | Natural User Interface | June 2011
HOW DO WE CONTROL “THINGS”?
5 | Natural User Interface | June 2011
HOW DO WE CONTROL “THINGS”?
Mechanical machines have always used direct, intuitive controls
Doorknob: US Patent 1878“Improvement in a door holding device”
Machine tools – 19th century
6 | Natural User Interface | June 2011
HOW DO WE CONTROL “THINGS”?
As complexity increased, we still used familiar wheels, knobs, keys, buttons, levers
7 | Natural User Interface | June 2011
HOW DO WE CONTROL “THINGS”?
Even the most complex electronic systems follow mechanical control patternsFit-for-purpose systems, no matter how complicated, use direct, intuitive controls
8 | Natural User Interface | June 2011
HOW DO WE CONTROL “THINGS”?
Direct control concepts also apply to consumer electronics Fit-for-purpose remotes are perfectly designed for each device, but multi-purpose is a big problem!
9 | Natural User Interface | June 2011
HOW DO WE CONTROL “THINGS”?
How does all this apply to computing?
?=
10 | Natural User Interface | June 2011
HOW DO WE CONTROL “THINGS”?
Human-Computer Interface: “HCI” began as “CI”Early computers were not interactive
– Machine output: Print, plot, character CRT– Machine input: Cards, tapes, console
For >40 years, we’ve been trying to make HCI interactive and intuitiveby simulating the real world and emulating direct controls
– Metaphorical output2D and 3D graphics, video, audio, physical device controls
Realistic rendering and instrumentation – desktop, appliances with buttons and knobs, game worlds
– Indirect human inputManipulate rendered output
Keyboards, pointing devices, handheld controllers, voice
Result: Interactive computing!
11 | Natural User Interface | June 2011
REVOLUTION 1 | Interactive Computing
12 | Natural User Interface | June 2011
FIRST REVOLUTIONARY * CHANGE IN HCI | Interactive Computing
Desktop metaphor + Mouse
Revolution – a fundamental change in the way of thinking about or visualizing something: a change of paradigmMerriam-Webster’s Collegiate Dictionary
“Though the world does not change with a change of paradigm, the scientist afterward works in a different world.”Thomas Samuel Kuhn, The Structure of Scientific Revolutions, 1962
13 | Natural User Interface | June 2011
REVOLUTION 1 | Interactive Computing
Started ~20 years ago – based on ~25 years of invention and evolution
1 Million Internet Hosts
1960 1970 1980 1990 2000 2010 2020
Doug Engelbart Patents Mouse - 1964
Pointing Devices
Xerox Alto (74)
Unix graphics Workstations
Apple Macintosh
Microsoft Windows 3.1
Macintosh x86
Invention>15 Years
Evolution10 Years
RevolutionMulti-billion $ Industry
IBM PC
SPARCStation
Raster Graphics
Windows95 & IEHTML Linux
Internet Privatization
(NSFNET reverts to research)
WindowsNT
14 | Natural User Interface | June 2011
REVOLUTION 1 | Interactive Computing
Why did the UI revolution take >25 years to reach consumers?
Multi-billion $ Industry
User Acceptance
Mature Ecosystem
Complete Platform
CPU, GPUInteractive computingMulti-purpose OS
Software technologyIndustry-wide appsWeb, HTML, browser
Productivity + funFamiliarityAffordability
15 | Natural User Interface | June 2011
REVOLUTION 2 | Natural User Interface
16 | Natural User Interface | June 2011
SECOND REVOLUTIONARY * CHANGE IN HCI | Natural User Interface
Computers start to communicate more like people
More natural, more intuitive
17 | Natural User Interface | June 2011
REVOLUTION 2 | Natural UI
Starting now – based on ~40 years of invention and evolution
Dictation Apps
Voice Controls (car, phone)
Invention>30 Years
1960 1970 1980 1990 2000 2010 2020
Resistive touch patents
Multi-touch capacitive (2)
First Practical Speech Recog
Evolution10 Years
iPad
RevolutionMulti-$B Industry
Kinect
Computer Vision R&D
3D Motion Capture
Newton MessagePad
Depth Cams
iPhone2007
Capacitive Touch R&D (1)
1 – E.A. Johnson (1967). Touch Displays: A Programmed Man-Machine Interface” Ergonomics 10 (2): 271-2772 – http://www.billbuxton.com/multitouchOverview.html
Microsoft Tablet
18 | Natural User Interface | June 2011
REVOLUTION 2 | Natural UI
Why did the UI revolution take >25 years to reach consumers?
Multi-billion $ Industry
User Acceptance
Mature Ecosystem
Complete Platform
CPU, GPUTouch, voice, sensorsTailored OS
Software frameworksTailored appsCurated software
Mobility + funEase of useAffordability
19 | Natural User Interface | June 2011
REVOLUTION 2 | Natural UI
The second “NUI Revolution” is just getting started
Where is it heading?
20 | Natural User Interface | June 2011
THREE LAYER NUI MODEL
21 | Natural User Interface | June 2011
THREE LAYER NUI MODEL
3. Ambient ComputingNUI extends across multiple devices
2. NUINatural User Interface
Software interprets human behavior
1. HCISensors detect human behavior
Computing becomes part of everyday life
Networked, cloud-based, always active
“The most profound technologies are those that disappear.They weave themselves into the fabric of everyday life until they are indistinguishable from it.” - Mark Weiser, Xerox PARC, 1991
Emulate human communicationMulti-sensory, contextual,
intuitive, learning
Detect and process human behavior>40 years of evolution - Vision, sound,
physical, environmental, biometric
22 | Natural User Interface | June 2011
THREE LAYER NUI MODEL
1. HCI Layer – Human-Computer Interface Detect and process human behavior
Physical• Mouse, Keyboard• Multi-touch• Tactile, haptics• Position sensors• Game controllers• Physical objects
Auditory• Context-free commands• Speaker independent• No training• Voice Search• Ambient sound recognition(always listening)
Biometric• Brain-Computer Interface (BCI)• Implantables• Neuroprosthetics• Security• Gaming• Medical
Environmental• GPS, RFID• Magnetometer• Temperature, pressure• Gyros, accelerometers• Molecular detection
Visual
• Free-space gestures• Person recognition• Eye, gaze tracking• Activity modeling• Background removal• Photo, video search
Three layer NUI model
23 | Natural User Interface | June 2011
2. NUI Layer – Middleware and Application FrameworkHuman behavior translates into action
NUI Apps Examples:
THREE LAYER NUI MODEL
Education
Gaming
Conferencing
Location-based
Collaboration
Security
Healthcare
Multimedia
NUI Platform MiddlewareExamples:
Point, select, manipulate
Image processing
Gesture recognition
Voice, sound recognition
Recognition – object, face
Ambient monitoring
Human factors
Common controls
HCI SensorsExamples:
Ambient Computing Cloud Services
24 | Natural User Interface | June 2011
3. Ambient Computing Layer – Multiple devices plus cloud servicesNUI becomes part of everyday life
Ambient Computing Cloud ServicesUser’s identity• Identity• Rules• Preferences• State
Cloud context• Device registry• Services registry• Current location, status• Social connections
Device context• Ambient apps• Events and triggers• NUI services (server offload)• Multi-user apps
THREE LAYER NUI MODEL
25 | Natural User Interface | June 2011
ILLUSTRATION OF AMBIENT COMPUTING
Corning’s Video – “A Day Made of Glass”Video is shown with permission of the Corning Glass Technologies Group
26 | Natural User Interface | June 2011
ILLUSTRATION OF AMBIENT COMPUTING
Ambient Computing Situations in the VideoThere’s more going on here than just glass and “touch screens everywhere”
User’s identity is passively recognized on multiple devices– Car, store computer recognized Jennifer. Could be via mobile device or facial recognition.
Cloud context creates consistent experience across multiple devices– Bathroom mirror, car navigation, highway display signs, surfaces, store computer
Device context flows between small screen and large screen devices– Stove, bus stop route display, office Surface, flexible display– Device-to-device communication
What’s missing? NUI is limited to touch. No voice or gesture HCI.
27 | Natural User Interface | June 2011
“REVOLUTIONARY” PLATFORMS FOR NUI
28 | Natural User Interface | June 2011
REVOLUTIONARY PLATFORMS FOR NUI
Compute– Realism – high fidelity video, audio– Natural input – Goal: intuitive human communication– Acceleration (APU) – data parallel algorithms– Efficiency – NUI duty cycle can be 100%
Software– Tailored OS and apps – fit-for-purpose controls– Ecosystem – Apps written for the platform
Sensors– High fidelity – video, audio– Low latency – I/O at greater than human speed– High bandwidth – systems for continuous duty cycle
29 | Natural User Interface | June 2011
REVOLUTIONARY PLATFORMS FOR NUI
Computational horsepower for NUI: The case for FusionHigher performance, lower power for visual NUI computing
Computer vision acceleration– Turn the graphics pipeline around
Fusion: Optimize user experience per unit of energy– Many HCI / NUI algorithms are well suited for data parallel execution
Fusion: High performance GPU memory access– Improves GP-GPU performance and programming productivity
Future Fusion: Architectural Optimization for HCI / NUI– Short term: Algorithm architecture and implementation (i.e. OpenCL™, OpenCV)– Long term: GPU architecture, camera input data path
APURendering
Vision
30 | Natural User Interface | June 2011
FUSION: USER EXPERIENCE PER UNIT OF ENERGY
0
5
10
15
20
25
30
0
500
1000
1500
2000
2500
3000
3500
4000
4500
2005 2006 2007 2008 2009 2010 2011 2012 2013
High-end (~130 watt ASIC) GPU performance industry trend, single precision(Not reflective or predictive of specific AMD products)
Peak
GFl
ops
GFl
ops
/ Wat
t
CPU GFlops / Watt are way down here
Rule of thumb:~20X betterGFlops / Watt
Trend lines need final calibration
31 | Natural User Interface | June 2011
SOFTWARE – PROGRAMMING FOR AN ACCELERATED PLATFORMSome of the use-cases AMD partners are working on that REQUIRE acceleration:
Gestures (camera)– Wide field of view– Depth of field – 10 inches to 20 feet– Multi-user tracking– Very low latency– Detailed kinetic models (fingers, eyes, mouth)– Stereopsis (depth) with cheap 2D cameras
Eye tracking– Eliminate or automate calibration and training– Real-time area of interest– Practical UI controls
Sounds– Large vocabulary, multi-lingual speech recog– Speaker independent– Eliminate training– Ambient sound classification– Multi-person speech separation
Face / object recognition– Fast, low power object classification– Error rate low enough for secure login
32 | Natural User Interface | June 2011
SENSOR I/O
What’s unique about sensor I/O for NUI?– High bandwidth – Cameras can stream gigabits per second (1080p60 24 bit payload is ~3Gb/s)– Many sensors – Multiple cameras, multiple microphones, gyro, accelerometer, magnetometer,
barometer, thermometer, near-field comms, GPS, ambient light, …– Low latency – Goal: real-time response to human input (60 fps isn’t enough for fast gestures)– Continuous duty cycle – Sensors for NUI are active all the time
Platform design implications– Efficient interfaces – Low overhead, low power (i.e avoid USB for internal sensors)– Local processing – Round trips to networked services increase latency– Partitioned design – Sensor processing in parallel with application processing
33 | Natural User Interface | June 2011
SUMMARY | Why It’s Important to “Think Revolutionary”
34 | Natural User Interface | June 2011
SUMMARY: SWING FOR THE FENCES!
This is a revolution – not an incremental change – and it’ll play out over the next 20 years– Legacy compatibility is OK, but don’t dumb down revolutionary NUI products to fit the old HCI paradigm
Use the whole platform– Yes, you have to write data parallel code
Do not compromise the user experience– Intuitive, truly natural, no “training”, multi-sensory, multi-user, multi-cultural
Go for mass markets– Consumers love this stuff! Build Fords and Toyotas, not just Maybachs and Bentleys
Tell us what you need in software support and future APUs– We’re just gettin’ started!
QUESTIONS
36 | Natural User Interface | June 2011
Disclaimer & AttributionThe information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors.
The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limitedto product and roadmap changes, component and motherboard version changes, new model and/or product releases, product differences between differing manufacturers, software changes, BIOS flashes, firmware upgrades, or the like. There is no obligation to update or otherwise correct or revise this information. However, we reserve the right to revise this information and to make changes from time to time to the content hereof without obligation to notify any person of such revisions or changes.
NO REPRESENTATIONS OR WARRANTIES ARE MADE WITH RESPECT TO THE CONTENTS HEREOF AND NO RESPONSIBILITY IS ASSUMED FOR ANY INACCURACIES, ERRORS OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION.
ALL IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE ARE EXPRESSLY DISCLAIMED. IN NO EVENT WILL ANY LIABILITY TO ANY PERSON BE INCURRED FOR ANY DIRECT, INDIRECT, SPECIAL OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION CONTAINED HEREIN, EVEN IF EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
AMD, the AMD arrow logo, and combinations thereof are trademarks of Advanced Micro Devices, Inc. All other names used in thispresentation are for informational purposes only and may be trademarks of their respective owners.
© 2011 Advanced Micro Devices, Inc. All rights reserved.