Human Interface Guidelines

download Human Interface Guidelines

of 32

Transcript of Human Interface Guidelines

  • 7/24/2019 Human Interface Guidelines

    1/32

    Intel Perceptual Computing SDK

    Human Interface Guidelines

    Revision 3.0

    February 25, 2013

  • 7/24/2019 Human Interface Guidelines

    2/32

    Legal Disclaimer

    INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS

    OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS

    DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL

    ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO

    SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A

    PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER

    INTELLECTUAL PROPERTY RIGHT.

    UNLESS OTHERWISE AGREED IN WRITING BY INTEL, THE INTEL PRODUCTS ARE NOT DESIGNED NOR INTENDED FOR

    ANY APPLICATION IN WHICH THE FAILURE OF THE INTEL PRODUCT COULD CREATE A SITUATION WHERE PERSONAL

    INJURY OR DEATH MAY OCCUR.

    Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not

    rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined." Intel

    reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities

    arising from future changes to them. The information here is subject to change without notice. Do not finalize a

    design with this information.

    The products described in this document may contain design defects or errors known as errata which may cause

    the product to deviate from published specifications. Current characterized errata are available on request.

    Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your

    product order.

    Copies of documents which have an order number and are referenced in this document, or other Intel literature,

    may be obtained by calling 1-800-548-4725, or go to:http://www.intel.com/design/literature.htm

    Software and workloads used in performance tests may have been optimized for performance only on Intel

    microprocessors. Performance tests, such as SYSmark* and MobileMark*, are measured using specific computer

    systems, components, software, operations, and functions. Any change to any of those factors may cause the

    results to vary. You should consult other information and performance tests to assist you in fully evaluating your

    contemplated purchases, including the performance of that product when combined with other products.

    Any software source code reprinted in this document is furnished under a software license and may only be used

    or copied in accordance with the terms of that license.

    Intel, the Intel logo, and Ultrabook, are trademarks of Intel Corporation in the US and/or other countries.

    Copyright 2012-2013 Intel Corporation. All rights reserved.

    *Other names and brands may be claimed as the property of others.

    http://www.intel.com/design/literature.htmhttp://www.intel.com/design/literature.htmhttp://www.intel.com/design/literature.htmhttp://www.intel.com/design/literature.htm
  • 7/24/2019 Human Interface Guidelines

    3/32

    Optimization Notice

    Intels compilers may or may not optimize to the same degree for non-Intel microprocessors for

    optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and

    SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or

    effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-

    dependent optimizations in this product are intended for use with Intel microprocessors. Certain

    optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer

    to the applicable product User and Reference Guides for more information regarding the specific

    instruction sets covered by this notice.

    Notice revision #20110804

  • 7/24/2019 Human Interface Guidelines

    4/32

    Table of Contents

    Introduction .............................................................................................................................................. 1

    Welcome ............................................................................................................................................... 1

    About the Camera ................................................................................................................................. 2

    High-Level Design Principles ..................................................................................................................... 5

    Input Modalities .................................................................................................................................... 5

    Design Philosophy ................................................................................................................................. 6

    Multimodality ........................................................................................................................................ 7

    Gesture Design Guidelines ........................................................................................................................ 8

    Capture Volumes ................................................................................................................................... 8

    Occlusion ............................................................................................................................................... 8

    High-Level Mid-Air Gesture Recommendations ................................................................................... 9

    Recognized Poses ................................................................................................................................ 11

    Universal Gesture Primitives ............................................................................................................... 12

    Other Considerations .......................................................................................................................... 15

    Samples and API .................................................................................................................................. 17

    Voice Design Guidelines .......................................................................................................................... 19

    High-Level Voice Design Recommendations ....................................................................................... 19

    Voice Recognition ............................................................................................................................... 19

    Speech Synthesis ................................................................................................................................. 21

    Samples and API .................................................................................................................................. 21

    Face Tracking Design Guidelines ............................................................................................................. 22

    High-Level Recommendations ............................................................................................................ 22

    Samples and API .................................................................................................................................. 22

    Visual Feedback Guidelines..................................................................................................................... 23

    High-Level Recommendations ............................................................................................................ 23

    Representing the User ........................................................................................................................ 23

    Representing Objects .......................................................................................................................... 25

    2D vs. 3D ............................................................................................................................................. 26

    Traditional UI Elements ....................................................................................................................... 26

    Integrating the Keyboard, Mouse, and Touch .................................................................................... 27

    Questions or Suggestions? ...................................................................................................................... 28

  • 7/24/2019 Human Interface Guidelines

    5/32

    1

    Introduction

    Welcome

    Participate in the revolution of Perceptual Computing! Imagine new ways of navigating your world with

    more senses and sensors integrated into the computing platform of the future. Give your users a new,natural, engaging way to experience applications, and have fun while doing it. At Intel we are excited to

    provide the tools as the foundation for this journey with the Intel Perceptual Computing SDKand look

    forward to seeing what you come up with. Over the new few months, you will be able to incorporate

    new capabilities into your applications including close-range hand gestures, finger articulation, speech

    recognition, face tracking, and augmented reality experiences to fundamentally change how people

    interact with their PCs.

    Perceptual Computing is about bringing exciting user experiences

    through new human-computing interactions where devices sense and

    perceive the users actions in a natural, immersive, and intuitive way.

    This document is intended to help you create innovative, enjoyable, functional, consistent, and powerful

    user interfaces for the Perceptual Computing applications of the future. In particular, it will help you:

    Develop compellinguser experiences appropriate for the platform.

    Design intuitiveand approachableinteractions.

    Make proper use of different input modalities.

    Remember, Perceptual Computing is a new field, and the technology gets better literally every week.

    Dont just design for today; as a designer and developer you will need to be creatively agile in designing

    for extensibility, modularity, and scalability for tomorrows capabilities. Well share new updates with

    you as they become available!

  • 7/24/2019 Human Interface Guidelines

    6/32

    2

    About the Camera

    Intel has announced the release of a peripheral device for use in Perceptual Computing applications- the

    CREATIVE* Interactive Gesture Camera. This is the first, but not necessarily only, technology platform

    from Intel that will be able to sense gesture, voice, and other input modalities. The guidelines in this

    document apply to this device, but also apply, in a broader sense, to other potential technology

    platforms.

    The following are some of the critical specifications the CREATIVE Interactive Gesture Camera:

    Size: 4.27 in 2.03 in 2.11 in (10.8cm 5.2cm 5.4cm)

    Weight: 9.56 oz (271 grams)

    Power: Single USB 2.0 (

  • 7/24/2019 Human Interface Guidelines

    7/32

    3

    Physical Device Configuration

    Youll want your app to work on a variety of platforms. Users might be running your application on a

    notebook, Ultrabook device, All-in-one, convertible, tablet, or traditional PC and monitor. These

    different platforms present different ergonomic limitations. Keep in mind the following variables:

    Screen size

    Smaller laptops and Ultrabook systems commonly have 13-inch screens and, occasionally, have even

    smaller screens. Desktops may have 24-inch screens or larger. This presents a design challenge for

    generating application UI and related artwork and for designing interactions. You must be flexible in

    supporting different display sizes.

    Screen distance

    Users are normally closer to laptop screens than desktop ones because laptop screens and keyboards

    are attached. Likewise, a laptop screen is often lower than a desktop one, relative to the users face and

    hands.

    When using a laptop, a users hands tend to be

    very close to the screen. The screen is usually

    lower, relative to the users head.

    When using a desktop, a users hands are farther

    away from the screen. The screen is also higher,

    relative to the users head.

    Camera configuration

    The Perceptual Computing camera is designed to be mounted on top of the monitor. Design your

    application assuming that this is the location of the camera. The camera is typically pointed at the user

    such that the users head and the upper portion of the users torso are in view. This supports common

    use cases such as video-conferencing. The camera will be placed at different heights on different

    platforms. For a large desk mounted display, the camera height could be even with the top of the users

    head, oriented to look down at the user. For an Ultrabook device on the users lap, the camera could be

    much lower, angled up at the user. Your application should support these different camera

    configurations.

  • 7/24/2019 Human Interface Guidelines

    8/32

    4

    Proper camera mounting on a stand-alone monitor. Proper camera mounting on a laptop.

    You should be flexible in supporting different screen sizes and camera configurations since this will

    impact the users interaction space.

  • 7/24/2019 Human Interface Guidelines

    9/32

    5

    High-Level Design Principles

    To design a successful app for the Perceptual Computing platform, you must understand its strengths.

    The killer apps for Perceptual Computing will not be the ones that we have seen on traditional

    platforms, or even more recent platforms such as phones and tablets.

    Input Modalities

    What sets the Perceptual Computing platform apart from traditional platforms are the new and

    different input modalities. Youll want tounderstand the strengths of these modalities, and incorporate

    them into your app appropriately. It can be especially powerful to combinemultiple modalities. For

    example, users can often coordinate simultaneous physical and voice input operations, making

    interaction richer and less taxing.

    Mid-air hand gestures.Allows for very rich and engaging interaction with 2D or

    3D objects. Allows easier, more literal direct manipulation. However, mid-air

    gesture can be tiring over long periods, and precision is limited.

    Touch. Also very concrete and easy to understand, with the additional benefit

    of having tactile feedback to touch events. However, touch is limited to 2D

    interaction. It is not as flexible as mid-air gesture.

    Voice. Human language is a powerful and compelling means of expression.Voice is also useful when a user is not within range of a computers other

    sensors. Environmental noise and social appropriateness should be

    considered.

    Mouse. The best modality for the accurate indication of a 2D point. Large-scale

    screen movements can be made with small mouse movements.

    Keyboard. Currently the best and most common modality for consistent and

    accurate text input.Useful for easy and reliable shortcuts.

  • 7/24/2019 Human Interface Guidelines

    10/32

    6

    Design Philosophy

    Designing and implementing applications for the Perceptual Computing platform requires a very

    different mindset than designing for traditional platforms, such as Windows* or Mac* OS X, or even

    newer platforms like iOS* or Android*. When designing your app, youll want it to be:

    Reality-inspired, but not a clone of reality.You should draw inspiration from the real-world.Perceptual Computing builds off of our natural skills used in every-day life. Every day we use our

    hands to pick up and manipulate objects and our voices to communicate. Leverage these natural

    human capabilities. However, do not slavishly imitate reality. In a virtual environment, we can

    relax the rules of the physical world to make interaction easier. For example, it is very difficult

    for a user to precisely wrap their virtual fingers around a virtual object in order to pick it up.

    With the Intel Perceptual Computing SDK, it may be easier for a user to perform a grasp action

    within a short proximity of a virtual object in order to pick it up.

    Literal, not abstract.Visual cues and interaction styles built from real-world equivalents are

    easier to understand than abstract symbolic alternatives. Also, symbolism can vary by geography

    and culture, and doesnt necessarily translate. Literal design metaphors, such as switches andknobs, are culturally universal.

    Intuitive.Your application should be approachable and immediately usable. Visual cues should

    be built in to guide the user. Voice input commands should be based around natural language

    usage, and your app should be flexible and tolerant in interpreting input.

    Consistent.Similar operations in different parts of your application should be performed in

    similar ways. Where guidelines for interaction exist, as described in this document, you should

    follow them. Consistency across applications in the Perceptual Computing ecosystem builds

    understanding and trust in the user.

    Extensible.Keep future SDK enhancements in mind. Unlike mouse interfaces, the power,

    robustness, and flexibility of Perceptual Computing platforms will improve over time. How will

    your app function in the future when sensing of hand poses improves dramatically? How about

    when understanding natural language improves? Design your app such that it can be improved

    as technology improves and new senses are integrated together.

    Reliable. It only takes a small number of false positives to discourage a user from your

    application. Focus on simplicity where possible to minimize errors.

    Intelligently manage persistence. For example, if a users hand goes out of the field of view of

    the camera, make sure that your application doesnt crash or do something completely

    unexpected. Intelligently handle such types of situations and provide feedback.

    Designed to strengths. Mid-air gesture input is very different from mouse input or touch input.

    Each modality has its strengths and weaknessesuse each when appropriate.

    Contextually appropriate.Are you designing a game? A medical application? A corporate

    content-sharing application? Make sure that the interactions you provide match the context. For

    example, you expect to have more fun interactions in a game, but may want more

  • 7/24/2019 Human Interface Guidelines

    11/32

    7

    straightforward interactions in a more serious context. Pay attention to modalities (e.g., dont

    rely on voice in a noisy environment).

    Take user-centered design seriously. Even the best designs need to be tested by the intended users.

    Dont do this right before you plan to launch your application or product. Unexpected issues will come

    up and require you to redesign your application. Make sure you know who your audience is before

    choosing the users you work with.

    Multimodality

    As we add more to our SDK, you will have additional sensors and inputs to play with. Make sure to

    design smartlydont use all types of input just for the sake of it, but also make sure to take advantage

    of combining different input modalities both synchronously and asynchronously. This will make it a

    more exciting and natural experience for the user, and can minimize fatigue of the hands, fingers, or

    voice. Having a few different modalities working in unison can also inspire confidence in the user that

    they are conveying the proper information. For example, use your hand to swipe through images, and

    use your voice to email the ones you like to a friend. Design in such a way that extending to differentmodalities and combinations of modalities is easy. Make sure that it is comfortable for the user to

    switch between modalities both mentally and physically. Also keep in mind that some of your users may

    prefer certain modalities over others, or have differing abilities.

  • 7/24/2019 Human Interface Guidelines

    12/32

    8

    Gesture Design Guidelines

    In this section we describe best practices for designing and implementing mid-air hand input (gesture)

    interactions.

    Capture VolumesIt is important to be aware of the sensing capabilities of your platform when designing and

    implementing your application. A camera has a certain field-of-view, or capture volume, beyond which it

    cant see anything. Furthermore, most depth sensing cameras have minimum and maximum sensing

    distances. The camera cannot sense objects closer than the minimum distance or farther than the

    maximum distance.

    The capture volume of the

    camera is visualized as a frustum

    defined by near and far planes

    and a field-of-view.

    The user is performing a hand

    gesture that is captured in the

    cameras capture volume.

    The user is performing a hand

    gesture outside of the capture

    volume. The camera will not see

    this gesture.

    Capture volume constraints limit the practical range of motion of the user and the general interaction

    space. Especially in games, enthusiastic users can inadvertently move outside of the capture volume.

    Feedback and interaction must take these situations into account.

    When performing gestures, it is expected that the user leans back in the chair in a relaxed position. The

    users hands move around a virtual plane roughly 12 inches away from the camera. This virtual plane

    serves multiple purposes: (a) it activates hand tracking when the users hand is within 12 inches from

    the camera; (b) the swipe gestures use the plane to distinguish between a left swipe and a right swipe.

    It is also recommended that the users head always be eight inches away from the users hands. The

    hand-tracking software cannot reliably distinguish a hand from a head if they are too close to each

    other.

    OcclusionFor applications involving mid-air gestures, keep in mind the problem of a users hands occluding the

    users view of the screen. It is awkward if users raise their hand to grab an object on the screen, but

    cant see the object because their hand caused the object to be hidden. When mapping the hand to the

    screen coordinates, map them in such a way that the hand is not in the line of sight of the screen object

    to be manipulated.

  • 7/24/2019 Human Interface Guidelines

    13/32

    9

    High-Level Mid-Air Gesture Recommendations

    For many Perceptual Computing applications, mid-air gestures will be the primary input modality.

    Consider the following points when designing your interaction and when considering gesture choices:

    Where possible make use of our universal gesture primitives. Introduce your own gesture

    primitives only when there is a compelling reason to do so. A small set of general-purposenatural gestures is preferable to a larger set of specialized gestures. As more apps come out,

    users will come to expect certain primitives, which will improve the perceived intuitiveness.

    Stay away from abstract gestures that require users to memorize a sequence or a pose.

    Abstract gestures are gestures that do not have a real-life equivalent and dont fit any existing

    mental models. An example of a confusing pose is thumbs down to delete something. An

    example of a better delete gesture is to place or throw an item in a trash can.

    Poses vs. gestures. Be aware of the different types of gestures. Posesare sustained postures,

    ones like clenching a fist to select an item, and dynamicgestures are those like swiping to turn a

    page. Figure out which make more sense for different interactions, and be clear in

    communicating which is needed at any given point. Innate vs. learned gestures. Some gestures will be natural to the user (e.g., grabbing an object

    on the screen), while some will have to be learned (e.g., waving to escape a mode). Make sure

    you keep the number of gestures small for a low cognitive load on the user.

    Be aware of which gestures should be actionable. What will you do if the user fixes her hair,

    drinks some coffee, or turns to talk to a friend? Make sure to make your gestures specific

    enough to be safe in these situations and not mess up the experience.

    Relative vs. absolute motion. Relative motion allows users to reset their current hand

    representation on the screen to a location more comfortable for their hand (e.g., as one would

    lift a mouse and reposition it so that it is still on a mouse pad). Absolute motion preserves

    spatial relationships. Applications should use the motion model that makes the most sense for

    the particular context.

    Design your gestures to be ergonomically comfortable. If the user gets tired or uncomfortable,

    they will likely stop using your application.

    Gesturing left-and-right is easier than up-and-down. Whenever presented with a choice, design

    for movement in the left-right directions for ease and ergonomic considerations.

    Two hands when appropriate.Some tasks, like zooming, are best performed with two hands.

    Support bi-manual interaction where appropriate.

    Handedness.Be aware of supporting both right- and left-handed gestures.

    Flexible thresholds.Make sure your code can accommodate hands of varying sizes and amounts

    of motor control. Some people may have issues with the standard settings and the application

    will need to work with them. For example, to accommodate an older person with hand jitter,

    the jitter threshold should be customizable. Another example is accommodating a young child

    or an excitable person who makes much larger gestures than you might expect.

  • 7/24/2019 Human Interface Guidelines

    14/32

    10

    Teach the gestures to the users.Provide users with a tutorial for your application, or show

    obvious feedback that guides them when first using the application. You could have an option to

    turn this training off after a certain amount of time or number of uses.

    Give an escape plan. Make it easy for the user to back out of a gesture or a mode, or reset.

    Consider providing the equivalent of a traditional home button.

    Be aware of your gesture engagement models.You may choose to design a gesture such that

    the system only looks for it once the user has done something to engage the system first (e.g.,

    spoken a command, made a thumbs up pose).

    Design for the right space.Be aware of designing for a larger world space (e.g. with larger

    gestures, more arm movement) versus a smaller more constrained space (e.g. manipulating a

    single object). Distinguish between environmentaland objectinteraction.

  • 7/24/2019 Human Interface Guidelines

    15/32

    11

    Recognized Poses

    Aposeand a gestureare two distinct things. Aposeis a sustained posture, while a gestureis a

    movement between poses. Here are the poses that we currently recognize as part of the SDK.

    Openness

    Using our SDK, you are able to discern between an open hand and a closed hand, by looking attheLABEL_OPEN and

    LABEL_CLOSE attributesrespectively.

    Thumbs Up and Thumbs Down

    Thumbs up and thumbs down poses can be recognized by looking at the LABEL_POSE_THUMB_UP

    andLABEL_POSE_THUMB_DOWN attributes, respectively. These could be used, for example, to confirm

    or cancel a verbal command.

    Peace

    The peace sign pose can be recognized by looking at the LABEL_POSE_PEACEattribute. This could be

    used as a trigger command, for example.

    Big5

    The Big 5 pose can be recognized by looking at the LABEL_POSE_BIG5attribute. Depending on the

    context of the application, this pose could be used to stop some sort of action (or to turn off voice

    commands, for example), or to initiate a gesture.

  • 7/24/2019 Human Interface Guidelines

    16/32

    12

    Universal Gesture Primitives

    We have defined some gestures that are reserved for pre-defined actions. In general, these gestures

    should be used only for these actions. Conversely, when these actions exist in your application, they

    should generally be performed using the given gestures. Providing feedback for these gestures is critical,

    and is discussed in the Visual Feedback Guidelines section. We dont require that you conform to these

    guidelines, but if you depart from these guidelines you should have a compelling user experience reason

    to do so. This set of universal gestures will become learned by users as standard and will become more

    expansive over time.

    Partial support for these gestures exists in the SDK. Some gestures are supported in their entirety, some

    are supported in a limited number of poses, and some are not yet supported. We plan to provide more

    complete support as the SDK matures.

    Grab and Release

    The gesture for grabbing an on-screen object is shown below. The user should start with a unique pose

    in order to start the sequence. The user should then have her fingers and thumb apart, and then bring

    them together into the grab pose. The reverse action, moving the fingers and thumb apart, releases the

    object. Limited grab and release functionality can be achieved through the opennessparameter (the

    value from 0 to 100 to indicate the level of palm openness) and fingertips (e.g. LABEL_FINGER_THUMB,

    LABEL_FINGER_INDEX) exposed by the SDK. For more reliable detection, you can also detect the top,

    middle, and bottom of the hand (e.g. LABEL_HAND_MIDDLE).

    A user grabs an object.

  • 7/24/2019 Human Interface Guidelines

    17/32

    13

    Move

    After grabbing an object, the user moves her hand to move the object. Some of the general guidelines

    for the design of basic grabbable objects are:

    It should be obvious to the user which objects can be moved and which cannot be moved.

    If the interface relies heavily on grabbing and moving, it should be obvious to the user where agrabbed object can be dropped. It may be useful to provide snappable behavior.

    Objects should be large enough to account for slight hand jitter.

    Objects should be far enough apart so users wont inadvertently grab the wrong object.

    If the hand becomes untracked while the user is moving an object the moved object should reset to its

    origin, and the tracking failure should be communicated to the user. This functionality can be realized

    through hand tracking with an openness value indicating a closed hand.

    A user moves an object.

    Pan

    If the application supports panning, this should be done using a flat hand. Panning engages once thehand is made mostly flat. Translation of the flat hand pans the view. Once the hand relaxes into a

    natural slightly curled pose, which can be determined by the hand openness parameter, panning ends.

    Note that if one panning is not good enough, the hand will have to move back and pan again.

    A user pans the view.

  • 7/24/2019 Human Interface Guidelines

    18/32

    14

    Zoom

    If the application supports zooming, this should be done using two flat hands. Zooming engages once

    both hands become mostly flat. Zooming is then coupled to the distance between the two hands (similar

    to pinch-zooming for touch). Zoom functionality requires an action to disengage the zooming otherwise

    the user cannot escape without changing the zoom.

    Resizing an object is very similar. Instead of keeping the 2 hands open, one hand will grab one side of an

    object, while the second hand grabs the other side of the object. Then the user moves the hands relative

    to one another, either closer together to shrink the object, or farther apart to grow the object. Once the

    user releases one hand, the resize operation ends.

    Wave

    The gesture for resetting, escaping a mode, or moving up a hierarchy is shown below. The user quicklywaves her hand back and forth. This is a general purpose get-me-out-of-here gesture.You can find this

    in the SDK under LABEL_HAND_WAVE.

    A user waves to reset a mode.

    A user zooms the view.

  • 7/24/2019 Human Interface Guidelines

    19/32

    15

    Circle

    The circle gesture, LABEL_HAND_CIRCLE, is recognized when the user extends all fingers and moves the

    hand in a circle. This could be used for selection or resetting, for example.

    Swipe

    Swipes are basic navigation gestures. However, it is technically challenging to recognize swipes

    accurately. There are many cases a left swipe is exactly like a right swipe from the cameras view point, if

    one is performing multiple swipes. This also applies to up and down swipes, respectively. You can find

    Swipe in the SDK under LABEL_NAV_SWIPE_LEFT, LABEL_NAV_SWIPE_RIGHT, LABEL_NAV_SWIPE_UP

    ,andLABEL_NAV_SWIPE_DOWN.

    To avoid confusion, the user should perform the swipe gestures as follows:

    Imagine there is a virtual plane about 12 inches away from the camera. The swipes must first go into the plane,

    travel inside the plane from left to right or right to left, and then go out of the plane.

    Other Considerations

    Hand Agnosticism

    All one-handed gestures can be performed with either the right or left hand. For two-handed gestures

    where the sequence of operations matters (e.g., grabbing an object with both hands for the resize

    gesture), the hand choice for starting the operation does not matter.

    A user circles with her hand to move to the next level of a game.

  • 7/24/2019 Human Interface Guidelines

    20/32

    16

    Finger Count Independence

    For many gestures, the number of fingers extended does not matter. For example, the pan operation

    can be performed with all fingers extended, or only a few. Restrictions in finger count only exist where

    necessary to avoid conflict. For example, having the index finger extended could be reserved for

    pointing at a 2D location, in which case it cant also be used for panning.

    Flexibility in Interpretation of Pose

    Hands can be in poses similar to, but slightly different, from the poses described. For example, accurate

    panning can be accomplished with the fingers pressed together or fanned apart.

    Rate Controlled or Absolute Controlled Rotation and Translation

    You can use an absolute-controlledmodel or a rate-controlled modelto control gesture-adjusted

    parameters such as rotation, translation (of object or view), and zoom level. In an absolute model, the

    magnitude to which the hand is rotated or translated in the gesture is translated directly into the

    parameter being adjusted, i.e., rotation or translation. For example, a 90-degree rotation by the inputhand results in a 90-degree rotation in the virtual object. In a rate-controlled model, the magnitude of

    rotation/translation is translated into the rate of change of the parameter, i.e., rotational velocity or

    linear velocity. For example, a 90-degree rotation could be translated into a rate of change of 10

    degrees/second (or some other constant rate). With a rate-controlled model, users release the object or

    return their hands to the starting state to stop the change.

  • 7/24/2019 Human Interface Guidelines

    21/32

    17

    How to Minimize Fatigue

    Gestural input is naturally fatiguing as it relies on several large muscles to sustain the whole arm in the

    air. It is a serious problem and should not be disregarded; otherwise, users may quickly abandon the

    application. By carefully balancing the following guidelines, you can alleviate the issue of fatigue as

    much as possible:

    Allow users to interact with elbows rested on a surface.Perhaps the best way to alleviate arm

    fatigue is by resting elbows on a chairs arm rest. Support this kind of input when possible. This,

    however, reduces the usable range of motion of the hand to an arc in the left and right

    direction. Evaluate whether interaction can be designed around this type of motion.

    Make gestures short-lasting. Long-lasting gestures, especially ones where the arms must be

    held in a static pose, quickly induce fatigue in the users arm and shoulder (e.g., holding the arm

    up for several seconds to make a selection).

    Design for breaks. Users naturally, and often subconsciously, take quick breaks (e.g., professors

    writing on the blackboard). Short, frequent breaks are better than long, infrequent ones. Do not require precise input. Users naturally tense up their muscles when trying to perform

    very precise actions (much like trying to reduce camera shake when taking a picture in the dark).

    This, in turn, accelerates fatigue. Allow for gross gestures and make your interactive objects

    large.

    Do not require many repeating gestures. If you require users to constantly move their hands in

    a certain way for a long period of time (e.g., while moving through a very long list of items by

    panning right), they will become tired and frustrated very quickly.

    Samples and APIIn the /doc folder of the SDK, you can find a file called sdksamples.pdf. This gives you examples that

    show finger tracking, pose/gesture recognition, and event notification (gesture_viewer,

    gesture_viewer_simple) in both C++ and C#. You can run the applications and view the source code in

    the Intel/PCSDK/sample folder.

  • 7/24/2019 Human Interface Guidelines

    22/32

    18

    Also, in sdkmanual-gesture.pdf, you can find the most current version of the gesture module, which

    consumes RGB, depth, or IR streams as input and returns blob information, geometric node tracking

    results, pose/gesture notification, and alert notification.

    For an example of 2d pan, zoom, and rotate, see: http://github.com/IntelPerceptual/PerceptualP5/

    tree/master/PanZoomRotate

    For an online tutorial on close-range hand/finger tracking, see:http://software.intel.com/en-

    us/sites/default/files/article/328725/perc-gesturerecognition-tutorial-final.pdf

    http://software.intel.com/en-us/sites/default/files/article/328725/perc-gesturerecognition-tutorial-final.pdfhttp://software.intel.com/en-us/sites/default/files/article/328725/perc-gesturerecognition-tutorial-final.pdfhttp://software.intel.com/en-us/sites/default/files/article/328725/perc-gesturerecognition-tutorial-final.pdfhttp://software.intel.com/en-us/sites/default/files/article/328725/perc-gesturerecognition-tutorial-final.pdfhttp://software.intel.com/en-us/sites/default/files/article/328725/perc-gesturerecognition-tutorial-final.pdfhttp://software.intel.com/en-us/sites/default/files/article/328725/perc-gesturerecognition-tutorial-final.pdf
  • 7/24/2019 Human Interface Guidelines

    23/32

    19

    Voice Design Guidelines

    In this section we describe best practices for designing and implementing voice

    command and control, dictation, and text to speech for your applications. As of

    now, English is the only supported language.

    High-Level Voice Design Recommendations

    Test your applicationin noisy background and different environmental spaces to ensure

    robustness of sound input.

    Watch out for false positives. For example, dont let a specific sound delete a file without

    verification, as this sound could unexpectedly crop up as background noise. Always show listening statusof the system. Is your application listening? Not listening?

    Processing sound?

    People do not speak the way they write.Be aware of pauses and interjections such as um

    and uh.

    Teach the user how to use your system as they use it.Give more help initially, then fade it

    away as the user gets more comfortable (or have it as a customizable option).

    Voice Recognition

    Command Mode Vs. Dictation Mode

    Be aware of the different listening modes your application will be in. Once listening, your application can

    be listening in command mode or dictation mode. Command mode is for issuing commands (e.g., Start

    computer, Email photo, Volume up). In Command mode, the SDK module recognizes only from a

    predefined list of context phrases that you have set. The developer can use multiple command lists,

    which we will call grammars. Good command application design would create multiple grammars and

    activate the one that is relevant to the current application state (this limits what the user can do at any

    given point in time based on the command grammar used). To invoke the command mode, provide a

    grammar.

    Dictation mode is for general language from the user (e.g., entering in the text for a Facebook status

    update). Dictation mode has a predefined vocabulary. It is a large, generic vocabulary containing 50k+

    common words (with some common named entities). Highly domain specific terms (e.g. medical

    terminology) may not be widely represented. Absence of agrammar will invoke the SDK in dictation

    mode. Dictation is limited to 30 seconds. Currently, grammar mode and dictation mode cannot be run at

    the same time.

  • 7/24/2019 Human Interface Guidelines

    24/32

    20

    Constructing Grammars

    Keep the following points in mind when constructing your grammars:

    Dont assume that yourcommand phrasing is natural! The language you use is very important.

    Ask other people -- friends, family, people on forums, study participantshow they would want

    to interact with your system or initiate certain events.

    Provide many different optionsfor your grammar to exert less effort on the user, and try to

    make interaction more natural. For example, instead of constraining the user to say Program

    start, you could also accept Start program, Start my program, Begin program, etc.

    Complicated words/namesare not easily recognized. Make your grammar include commonly

    used words. However, very short words can be difficult to recognize because of sound ambiguity

    with other words.

    Be aware of the length of the phrases in your grammar.Longer phrases are easier to

    distinguish between, but you also dont want users to have to say long phrases too often.

    Beware of easily confusable commands. For example, Create playlistand Create a listwill

    likely sound the same to your application. One would be used in a media player setting, and theother could be in a word processor setting, but if they are all in one grammar the application

    could have undesired responses.

    Experiment with different lengths of end of sentence detection. Responsiveness is important

    and the end of sentence parameter (endofSentencein PXCVoiceRecognition::ProfileInfo) can

    help adjust the responsiveness of the application.

    User Feedback

    Let the user know what commands are possible.It is not obvious to the user what your

    applications current grammar is.

    Let the user know how to initiate listening mode. Make your applications listening status clear. Let the user know that their commands have been understood.The user needs to know this to

    trust the system, and know which part is broken if something doesnt go the way they planned.

    One easy way to do this is to relay back a command. For example, the user could say Program

    start, and the system could respond by saying Starting program, please wait.

    Give users the ability to make the system stop listening.

    Give users the ability to edit/redo/change their dictation quickly. It might be easier for the

    user to edit with the mouse, keyboard, or touchscreen at some point to edit their dictations.

    If you give verbal feedback, make sure it is necessary, important, and concise! Dont overuse

    verbal feedback as it could get annoying to the user.

    If sound is muted, provide visual components and feedback.

  • 7/24/2019 Human Interface Guidelines

    25/32

    21

    How To Minimize Fatigue

    Remember, the user should not have to be speaking constantly. Speech is best used for times when

    dictation is being used, or triggers are necessary to accomplish actions. Speech can be socially awkward

    in public, and background noise can easily get in the way of successful voice recognition. Be aware of

    speechs best uses. Speech is best used as a shortcut for a multi-menu action (something that requires

    more than a first-level menu and a single mouse click). To scroll down a menu, it would make more

    sense to use a gesture rather than repeatedly have the user say Down, Down, Down.

    Speech Synthesis

    You can also generate speech using the built in Nuance speech synthesis that comes with our SDK.

    Currently a female voice is used for TTS.

    Make sure to use speech synthesis where it makes sense. Have an alternative for people who

    cannot hear well, or if speakers are muted.

    Listening to long synthesized speech will be tiresome. Synthesize and speak only short

    sentences.

    Samples and API

    You can run and view the source code for the voice_recognition and voice_synthesis projects in

    Intel/PCSDK/sample.

    In sdksamples.pdf, you can check out the audio_recorder sample. You can also find more information in

    sdkmanual-core.pdfon audio abstraction with the PXCAudio, PXCAccelerator, and

    PXCCapture::AudioStream interfaces.

    An article on how to use the SDK for Voice Recognition can be found here:

    http://software.intel.com/en-us/sites/default/files/article/328725/voicerecognitionhowto.pdf

    http://software.intel.com/en-us/sites/default/files/article/328725/voicerecognitionhowto.pdfhttp://software.intel.com/en-us/sites/default/files/article/328725/voicerecognitionhowto.pdfhttp://software.intel.com/en-us/sites/default/files/article/328725/voicerecognitionhowto.pdf
  • 7/24/2019 Human Interface Guidelines

    26/32

    22

    Face Tracking Design Guidelines

    In a future release we will provide more guidelines

    for designing interactions based on face tracking, face

    detection, and face recognition. Stay tuned!

    High-Level Recommendations

    More expressions will be in the SDK in the future- smiles and winks can be currently detected.

    Natural expressions in front of a computer will be difficult to detect, users should be prompted

    to show exaggerated expressions.

    Give feedback to the user to make sure they are in a typical working distance away from the

    computer, for optimal feature detection.

    Give feedback to the user about any orientation or lighting issues- provide error messages or

    recommendations.

    For optimal tracking, have ambient light or light facing the users face (avoid shadows).

    Try to make the interface background as close to white as possible (the screen can serve as a

    second light to ensure good reading).

    Notify the user if they are moving too fast to properly track facial features.

    Samples and API

    Check out out the face_detection and landmark_detection samples (in Intel/PCSDK/sample and

    discussed in sdksamples.pdf) to run the application and see the source code.

    You can also find more information in sdkmanual-face.pdf.

    An article on how to use the Face Detection module can be found here:http://software.intel.com/en-

    us/articles/intel-perceptual-computing-sdk-how-to-use-the-face-detection-module

    http://software.intel.com/en-us/articles/intel-perceptual-computing-sdk-how-to-use-the-face-detection-modulehttp://software.intel.com/en-us/articles/intel-perceptual-computing-sdk-how-to-use-the-face-detection-modulehttp://software.intel.com/en-us/articles/intel-perceptual-computing-sdk-how-to-use-the-face-detection-modulehttp://software.intel.com/en-us/articles/intel-perceptual-computing-sdk-how-to-use-the-face-detection-modulehttp://software.intel.com/en-us/articles/intel-perceptual-computing-sdk-how-to-use-the-face-detection-modulehttp://software.intel.com/en-us/articles/intel-perceptual-computing-sdk-how-to-use-the-face-detection-module
  • 7/24/2019 Human Interface Guidelines

    27/32

    23

    Visual Feedback Guidelines

    Youll want your Perceptual Computing application to appear and behave very differently from a

    traditional desktop PC style application. Familiar concepts, such as cursors, clicking, icons, menus, and

    folders, dont necessarily apply to an environment in which gesture and voice are the primary

    interaction modalities. In this section we provide design guidelines for developing your application to

    visually conform to the Perceptual Computing interaction model.

    High-Level Recommendations

    Dont have a delaybetween the users input (whether its gesture, voice, or anything else) and

    the visual feedback on the display.

    Smooth movements. Apply a filter to the users movements if necessary to prevent jarring

    visual movements.

    Combine different kinds of feedback.This can convince the user that the interactions are more

    realistic. Stay tuned to the next version of this manual for more advice on how to deal with

    audio feedback. Show what is actionable. You dont want the user trying to interact with something that they

    cant interact with.

    Show the current state of the system. Is the current object selected? If so, what can you do to

    show this visually? Ideas include using different colors, tilting the object, orienting the object

    differently, and changing object size.

    Show recognition of commands or interactions. This will let the user know they are on the right

    or wrong track.

    Show progress. For example, you could show an animation or a timer for short timespans.

    Consider physics. Think about the physics that you want to use to convey a more realistic and

    satisfying experience to the user. You could simulate magnetic snapping to an object to make

    selection easier, for example. While the user is panning through a list, you could accelerate the

    list movement and slow it down after the user has finished panning.

    Representing the User

    A user must be represented in the virtual world. The user embodiment allows the user to interact with

    elements in the scene. In traditional environments this embodiment is a mouse cursor. In a Perceptual

    Computing environment, the representation of the user should reflect the modalities used to interact

    and the nature of the application in question. Typically, where hand gestures are used, a representation

    of the hands should be shown on the screen. The hand representation depends on the application. In a

    magic game, the user may be represented as a glowing wand held by a hand. In a 3D modelingapplication, the user may be represented by an articulated hand model. You could have the cursor be a

    static object, or also have the cursor change orientation, size, or color depending on the movement or

    depth of the users hands.

  • 7/24/2019 Human Interface Guidelines

    28/32

    24

    The hand representation should be neither very realistic, nor very simplistic. A very realistic hand risks

    the uncanny valley effect1, which would disturb users. A too simplistic hand will be inadequate to

    communicate the complex state of the hand and risks being too close to a cursor.

    If head location is relevant to interaction (e.g. you are using face-tracking) a representation of the head

    may need to be incorporated. Similar rules hold for other modalities.

    Hand representation consisting of an articulated

    hand model. This would be appropriate for

    applications involving direct object manipulation.

    Hand representation consisting of a magic wand.

    This would be appropriate for a magic game.

    Hand representation when tracking has failed. The user is told that tracking has failed, so they know to

    act to fix tracking.

    1The uncanny valley is a hypothesis in the field of robotics and 3D computer animation which holds that when

    human replicas look and act almost, but not perfectly, like actual human beings, it causes a response of revulsion

    among human observers. The valley refers to the dip in the graph of the comfort level of humans as a function of a

    robots human likeness [Wikipedia 2013].

  • 7/24/2019 Human Interface Guidelines

    29/32

    25

    Sensor limitations can result in cases where the user is not being tracked. For example, a user may be

    too far from the camera, or may have moved to the side and is out of the view of the camera. Users

    often dont understand when this has happened. Your application should tell users when tracking has

    failed, why tracking has failed, and what they can do to correct the situation. This feedback can be

    incorporated into the design of the user representation (e.g., showing the relation between the user and

    the interaction bounding box/camera field of view visually). Other measures can be taken when tracking

    fails. In a game, where lost tracking can result in the user losing the game, the action can be dramatically

    slowed down until tracking is re-established.

    In general, you should recognize the limitations of the sensors and insure that the experiences you are

    trying to create are intelligent in working with the technology that you currently have. For example, it

    would be a poor design to have a user interaction that, in real life, would require sensitive tracking that

    is super-fast when your tracking only enables something slow. You may want to modify the interaction

    and visual representations to work within the current abilities of the technology.

    Representing Objects

    The ideal representation of objects in the scene is influenced greatly by the method in which we interact

    with them. In a Perceptual environment, we are able to interact much more richly with objects. We can

    push, grab, twist, or stretch them. This is much more than can be done with a mouse. On the other

    hand, a hand has much less precision than a mouse. The representation of objects should reflect these

    realities. Objects in your application should:

    Take advantage of the rich manipulation abilities of the human hand

    Convey visibly the interactive possibilities, so users can understand what can be done

    Be of a size that can be manipulated easily

    Not demand a degree of precise manipulation that results in a large number of errors or a large

    amount of fatigue

    Gestural Actions on Objects

    Some action states to consider while interacting with objects in your application may include:

    Targeting

    Hovering

    Selecting

    Dragging

    Releasing

    Resizing

    Rotating

  • 7/24/2019 Human Interface Guidelines

    30/32

    26

    2D vs. 3D

    A Perceptual Computing graphical application can be shown either within a 2D or 3D interactional

    environment. 2D environments are easier to understand and navigate on a 2D display, so should be

    used when there isnt a compelling need for a 3D environment.When using gesture to interact with a

    2D environment, however, consider using subtle 3D cues to enhance interaction. For example, a

    grabbed object can be made slightly larger with a drop shadow, to indicate it has been lifted off the 2D

    surface. Full 3D environments play to many of the strengths of a Perceptual environment, and should be

    used when the use case demands it. Some applications, especially games, benefit from operating in 3D.

    Traditional UI Elements

    The interactive elements in a primarily gesture-driven interface are different from those in a primarily

    mouse-driven environment. This section suggests some of the more traditional UI elements for use with

    mid-air gesture. These can be useful for clarity and efficiency, and many users are familiar with these

    models. Of course, it isnt good to just rely on what people are accustomed to if there are better

    solutions, but dont discount some of the UI elements people are already using.

    Horizontal ListsHorizontal lists can be good because they rely on the more natural left-right motion with the right hand.

    A welcomed improvement to linear lists is presenting choices on a slight arc, which allows the user to

    make a choice while resting their elbow on a hard surface. Note, however, that this approach is

    handedness-dependent. A left-handed user might not find it comfortable. Consider accommodating left-

    handed users by optionally mirroring the interface.

    Example of a horizontal list sweep.

  • 7/24/2019 Human Interface Guidelines

    31/32

    27

    Radial Lists

    Radial lists (also known as pie menus) are useful, especially for gestural input, as they are less error-

    prone since the distance the user has to traverse in order to reach any option is short, and a user

    doesnt have to aim precisely to select an option. Also they can take up less space than linear lists. When

    constructing radial lists, maximize the selectable area for each option by making the whole slice of the

    list selectable.

    Sliders

    Typically, sliders are used for adjusting values within a given range. You may want to use a slider for

    absolute panning instead of using relative panning depending on your application. Follow these

    guidelines:

    Create discrete slidersas opposed to continuous ones. Gestural input lacks the fidelity required

    to make fine selections without inducing fatigue.

    Try to keep sufficient distance between stepsto avoid demanding too much precision on the

    part of the user.

    The top slider has fewer steps, allowing the user to easily select the one they want using mid-air gesture.

    The numerous steps on the lower slider make it much harder to select the desired value.

    Integrating the Keyboard, Mouse, and Touch

    Dont ignore the mouse, keyboard, and touchpad or touchscreen. People are used to these form factors,

    and each has their own specialized purpose. Often, it makes much more sense to type in informationusing the keyboard, rather than using an onscreen keyboard (although in some situations, like when the

    user only has to input a few letters, using gesture makes sense). Keys can still be used as failsafe

    shortcuts or escapes. To find a very precise 2D location, the mouse and touchscreen can still be very

    useful and efficient.

    Example of a radial list with paste currently selected.

  • 7/24/2019 Human Interface Guidelines

    32/32

    Questions or Suggestions?

    This document provides guidelines that are rooted in many years of research in human-computer

    interaction, user interface design, and multi-modal input. However, if you feel that certain guidelines do

    not fit your use case or you have proposals for modifications or additions, please post to the forum

    thread Human Interaction Guidelines-Questions and Suggestions, and we will be happy to discuss the

    issues with you.

    Other helpful information:

    Our website:

    http://intel.com/software/perceptual

    For information and updates on the SDK, follow us on Twitter at:

    @PerceptualSDK

    All manuals mentioned in this document that were downloaded with the SDK are also available

    online:http://software.intel.com/en-us/articles/intel-perceptual-computing-sdk-manual-page

    Check out our tutorials!

    http://software.intel.com/en-us/articles/intel-perceptual-computing-sdk-tutorials

    Check out our github repository:

    http://github.com/IntelPerceptual

    We also have a social hub where you can find links to our videos and connect with us on

    Facebook, Twitter, and Google+ :

    http://about.me/IntelPerceptual

    Frequently Asked Questions

    http://software.intel.com/articles/perc-faq

    And last but not least, participate in our Intel Developer Zone Intel Perceptual Computing SDK

    forum to share information with fellow developers and ask questions.http://software.intel.com/en-us/forums/intel-perceptual-computing-sdk

    http://intel.com/software/perceptualhttp://intel.com/software/perceptualhttp://intel.com/software/perceptualhttp://intel.com/software/perceptualhttp://software.intel.com/en-us/articles/intel-perceptual-computing-sdk-manual-pagehttp://software.intel.com/en-us/articles/intel-perceptual-computing-sdk-manual-pagehttp://software.intel.com/en-us/articles/intel-perceptual-computing-sdk-manual-pagehttp://software.intel.com/en-us/articles/intel-perceptual-computing-sdk-tutorialshttp://software.intel.com/en-us/articles/intel-perceptual-computing-sdk-tutorialshttp://github.com/IntelPerceptualhttp://github.com/IntelPerceptualhttp://about.me/IntelPerceptualhttp://software.intel.com/articles/perc-faqhttp://software.intel.com/articles/perc-faqhttp://software.intel.com/en-us/forums/intel-perceptual-computing-sdkhttp://software.intel.com/en-us/forums/intel-perceptual-computing-sdkhttp://software.intel.com/en-us/forums/intel-perceptual-computing-sdkhttp://software.intel.com/articles/perc-faqhttp://about.me/IntelPerceptualhttp://github.com/IntelPerceptualhttp://software.intel.com/en-us/articles/intel-perceptual-computing-sdk-tutorialshttp://software.intel.com/en-us/articles/intel-perceptual-computing-sdk-manual-pagehttp://intel.com/software/perceptualhttp://intel.com/software/perceptual