Arm as a Touchscreen

ARM AS A TOUCHSCREEN

“ARM AS A TOUCHSCREEN”

ABHIJEET S. KAPSE

Table of Contents Abstract 02

1. Introduction 03

2. Skinput 05

2.1 What is Skinput……………………………………………….……………………….05

2.2 Principle of Skinput……………………………………………….…………………..06

3. Working Of Skinput 07

3.1 Pico-projector 07

3.2 Bioacoustics 08

3.2.1 Transverse Wave Propagation………………………….……………………….09

3.2.2 Longitudinal Wave Propagation……………………….……………….……….09

3.2.3 Bioacoustic Sensor…………….……………………….……………….……….10

3.3 Bluetooth 11

4. Experiments…………………….………………….……………………..…………………..13

4.1 Experimental Conditions 13

4.2 Analysis 19

4.3 BMI Effect 21

5. Advantages………...…………………………………………………………………………23

6. Disadvantages…………..…………………………………………………………………….24

7. Applications…………………………………………………………………………………..25

8. Future Implementation……………………………………………………...………………26

Conclusion……………...……………………………………………………………………….27

References…………………………………………………………………………...…………..28

1


ABSTRACT

Popularity of mobiles devices increasing day by day due to the advantages like portability,

mobility and flexibility, but the limited size gives very less interactive surface area. We cannot

just make the device large without losing benefit of small size. So the Microsoft company has

developed Skinput, a technology that appropriates the human body for acoustic transmission,

allowing the skin to be used as an input surface. Human body produces different vibrations when

we tap on different body parts. With the help of this unique property of human body we can use

different locations as different functions of small devices like mobile phones or music players.

When we tap on our body some mechanical vibrations propagates through the body that

vibrations are captured by sensor array and with the help of armband we send the signals

produced by sensors to the mobile devices and the software can detect on which location of our

body part the finger is tapped. So according to the location, desired operation is performed.

When augmented with a Pico-projector, the device can provide a direct manipulation, graphical

user interface on the body. This approach provides an always available, naturally portable, and

on-body finger input system.

2


CHAPTER 1

INTRODUCTION

The world is going crazy over an invention, which is known as mobile phones. The Mobile

devices became popular in less time due some advantages they came up with, like portability,

flexibility, mobility and responsiveness. These devices easily get fit in our pocket means we

don’t need to carry any extra surface area with us. Devices with significant computational power

and capabilities can now be easily carried on our bodies. However, their small size typically

leads to limited interaction space (e.g. diminutive i.e. very small screens, buttons, and jog

wheels) and consequently diminishes their usability and functionality. Since, we cannot simply

make buttons and screens larger without losing the primary benefit of small size.

The alternative approaches that enhance interactions with small mobile systems. One option is to

opportunistically appropriate surface area from the environment for interactive purposes. For

example, a technique that allows a small mobile device to turn tables on which it rests into a

gestural finger input canvas. However, tables are not always present so we cannot use these

technique everywhere, and in a mobile context, users are unlikely to want to carry appropriated

surfaces with them (at this point, one might as well just have a larger device). However, there is

one surface that has been previous overlooked as an input canvas and one that happens to always

travel with us : our skin.

Appropriating the human body as an input device is appealing not only because we have roughly

two square meters of external surface area, but also because much of it is easily accessible by our

hands (e.g., arms, upper legs, torso).We can use this without any visual contact. Furthermore,

proprioception – our sense of how our body is configured in three-dimensional space – allows us

to accurately interact with our bodies in an eyes-free manner. For example, we can readily flick

each of our fingers, touch the tip of our nose, and clap our hands together without visual

assistance. Few external input devices can claim this accurate, eyes-free input characteristic and

provide such a large interaction area. We can use any part of our body as an input surface but the

for comfortable operation we need to use our arm as an input. In this paper, we present our work

3


on Skinput – a method that allows the body to be appropriated for finger input using a novel,

non-invasive, wearable bio-acoustic sensor.

The technology was developed by Chris Harrison, Desney Tan, and Dan Morris, at Microsoft

Research's Computational User Experiences Group. Skinput is a combination of three

technologies which are pico-projector, bioacoustics sensors and Bluetooth. Pico-projector will

display mobile screen on our skin. As according to our need we tap on our body. After tapping

some vibrations are produced through our body, those ripples are captured by bioacoustics

sensors which are mounted armband. These armband is connected to the mobile device by

wireless connection i.e. Bluetooth. Mobile device consists of a software which matches these

vibration signal with the store signals and desired operation is performed. We have use Support

Vector Machine algorithm i.e. supervised learning algorithm to train our software. At initial

stage we have to store the signal data from each location of our arm which is the reference signal

for our software. Skinput employs acoustics, which take advantage of the human body's natural

sound conductive properties (e.g., bone conduction). This allows the body to be annexed as an

input surface without the need for the skin to be invasively instrumented with sensors, tracking

markers, or other items.

The contributions of this paper are: The description of the design of a novel, wearable sensor for

bio-acoustic signal acquisition. Also the description of an analysis approach that enables skinput

system to resolve the location of finger taps on the body. In this paper, we present working on

skinput—a method that allows the body to be appropriated for finger input using a novel, non-

invasive, wearable bio-acoustic sensor. When coupled with a pico-projector, the skin can operate

as an interactive canvas supporting both input and graphical output.

4

http://en.wikipedia.org/wiki/Skin

http://en.wikipedia.org/wiki/Bone_conduction

http://en.wikipedia.org/wiki/Microsoft_Research

http://en.wikipedia.org/wiki/Microsoft_Research

http://en.wikipedia.org/wiki/Desney_Tan


CHAPTER 2

Skinput2.1 What Is Skinput

Touch screens have revolutionized the way we communicate with electronics, but sometimes

they can get a little cramped — wouldn’t it be great if the iPhone’s screen was just a little bit

bigger? One creative solution is Skinput, a device that uses a pico projector to beam graphics

(keyboards, menus, etc.) onto a user’s palm and forearm, transforming the skin into a computer

interface. Skinput is a combination of two words i.e Skin and Input. This technology uses largest

part of our body which is skin as an input surface for mobile gadgets. Chris Harrison and team of

Microsoft research has developed Skinput, a way in which your skin can become a touch screen

device or your fingers buttons on a MP3 controller.

Figure 1: Display on palm using Skinput Technology

5

http://en.wikipedia.org/wiki/Handheld_projector

http://research.microsoft.com/en-us/um/redmond/groups/cue/skinput/index.html

http://www.inhabitat.com/2009/12/17/top-7-best-green-iphone-apps/


Skinput represents one way to decouple input from electronic devices with the aim of allowing

devices to become smaller without simultaneously shrinking the surface area on which input can

be performed.

2.2 Principle Of Skinput

Due to a unique structure of the arm, along with varying bone thickness, muscle or fat tissue

concentrations and the like, each tap in different places along the arm delivers a unique

combination of transverse and longitudinal waves up the arm, to the torso. Transverse waves are

the ripples of lose skin, expanding away from the point of impact. Longitudinal waves are

vibrations emitted by the (recently struck) bone along its entire length, from the center of the arm

towards the skin.

Skinput relies on an armband, currently worn around the biceps. It detects vibrations in the arm

and compares them with predefined control commands (e.g. up, down, back, enter).

Additionally, thanks for the sense of proprioception (the ability to sense the position of our body

parts without looking), Skinput does not preoccupy the user's vision (much like touch typing).

The current Skinput prototype build relies on a series of arrays of small, cantilevered piezo films

(MiniSense100, Measurement Specialties, Inc.). This setup was found favourable for measuring

the specific wave frequencies and providing a satisfactory noise-to-signal ratio. The sensors

output acoustic wave signals, which are then processed, segmented and classified by the software

in order to execute a predefined command.

6


CHAPTER 3

Working Of Skinput3.1 Pico-Projector

Pico projectors are tiny battery powered projectors - as small as a mobile phone - or even

smaller: these projectors can even be embedded inside phones or digital cameras. Pico-projectors

are small, but they can show large displays (sometimes up to 100"). While great for mobility and

content sharing, pico-projectors offer low brightness and resolution compared to larger

projectors. It is a new innovation, but pico-projectors are already selling at a rate of about a

million units a year (in 2010), and the market is expected to continue growing quickly.

Figure 2: Pico-projector

We are using DLP (Digital Light Processing) - pioneered by TI, the idea behind DLP is to use

tiny mirrors on a chip that direct the light. Each mirror controls the amount of light each pixel on

7


the target picture gets (the mirror has two states, on and off. It refreshes many times in a second -

and if 50% of the times it is on, then the pixel appears at 50% the brightness). Color is achieved

by a using a color wheel between the light source and the mirrors - this splits the light in

red/green/blue, and each mirror controls all thee light beams for its designated pixel. So with the

help of tiny projector we will display required menu bar on our arm.

3.2 Bio-Acoustics

Acoustics is the interdisciplinary science that deals with the study of all mechanical waves in

gases, liquids, and solids including vibration, sound, ultrasound and infrasound. A scientist who

works in the field of acoustics is an acoustician while someone working in the field of acoustics

technology may be called an acoustical engineer. The application of acoustics can be seen in

almost all aspects of modern society with the most obvious being the audio and noise control

industries. Bioacoustics is a cross-disciplinary science that combines biology and acoustics.

Usually it refers to the investigation of sound production, dispersion through elastic media, and

reception in animals, including humans.

When a finger taps the skin, several distinct forms of acoustic energy are produced. Some energy

is radiated into the air as sound waves; this energy is not captured by the Skinput system. Among

the acoustic energy transmitted through the arm, the most readily visible are transverse waves,

created by the displacement of the skin from a finger impact. When shot with a high-speed

camera, these appear as ripples, which propagate outward from the point of contact. The

amplitude of these ripples is correlated to both the tapping force and to the volume and

compliance of soft tissues under the impact area. In general, tapping on soft regions of the arm

creates higher amplitude transverse waves than tapping on boney areas (e.g., wrist, palm,

fingers), which have negligible compliance.

In addition to the energy that propagates on the surface of the arm, some energy is transmitted

inward, toward the skeleton. These longitudinal (compressive) waves travel through the soft

tissues of the arm, exciting the bone, which is much less deformable then the soft tissue but can

respond to mechanical excitation by rotating and translating as a rigid body. This excitation

8

http://en.wikipedia.org/wiki/Humans

http://en.wikipedia.org/wiki/Animal

http://en.wikipedia.org/wiki/Transmission_medium

http://en.wikipedia.org/wiki/Sound

http://en.wikipedia.org/wiki/Acoustics

http://en.wikipedia.org/wiki/Biology

http://en.wikipedia.org/wiki/Science

http://en.wikipedia.org/wiki/Acoustical_engineering

http://en.wikipedia.org/wiki/Acoustician

http://en.wikipedia.org/wiki/Infrasound

http://en.wikipedia.org/wiki/Ultrasound

http://en.wikipedia.org/wiki/Sound

http://en.wikipedia.org/wiki/Vibration

http://en.wikipedia.org/wiki/Mechanical_wave


vibrates soft tissues surrounding the entire length of the bone, resulting in new longitudinal

waves that propagate outward to the skin.

3.2.1 Transverse Wave Propagation :

Figure 3: Finger impacts displace the skin, creating transverse waves (ripples). The sensor is

activated as the wave passes underneath it.

3.2.2 Longitudinal Wave Propagation :

Figure 4: Finger impacts create longitudinal (compressive) waves that cause internal skeletal

structures to vibrate. This, in turn, creates longitudinal waves that emanate outwards from the

bone (along its entire length) toward the skin.

9


We highlight these two separate forms of conduction, transverse waves moving directly along

the arm surface, and longitudinal waves moving into and out of the bone through soft tissues –

because these mechanisms carry energy at different frequencies and over different distances.

Roughly speaking, higher frequencies propagate more readily through bone than through soft

tissue, and bone conduction carries energy over larger distances than soft tissue conduction.

While we do not explicitly model the specific mechanisms of conduction, or depend on these

mechanisms for our analysis, we do believe the success of our technique depends on the complex

acoustic patterns that result from mixtures of these modalities. Similarly, we also believe that

joints play an important role in making tapped locations acoustically distinct. Bones are held

together by ligaments, and joints often include additional biological structures such as fluid

cavities. This makes joints behave as acoustic filters. In some cases, these may simply dampen

acoustics; in other cases, these will selectively attenuate specific frequencies, creating location

specific acoustic signatures.

Figure 5: Arm band which consists of vibration sensor array

3.2.3 Bioacoustic Sensor:

10


The Minisense 100 is a low-cost cantilever-type vibration sensor loaded by a mass to offer high

sensitivity at low frequencies. The pins are designed for easy installation and are solderable.

Horizontal and vertical mounting options are offered as well as a reduced height version. The

active sensor area is shielded for improved RFI/EMI rejection. Rugged, flexible PVDF sensing

element withstands high shock overload. Sensor has excellent linearity and dynamic range, and

may be used for detecting either continuous vibration or impacts.

Some features of Minisense 100 are given below:

High Voltage Sensitivity (1 V/g)

Over 5 V/g at Resonance

Horizontal or Vertical Mounting

Shielded Construction

Solderable Pins, PCB Mounting

Low Cost

< 1% Linearity

Up to 40 Hz (2,400 rpm) Operation Below Resonance

3.3 Bluetooth ;

Bluetooth is a wireless technology standard for exchanging data over short distances (using

short-wavelength radio transmissions in the ISM band from 2400–2480 MHz) from fixed and

mobile devices, creating personal area networks (PANs) with high levels of security. Created by

telecoms vendor Ericsson in 1994, it was originally conceived as a wireless alternative to RS-232

data cables. It can connect several devices, overcoming problems of synchronization. Bluetooth

takes small-area networking to the next level by removing the need for user intervention and

keeping transmission power extremely low to save battery power.

Bluetooth is essentially a networking standard that works at two levels:

It provides agreement at the physical level -- Bluetooth is a radio-frequency standard.

11

http://electronics.howstuffworks.com/radio-spectrum.htm

http://electronics.howstuffworks.com/battery.htm

http://en.wikipedia.org/wiki/RS-232

http://en.wikipedia.org/wiki/Ericsson

http://en.wikipedia.org/wiki/Personal_area_network

http://en.wikipedia.org/wiki/Wireless


It provides agreement at the protocol level, where products have to agree on when bits are

sent, how many will be sent at a time, and how the parties in a conversation can be sure

that the message received is the same as the message sent.

Bluetooth is intended to get around the problems that come with infrared systems. The older

Bluetooth 1.0 standard has a maximum transfer speed of 1 megabit per second (Mbps), while

Bluetooth 2.0 can manage up to 3 Mbps. Bluetooth 2.0 is backward-compatible with 1.0 devices.

One of the ways Bluetooth devices avoid interfering with other systems is by sending out very

weak signals of about 1 milliwatt. By comparison, the most powerful cell phones can transmit a

signal of 3 watts. The low power limits the range of a Bluetooth device to about 10 meters (32

feet), cutting the chances of interference between your computer system and your portable

telephone or television. Even with the low power, Bluetooth doesn't require line of sight between

communicating devices. The walls in your house won't stop a Bluetooth signal, making the

standard useful for controlling several devices in different rooms.

Bluetooth can connect up to eight devices simultaneously. With all of those devices in the same

10-meter (32-foot) radius, you might think they'd interfere with one another, but it's unlikely.

Bluetooth uses a technique called spread-spectrum frequency hopping that makes it rare for more

than one device to be transmitting on the same frequency at the same time. In this technique, a

device will use 79 individual, randomly chosen frequencies within a designated range, changing

from one to another on a regular basis. In the case of Bluetooth, the transmitters change

frequencies 1,600 times every second, meaning that more devices can make full use of a limited

slice of the radio spectrum. Since every Bluetooth transmitter uses spread-spectrum transmitting

automatically, it’s unlikely that two transmitters will be on the same frequency at the same time.

This same technique minimizes the risk that portable phones or baby monitors will disrupt

Bluetooth devices, since any interference on a particular frequency will last only a tiny fraction

of a second.

So we are connecting armband and mobile device using Bluetooth technology. So whatever data

is received by the sensors are transferred to the mobile device. That mobile device samples the

data and compared it with the stored data and according to the algorithm task is performed.

12

http://electronics.howstuffworks.com/radio-spectrum.htm


CHAPTER 4

Experiments

4.1 Experimental Conditions :

To evaluate the performance of our system, they recruited 13 participants (7 female) from the

Greater Seattle area. These participants represented a diverse crosssection of potential ages and

body types. Ages ranged from 20 to 56 (mean 38.3), and computed body mass indexes (BMIs)

ranged from 20.5 (normal) to 31.9 (obese).

We selected three input groupings from the multitude of possible location combinations to test.

We believe that these groupings, illustrated in Figure 7, are of particular interest with respect to

interface design, and at the same time, push the limits of our sensing capability. From these three

groupings, we derived five different experimental conditions, described below. Fingers (Five

Locations) one set of gestures we tested had participants tapping on the tips of each of their five

fingers (Figure “Fingers”). The fingers offer interesting affordances that make them compelling

to appropriate for input. Foremost, they provide clearly discrete interaction points, which are

even already well-named (e.g., ring finger). In addition to five finger tips, there are 14 knuckles

(five major, nine minor), which, taken together, could offer 19 readily identifiable input locations

on the fingers alone. Second, we have exceptional finger to finger dexterity, as demonstrated

when we count by tapping on our fingers. Finally, the fingers are linearly ordered, which is

potentially useful for interfaces like number entry, magnitude control (e.g., volume), and menu

selection. At the same time, fingers are among the most uniform appendages on the body, with

all but the thumb sharing a similar skeletal and muscular structure. This drastically reduces

13


acoustic variation and makes differentiating among them difficult. Additionally, acoustic

information must cross as many as five (finger and wrist) joints to reach the forearm, which

further dampens signals. For this experimental condition, we thus decided to place the sensor

arrays on the forearm, just below the elbow. Despite these difficulties, pilot experiments showed

measureable acoustic differences among fingers, which we theorize is primarily related to finger

length and thickness, interactions with the complex structure of the wrist bones, and variations in

the acoustic transmission properties of the muscles extending from the fingers to the forearm.

Whole Arm (Five Locations) Another gesture set investigated the use of five input locations On

the forearm and hand: arm, wrist, palm, thumb and middle finger (Figure, “Whole Arm”). We

selected these locations for two important reasons. First, they are distinct and named parts of the

body (e.g., “wrist”). This allowed participants to accurately tap these locations without training

or markings. Additionally, these locations proved to be acoustically distinct during piloting, with

the large spatial spread of input points offering further variation. We used these locations in three

different conditions.

One condition placed the sensor above the elbow, while another placed it below. This was

incorporated into the experiment to measure the accuracy loss across this significant articulation

point (the elbow). Additionally, participants repeated the lower placement condition in an eyes-

free context: participants were told to close their eyes and face forward, both for training and

testing. This condition was included to gauge how well users could target on-body input

locations in an eyes-free context (e.g., driving). Forearm (Ten Locations) In an effort to assess

the upper bound of our approach’s sensing resolution, our fifth and final experimental condition

used ten locations on just the forearm (Figure 6, “Forearm”). Not only was this a very high

density of input locations (unlike the whole-arm condition), but it also relied on an input surface

(the forearm) with a high degree of physical uniformity (unlike, e.g., the hand). We expected that

these factors would make acoustic sensing difficult. Moreover, this location was compelling due

to its large and flat surface area, as well as its immediate accessibility, both visually and for

finger input. Simultaneously, this makes for an ideal projection surface for dynamic interfaces.

To maximize the surface area for input, we placed the sensor above the elbow, leaving the entire

forearm free. Rather than naming the input locations, as was done in the previously described

conditions, we employed small, colored stickers to mark input targets. This was both to reduce

confusion (since locations on the forearm do not have common names) and to increase input

14


consistency. As mentioned previously, we believe the forearm is ideal for projected interface

elements; the stickers served as low tech placeholders for projected.

Design and Setup :

We employed a within-subjects design, with each participant performing tasks in each of the five

conditions in randomized order: five fingers with sensors below elbow; five points on the whole

arm with the sensors above the elbow; the same points with sensors below the elbow, both

sighted and blind; and ten marked points on the forearm with the sensors above the elbow.

Participants were seated in a conventional office chair, in front of a desktop computer that

presented stimuli. For conditions with sensors below the elbow, we placed the armband~3cm

away from the elbow, with one sensor package near the radius and the other near the ulna. For

conditions with the sensors above the elbow, we placed the armband ~7cm above the elbow,

such that one sensor package rested on the biceps. Right-handed participants had the armband

placed on the left arm, which allowed them to use their dominant hand for finger input. For the

one left-handed participant, we flipped the setup, which had no apparent effect on the operation

of the system. Tightness of the armband was adjusted to be firm, but comfortable. While

performing tasks, participants could place their elbow on the desk, tucked against their body, or

on the chair’s adjustable armrest; most chose the latter.

Procedure :

For each condition, the experimenter walked through the input locations to be tested and

demonstrated finger taps on each. Participants practiced duplicating these motions for

approximately one minute with each gesture set. This allowed participants to familiarize

themselves with our naming conventions (e.g. “pinky”, “wrist”), and to practice tapping their

arm and hands with a finger on the opposite hand. It also allowed us to convey the appropriate

tap force to participants, who often initially tapped unnecessarily hard. To train the system,

participants were instructed to comfortably tap each location ten times, with a finger of their

choosing. This constituted one training round. In total, three rounds of training data were

collected per input location set (30 examples per location, 150 data points total). An exception to

this procedure was in the case of the ten forearm locations, where only two rounds were

15


collected to save time (20 examples per location, 200 data points total). Total training time for

each experimental condition was approximately three minutes. We used the training data to build

an SVM classifier. During the subsequent testing phase, we presented participants with simple

text stimuli (e.g. “tap your wrist”), which instructed them where to tap. The order of stimuli was

randomized, with each location appearing ten times in total. The system performed real-time

segmentation and classification, and provided immediate feedback to the participant (e.g. “you

tapped your wrist”). We provided feedback so that participants could see where the system was

making errors (as they would if using a real application).

Figure 6: Accuracy of the three whole-arm-centric conditions. Error bars represent standard

deviation.

If an input was not segmented (i.e. the tap was too quiet), participants could see this and would

simply tap again. Overall, segmentation error rates were negligible in all conditions, and not

included in further analysis. In this section, we report on the classification accuracies for the test

phases in the five different conditions. Overall, classification rates were high, with an average

accuracy across conditions of 87.6%. Additionally, we present preliminary results exploring the

correlation between classification accuracy and factors such as BMI, age, and sex.

16


Five Fingers

Despite multiple joint crossings and ~40cm of separation between the input targets and sensors,

classification accuracy remained high for the five-finger condition, averaging 87.7%

(SD=10.0%, chance=20%) across participants. Segmentation, as in other conditions, was

essentially perfect. Inspection of the confusion matrices showed no systematic errors in the

classification, with errors tending to be evenly distributed over the other digits. When

classification was incorrect, the system believed the input to be an adjacent finger 60.5% of the

time; only marginally above prior probability (40%). This suggests there are only limited

acoustic continuities between the fingers. The only potential exception to this was in the case of

the pinky, where the ring finger constituted 63.3% percent of the misclassifications.

Whole Arm

Participants performed three conditions with the whole-arm location configuration. The below-

elbow placement performed the best, posting a 95.5% (SD=5.1%, chance=20%) average

accuracy. This is not surprising, as this condition placed the sensors closer to the input targets

than the other conditions. Moving the sensor above the elbow reduced accuracy to 88.3%

(SD=7.8%, chance=20%), a drop of 7.2%. This is almost certainly related to the acoustic loss at

the elbow joint and the additional 10cm of distance between the sensor and input targets. Figure

8 shows these results. The eyes-free input condition yielded lower accuracies than other

conditions, averaging 85.0% (SD=9.4%, chance=20%). This represents a 10.5% drop from its

vision assisted, but otherwise identical counterpart condition. It was apparent from watching

participants complete this condition that targeting precision was reduced. In sighted conditions,

participants appeared to be able to tap locations with perhaps a 2cm radius of error. Although not

formally captured, this margin of error appeared to double or triple when the eyes were closed.

We believe that additional training data, which better covers the increased input variability,

would remove much of this deficit. We would also caution designers developing eyes-free, on-

body interfaces to carefully consider the locations participants can tap accurately.

17


Figure 7: Higher accuracies can be achieved by collapsing the 10 input locations into groups. A–

E and G were designed to be spatially intuitive. F was created following analysis of per-location

accuracy data.

Higher accuracies can be achieved by collapsing the ten input locations into groups. First two

were created using a design-centric strategy. Last figure was created following analysis of per-

location accuracy data.

Forearm

Classification accuracy for the ten-location forearm condition stood at 81.5% (SD=10.5%,

chance=10%), a surprisingly strong result for an input set we devised to push our system’s

sensing limit (K=0.72, considered very strong). Following the experiment, we considered

different ways to improve accuracy by collapsing the ten locations into larger input groupings.

The goal of this exercise was to explore the tradeoff between classification accuracy and number

of input locations on the forearm, which represents a particularly valuable input surface for

application designers. We grouped targets into sets based on what we believed to be logical

spatial groupings. In addition to exploring classification accuracies for layouts that we

considered to be intuitive, we also performed an exhaustive search over all possible groupings.

For most location counts, this search confirmed that our intuitive groupings were optimal;

however, this search revealed one plausible, although irregular, layout with high accuracy at six

input locations (Figure 9, F). Unlike in the five-fingers condition, there appeared to be shared

18


acoustic traits that led to a higher likelihood of confusion with adjacent targets than distant ones.

This effect was more prominent laterally than longitudinally. Figure illustrates this with lateral

groupings consistently outperforming similarly arranged, longitudinal groupings (B and C vs. D

and E). This is unsurprising given the morphology of the arm, with a high degree of bilateral

symmetry along the long axis.

4.2 ANALYSIS :

The audio stream was segmented into individual taps using an absolute exponential average of

all sensor channels (Figure, red waveform). When an intensity threshold was exceeded (Figure,

upper blue line), the program recorded the timestamp as a potential start of a tap. If the intensity

did not fall below a second, independent “closing” threshold (Figure, lower purple line) between

100 and 700 ms after the onset crossing (a duration we found to be the common for finger

impacts), the event was discarded. If start and end crossings were detected that satisfied these

criteria, the acoustic data in that period (plus a 60 ms buffer on either end) was considered an

input event (Figure , vertical green regions). Although simple, this heuristic proved to be robust.

After an input has been segmented, the waveforms are analyzed. We employ a brute force

machine learning approach, computing 186 features in total, many of which are derived

combinatorially. For gross information, we include the average amplitude, standard deviation

and total (absolute) energy of the waveforms in each channel (30 features). From these, we

calculate all average amplitude ratios between channel pairs (45 features). We also include an

average of these ratios (1 feature). We calculate a 256-point FFT for all 10 channels, although

only the lower 10 values are used (representing the acoustic power from 0 to 193 Hz), yielding

100 features. These are normalized by the highest amplitude FFT value found on any channel.

19


Figure 8: Ten channels of acoustic data generated by three finger taps on the forearm, followed

by three taps on the wrist. The exponential average of the channels is shown in red. Segmented

input windows are highlighted in green. Note how different sensing elements are activated by the

two locations

We also include the center of mass of the power spectrum within the same 0–193 Hz range for

each channel, a rough estimation of the fundamental frequency of the signal displacing each

sensor (10 features). Subsequent feature selection established the all-pairs amplitude ratios and

certain bands of the FFT to be the most predictive features. These 186 features are passed to a

support vector machine (SVM) classifier. A full description of SVMs is beyond the scope of this

paper (see Burges3 for a tutorial). Our software uses the implementation provided in the Weka

machine learning toolkit.26 It should be noted, however, that other, more sophisticated

classification techniques and features could be employed. Thus, the results presented in this

paper should be considered a baseline. Before the SVM can classify input instances, it must first

be trained to the user and the sensor position. This stage requires the collection of several

examples for each input location of interest. When using Skinput to recognize live input, the

same 186 acoustic features are computed on the-fly for each segmented input. These are fed into

the trained SVM for classification. We use an event model in our software—once an input is

classified, an event associated with that location is instantiated. Any interactive features bound to

that event are fired.

20


SVM

A support vector machine (SVM) is a concept in statistics and computer science for a set of

related supervised learning methods that analyze data and recognize patterns, used for

classification and regression analysis. The standard SVM takes a set of input data and predicts,

for each given input, which of two possible classes forms the input, making the SVM a non-

probabilistic binary linear classifier. Given a set of training examples, each marked as belonging

to one of two categories, an SVM training algorithm builds a model that assigns new examples

into one category or the other. An SVM model is a representation of the examples as points in

space, mapped so that the examples of the separate categories are divided by a clear gap that is as

wide as possible. New examples are then mapped into that same space and predicted to belong to

a category based on which side of the gap they fall on.

4.3 BMI EFFECT:

Early on, we suspected that our acoustic approach was susceptible to variations in body

composition. This included, most notably, the prevalence of fatty tissues and the density/ mass of

bones. These, respectively, tend to dampen or facilitate the transmission of acoustic energy in the

body. To assess how these variations affected our sensing accuracy, we calculated each

participant’s body mass index (BMI) from self-reported weight and height. Data & observations

from the experiment suggest that high BMI is correlated with decreased accuracies.

Figure 9: Accuracy was significantly lower for participants with BMIs above the 50th percentile.

21

http://en.wikipedia.org/wiki/Linear_classifier

http://en.wikipedia.org/wiki/Binary_classifier

http://en.wikipedia.org/wiki/Probabilistic_logic

http://en.wikipedia.org/wiki/Regression_analysis

http://en.wikipedia.org/wiki/Classification_(machine_learning)

http://en.wikipedia.org/wiki/Supervised_learning

http://en.wikipedia.org/wiki/Computer_science

http://en.wikipedia.org/wiki/Statistics


The participants with the three highest BMIs (29.2, 29.6, and 31.9 – representing borderline

produced the three lowest average accuracies. Figure illustrates this significant disparity here

participants are separated into two groups, those with BMI greater and less than the US national

median, age and sex adjusted [5] (F1,12=8.65, p=.013). Other factors such as age and sex, which

may be correlated to BMI in specific populations, might also exhibit a correlation with

classification accuracy. For example, in our participant pool, males yielded higher classification

accuracies than females, but we expect that this is an artifact of BMI correlation in our sample,

and probably not an effect of sex directly.

22


CHAPTER 5

Advantages

Easy to work: Skinput technology is very easy to understand and it’s very easy to use, it

takes only 20 mins to figure out how to work it.

No interaction with the gadget : If we have to use any application of our mobile then we

reach to our pocket take out the device, unlock it and then go to the application. By using

Skinput we do not need any interaction with the gadget. We have to just tap our finger

and the desired function will performed by the system.

No worry about keypad : People with large fingers gets trouble while operating touch

screens. Using Skinput we get very large interaction surface area. So for such people this

problem will resolve.

Without Visual Contact : For some operations like music players we need only 4-5

buttons. So we can use each fingertip as a button. For such operation we don’t any

display. We can operate such functions without any visual contact.

Easy to access when your phone is not available.

Allows users to interact more personally with their device.

Larger buttons to reduce the risk of pressing the wrong buttons.

Through the use of a sense called proprioception after user learns where the locations are

on the skin they will no longer have to look down to use Skinput reducing people looking

down at their phone while driving.

It can be used for a more interactive gaming experience.

23


CHAPTER 6

Disadvantages

Skinput has its downfalls, especially due to fact of the BIG band that looks very easy to

put on. Many people would not wear a very big band around their arm for the day just to

have this product.

Everybody can’t use this product, the elderly for example have a hard time adapting to

technology as it is. We also have to take in consideration the inconvenience it would

cause to people with invisible disabilities.

This technology only works on direct skin exposure. We cannot use full sleeves shirt

when we are using this technology.

Currently there are only five buttons with accuracy more than 95%. A phone uses at least

10 buttons to dial a phone number or send a text message. So in such cases it will create

problem.

The easy accessibility will cause people to be more socially distracted.

If the user has more than a 30% Body Mass Index, Skinput is reduced to 80% accuracy.

The arm band is currently bulky.

The visibility of the projection of the buttons on the skin can be reduced if the user has a

tattoo located on their arm

24


CHAPTER 7

Applications

We can use Skinput technology in any mobile device. We just need different software for

different mobiles like for mobiles which supports android operating system requires android

application or Symbian operating system requires .jar or .sis software.

We can use this technology in i-pods or other music devices which supports Bluetooth

technology. For such music devices we just need 4 or 5 different buttons. So we can use our

fingertips as input. Like this we can operate these devices without any visual contact.

In gaming devices we can use this technology. So without any joysticks or touch screens we can

play games very easily.

Person with physical disabilities can operate this system very easily.

For simpler browsing system which require less number of buttons (maximum 10) can be

replaced by this technology.

25


CHAPTER 8

FUTURE IMPLEMENTATION

In order to assess the real-world practicality of Skinput, we are currently building a successor to

our prototype that will incorporate several additional sensors, particularly electrical sensors and

inertial sensors (accelerometers and gyroscopes). In addition to expanding the gesture vocabulary

beyond taps, we expect this sensor fusion to allow considerably more accuracy—and more

robustness to false positives—than each sensor alone. This revision of our prototype will also

allow us to benefit from anecdotal lessons learned since building our first prototype: in

particular, early experiments with subsequent prototypes suggest that the hardware filtering we

describe above can be effectively replicated in software, allowing us to replace our relatively

large piezoelectric sensors with micro-machined accelerometers.

This considerably reduces the size and electrical complexity of our armband. Furthermore,

anecdotal evidence has also suggested that vibration frequency ranges as high as several

kilohertz may contribute to tap classification, further motivating the use of broadband

accelerometers. Finally, our multi-sensor armband will be wireless, allowing us to explore a wide

variety of usage scenarios, as well as our general assertion that always-available input will

inspire radically new computing paradigms.

26


Conclusion

In this paper, we have presented our approach to appropriating the human body as an input

surface. We have described a novel, wearable bio-acoustic sensing array that we built into an

armband in order to detect and localize finger taps on the forearm and hand. Results from

experiments have shown that the system performs very well for a series of gestures, even when

the body is in motion. Additionally, we have presented initial results demonstrating other

potential uses of our approach, which we hope to further explore in future work. These include

single-handed gestures, taps with different parts of the finger, and differentiating between

materials and objects. We conclude with descriptions of several prototype applications that

demonstrate the rich design space we believe Skinput enables.

27


REFERENCES

1) Chris Harrison, Desney Tan, and Dan Morris “Skinput: Appropriating the Skin as an

Interactive Canvas” Microsoft Research 2011.

2) Chris Harrison, Scott E. Hudson “Scratch Input: Creating, Large Inexpensive,

Unpowered and Mobile Finger Input Surfaces”UIST 2008.

3) Amento, B.Hill, W.Terveen “The Sound of one Hand: A wrist- mounted bio-acoustic

fingertip gesture interface” CHI’02.

4) Thomas Hahn “Future Human Computer Interaction with special focus on input and

output techniques” HCI March 2006.

5) Burges, C.J. A Tutorial on Support Vector Machines for Pattern Recognition. Data

Mining and Knowledge Discovery, 2.2, June 1998, 121-167.

6) Clinical Guidelines on the Identification, Evaluation, and Treatment of Overweight and

Obesity in Adults. National Heart, Lung and Blood Institute. Jun. 17, 1998.

7) Deyle, T., Palinko, S., Poole, E.S., and Starner, T. Hambone: A Bio-Acoustic Gesture

Interface. In Proc. ISWC '07. 1-8.

8) Erol, A., Bebis, G., Nicolescu, M., Boyle, R.D., and Twombly, X. Vision-based hand

pose estimation: A review. Computer Vision and Image Understanding. 108, Oct., 2007.

9) Fabiani, G.E. McFarland, D.J. Wolpaw, J.R. and Pfurtscheller, G. Conversion of EEG

activity into cursor movement by a brain-computer interface (BCI). IEEE Trans. on

Neural Systems and Rehabilitation Engineering, 12.3, 331-8. Sept. 2004.

10) Grimes, D., Tan, D., Hudson, S.E., Shenoy, P., and Rao, R. Feasibility and pragmatics

of classifying working memory load with an electroencephalograph. Proc. CHI ’08, 835-

844.

11) Harrison, C., and Hudson, S.E. Scratch Input: Creating Large, Inexpensive, Unpowered

and Mobile finger Input Surfaces. In Proc. UIST ’08, 205-208.

28

Arm as a Touchscreen

Technology

Transcript of Arm as a Touchscreen