SIXth Sense Technologyx - 123SeminarsOnly

engineerbeetsAI and Robotics

Nov 15, 2013 (4 years and 6 months ago)



Now a days When we encounter something or some place, we use our five natural
senses t
o perceive information about it,

that information helps us make decisions
and chose the right actions to take
The SixthSense

prototype is comprised of a
pocket projector, a mirror and a camera

(mobile phone)

Aim of the sixth sense technolofy is
to rethink the ways in which humans and
computers interact, partially by redefining both human and computer
. So if we
achive this goal
we can continually learn from our surroundings.

There is no
link between our digital devices and our interactions with the physical
world.SixthSense bridges this gap

SixthSense technology is a wearable gestural interface that enhance the physical
d around us with digital information and lets us use natural hand gestures to
interact with that information. (like touch screen)

Some of the more practical uses
of this technology

This Sixth Sense Technology is currently being used on a very small scale

efforts made to make it more plausible for other situations.
rom healthcare to to
in daily life

reading a newspaper and viewing videos instead of the photos in the paper.

live sports updates while reading the newspaper.

The device can al
so tell arrival, departure or delay time of air plane on

For book lovers, Open any book and find the Amazon ratings of the book.
pick any page and the device gives additional information on the text,
comments and lot more add on feature

Brief Su

A 28 year old MIT student named

Pranav Mistry has invented new technology. He
calls it his Sixth Sense Technology. In short,

has created is a device that
you can take with you anywhere that will aid you in day to day activities. Anything
from simply projecting an image on the wall in front of you, to taking a picture
simply by using your hands.

Pranav Mistry

s the inventor of

SixthSense, a wearable device that enables new interactions
between the
world and the world of data.
Pranav Mistry is a PhD student in
the Fluid Interfaces Group at MIT's Media Lab. Before his studies at MIT, he
worked wi
th Microsoft as a UX resear
. Mistry
objective is to
integrating the
digital information

with our real
world interactions



TED (Technology, Entertainment, Design) is a working prototype of a multifunctional device
that can become part of our lives in five years to ten.

Set named sixth sense consists of a camera
(it captures the movement of hands), the projector (it produces the image on any surface), the
mobile phone (it is in your pocket and need only to communicate with the abstract database
server) and four fingers o
n which to detect movements, wearing colorful caps, perceived by the



Blind people and the World Wide Web


King, Gareth Evans, Paul Blenkhorn, UMIST, Manchester, UK. Links may be different
from the original 2004 article.

1 Blind people and the World Wide Web

Perhaps you've read a book recently? Perhaps when you finished you picked up a newspaper and
got the sp
orts headlines, or went online and surfed some travel sites to book next year's summer
holiday? Your local newsstand easily has a hundred newspapers and magazines. If you have web
access, you have billions of sites available to you. Unless, of course, you'
re blind, when
accessing printed or net resources suddenly becomes a very different proposition.

Traditionally, blind people have had only limited means of accessing printed material. Braille is
the most famous access method, but only a tiny proportion of
blind people can read Braille

some 2% in the UK. Recent years have seen the wider adoption of audio recordings, but like
Braille these suffer from a lack of immediacy

you want the news today, not to wait a week for
it to be translated

and a blind use
r is usually reliant on sighted people, often volunteers, to
produce the material. This reliance and the higher costs of producing alternative format materials
such as audiotapes necessarily reduce the material available. This is a poor comparison with
t is available to sighted users and their choice of material.

The rise of affordable personal computing in general and the Internet in particular promised an
incredible improvement in access to written materials. With a personal computer, some easily
able technology, and a web browser, you are no longer restricted to tapes sent through the
post or the passive technology of the radio: you now have access to billions of web pages,
personal, corporate, educational, entertaining, all available from your ho
me. And there is no
better time for this huge revolution: the great majority of blind people in developed countries
become so because of the effects of age. With average life expectancies increasing the number of
potential blind Internet users grows and gr
ows. There are some one million registered visually
impaired people in the UK, of whom 750,000 are over 75. They want access to the same material
they've always had, whether it's the London Times or the National Enquirer, but the material may
not be availa
ble in an alternative format. Relying on what other people choose to translate for
your benefit reduces your choice and freedom. Besides, sighted people have taken to the Internet
in their millions for booking holidays, researching family trees and countle
ss other uses: blind
people need the same opportunities, and the technology makes it possible.

This is not to say, alas, that the web is a happy land where a blind person can surf and browse
with all the freedom and ease of a sighted person. To understand
why, we need to examine how
blind people access computers in the first place.

2 How blind people access computers

The last decade has seen the triumph of the rich graphical desktop, replete with colourful icons,
controls and buttons all around the screen,
controlled by the mouse pointer moving about the
screen clicking and dragging. This is not, on the face of it, a usable environment for blind people,
but use it they must.

Many people with a significant visual impairment have some degree of residual vision
. There are
assistive technology solutions for them: a
screen magnifier

application, such as
ZoomText from
Ai Squared,

magnifies a small area of the display, potentially filling the entire computer screen.

The user can move the area being magnified around the desktop. This allows the user to control
the computer interface directly, and is a good solution for people with gradually
vision, especially those who are already familiar with their compute
r interface but are starting to
have trouble seeing it. However, for those with a significant visual impairment or complete
blindness, there are different two options.

The first is to use a
screen reader
. This is an application that attempts to describe to

the blind
user in speech what the graphical user interface is displaying. It turns the visual output of the
standard user interface into a format that is accessible to a blind user. In practice this means
driving a Braille output device

a row of Braille

cells with mechanical pins that pop up and
simulate Braille characters under the user's fingers

or, more commonly, a text
synthesizer. We will deal exclusively with these text
speech users in the rest of this article
because they form the g
reat majority of users, actual and potential. The screen reader acts almost
as a sighted companion to the blind user, reading out what is happening on the screen

boxes, command buttons, menu items, and text. Ultimately screen readers have to access

the raw
video output from the operating system to the screen and analysing it for information that should
be presented to the user. This is a complex process, as you would expect from an application that
is attempting to communicate the complicated graphi
cal user interface in a wholly non
way. There are many screen readers available, including JAWS from Freedom Scientific,
Window Eyes from GW Micro, or
Thunder from

If you have Windows 2000
or XP, you'll find that Microsoft have included a basic screen reader in the operating system,
called Narrator: try activating it, opening Notepad and typing some text or checking your email
without looking at your screen.

The goa
l of a screen reader is to make it appear to the user as if the current application was itself
a talking application designed specifically for blind users. This is difficult to accomplish.
Applications often have particular user controls or methods of oper
ation that must be supported
by the screen reader. For example, a spreadsheet program operates very differently from an email
client. This forces screen reader developers to adapt their programs to support specific
applications, typically the market leader
s like Microsoft Word. It also means that applications
that utilise simple interface components like menus and text boxes will work best with screen
readers. Those with non
standard interface components like 3D animations may be difficult for a
screen read
er to access.

The second way for a blind person to use a computer is to take advantage of

applications. These are usually applications written specifically for blind people that provide
their output through synthesised or recorded speech. The
obvious advantage is that the
application designer can ensure that what is communicated to the user is exactly what the
designer wants communicated

although this assumes that the designer's conception of what the
user needs or wants to hear is correct! A
side from the extra design and development required to
produce a self
voicing application, the main drawback is that the application cannot be used at
the same time as the user's screen reader. If the application usurps the screen reader, the user's
ary interface to the computer, it takes upon itself the responsibility for being at least as
comfortable and usable for the user as their screen reader. Users become accustomed to their
particular screen reader and its operation and will have it configured

just as they want it. The
hotkeys of a self
voicing application may be different; the voice may be different, and have
different characteristics. For example, many screen reader users set them to read out as fast as
possible, which sounds odd if you have
never heard it before but makes sense if you are
accustomed to it. With a self
voicing application, the user may even have to switch off their
screen reader, which is most undesirable if they want to use another non
voicing application
at the same tim

Whether using a screen reader or a self
voicing application, the use of the sense of hearing rather
than vision has great implications for the design of the interface. The visual sense, or visual
, has an enormous capacity for communicating info
rmation quickly and easily. If you
look at an application on your computer display and you will immediately notice the menus,
icons, buttons and other interface controls arrayed about screen. Each represents a function that
is available to you, and a quick

glance allows you to locate the function you want and
immediately activate it with the mouse. Say the application is a word processor: you can go
straight from reading the text of your document to any one of the functions offered by the
interface. Now ima
gine that to find the print function you will have to start at the top left
corner of the screen and go through each control in turn, wait until its function is described to
you, until you find the function you require. Of course, experience blind com
puter users will not
rely on navigating through menus for every function. They will utilise shortcut keys, such as
P" to print a document, develop combinations of keystrokes to complete their most
common tasks, and learn the location of commonly

functions in menus and applications.
This requires, however, a consistent user interface, where shortcut keys and keystroke
combinations can be relied upon to perform the same function each time and menu items are
always located in the same place.

The imp
ortant constraint on the use of computers by blind users is that they rely on hearing,
rather than sight. Why is this such a problem? First, blind users are constrained into examining
one thing at a time in an order not of their own making

they do not kn
ow the structure of things
before they explore them. This is the problem with unfamiliar, rich, new interfaces. Second,
blind users have to listen to a surprising amount of text to give them the same amount of
information as a sighted user might be able to

gain in a quick glance. Sighted users might be able
to glance through a large document, scanning the chapter and paragraph headings for a key word
or phrase, because they can see the headings instantly distinct from the body text and what words
they conta
in. A blind user, even if they can jump from heading to heading, has to wait for the
slower screen reader to speak the heading: setting it to read as fast as possible might seem more
sensible now.

These two constraints, fixed order of access and time to ob
tain information, mean that interfaces
that rely on hearing must comply with a principle of
maximum output in minimum speech
This greatly changes usability: superfluous information is not just a distraction, as a page with
lots of links might be for a sig
hted user, but a real barrier to using the interface. Blind users must
not be asked to use a complex interface with many options. If a user misses some output, it will
need to be read out again, so an explicit way to repeat things is required. Most importa
ntly, users
need control over what is being said: sighted users can move their gaze wherever they want
whenever they want, and blind users need some similar control of the focus. Imagine reading
something where you can only see one word at a time with no w
ay to go back or forwards. Non
visual interfaces need to provide means to navigate through the document, stop, go back, skip
items, repeat and explore the text available. This affects how blind people browse web pages, as
we will find out next.

3 Access to

the web

So, knowing some of the problems that blind people have with accessing computers in general,
how can they access the wonders of the World Wide Web in particular? What are the particular
characteristics of browsers and more especially web pages the

Websites vary enormously, but with a quick browse around the most popular sites you will
quickly notice a common characteristic: a very heavily visual graphical interface: images,
including animated advertising banners; non
linear page layouts, l
ike a newspaper front page
with items and indices arranged around the screen; navigation menus and input controls for
search functions and user input. And these are simple static items: advanced sites now take
advantage of dynamic web page features like wh
ole user interfaces written in Flash. For every
Google, applauded for a simple and accessible user interface, there is another website with tabs,
buttons, pop
ups and other great features for sighted users.

It is important to realise that not only are web pages full of rich features, but that their
arrangement in the pages are completely non
standard. We have described how blind users can
use complex graphical applications by the use of hotkeys and learning t
he user interface. This
required a consistent user interface. Surf about some more websites, and you will realise quickly
that no such consistent user interface exists for web pages. In face, a single web page can be as
rich a user interface as a standalon
e application. Imagine arriving at an online bookshop's
website, with all those images, links, titles and text paragraphs, and having to start at the top left
hand corner and progress one item at a time through the page to find the login to check your last

order. No shortcut keys are available for useful functions like "search this website" or "contact
the website owner" that might be available on the page. Every website has a different user
interface which must be explored and understood to use it, which p
laces great demands on blind
users to make the necessary effort. So, how does a blind user start to get to grips with these

The immediate response might be to use the user's screen reader to access a conventional
browser like Internet Explorer. Thi
s has problems: we know that each application makes
different demands on the screen reader, and the heavily
visual and non
standard interfaces of
web pages pose considerable difficulties to a screen reader. Navigating the web can be compared
to trying to u
se the largest and most complex application that a blind person will ever attempt. A
specific problem with Internet Explorer is that the need to allow the user to move around the
document we have described is complicated by the lack of a

on a web pag
e, an indicator of
the position at which you will enter or delete text usually shown as a flashing vertical bar in a
text editor. Sighted users can simply glance at a different area to change their focus, but screen
reader users need to move the focus of t
he screen reader to the area of interest, and this is
normally done by moving the caret. Browser windows, however, do not have carets

you can
only scroll the whole page up and down and look for the text of interest. The only items you can
select individu
ally are links or form items. A screen reader could simply choose to read a web
page displayed in Internet Explorer from the very top of the page to the bottom, but this would
be immensely time
consuming for the user. Tables and frames and forms further co
mplicate a
web page. This is not to say that using a screen reader is impossible: advanced screen readers do
provide special navigation modes for web pages with a great deal of success. After all, web
browsing is one of the common applications which a scre
en reader developer will try to support.
However, complex navigation mechanisms are the result, and whilst these are excellent for
experienced and highly skilled users, they are not necessarily ideal for the newly blind user who
may be coming to the techno
logy late in life. Web access is a general, not specialist need, and
needs to support a general, no n
specialist group of users.

So, why not write a self
voicing web browser? Some have been developed, such as the IBM
Homepage Reader. They can be geared to

the needs of the user group, although the general
problems with self
voicing applications described above apply. The developer, even if they could
use existing browser and parser technology, has to design a complex new interface for the
complex graphic
al web pages to be browsed. Users will usually have a screen reader already:
why make them learn to use a new application, rather than further develop and utilise their
existing screen reader skills? Web pages are complex enough and vary in their own inter
can we keep the user in familiar territory at least in trying to navigate through these rich

One alternative solution that we propose is to translate the web page content into a 'screen
friendly' format. To do this we take a norm
al graphical web page and strip out superfluous
information like decorative images or table
based visual formatting to produce a simpler
navigable document that is in accord with the principle of maximum output in minimum speech
we established earlier. We
can give the user a cursor and let them loose in the translated
document, so they have control over what is being read out and they can explore the document
on their own terms. The user interface is simplified. The user can use their familiar, trusted
en reader, so the necessary learning curve for getting used to browsing the web through this
interface should be less steep. A number of applications based on this theory have been
developed, including Franck Audio Data's WebFormator and the Baum Web Wizar
d. Our
version is called WebbIE.

4 WebbIE

WebbIE re
presents the information from a web page in an accessible format suitable for a
screen reader

a panel of plain text (see Figure 1). We did not want to build a browser, so we
utilised the Microsoft WebBr
owser object, which gives a program its own internal Internet
Explorer. This can fetch a web page and parse it into a standard World
Web Consortium
(W3C) document format, which can then be queried by WebbIE for information on the web
page. This takes
care of the back
end processing and leaves us free to work on the user interface.
Since IE is so widespread a browser, we can expect almost every page to support it, which means
that we do not have to worry about unsupported web page features. The drawback

is that the
application will only work on Windows machines with Internet Explorer 5 or higher, but this
includes a very significant number of machines.

Of course, we still have to decide what to do with the web page features that have been parsed
and prov
ided to us. We use the WebBrowser to obtain information on the links and forms on the
page, since they are vital for navigation and use (for example search engines, commerce sites, or
database queries). All types of HTML link are supported, including image
s and image maps. We
then obtain the HTML for the body of the page from the WebBrowser and parse it directly,
generating a text
only document more like a plain text file than a web page. This allows us to
discard images and redundant mark
up, like tables u
sed for visual formatting, while still
communicating important structural mark
up characteristics like headers or lists, mostly by using
simple new lines for new paragraphs, headings or list items. Following the principle of maximum
output in minimum speec
h, the user can choose what non
text content is displayed, so they can
either have image descriptions provided to them or discarded. . The output is plain text, which
can be moved around or searched like a text document but retains its vital HTML functiona
with forms and links (see Figure 1).

Figure 1: WebbIE in action at Amazon, showing the original website and WebbIE's presentation
of it. Note the links (LINK) and the form elements (TEXT INPUT, SELECT ITEM, SUBMIT

Additional navigation fe
atures are provided to let the user move a caret around the text, allowing
the user to control what their screen reader reads to them. For example, the user can skip over
any links to the next piece of text, which comes in useful for those pages that use n
avigation bars
at the top of the page. Filling in forms is done within WebbIE: the user moves the cursor to a
form item and hits return. They complete a simple text box and the page is updated with the
input for review.

When the user initiates an action th
at results in an exit from the page, such as clicking on a link
to another web page or hitting the submit button on a form, the action is passed back to the
WebBrowser object, which processes it as a normal user
generated event and gets the new web
page or

submits the form depending on the event. WebbIE is updated with the result returned
from the website. The user therefore enjoys a fully
functional text
only web browser.

Better still, if plug
ins or support applications are installed, the WebBrowser will
trigger their
action automatically when their content type is encountered, which means that you can access
things like streaming audio from news or radio sites. You can bring up the native Internet
Explorer window visible if you want to access the web page


for example, to access
JavaScript controls or Java applets. The IE favorites are available to use and amend. It also
supports secure access to web pages (the SSL protocol), so a blind user can shop safely.

We mentioned that many visually impair
ed people have some functional vision, and that they
can benefit from magnification technology. WebbIE does provide some support for these users
by allowing them to increase the font size of the text and change (invert) the colours, but it does
not magnify

images, so it is not a magnification program
per se
. Users might find the simpler
WebbIE interface to web pages easier to use than scrolling around a magnified but still very
complex graphical web page.

5 Problems that WebbIE faces

The problems that WebbI
E encounters with some web pages are interesting because illustrate
problems that any non
visual web browser will encounter. These can be divided into three sets of
problems: how to present web pages to users; poor use of HTML; and inaccessible content.

esenting web pages to users

The principle of maximum output in minimum speech demands that we decide which mark
features of HTML are communicated to the user and how they can best be presented. To give an
example, HTML defines bulleted lists of items: d
o we need to communicate to the user that this
is a bulleted list, by adding to the text something like "Bullet item:" at the beginning of each line,
or do we simply provide a new line? There are six types of heading in HTML, usually displayed
with differe
nt fonts and sizes in text: does the user need to know the type of their current
heading? Is it more important that the structural features of the web page are communicated to
the user, or is it better that the user can work through the web page as quickly

as possible to find
what they want? With WebbIE we have generally gone with the latter approach, so the majority
of mark
up features

headings, paragraph breaks or lists for example

are simply presented with
new lines. An interesting related problem co
ncerns items that are useful for sighted users, such as
navigation bars at the top of each page, but undesirable for blind users, who then are faced with a
long list of the links at the top of each page (some sites may have more than 50 such links).

allows users to skip to the first non
link line, but a more elegant solution might be the
up of the navigation bar as a navigation feature that can be skipped unless requested
specifically. The W3C provides a mechanism for this in HTML, but few websi
te designers take
advantage of it. In general the problem of how to squeeze the output of a rich visual medium
through a more restrictive speech
based interface requires compromises and design decisions
with no "right" answer. Testing with real
life users
is the best solution.

Poor use of HTML

There are three main problems with poor use of HTML, and they are all preventable. The first is
especially simple. Images are useless to blind users, but HTML allows for a text description to be
applied to an image, c
alled an ALT tag. Most images can be discarded when a web page is
presented in WebbIE, but images that are used as links must be somehow communicated to the
user to inform them where a link will take them. Adding the ALT tag to the link image allows

to provide useful information to the user, for example. "Link to catalog". Without the
tag the only information that can be provided is the destination URL of the link. More and more
sites use server
side processing to produce dynamic pages, which can lea
d a list of identical links
to a web page called something like "". This is most disheartening
for a blind user, who will have to find the content they want by trial and error or give up on the
site completely. The next two probl
ems are more annoyances than show
stoppers. Frames and
the use of tables to provide visual structure can be worked around by WebbIE, but can produce
disjointed content when presented to the user solely as text. For example, HTML always
describes tables fro
m left to right, row by row. If a designer has intended the layout of items in a
table to have a semantic meaning for visual users, for example putting links in one row and
descriptions of the links in the row below, they will look attractively lined up fo
r sighted users
but a blind user will meet all the links and then all the descriptions with no obvious connection
between them. Frames require WebbIE to go and get the frame content separately, and often
produce a long list of internal navigation links at
the top of every page, but these problems are
generally surmountable. More difficult are a few websites that seek to prevent users from
accessing frame content directly, and use JavaScript to forcibly reinstate the frames

cannot get at the content

within the frames, and the user is left with nothing but a description of
the frames.

Inaccessible content

The HTML coding problems might be resolved with more attention to accessibility by web
designers, but experience suggests that this is optimistic. W
e will therefore continue to develop
WebbIE to handle problematic HTML code. However, the third type of access problem is more
fundamental. HTML is essentially a text
based format designed for the presentation of text
documents. This is perfect for blind u
sers but not for website designers, who have turned to
many alternative technologies to provide active functions to their pages and achieve the desired
visual effects. These technologies include Adobe Acrobat files, Java applets, and most recently
entire w
ebsites presented entirely in Shockwave Flash. Strategies for coping with these largely
depend on the extent to which the companies behind these proprietary formats have worked to
make them accessible to screen readers. WebbIE users can switch to the Inter
net Explorer view
and access the content if their screen reader can access the format. Embedded objects like Java
applets have ALT tags like images, but this does not allow access to the content. (Though since
many applets simply provide animation or other

visual effects to an image that may be
unimportant). A more subtle problem is the use of JavaScript in pages where it replaces the
functions normally reserved for HTML, such as submitting forms or linking to another web page.
For example, if a designer us
es the JavaScript onClick command to create a link rather than the
normal HTML tag WebbIE will have great difficulty in identifying and following the link. An
accessible interface to the JavaScript functions of a web page would be possible to develop, but
whether it would be desirable or even usable by blind users is another matter. New technologies
are introduced to the web all the time: all we can ask is that accessibility issues are addressed by
their developers.

Alasdair King, February 2004. Last update
d June 2008.