mapping effective production scheduling rules to problems - Filebox

electricfutureΤεχνίτη Νοημοσύνη και Ρομποτική

14 Νοε 2013 (πριν από 4 χρόνια και 6 μήνες)

277 εμφανίσεις




In today’s human
machine work systems, the human operator has more data and
information increasingly available. For example, in a typical aircraft cockpit, flight
management and control gadgets provide pilots with such an enormous
amount of data
that the cognitive workload on the pilots is often a major concern. Similar workload
concerns are found within information
centric organizations such as the military. The
use of more sophisticated sensor systems and the subsequent increase

in data has led to
increasing cognitive demands on the individual soldier. The soldier must efficiently and
effectively synthesize an increased volume of data to support real
time decision
on the battlefield. Similarly, in air traffic control tow
ers or command centers that
monitor urban traffic, the human operator can be overloaded by the amount of available
information. Information overloading can impede the operator’s ability to perform
critical cognitive tasks and result in an increased human
error rate.

The management and presentation of data and information in information
work systems present a unique and challenging interface design dilemma. This dilemma
occurs since the amount of data and information generated by technology is in
while the human’s ability to process this data and information remains constant.

In order to alleviate this problem, better engineered interface technologies must be
developed, tested, and analyzed to support human performance in a more highly
emanding cognitive environment. This burgeoning need for better


technologies to support human cognition has led to the development of many types of
cognitive aids or devices. Among these devices is augmented reality (AR), which is
designed to a
id or augment human cognition.

The goal of AR is simple. Through real
time superimposition of meaningful
generated graphical artifacts onto real world visual scenes, AR helps the human
user to better understand his or her environment (Azuma, 19
97). This enhanced
understanding aids cognition and facilitates the completion of real world tasks. Although
AR has potential benefits, the implementation of AR technologies can result in task
performance costs. The use of AR to fuse virtual artifacts a
nd real scenes may also
increase visual clutter (Yeh and Wickens, 1998). This increased clutter can increase
cognitive processing demands and lead to a decline in human performance.

Ergonomic research has been conducted to understand and reconcile the ben
and costs associated with AR. Several studies have acknowledged the benefits of AR.
For example, target detection performance is aided through the use of superimposed
cueing imagery by guiding attention through a real scene (Yen and Wickens, 2000;
Merlo, Wickens, and Brandenburg, 2003). Conversely, performance costs have also been
identified. One identified cost of superimposed imagery is cognitive tunneling, a
phenomena where attentional resources are misallocated within a real scene (Yeh an
Wickens, 1998; Yeh and Wickens, 2000).

The existing research on the ergonomic analysis of AR is relatively sparse in spite
of the abundance of literature on virtual environments and their applications. Stedmon
and Stone (2001), Stedmon, Hill, Kalawsky
, and Cook (1999) and Azuman (1997) note


this situation by stating that ergonomic issues in the design of AR environments have not
received the attention that they should. This dissertation responds to the need for further
studies of ergonomic issues in A
R environments. Through an applied and theoretical
investigation of task performance in an AR environment, this research represents an
extension of other research (Yeh and Wickens, 1998; Yeh, 2000; Maltz, and Shinar,
2003) that has investigated the augmen
tation of reality with cueing on simple target
detection tasks. In particular, this research explores the implications of cueing on change
detection task performance within an AR domain.

1.1. Background

AR is a mixed reality technology representing a me
rger of real and virtual
elements. AR is also regarded as an advanced human
computer interaction (HCI)
strategy (Shneiderman, 1998) that supplements the richness of reality, which is often
difficult to duplicate in a synthetic environment, with virtual ob
jects (Hermans, 2000).
Typically, this is accomplished through the use of a head mounted display (HMD). AR
provides the user with appropriate computer generated information about the real
environment that cannot be directly sensed or otherwise known with
out some sort of
augmentation (Furmanski, Azuma, Daily, 2002). AR enhances the user’s perceptual
capabilities by bringing this information into the user’s real world (Azuma, 1997).

Some AR applications use cueing to provide additional information to the

Cueing represents the superimposition of virtual elements on real objects to guide or
direct attention to certain target regions in the real world that are important (Wickens and


Hollands, 1999). An example of an AR environment featuring cueing is

presented in
Figure 1.1. This figure illustrates an implementation of AR that annotates the user’s view
with cues that identify viewed entities. Cueing is represented in the figure by the arrows
that provide information and direct attention to salient a
spects of the viewed scene.

Figure 1.
. Soldier Vision’s Vision System

Cueing can improve target detection performance and target search times in AR
environments (Merlo, Wickens, and Yeh, 1999). AR interface designers

can employ
cueing as an augmentation mechanism to facilitate target identification and detection
tasks in many domains.

While the literature states that cueing increases awareness of the objects that are
cued, the exacerbation of change blindness may be
an incurred cost of cueing. Change
blindness is the induced failure to detect changes in a display (Rensink, O’Reagan, and
Cueing used to
identify and direct
thaidentifies and
directs attention


Clark, 1997). This blindness to change is due to the failure to notice unattended objects
in the direct line of sight. Change blin
dness typically occurs when the visual scene is
changed and the eyes are in motion. In certain domains, understanding the implications
of cueing on change detection is critical. For example, in military applications, the
ability to quickly perceive and i
dentify changes in enemy locations and movements can
have life or death implications.

Regardless of the type of environment, it has been shown that humans have
difficulty in detecting changes in scenes (Rensink et al., 1997). An object often changes
ion within a display or disappears completely and the observer’s eyes come to rest
without noticing the change in magnitude or position. Substantial change blindness
research has been conducted to understand this internal representation construct.
n (2000) provides an overview of the findings applied to the way humans build
internal representations with an emphasis on the criticality of attention in detecting

In general, both cueing and change detection performance require attentional
rces. The present work will investigate cueing used in an AR environment and its
impact on change detection performance. It is believed that the use of AR as a strategy to
merge computer
generated artifacts with the user’s view in order to direct the use
attention within a real scene may impact the user’s ability to detect changes.
Understanding this potential impact of AR on change detection performance is the focus
of this research.


1.2. Problem Statement

The fusion of real
world visual scenes w
ith computer
generated artifacts could
introduce attentional and cognitive load concerns (Biocca, Tang, Lamas, Gregg, Brady,
and Gai, 2001). For example, AR systems may create excessive cognitive demands on
users (Lampton, Bliss, and Morris, 2002). The p
ossibility of an excessive cognitive

lessens the promises of AR technologies
(Tang, Owen, Biocca, and Mou, 2003).

The question of whether the superimposed artificial artifacts (cues) represent
information clutter, a distraction, or other extraneous in
formation that can potentially
increase cognitive workload and, thus, detection performance degradation in an AR
domain presents the opportunity for the current research. This research will investigate
the use of cueing in an AR environment and its impact

on change detection performance.

Some important research results related to the previous questions have been
obtained by Wickens et al. (1999). The issue of whether cueing can draw resources so
that changes cannot be detected remains a puzzle

In additi
on, this research is limited
with respect to validating the impact of cueing on change detection performance.

Yeh, Wickens, and Seagull (1999) has identified several important cueing

reliability, precision, and saliency. Cueing reliability is

defined as the
correctness of the cue. Cueing saliency refers to the conspicuity of the cue. Cueing
precision represents the specificity of the cue or the ability of the cue to reduce the search
space. In the present research,
it is the author’s conjec
ture that both the manipulation of
the cueing properties and the type of change (presence, position or feature) can have
implications in change detection performance in an AR environment.


This research has significant implications in the development of d
esign guidelines
for AR interfaces. Results can facilitate the development of AR interfaces and support
the development of design guidelines for the use of AR. Ultimately, this research adds to
the body of knowledge through the provision of additional in
sights into the fundamental
cognitive ergonomics of AR technologies.

1.3. Experimental Goals and Hypothesis

The primary goal of this research is to assess human performance in a change
detection task using AR support with cueing. This goal is accomplish
ed through two
complementary objectives: (a) the exploration of cueing and its properties in AR
environments; and (b) the investigation of the effect of cueing on change detection and
identification performance. The hypotheses and summaries of the corres
experiments are detailed below.


Cueing reliability has an effect on change detection and identification


Experiment 1 investigates change detection and identification performance in a
simulated AR display environment. This AR d
isplay environment features the user’s
forward field of view (FFOV) superimposed with cueing to identify viewed aspects.
Since systems employing cueing almost always provide imperfect aiding (Maltz and
Shinar, 2003), this experiment mimics a system with d
iffering levels of reliability. The
aim is to investigate the impact of cueing reliability on change detection and
identification performance.


Three levels of reliability (low, medium, and high) were chosen to simulate the
actual range of the performanc
e capabilities of AR systems. Reliability represents the
correctness of the cue. Cueing could be incorrect if an entity is miscued or not cued at
all. Figures 1.2, 1.3, and 1.4 illustrate the selected levels of reliability of low, medium,
and high, resp

Detection and identification performance were compared at three levels of
reliability across four change types. These four change types included position, feature,
presence, and none.

Figure 1.
An Example of an Implementation of Low Reliability Cueing


Figure 1.3. An Example of an Implementation of Medium Reliability Cueing

Figure 1.4. An Example of an Implementation of High Reliability Cueing



Cueing saliency has an effect on change d
etection and identification


Experiment 2 provides a further study of the impact of cueing on change detection
and identification performance. The saliency of the cue is manipulated to understand its
implications on change detection and ide
ntification performance. Three levels of
saliency were chosen that reflect possible cueing implementations in an AR environment.

Saliency is defined as the cue’s ability to make the cued entity “pop
out” from the rest of
the scene. Three levels of sali
ency were selected that varied the size and color of the
cueing. Figures 1.5 and 1.6 illustrate a lower and a higher salient cueing implementation,
respectively. Change detection and identification performance were compared at three
levels of saliency ac
ross the four change types discussed in support of hypothesis one.

Figure 1.5. An Example of an Implementation of Lower Salient Cueing


Figure 1.6. An Example of an Implementation of Higher Salient Cueing


Cueing precision has an effect on cha
nge detection and identification


Experiment 3 provides another investigation of change detection and
identification performance in a simulated AR display environment. The AR display
environment featured the user’s forward field of view (FFO
V) superimposed with cueing
to identify the contained entities. Three levels of precision were selected to simulate the
possible range in capabilities of AR systems.

Precision represents the specificity of the cue. The cueing is implemented at
various le
vels in order to reduce the user’s search space. Specificity could involve the
cueing of all entities
, only a subset of the entities
, or none

the entities contained within the participant’s FFOV, as illustrated in Figures 1.7,

1.8, and
1.9, respectively.


Detection and identification performance were compared at three levels of
precision across four change types. These four change types included position, feature,
presence, and none.

Figure 1.7. An Example of an Implement
ation of Less Precise Cueing (All)

Figure 1.8.

An Example of an Implementation of More Precise Cueing



Figure 1.9. An Example of an Implementation of No Cueing (None)

1.4. Diss
ertation Outline

Chapter 1 outlines the rationale, background, and objectives of the research.
Chapter 2 highlights relevant literature associated with AR, AR applications, and human
factors and psychophysical issues in AR design and applications. A descr
iption of the
task domain is found in Chapter 3. Chapter 4 features a description of the experimental
design, the tasks, and the results. Chapter 5 details additional quantitative analyses that
examine the impact of cueing on change detection performance
. Chapter 6 provides a
discussion of the results. Finally, Chapter 7 provides a summary of the research and
offers suggestions for further studies that may extend this research.




2.1. Overview


has become a useful tool in

machine work organizations since the
invention of computer desktop displays, simulated and animated displays, large
room/wall projection displays, multimedia information management, head
displays, and virtual reality displays. Although thes
e developments are useful for human
performance, there are many unresolved issues that adversely affect the utility of these
tools. Research in human factors and psychology has attempted to address many of the
issues on display effectiveness. However, ve
ry little research has attempted to address
psychophysical factors in augmented reality. This chapter provides a brief overview of
the current information on augmented reality.

2.2. Relevant Literature

2.2.1. Augmented Reality

AR is a technology that e
nhances or augments a user's view of the real world with
additional information generated from a computer model (Barfield, Rosenberg, and
Lotens, 1995). The enhancement may consist of virtual artifacts fitted into the
environment or a display of non
tric information about existing real objects. AR
allows a user to work with and examine real 3
D objects while receiving additional
information about these objects or a task at hand. AR brings information into the user's


real world by exploiting people’s

visual and spatial skills. Thus, AR allows the user to
stay in touch with the real environment. Augmented reality does not create a totally
artificial environment. It adds to an individual’s sense of what’s already in his or her

This is i
n contrast to virtual reality (VR) where the user is completely
immersed in an artificial world (Durlach and Mavor, 1995). In VR systems, there is no
way for the user to interact with objects in the real world. Using AR technology, users
can interact with

a mixed virtual and real world in a natural way. AR systems bring the
computer to the user's real work environment, whereas VR systems try to bring the world
into the user's computer. This paradigm for user interaction and information visualization
titutes the core of a very promising new technology for many applications.
However, real applications impose very strong demands on AR technologies that cannot
yet be met (

AR is considered a mixed reality tec
hnology (Azuma, Baillot, Behringer, Feiner,
Julier, and MacIntyre, 2001). Milgram and Kishino (1994) have developed a taxonomy
system that portrays this situation (Figure 2.1). As shown in Figure 2.1, the “real”
environment reflects a reality that is the

actual world, context, or situation, perceived by
an individual’s natural senses. In the real environment, individuals encounter the totality
of all things possessing actuality, existence, or essence. When individuals try to expand
their ability to percei
ve more information from the real world, both in depth or size with
the use of supporting technology, they are in an augmented reality mode. Examples of
this mode include large screen displays, true depth displays (Mellor, 1995), and
multimedia systems. Th
e large display is a richer rendering of real world data with 3


models or with “look and speak” interface systems (Bowskill and Downie, 1995). At the
far end of the continuum is the virtual environment (VE). Appino, Lewis, Koved, Ling,
Rabenhorst, and
Codella (1992) define a virtual world as an interactive, multi
dimensional, computer
generated environment. As the name implies, information
in the virtual world is artificially created to supplement reality. Thus, AR is a system in
virtual and real agents are combined to create a more realistic human
interface that uses human skills in perception, motor coordination, and cognition. The
purpose of AR is to add virtualism to realism.

Figure 2.1. Reality
Virtuality Continuu

(Adapted from Milligan and Kishino, 1994)

As shown in Figure 2.1, AR is a special type of VE representation. In the
broadest sense, VEs are tools that assist the user in task performance by providing
support, including feedback and information (Shewc
huk, Chung, and Williges, 2002). As
a pseudo
virtual environment, an AR system utilizes virtual objects and computer
generated data to supplement the real world (Azuma, 1997 and Wellner, MacKay, and
Gold, 1993; Azuma et al., 2001). The virtual objects, a
s primarily visual supplements,
are applicable to all of the senses (Azuma et al., 2001). These objects are typically


superimposed on reality and viewed through the use of a transparent or video
based head
mounted display (HMD) or other appropriate mechan
isms (Blade and Padgett, 2002).
This creates a view of a real scene that is the product of the fusion of the real and the
virtual (Vallino and Kutulakos, 2001).

The objective of the merging of real and virtual objects is to extend the human
user’s percept
ion of and interaction with reality (Azuma, 1997). AR supplements the
richness of reality, which is often difficult to duplicate in a synthetic environment, with
virtual objects (Hermans, 2000). In addition, AR provides the user appropriate computer
rated information about a real environment that cannot be directly sensed or
otherwise known without some sort of augmentation (Furmanski, Azuma, and Daily,
2002). Thus, AR enhances the user’s situation awareness and perception of the real
world (Behringe
r, Reinhold, Klinker, Gudrum, and Mizell, 1999).

AR can improve the user’s performance of real
world tasks through sensory
enhancement, a phenomenon that leads to cognitive amplification (Vallino and
Kutulakos, 2001; Stedmon and Stone, 2001). An explorat
ion of how AR interacts with
human abilities to improve task performance shows that the benefits of AR mostly
support perceptual and cognitive tasks. At least four such benefits have been identified
by Neumann and Majoros (1998):

Information access: AR c
an trigger the insertion of appropriate virtual
objects, sparing the user a search for needed information. The elimination
of the search for information decreases cognitive workloads.


Reduced error likelihood: Situationally independent AR can enable the
novice to perform at an expert level (e.g., providing very efficient retrieval
of information from memory). In addition, AR can facilitate the transition
from “information novice” to “information expert.”

Enhanced motivation: AR can engage the user in clo
loop information
exploration tasks through its rich information display. Users can be
motivated if they are exposed to novel displays and visualizations that
encourage direct perception and the use of information in context.

Concurrent training and p
erformance: AR can reduce training
requirements and provide scenes that are annotated with information that
is normally presented in training; thus, significantly facilitating task
performance. Some Sample Applications of AR

Although AR technolo
gies have been around for several decades, the
development and application of the technology is still in its infancy (Stedmon, Kalawsky,
Hill, and Cook, 1999), and the validation of potential benefits continues to be studied.
(Neumann and Majoros, 1998).

The potential for AR is great. AR technologies are
currently leveraged to enhance user perception and task performance in a number of
domains. AR technologies provide critical real
time data on the battlefield, to physicians
in the operating room, and t
o trainees in the workplace (Robinett, 1992). In today’s
environment where information is power, AR can be the means to insert information at


the “right” time into the “seen world.” In general, the success of AR applications is
dependent on the way infor
mation is transmitted to the user.

There are many other applications of AR (Azuma, 1997). Doctors have used AR
as a visualization and training aid for surgery (Edwards, Hill, Hawkes, Spink, Colchester,
Strong, and Gleeson, 1995). It is now possible to co
llect 3
D datasets of a patient in real
time using non
invasive sensors such as Magnetic Resonance Imaging (MRI), Computed
Topography (CT) scans, or ultrasound imaging. AR may also be useful for training
purposes (Kancherla, 1995). Virtual instructions r
emind a novice surgeon of required
steps in a surgical procedure, preventing inattention to a patient due to consultation of a

Another category of AR applications is the assembly, maintenance, and repair of
complex machinery. Instructions may be
easier to understand if they are in the format of
D drawings superimposed upon actual equipment that show both step
step tasks and
how to do them. These superimposed 3
D drawings can be animated, making the
directions even more explicit. Several res
earch projects have demonstrated prototypes in
this area. Feiner, MacIntyre, Haupt, and Solomon (1993) have successfully used VR for
printer maintenance applications.

AR has been used to support creative and abstract generalizations of visualization
. For example, an architect with a transparent HMD is able to look out of a window
and see how a proposed new skyscraper will change his or her view. If a database
containing information about a building's structure is available, AR provides architects
with "X
ray vision" inside a building, displaying the location of pipes, electric lines, and


structural supports (Feiner, Webster, Kruege III, MacIntyre, and Keller 1995).
Researchers at the University of Toronto have built a system termed Augmented Reali
through Graphic Overlays on Stereovideo (ARGOS) (Milgram, Drasic, Grodski, Restogi
and Zhai, and Zhou 1995), which is used to make images easier to understand during
difficult viewing conditions (Drasic, Grodski, Milgram, Ruffo, Wong, and Zhai, 1993).
The ARGOS system has demonstrated that stereoscopic AR is an easier and more
accurate way of performing robot path planning than that found in systems that use
traditional monoscopic interfaces.

2.2.2. Augmented Cognition (AC)

AR is also an outgrowth of r
esearch in cognitive amplification (Bowskill and
Downie, 1995). AC is a specific example of what Brooks (1996) terms

(IA), which uses the computer to make tasks easier for human beings to
perform. AC or “intelligence augmentatio
n (Maes, 1994) is concerned with constructing
integrated forms of human(s) and machine(s) to make humans more intelligent and more
aware of information, problems and/or tasks. The notion of augmenting human
intelligence is not new.
Englebart (1962) prop
osed the development of a conceptual
framework to “augment human intellect.” Englebart (1962) defined this augmentation as
increasing a human being’s capability to approach a complex problem situation, to
achieve a level of comprehension to fit his/her nee
ds, and to formulate solutions to

Kuutti and Kaptelinin (1997) state that augmentation is a powerful idea. The
premise of augmentation is that a human being is more capable when the functionality of


a given cognitive tool is added to his/her co
gnitive abilities. Burdea and Coiffett (1994)
add that a valid approach lies in augmenting human intelligence through a highly
enriched virtual environment. Enriching the task environment with appropriate
information results in an improvement in human ta
sk performance (Burdea and Coiffet,

AR can also be used to amplify cognition through the simultaneous presentation
of visual information in different windows or via

modalities. The dual coding
theory of Paivio (1986, 1991) deals with the

processing of verbal (auditory) and pictorial
(visual) information. Paivio states that long
term memory contains two independent and
interacting subsystems. One presents linguistic information (e.g., spoken information),
and the other presents non
istic information (e.g., pictures or object manipulations).
When the same information is coded both verbally and nonverbally, it will have greater
mnemonic power than when only coded one way. In addition, according to this theory, if
nonverbal information

is processed in more than one sensimotor modality (e.g., the visual
and proprioceptive receiving and processing stimuli that originate from the central
nervous system), this will result in additional effects in recall. Thus, the more codes that

information in memory, the better information will be remembered. Table 2.1
shows the benefits and disadvantages of presenting visual information.

Another explanation of the information power of AC can be attributed to the text
representation theory of

Van Dijk and Kintsch (1983) that distinguishes between good
remembrances or recall and a successfully performed task. The theory states that people
need a good situational representation of a task to perform it well.


Table 2.1. Benefits and Losses of P
resenting Visual Information in a Parallel

Manner (Paivio, 1991)




Serves as external memory

Provides context information

Decreases search for relevant

Increases possibility to compare

and associate related information

Processes more shallow

Increases chance of an information

Situational representations also present the situation or context described by a text.
Reading a text about how to use a washing machine p
rovides users with a propositional
representation, while a picture of a washing machine creates a situational presentation.
This presentation is the aspect of knowledge related to augmented cognition, which is the
portrayal of situational information. Wh
en an individual has to perform a task, it is ideal
to have good context information (situational representation). A study of Perris and
Kintsch (1985) showed that situational representation leads to more accurate inference,
recognition and problem solvin

While AC holds many promises, challenges exist in its development and
application (Barfield and Caudell, 2001). Kuutti and Kaptelinin (1997) note that there
are several conceptual and theoretical problems with current cognitive augmentation

including the following:



No accounting for the fact that individuals dynamically extend themselves
by including parts of their environments.


No realization that artifacts shape internal cognitive processes. No
consideration of the cognitive capabilities

of humans to manifest
themselves in the cultural context by and to be transformed by their culture.


The existence of individuals who inherently augment their cognition.


The dismissal of the developmental aspects of a distribution (human mind
and artifact)

that occurs due to the shear nature of distributed cognition.

Kuutti and Kaptelinin (1997) further argue that the desired augmentation
approach should consider the human mind as a part of the context in which information
exists. From an implementation pe
rspective, Bradley (2002) notes that challenges for
augmentation include an appropriate HCI, a mechanism for understanding context, and
the integration of information with the task (ethnography).

The Defense Advanced Research Projects Agency (DARPA) (Hor
vitz, Pavel, and
Schmorrow, 2001) in its augmented cognition program has executed a number of
experiments that explore ways for people to more easily encode, store, and retrieve
information. These experiments have been conducted using the Info Cockpit gen
eric test
bed concept, an enhanced computational environment that uses spatial location and place
to augment human memory. Preliminary results indicate that users of Info Cockpit
environments demonstrate over a 50% improvement in memory in contrast to use
rs of
traditional desktop computers. In addition, preliminary results from the use of an
augmented reality system for Coast Guard harbor navigation have shown significant


improvement in human performance of maritime navigation tasks. It is DARAPA’s
f that the continued success of augmented cognition concepts will improve 21

century warrior interaction with computer
based systems and revolutionize military
making. Notably, some of DARPA’s applications use AR as an information
management su
pport environment.

The application of augmented cognition extends beyond the military domain.
Researchers in the field of assisted cognition are designing artificial intelligence systems
to care for Alzheimer’s patients with no direct human assistance (Kau
tz, Fox, Etzioni,
Borriello, and Arnstein, n.d). Assisted cognition systems, composed of intelligent
personal digital assistants and “smart homes”, will enable aging adults to stay home
longer and maintain their independence.


Virtual Environments (

In the broadest sense, VEs are tools used to assist the user in task performance by
providing support and including feedback and information (Shewchuk, Chung, and
Williges, 2002). Shewchuk, Chung, and Williges (2002) also state that there are differe
means by which VEs can provide augmentation support. These means include

Visualization: User interaction in the VE

Simulation: Simulation of virtual interactions

Information Provision: VEs can provide information to guide and support
users in task pe
rformance. AR is promising because it provides real and
virtual information for the execution of real
world tasks.


Telerobotics: Through reconstruction of a remote environment, VEs can
facilitate the control of robotics in a real environment via a virtual


Barfield and Caudell (2001) and Azuma (1997) note that


VEs can maintain the user’s immersion in the real world.


VEs can register or align an image in three dimensions, thereby generating
a consistent representation of all objects from all views

in the real world.


VEs allow real
time information processing by the cooperation of virtual

Wickens and Hollands (1999) note that virtual environments appear to offer three
fundamental benefits. These benefits include


the advantage of an ego
ntered frame of reference for many guidance


usefulness for training, and


line comprehension.

Although the ultimate potential and development of VE is without question, many
challenges exist that inhibit full exploitation (Blade and Padgett, 2
002). Schmalstieg,
Fuhrmann and Hesina (2000) observed that AR has a chance to become a viable HCI for
universal productivity enabling applications, where a single system covers a variety of
tasks. This has been demonstrated in the application of VR in are
as including training,
medical visualization, the maintenance and repair of complex equipment, annotation, and
path planning (Azuma, Baillot, Behringer, Feiner, Julier, and MacIntrye, 2001).


2.3. Human Factors Issues of AR and AC

2.3.1. Human Factors C

AR presents human factors opportunities. Shewchuk, Chung, and Williges (2002)
assert that the “success of AR applications depends mainly on the way information is
transmitted to the human operators rather than the immersive feeling or fidelity w
ithin the
virtual environment.” The effective fusion of reality (the real scene) with appropriate
supplemental virtual objects in a single display space introduces a unique HCI design
dilemma (Feiner, 1994; Milgram, Zhai, Drascic, and Grodski, 1993). Thi
s dilemma
pertains to human factors and related psychophysical considerations.

Although AR is in use, technology issues and challenges exist. Stedmon,
Kalawsky, Hill, and Cook (1999) state that the key technological limitations of AR
include the fusion o
f the virtual and real within the same real world scene and the
fundamental issues of image registration and collimation. Thus, AR is faced with several
developmental hurdles. Image registration, tracking and sensing, calibration, portability,
and so for
th are mentioned in the literature as key technological challenges (Robinett,
1992; Torrance, 1995; Tuceryan, Greer, Whitaker, Breen, Crampton, Rose, and Ahlers,
1995). A great deal of research is being conducted to address these

When properly implemented, AR has been shown to eliminate human factors
challenges such as vertigo and motion sickness that can be induced by use of fully
immersive technologies (e.g., VR) (Stedmon, Hill, Kalawsky, and Cook, 1999; Stedmon
and Stone, 200
1). However, there still exists a dearth of research into the human factors


issues of AR systems (Stedmon, Hill, Kalawsky, and Cook, 1999). Hix and Gabbard
(2002) state that VE developers have focused on the production of innovative gadgetry
and interact
ion techniques with little attention to user issues. Valimont, Vincenzi, and
Gangadharan (2002) concur that human factors and cognitive issues have not received a
significant amount of research attention.

AR systems also present problems in processing vis
ual information that is most
related to perception and cognition. At least seven of these factors have been identified:


Acquisition of spatial knowledge (Peruch, Versher, and Gauthier, 1995):
Gathering information in virtual space.


Spatial reasoning (Nor
man, 1994; Pernkoff, 1987): Making decisions
about objects in space with respect to time and distance.


Spatial perception (Henry and Furnee, 1993): The ability to make sense of
virtual stimuli as induced by AR.


Spatial cognition (Psotka and Davison, 199
3): Translating spatial
information into knowledge for action.


Spatial awareness (Cole, Meritti, Coleman, and Ikehara, 1991): The
ability of “being there” and “being here”.


Spatial orientation (Ellis, 1991; Ellis, Bucher, and Menges, 1995): The
to perform a task in various geometric cardinalities such as
translation and/or rotation.



Spatial information modeling (Bartram, 1980): The ability to represent
virtual information as data models.

Acquisition of spatial knowledge, spatial reasoning, spat
ial cognition, and spatial
information modeling are all cognition problems. They all deal with how to use our
recognition memory and representation schema to process spatial information based on
salient features. Spatial awareness, spatial orientation, a
nd spatial perception are all
perception problems. They deal with the first human contact with sensory stimuli and the
manner in which the intensity of the stimuli affects the view of the world around the
users. The way spatial information is represented
, processed, and transmitted to the user
contributes significantly to the human operator’s cognitive workload and other related
performance inhibiting factors.

In summary, human factors issues are a critical component in the continued
development of AR tec
hnology. Currently, these issues have not received as much
attention as they should (Stedmon and Stone, 2001; Stedmon, Hill, Kalawsky and Cook,
1999). There is a significant knowledge gap in the understanding of human factors,
perceptual studies, and cogn
itive science that can facilitate the design of effective AR
systems. Stedmon and Stone (2001) further assert that the full exploitation of AR design
will not be realized without greater attention to human factors’ issues and challenges.

2.3.2. Psychophy
sical Issues

Schiffman (1996) defines psychophysics as the study of the relationship between
environmental stimulation (physical aspects) and sensory experience (cognitive or
psychological aspects). Psychophysics is concerned with describing how an organi


uses its sensory systems to detect events in its environment. This description is
functional with respect to sensory system processes. Psychophysical methods allow
researchers to study how well people sense and resolve intensive, spatial and temporal
variations of input to human sensors. For example, researchers are able to determine the
minimal intensity of virtual information by resolving differences in wavelengths through
color, or differences in size through geometric features, and so forth. Rese
archers can
also determine the minimum change in stimulus intensity that is required to notice a
change in an individual’s perception (Simpson and Fitter, 1973). Researchers can
compare how well each person senses the environmental information, with or wi
augmented apparatus such as eyeglasses. The ability to detect, discriminate, and
recognize objects in space with minimum energy levels and to resolve fine details
constitutes the study of psychophysics (Schiffman, 1996).

Emura and Tachi (1994) had s
ubjects use AR to judge distance, velocity and
acceleration quite consistently, but with systematic errors. D
elayed visual feedback
(which produces a disagreement between the seen and felt time of occurrence of an
event) drastically impairs performance on

many tasks. Delays of 100 milliseconds can
render the rapid and accurate visual control of behavior impossible and delays greater
than one second essentially eliminate the visual control of behavior.

One of the most utilized models in psychophysics is
the Signal Detection Theory
(SDT). SDT uses a combination of statistical decision theory and the concept of the ideal
observer to model an observer's sensitivity to events in its environment. SDT is stimulus
oriented, because properties of the stimuli ar
e used to determine human performance in a


detection task. In SDT, the proportion of correct decisions measure event detection or
discrimination tasks. The deficiencies of correct proportions warrant the use of receiver
operating characteristic (ROC). A
s Licklider (1963) states, "The nature of [the ear's]
solution to the time
frequency problem is, in fact, one of the central problems in the
psychology of hearing." This problem is still unresolved, primarily due to the
inconsistent results of experiments

that degrade performance and make it difficult to
compare models (Dai and Wright, 1995).

2.4. Human Factors and Psychological Issues Relevant to the Current Research

2.4.1. Display and Sensing Issues

When information cues are sensed through sensors at
the remote site and then
displayed to an operator, there is always some loss of information. This loss occurs
because the sensors do not pick up all the cues adequately and the cues cannot be
reproduced for an operator with a high degree of fidelity. Some

cues, such as visual cues,
are easy to pick up through video cameras. Others, such as tactile cues, are much more
difficult to sense since the technology to sense and display such cues is not yet
adequately, advanced. Of the many cues that may be transm
itted from a remote site,
visual feedback is considered the most vital for teleoperation (Sheridan, 1992).
According to Massimino and Sheridan (1994), a decrease in the quality of the viewing
conditions in the presence of force feedback will have a smaller

negative effect on
performance than when visual information alone is present.


A monoscopic video camera and monitor are conventionally used in designing
AR systems due to their cost, availability, and suitability. A standard monoscopic video
system pres
ents some deficiencies in display:

(a) Diminished depth information


Depth information provided by AR systems may
be insufficient (Pepper, 1984). A monoscopic video display is very effective in providing
visual information about distance viewing throug
h a variety of monocular cues for
judging depth such as interposition, light and shadow, linear perspective and size
constancy of familiar objects. However, the monoscopic video lacks important binocular
cues due to retinal disparity and binocular converg
ence. In many cases the appearance of
images at different depth planes can greatly aid in image interpretation, which is
especially useful when the objects and their background are unfamiliar, viewed from
unusual angles, or viewed under low visibility cond

(b) Height in the visual field


Ordinal information about the distance of objects is
available from the vertical location in the visual field of the bases of objects resting on
the (horizontal) ground. The bases of more distant objects are locat
ed higher in the visual
field than the bases of closer objects. Scaled information is available relative to the
observer's eye height. The cue is not useful at very close distances. It is maximally
effective at distances of about 2m and decreases in eff
ectiveness at a distance of
approximately 100 m (Welch, 1978).

2.4.2. Motion Cues

The apparent motion of objects caused by movement of the observer through the
environment provides information about the layout (e.g., near objects appear to move


past a mov
ing observer more rapidly than far objects). Information is not available about
objects that are too close relative to the velocity of movement to be tracked. This
situation decreases sensory effectiveness to about 100 m, depending on the velocity of
ment by the user (Utsumi, Milgram, Takemura, and Kishino, 1994).

2.4.3. Time Delay Issues

Task performance with AR systems is impacted by a time delay between the
control input by the operator, information transmission, and the consequent feedback of
rol actions visible on the display (Olano, Cohen, Mine, and Bishop, 1996). A
continuous closed loop control becomes unstable at a particular frequency when the time
delay in the control loop exceeds half the time period at that frequency. AR users will
sually experience a time delay in communication between the operator and the AR
gadget, and the AR equipment is sensitive to many factors that include information
bandwidth, speed of data transmission, computer processing capacity, and so forth. Apart
m the speed of communication, signal processing and data storage in buffers at various
stages between the local site and the remote site also result in the use of considerable time
(Smith, McCray, and Smith, 1962; Smith, Wargo, Jones, and Smith, 1963).

.4. Bandwidth of Communication

Communication bandwidth is a significant factor in limiting the transmission of
visual data between human operators and digital information generation mechanisms.
For example, if the bandwidth demand of video data is high t
his can pose tremendous
time delays for the operators. Decreasing frame rate, resolution, number of bits/pixel, or
the use of image compression devices can reduce bandwidth demand. In general,


operator performance has been shown to be adversely affected
by decreases in the frame
rate, resolution and grayscale (Sheridan, 1992). For example, Massimino et al., (1994)
reported more than a 100% increase in the task completion time when the video frame
rate (in a monoscopic video) was dropped from 30 frames/se
cond to 3 frames/second. In
addition, a low bandwidth also contributes to a time delay, as a single visual display
frame takes longer to reach a local site.

2.4.5. Attention

Early theories of attention (Welford, 1960; Broadbent, 1958) located the process

between perception and memory. This result demonstrated that when a stimulus is
received, information in memory associated with that stimulus is automatically activated.
Stimuli received after this activation (and associated with the activated memory) be
from better reaction time and accuracy. The converse is also true. Reaction time suffers
if a stimulus is not associated with the activated memory to be received (i.e., an
unexpected signal).

The importance of attention on AR performance measurement

cannot be
underestimated. In real environments, human senses receive and process information
stimuli in parallel and few of these stimuli are used in the relevant task (Cho and Ulrich,
1996). Moreover, during information processing, individuals shift th
eir sensory receptors
to different positions (as in the case of head or eye movements). This is known as overt

The concept of automaticity is directly related to attention as well. Automatic
, such as reading familiar words, hearing o
ne’s name, and so forth, consume



resources. These resources can be used to perform other tasks. Numerous
automatic processes can be performed simultaneously; walking while whistling is an
example. Conscious processes, on the other hand, requ
ire attention. They are either too
unfamiliar or too complex to be automatic, and generally consume most of the available
resources in the cognitive system. Multiple conscious processes can only occur
simultaneously if they are very simple, although an i
ndividual can perform a number of
automatic processes along with a conscious process (listening to the news on the radio
while operating the directional signals and chewing gum as a car is kept on the road).

Both natural and artificial environments provide

far more information than can be
assimilated simultaneously. Selective attention is the process by which the salient cues
are selected and combined to provide the information required to support the control of
action. The ability to selectively attend t
o salient cues and obtain the information
required develops with experience. For example, expert squash players obtain
information about the subsequent flight of an approaching ball from the movement of
their opponent's arm well before the ball is struck,

while novices appear unable to access
this information and must rely on the movement of the racquet and the ball (Abernethy,

Visual search patterns have also been studied to understand expert/novices’
differences in obtaining information from compl
ex displays. The search patterns are in
part determined by knowledge and prior experience. For example, skilled radiologists
employ search patterns that roughly correspond to the probability distribution of the
location of abnormalities (Kundel, 1974).
Other differences appear to correspond to


differences in skill. For example, novice drivers are more fixated on the road edge and
speedometer, while experienced drivers spend more time looking further ahead (Mourant
and Rockwell, 1972).


ler (1956) is one of many researchers who have shown that the span of
immediate memory is about 7 items in length. Individuals can perceive large quantities of
sensations, and can hold vast amounts of information in their long
term memory.
However, immed
iate memory is the bottleneck in the information processing system.
While 7 ± 2 is the number of items normally kept in immediate memory, there is far more
to short
term memory capacity. Items can be grouped or “chunked” to stretch the
limitation. Often

there is insufficient time or attention to apply a scheme to transfer
items to long
term memory. The natural assumption has been that time, in the form of
memory decay, is the significant factor in memory loss. Waugh and Norman (1965),
however, have sho
wn that interference is the primary factor, although the decay theory is
difficult to test. Forgetting is strongly influenced by intervening items prior to
transference to long
term memory. The manner in which AR and information
amplification affects mem
ory functions remains a subject of investigation in human
factors research.

2.4.7. Change Blindness

Events involve a change that occurs over time, a moving object, a moving
observer, or all of these phenomena. Objects in human visual environments often d
o not
occupy a single location with respect to depth. Movement of objects is distinguished from


movement of observers by differences in the optical transformations that occur in the
visual field. Movement of objects within the environment cause local chan
ges in the
optic array, while observer movement yields a global optic flow. Thus, deformations in
the local and global optic flow provide the observer with information about events. This
phenomenon in observation suggests that attention is focally alloca
ted to a local region of
the visual field where stimuli are processed in more detail.

The hypothesis in change blindness is that an observer’s eyes come to rest without
noticing the change in magnitude or position when an object changes position within th
display or disappears completely. This condition is often attributed to the inattention of
the temporal blindness effect (Rensink, O’Regan, and Clark, 1977). Change blindness
occurs while the visual scene is changed and the eyes are in motion. It is m
ore likely to
occur when observers do not explicitly focus attention on the changing object (Rensink,
2000). The necessity of optic flow for the perception of objects and events is evident in
the observation that large changes go unnoticed if the optic fl
ow is disrupted. This
phenomenon can be achieved experimentally by altering presentations during saccadic
eye movement, or more simply, by introducing a gray “blanking image” between
successive displays in which changes occur.

Theories in the visual atte
ntion literature offer an understanding of change
detection and strategies for enhancing detection of change (Rensink, 1997). An object
must be directly attended to in order for the user to recognize a change when local cues
that may garner a user’s atten
tion are absent. Thus, human beings do not automatically
encode entire visual scenes. This encoding could result in blindness to a change in the


visual scene (transitions between quantities not the shear presence of quantities)
(Rensink, 2000) or, as ref
erred to in the literature, “change blindness”. Simply, change
blindness represents a failure to see unattended changes (Rensink et al., 1997).

Rensink (2000) states that change blindness is a salient phenomenon pertinent to
the visual experience. Mitrof
f and Simons (2002) note that change blindness is the
inability to detect what should be an obvious visual change during a disruption

saccade, flicker, or a blink. From an applied perspective, change blindness denotes the
failure to report the pres
ence of significant changes in the visual input under particular
experimental conditions (Rensink, 2000).

Substantial change blindness research has been conducted to understand this
internal representation construct. O’Regan et al. (1999) provide an ove
rview of the
findings applied to the way human beings build internal visual representations of the

the internal representation of the visual world is much sparser than
suggested by subjective experience

attention is paramount in encoding an aspect o
f the scene in this

visual transients as a residual of scene changes can bring attention to the
location of the change.

The applicability of these findings to AR design is significant, especially when
addressing issues of cognitive augmentat
ion design. Experimental findings by Tan, Gray,
Young, and Irawan (2001) suggest the applicability of cues in enhancing detection of


change in a display. For example, haptic cueing can provide an alternative to auditory
and visual cueing in the design of

a human
system interface. For this purpose, Intille
(2002) advocates the exploitation of change blindness in interface design as a strategy to
keep information current without attracting a user’s focus of attention. Design strategies
such as blanking an

image, changing views, displaying mud splashes, changing
information slowly, exploiting eye blinks or saccades, and the use of occlusion have been
suggested as a means to enhance detection of change and maintain the “calmness” of the
interface environment
. These strategies will minimize feelings of information overload
(Intille, 2002).

Understanding and exploiting the aspects of change blindness can facilitate AR
design, ensuring that augmentation information is “seen” and does not conflict with

task performance. Such understanding and exploiting present a unique set of
human factors considerations, specifically, the appropriate presentation characteristics of
the augmented information so that the user “sees” or “detects” changes. The applicati
of change blindness phenomena is illustrated by popular science (Ditlea, 2002) (Figure
2.2). Figure 2.2

illustrates the dynamic nature of change blindness with respect to
information processing. For example, during changes in a “real scene”, a soldier

must be
able to detect changes in the augmentation layer because in a single “blink of the eye” the
augmentation annotating an entity as “friendly” could change to reflect the entity as


“Hostages are being held on an upper floor of a buil
ding on National Street.
Through augmented reality glasses linked to computers worn on their bodies, the
troops sent in to free the hostages view a world brimming with

information that can assist them in this dangerous mission. Where
they look, the environment is
automatically annotated
. Risky hot spots are
highlighted in red
: snipers positioned on buildings on either side of the street,
and the spot where a car bomb just exploded. A pop
up window provides the
floor plan of the
building the soldiers must swarm, with the safest route blocked
out. The Black Hawk chopper hovering overhead beams intelligence to
commanders at the control center, who then

the soldiers’ annotated
in real time

Figure 2.2. Vision of AR

5. Summary of Human Factors and Psychophysical Issues in AR Design and


The major human factors and psychophysical issues relevant to AR/AC research
are summarized in Tables 2.2 and 2.3.


Table 2.2. Some Technology and Human Factors Cha
llenges in AR Research




Visualization Challenges

Image Registration:
refers to the proper alignment of
the real and virtual worlds.
of the most critical issues in AR
is the registration
or tracking
problem (Ansar and Daniilidis,

Latency: Delay causes
registration errors. Delay
can reduce task performance
(Azuma et al., 2001)

Perception: Perceptual
biases can interfere with task
performance (Drascic and
Milgram, 1996)

Depth percep
tion: Depth
perception is a difficult
registration problem.
Consistent registration plays
a role in depth perception.
(Azuma et al., 2001)

Adaptation: Adaptation to
AR equipment can
negatively impact the user’s
performance. (Azuma et al.,

as small as 10ms can
make a statistically

difference in the
performance of a task that
guides a ring over a bent
wire (Azuma, et al., 2001)

18 perceptual issues have
been defined in support of
stereoscopic displays with an
emphasis on AR (Drasi
c and
Milgram, 1996)


Human Information Processing
(Stedmon, Hill, Kalawsky, and
Cook, 1999)

Potential psychological
limitations of AR technology
include any excessive human
information processing

Experiments h
ave been
developed to investigate the
impact of AR on human
information processing and
the most appropriate
symbologies for displaying
information via the AR

Data have not shown any
significant correlations with

Further research requ

User Interface Challenges: AR
presents unique challenges

management of interaction
between the physical world and
virtual information, without
changing the physical world
(Azuma, et al., 2001)

Data density: Augmentation
may render a cluttered and

unreadable display

Filtering techniques have
been explored to reduce the
volume of displayed
information while keeping
important information in


Table 2.2. (Continued).




Physical Ergonomic Challeng

Physical Ergonomic Challenges
(Azuma, et. a, 2001 and
McCauley Bell, 2002)

Fatigue and eye strain:
Uncomfortable AR displays
many not be suitable for
term use

Usability of a VE system is
influenced by the ergonomic
strength and usability of the

physical aspects of the

The exploration of physical
ergonomic challenges
remains an open research

Table 2.3. Psychophysical Factors Relevant to AR Performance










elaboration, markers

Perception of brightness,
hues, resolution, and
color fidelity


Surface orientation

Information display size,
volume, and area. Image
magnification acco
to angle and distance.



Rate of visual motion

Increase in depth from
point of

Reverse motion, display
momentum, and change
blindness effect.



Reduced amplification
e to decoupling effect
of accommodation or
binocular disparity

Vergence movements to
objects in different
planes, cue conflicts, and
eyes strains.


Fidelity, content, and

Refresh rate, frame rate,
and update frequencies.

Channel capacity,

information cues versus


Concurrent actions

Buffering and “pop ups”.

Delayed or late updates


Table 2.3. (Continued).









Signal intensity

Cue saliency,

Information elaboration,
eliciting important

Noise, stimulus duration,
and intensity threshold

Target location

Realism of posture
with respect to angle,

Visible surface f
and viewing distance.

Constancy effect,
egocentric versus
exocentric depth

Target size

Salient features

Information density

Information design

Target background

Highlights and
distinguishing features

Cue differentiations and

Noise, clutter, texture,
and obscuration

2.6. Implications of Literature Review to Current Research

Shneiderman (1998) states that humans have known unique perceptual abilities
that are underutilized in most HCI designs. As highlighted in t
he reviewed literature, AR
is an HCI design that bridges this gap. AR seeks to “properly leverage” natural human
capabilities to provide a

appropriate HCI that enables the human user to do more. In its
attempt to provide an extension of the user’s perce
ption of the environment, AR
capitalizes on the human’s ability to scan, recognize, and rapidly recall images and detect
changes in size, color, shape, movement, or texture (Shneiderman, 1998). The
effectiveness of this extension, although possibly compro
mised in stressful conditions,
should take place in both the real and augmented layer. Clearly, close attention to human
factors considerations is of the utmost importance.


Wilson (1999) affirms that the application of human factors/ergonomic methods
in t
he development of VE
, inclusive of AR, will drive the impact and value of these
technologies. While germane HCI guidelines are applicable to VE environments,
inclusive of AR, there exist a number of variables that are specific to AR.

The vital lessons o
f the literature review for this research include:


The investigation of psychophysical research relevant to the proposed
experimental designs as shown in Table 2.4.

Table 2.4. Literature Review Lessons Learned




Time delay/ Information content:

Flanagan, McAnally, Martin,
Heeham, and Oldfield (1998),
Malcolm (1984), Perrott,
Sandralodabi, Saberi (1991), and
Perrott, Cisneros, McKinely,
D’Angelo (1996).

Perception speed

time situation

How does time delay affect
situation awareness and
subsequent target detection

Does too much information
content in the display affect
detection search time?


Massimino and Sheridan (1994),
Montgomery (1999), Ntuen and
s (2001).

Refresh rate

Information content in

Memory recall

Level of cognitive
amplification desired

Do peripheral displays augment
information requirements during
task processing?

How often should information
be presented or sampled?


chley and Kramer (2001), Maltz
and Meyers (2001), Nelson,
Hettinger, Cunningham, Brickman,
Haas, and McKinley (1998),
Theeuwes (1989), Watson and
Kramer, 1999), Yeh and Wickens
(2001a; 2001b).

based versus cued
target detection

Divided attention

lective attention

Does directed attention translate
to an increase in target detection

Which is better: whole object
detection or objects detected by
salient features?

Change blindness:

Levin and Simon (1997), Rensick
(2000; 2002), Rensick
, et al.

Perception of change

Change detection

Does AR complement temporal
changes in scene processing
during saccades?



There are various ways to amplify human cognition. These include
annotation of information (Rose, Breen, Ahlers, Cra
mpton, Tuceryan,
Whitaker, and Greer, 1995), magnification of the glass approach or
imparted presence (Bowskill and Downie, 1995), intelligence
amplification with the use of the tool smith metaphor (Brooks, 1996),
peripheral information space (Carr, 1995),

and supplementation with
redundant information (Lampton, Knerr, Goldberg, Bliss, Moshell and
Blau, 1994).


HMD is not the only means of rendering

Understanding the rules film editors use to combine different views into a
coherent whole ca
n provide information augmentation in a motion picture
form (Levin and Simons, 1997). Other means of AR include body
computers and palmtop computers (Fitzmaurice, 1993).




3.1. Overview

The urban warfare environment is the sel
ected experimental domain for this
study. Gritton and Anton (2003) highlight urban warfare as one of ten international
security developments that deserve information technology support for the warrior. To
this end, individuals in the defense industry are

beginning to develop new concepts,
technologies, and systems that will render the dense and dynamic urban battlefield as
transparent as possible.

There is a large amount of developmental work that will provide the soldier with
up to the minute data on t
he battlefield and the enemy. Devices such as micro
vehicles are being investigated to provide the soldier on the urban battlefield real
video information about a situation by flight over the scene. Clearly, the soldier’s access
to a rich body o
f contextually appropriate information will be of advantage. However,
there is a concern with how to present this type of information to a soldier so that its
value can be extracted in real time (augmenting not overloading the soldier’s cognition)

sacrificing battlefield vigilance.

Understanding how to enable performance in an urban warfare environment is
critical. Future strategies to deter terrorism and combat major urban crimes like car
hijacking will take place in large cities or urban thoroug
hfares with highly dense
populations. In addition, there exist within these environments landmarks that make it


easy for targeted criminals to hide and assimilate themselves among the non
population. This situation makes the environment extremely
complicated and inherently
dimensional. The cluttered environment poses risks to law enforcement agents and
land warriors who must identify target groups and bring them to justice. The situation
becomes even more risky in urban combat situations wh
ere soldiers are employed for
liberation and/or the elimination of insurgents. In narrow, crowded streets it is virtually
impossible for law enforcement teams or combat soldiers to be in the direct line of sight
of one another. In addition, the urban sce
ne is highly dynamic and constantly changing.
Dangers, such as the position of snipers, can change from one minute to another.
Operation Restore Hope in Somalia in 1999 is an example of an urban warfare situation
in which approximately 27 U.S. Army soldier
s were killed by urban snipers.

In order to minimize risk in urban warfare environments, likely targets that
constitute risks must be detected, discriminated, and identified. Figure 3.1 shows an
example of an urban landscape with potential targets. Thes
e targets represent cognitive
tasks that belong to psychophysical studies in target detection. Psychophysical studies
are very worthwhile because of the potential high payoffs in saving lives during urban


Example Applications of AR Rele
vant to this Study

The use of

has been identified as a tool for presenting information to the
soldier on the urban battlefield. The Battlefield Augmented Reality (BARS) is an
example AR application tool under development by the United States Marines to


facilitate fighting in urban environments. BARS is slated to use AR technologies to
support target identification and detection in an urban fighting situation (Julier, Baillot,
Lanzagorta, Brown, Rosenblum, n.d).

Figure 3.1. Sample Urban Landscape for Target Detection Experiment


BARS will provide U.S. Marines real
time, three
dimensional situational
awareness information
without distracting soldiers from their operating surroundings.
This process will be accomplished through the overlay of mission critical information on
the Marines’ immediate visual environment. This overlay will provide appropriate
contextual informati
on so that the Marines will not have to be distracted by scanning
supporting documents, interfaces, and so forth.

It is expected that BARS will provide strategic information including

Global information about the user’s environment


The absolute location o
f the user and all other members of his team

Planning information

Local information about buildings

Routing information

Signpost information

Highly localized information that needs to be registered with specific
environmental features


re and utility information

Virtual objects that are simulated within the environment

Other types of information

By integrating a Wearable Augmented Reality System (WARS) with a 3
Interactive Command Environment (3DICE), BARS is expected to provide the ur
combater mobile “pilot
like” vision in a head
up display in the cockpit environment.
Using BARS, Marines will acquire useful situation awareness such as personal locations,
threats and targets.

In general, the urban fighting environment presents uniqu
e challenges. The
combater in the urban battlefield environment must be able to identify and detect targets
in the urban warring landscape. AR offers promises in facilitating combat performance
in this landscape by, among other things, improving target i
dentification and detection.
The experiments in this research will help provide understanding of the effectiveness of
superimposed computer imagery generated by AR in target identification and detection.


The experiments will be performed within the conte
xt of a visually dense and dynamic
environment as characterized by a selected urban battlefield environment.




4.1. General Design Notes

The following experimental design components are consistent across all the
experiments a
nd are described fully in section 4.2:

Criteria used to screen participants

Apparatus used to perform experiments

Design for the experimental scenarios

Limitations constraining test bed development

Procedures for experimental execution

All statistical anal
yses were performed using MINITAB Statistical

Software (MINITAB 14, MINITAB, Inc., State College, PA).

4.2. Experiment 1

4.2.1. Methodolog

The first experiment was designed to understand the impact of cueing reliability
on change detection performance

in an

environment. Two independent variables and
two dependent variables comprised the main components for a within subjects factorial
design. Objective measures were supplemented with subjective measures. Analyses
were also conducted with informati
on theory to further understand the implications of


cueing reliability on information loss during change detection and identification

The two independent variables were cueing reliability and temporal change type.
Cueing reliability was inves
tigated at 3 levels. These 3 levels were low (50%), medium
(70%), and high (90%). The temporal change type was investigated at 4 levels:
presence, position, feature, and none. A 3 x 4 factorial design with 12 treatment
combinations was used. Ten repli
cates were used comprising five uniquely staged
scenarios in duplicate. These 10 replicates for each treatment combination were shown to
each participant resulting in a total of 120 trials (Table 4.1). The 120 trials were
presented to each participant in

a random order.

Table 4.1. Experimental Design for Experiment 1

Cueing Reliability

Temporal Change Type





Low(L): 50%

10 replicates

10 replicates

10 replicates

10 replicates

Medium(M): 70%

10 repli

10 replicates

10 replicates

10 replicates

High(H): 90%

10 replicates

10 replicates

10 replicates

10 replicates

The selected reliability levels were chosen to reflect the actual range of
performance capabilities of

systems (M
ilbert, 2004). The selected temporal change
types represented the types of tactically significant changes that might be encountered on
a battlefield.


Two objective measures were collected from participants for each trial. These
objective measures were th
e change detection time and the identification of detected
change. Subjective measurements (described in Appendix A) related to the effect of
cueing on change detection performance were collected. Sample subjective
measurements included the description o
f the participant’s scan strategy for the detection
of changes and an understanding of the impact of cueing on change detection.

4.2.2. Participants

six volunteer participants, aged 18
47, were selected from the general
North Carolina A&T State U
niversity population. To be eligible for the experiments, the
participants needed normal to corrected normal vision and normal color vision. All
participants met the prescribed criteria. The participants were not paid and received
course credit, where a

4.2.3. Equipment Set

A Gateway E
Machine Pentium IV computer with a 15 inch (viewable space) flat
panel monitor was used to display the scenarios. Other equipment used in support of
scenario taping is listed in Table 4.2.

4.2.4. Scenario D

The experimental scenarios were designed to simulate the use of AR on the
battlefield. An urban battlefield environment was selected as the domain because of
recent interest and exploration in urban warfare. AR has been identified as a technology
o provide identification of friend and foe on the battlefield. Through the use of an HMD,
attentional cueing was superimposed on the soldier’s view (augmenting his/her “real”


view), thus providing distinct symbology to draw attention to and identify entit
ies as
friend, foe, or unknown.

The experimental scenarios were designed to mimic the forward field of view
(FFOV) of a soldier while using attentional cueing to identify entities on the battlefield.
The scenarios were designed to provide a level of reali
sm in concert with an urban
warfare setting.

The scenarios featured an urban battlefield scenario populated with friendly
personnel, enemy personnel, and unknown personnel. Cues, as virtual artifacts, were
used to identify all of these individuals. The e
nemy personnel (the targets) possessed a
common attribute of no uniform top so that they were distinguishable regardless of

Cueing was designed to operate at different reliability levels: low (50%), medium
(70%), and high (90%). Targets (enemy pe
rsonnel) could be cued, miscued (enemy
personnel cued as friendly personnel), or not cued. The enemy personnel were randomly
miscued to meet the prescribed reliability levels. In addition, enemy personnel that were
uncued or miscued as friend

were done

so in a random fashion.

The scenarios were designed so that a “single” change in a single target (enemy
personnel) would occur approximately 15 seconds after commencement of the scenario.
The changes were a feature, a presence, a positional change, or no

change. The changes
reflected behaviors that are tactically significant in an urban battlefield environment.
Changes could include the following behaviors:

Feature Changes:



No weapon to weapon


Change in weapon type

Presence Change:


Appearance of

an enemy


Disappearance of an enemy

Positional Change:


Change in position of an enemy

Table 4.2. Equipment Used in Support of Scenario Tapping

Videotaping Equipment

Props to Support Actors in the Virtual
Battle Space

Sony DCR
TRV22 Digital Video Ca

Sony AC
L15B AC Power Adapter

Sony NP
FM30 Battery Pack

Sony BC
TRM Battery Charger

Duracell DR
SM50 Replacement Battery

Mock Assault rifle (2)

Mock Shotgun (2)

Mock Handgun (2)

Mock RPG (1)

Radio w/antenna (1)

Gas Mask (1)

Hand Grenade (1)

culars (1)

Woods camouflage BDU (3)

Desert camouflage BDU (1)

Enemy caps (3)

Turban (1)

Similar to previous change detection investigations (Rensink, 2002; and Groff
2002), scenarios were designed to reflect the following model:

Scenario Commencement




Scene (S’)


The changes were designed so that the motion of the change was masked by a
visual distraction in the scene. These distractions were designed to mimic natural
behaviors on the battlefield (i.e., “look

The scenarios were videotaped at the ATK Ordinance and Ground Systems
Proving Grounds in Elk River, MN on Saturday, 21 August 2004. Actors used in the
scenarios were volunteers of the Minnesota National Guard.

4.2.5. Technical Limitations

development of the scenarios was constrained by two technical limitations:

Number of scenarios: Because of limited time and funding to secure
additional personnel and facility capacity, only five unique scenarios per
treatment combination could be photogr

Videotaping equipment: Only “home” level video equipment was available
for taping the scenarios. This restricted the level of choreography of the
scenarios, and limited the complexity of the scenarios.

4.2.6. Experimental Procedures

The experimen
t tool took approximately 1.25 hours to complete, during which
participants were given the instructions for the experiment.

Participants were greeted and thanked for their willingness to participate in the
experiment. Participants were given a short bri
efing about the experiment including
requirements, experimental objectives, and time requirements. The participants that did
not meet the requirements or were unwilling to participate were dismissed from the study.


Participants were then given a more comp
rehensive briefing of the experiment
and the experimental objectives. Participants were provided with the required Internal
Review Board (IRB) content. Participants were asked to sign a consent form that
indicated that they understood and voluntarily agr
eed to the conditions outlined in the
experimental briefing. Participants who agreed to the statements on the consent form
were told that they could withdraw from the study at any time.

Participants were given specific task instructions after they were gi
ven general
logistical instructions and time to adjust their workstations for comfort. Participants were
informed that they would be viewing 120 scenarios, each lasting approximately 30
seconds. The scenarios were designed to mimic their FFOV of a battle
field environment
through an HMD). They were told that the scenarios would feature urban, woodland, and
open terrains, and that virtual cueing would be provided to aid them in identifying
individuals within their FFOV. The individuals in their FFOV were
neither a friendly,
enemy, or unknown personnel. The participants were provided with examples of these
individuals and the cueing used to identify them. In addition, the participants were told
that the automation providing the cueing operated at various
levels of saliency, precision,
and reliability, respectively. Thus, the participants were informed of the probability of