Saliency Map Tutorial

skillfulwolverineΛογισμικό & κατασκευή λογ/κού

2 Δεκ 2013 (πριν από 3 χρόνια και 4 μήνες)

207 εμφανίσεις





2012.06 ADSP

Saliency Map

Tutorial

b97901095@ntu.edu.tw


1


Content

1

Motivation

................................
................................
................................
.............

2

2

Introduction to Visual Saliency

................................
................................
..........

3

2.1

Definition

................................
................................
...............................

3

2.2

Saliency in Psychology

................................
................................
..........

5

2.3

Saliency in Neuroanatomy

................................
................................
.....

5

3

Introduction to Saliency Map

................................
................................
.............

6

3.1

Definition

................................
................................
...............................

6

3.2

Buttom
-
Up Approach
................................
................................
.............

7

3.3

To
p
-
Down Approach

................................
................................
.............

7

4

General Computational Framework

................................
................................
..

8

5

Proposed Computational Mechanisms

................................
..............................

9

5.1

A Model of Saliency
-
based Visual Attention for Rapid Scene
Analysis
--
L. Itti, C.
Koch and Ernst Niebur(1998)

................................
...............

9

5.1.1

Architecture of Model

................................
................................
....

9

5.1.2

Visual Preprocessing

................................
................................
....

10

5.1.3

Center
-
Surround Differences

................................
.......................

10

5.1.4

Normalization

................................
................................
..............

12

5.1.5

Conspicuity Maps

................................
................................
........

12

5.1.6

Saliency Map

................................
................................
...............

13

5.1.7

Example

................................
................................
.......................

14

5.2

Graph
-
Based Visual Saliency(GBVS)

................................
.................

15

Jonathan Harel, Chri
stof K
och and Pietro Perona(2007)

................................
.....

15

5.2.1

Forming an Activation Map(s2)
................................
...................

15

5.2.1.1

A Markovian Approach

................................
........................

15

5.2.2

Normalizing an Activation Map (s3)

................................
...........

16

5.2.3

Example

................................
................................
.......................

17

6

Applications

................................
................................
................................
........

18

7

Reference

................................
................................
................................
............

19









2


1

Motivation

One of the most severe problems of perception is information overload.
Peripheral sensors generate afferent signals more or less
continuously and it would be
computationally costly to process all this incoming information all the time. Thus, it is
important for the nervous system to make decisions on which part of the available
information is to be selected for further, more detaile
d processing, and which parts are
to be discarded. Furthermore, the selected stimuli need to be prioritized, with the most
relevant being processed first and the less important ones later, thus leading to a
sequential treatment of different parts of the vi
sual scene. This selection and ordering
process is called selective

attention
. Among many other functions,

attention

to a
stimulus has been considered necessary for it to be perceived consciously.


What determines which stimuli are selected by the attentio
nal process and which
will be discarded? Many interacting factors contribute to this decision. It has proven
useful to distinguish between

bottom
-
up

and

top
-
down

factors. The former are all
those that depend only on the instantaneous sensory input, without

taking into account
the internal state of the organism. Top
-
down control, on the other hand, does take into
account the internal state, such as goals the organisms has at this time, personal
history and experiences, etc. A dramatic example of a stimulus t
hat attracts attention
using bottom
-
up mechanisms is a fire
-
cracker going off suddenly while an example of
top
-
down attention is the focusing onto difficult
-
to
-
find food items by an animal that
is hungry, ignoring more "salient" stimuli.
















3


2

Introduction

to Visual S
alienc
y

2.1

Definition

Our attention is attracted to visually salient stimuli. It is important for complex
biological systems to rapidly detect potential prey, predators, or mates in a cluttered
visual world. However, simultaneously
identifying any and all interesting targets in
one's visual field has prohibitive computational

complexity

making it a daunting
task even for the most sophisticated biological

brains

[1]
, let alone for any existing
computer. One solution, adopted by primat
es and many other animals, is to restrict
complex

object

recognition

process to a small area or a few objects at any one time.
The many objects or areas in the visual scene can then be processed one after the
other. This serialization of visual scene analy
sis is operationalized through
mechanisms of

visual

attention
: A common (although somewhat inaccurate)
metaphor for attention is that of a virtual

spotlight,

shifting to and highlighting
different sub
-
regions of the visual world, so that one region at a ti
me can be
subjected to more detailed visual analysis
[2] [3] [4]
.


Visual

attention

may be a solution to the inability to fully process all locations
in parallel. However, this solution produces a problem. If you are only going to
process one region or obj
ect at a time, how do you select that target of attention?
Visual sali
enc
y

helps your brain achieve reasonably efficient selection. Early stages
of visual processing give rise to a distinct subjective perceptual quality which
makes some stimuli stand out f
rom among other items or locations. Our brain has
evolved to rapidly c
ompute salienc
y

in an automatic manner and in real
-
time over
the entire visual field. Visual attention is then attracted towards salient visual
locations.


Visual salienc
y

is sometimes carelessly described as a physical property of a
visual stimulus. It is imp
ortant to remember that salienc
y

is the consequence of an
interaction of a stimulus with other stimuli, as well as with a visual system
(biological or artificial). As
a straight
-
forward example, consider that a color
-
blind
person will have a dramatically differe
nt experience of visual salien
y

than a person
with normal

color

vision
, even when both look at exactly the same physical scene
(see

Fig. 1
, e.g., the first
example image below). As a more controversial example,
it may be that expertise changes th
e salienc
y

of some stimuli for some observers.
Never
theless, because visual salienc
y

arises from fairly low
-
level and stereotypical
computations in the early stages o
f visual processing, the factors contributing to
4


salienc
y

are generally quite comparable from one observer to the next, leading to
similar experiences across a range of observers and of behavioral conditions.


Visual Salien
cy

Example

Comments


Fig.1

One item in the array of items
strongly

pops
-
out

and
effortlessly and immediately attracts attention. Many studies
have suggested that in simple displays like this, no scanning
occurs: Attention is immediately drawn to the salient item,
no matter how many
other items (called

distractors
) are
present in the display
[2] [5]
. This suggests that the image is
processed in parallel (al
l at once) to determine salienc
y

at
every location and to orient towards the most salient
location.


Fig.2

In this display, the
vertical bar is visually salient. Comparing
this example to the previous one suggests that local visual
properties of a given item

do

not

determine

how perceptually
salient this item will be; rather, looking at a given item
within its surrounding context
is crucial. Compare, for
example, the red bar in the top
-
left corner of this image to
the salient bar in the image above: both bars are red, roughly
horizontal, and they both have very similar local
appearances. Yet the one in the top
-
l
eft corner here has
low
salienc
y

and attention is much more strongly attracted to the
more salient vertical bar, while the red bar in the above
image is highly salient.


Fig.3

In this display, there is again one bar that is unique and
different from all the other ones.
However, by design and
through judicious choice of distracting

items, there is little
salienc
y

to guide you towards the target bar (why that is will
be discussed in the following section). The target is a
so
-
called

conjunction

target
: is the only red and v
ertical bar
[2]
. Because salienc
y

does not help you direct attention
towards potentially interesting items in the display, you find
yourself scanning the image, seemingly at random, looking
for something interesting.

Table 1

5


2.2

Salienc
y

in Psychology



Distinctiveness, prominence, obviousness. The term is widely used in the
study of perception and cognition to refer to any aspect of a stimulus that, for any
of many reasons, st
ands out from the rest. Salienc
y

may be the result of emotional,
motivational o
r cognitive factors and is not necessarily associated with physical
factors such as intensity, cl
arity or size. Although salienc
y

is thought to determine

attentional selection, salienc
y

associated with physical factors does not necessarily
influence select
ion of a stimulus

[7].

2.3

Salienc
y

in Neuroanatomy



The

hippocampus

participat
es in the assessment of salienc
y

and context
using past memories to filter new incoming stimulus; placing those that are most
important into the long term memory. The

entorhinal

cortex is the pathway into
and out of
the hippocampus and is damaged early on in

Alzheimer's
disease
.[
citation needed
]

The

pulvinar

(in the

thalamus
) modulates physical saliency in attentional selection

[6].





















6


3

Introduction to Salienc
y

Map

3.1

Definition




Salienc
y

m
ap has its root in Feature Integration Theory
[2]

and appears first
in the class of algorithmic models above
[8]
. It includes the following elements
(see Figure 4):

(i)

an

early representation composed of a set of feature maps, computed
in parallel, permitting separate representations of se
veral stimulus
characteristics
.

(ii)

a

topographic saliency map where each location encodes the
combination of properties across all feature maps as a conspicuity
m
easure
.

(iii)

a selective mapping into a central non
-
topographic representation,
through the topographic saliency map, of the properti
es

of a single
visual location
.

(iv)

a winner
-
take
-
all (WTA) network implementing the selection process
based on one major rule: conspicuity of location (minor rules of
proximity or similarity prefe
rence are also suggested)
.

(v)

inhibition

of this selected location t
hat causes an automatic shift to
the next most conspicuous location. Feature maps code conspicuity
within a particular feature dimension.


The saliency map combines information from each of the feature maps into
a global measure where points corresponding

to one location in a feature map
project to single units in the saliency map. Saliency at a given location is
determined by the degree of difference between that location and its surround.
The models of Clark & Ferrier (1988)

[9]
, Sandon (1990)

[10]
, Itti

et al. (1998)

[11]
, Itti & Koch (2000)

[12]
, Walther et al. (2002)

[13]
, Navalpakkam & Itti
(2005)

[14]
, Itti & Baldi (2006)

[15]
, SERR Humphreys & Müller (1993)

[16]
,
Zhang et al. (2008)

[17]
,

and Bruce & Tsots
s (2009)
[18]
are all in this class. The
dri
ve to discover the best representation of saliency or conspicuity is a major
current activity; whether or not a single such representation exists in the brain
remains an open question with evidence supporting many potential loci
(summarized in Tsotsos et a
l. 2005

[19]
).

7



Fig.4


The Saliency Map Model as originally conceived by Koch & Ullman 1985. (figure
adapted from Koch & Ullman 1985)

3.2

Buttom
-
Up Approach



The core of visual salienc
y

is a bottom
-
up, stimulus
-
driven signal that
announces “this location is sufficiently different from its surroundings to be
worthy of your attention”. This
bottom
-
up

deployment of attention towards salient
locations can be strongly modulated or even someti
mes overridden
by

top
-
down,

user
-
driven factors
[20] [21]
. Thus, a lone red object in a green field
will be salient and will attract attention in a bottom
-
up manner.

3.3

Top
-
Down

Approach

On the other hand
, if you are looking through a child’s toy bin for a
red plastic
dragon, amidst plastic objects of many vivid colors, no one color may be especially
salient until your top
-
down desire to find the red object renders all red objects,
whether dragons or not, more salient.










8


4

General Computational
Framework


A simple frame
work to think about how salienc
y

may be computed in biological
brains has been developed over the past three decades (Treisman & Gelade, 1980

[22]
;
Koch & Ullman, 1985

[2]
; Niebur & Koch, 1996

[23]
; Itti & Koch, 2001

[21]
).
According to the framework, incoming visual information is first analyzed by early
visual neurons, which are sensitive to the various elementary visual features of the
stimulus. This analysis, operated in parallel over the entire visual field and at multip
le
spatial and temporal scales, gives rise to a number of cortical

feature

maps,

where
each map represents the amount of a given visual feature at any location in the visual
field. Within each of the feature maps, locations which significantly differ from
their
neighbors are highlighted, as further discussed below. Finally, all highlighted
locations from all feature maps combine into a single

saliency

map

which represents a
pure salienc
y

signal that is independent of visual features (Koch & Ullman, 1985

[2]
;
Nothdurft, 2000

[24]
). According to several models, the relative contributions of
different feature maps to the final

saliency

map

is dependent upon the current
behavioral goals and subjective state of the observer (Wolfe, 1994

[25]
; Navalpakkam
& Itti,
2005

[26]
). In the absence of any particular task, such as, for example, during
casual viewing, attention is drawn towards the most salient locations in the saliency
map, as detected, for example, via a

winner
-
take
-
all

mechanism (Didday, 1976

[27]
;
Koch &
Ullman, 1985

[2]
). This, in turns, triggers motor actions which direct the eyes
and the head towards salient visual locations (Dominey & Arbib, 1992

[28]
; Findlay
& Walker, 1999

[29]
). Note that a number of theories exist as to whether an explicit
saliency

map is necessary or not (Hamker 1999

[30]
; Li, 2002

[31]
; see

Saliency

Map

for additional discussion).








9


5

Proposed
Computational Mechanisms

5.1

A Model of Saliency
-
based Visual Attention for Rapid Scene
Analysis
--
L. Itti, C. Koch and Ernst Niebur(1998)

[11]


Inspired by the behavior and the neuronal architecture of the early primate visual
system, the model combined multi
-
scale image features into a single topographical
saliency map. This is a completely buttom
-
up approach.


5.1.1

Architecture of Model


Fig.5 The architecture of Itti & Kuch model.
(figure adapted from
L. Itti, C. Koch
and Ernst Niebur(1998)
)









10


5.1.2

Visual Preprocessing


Fig.6



Define











(
r,g,b

are the color values).





For each pixel in the pyramid, generate the three color
channels:

R = r
-
(g+b)/2

G = g
-
(r+b)/2

B = b
-
(r+g)/2

Four Gaussian pyramids

(

)


(

)


(

)


(

)

are created from these color
channels, where










is the scale.


5.1.3

Center
-
Surround Differences


Fig.7


Each feature is computed by a set of linear“center
-
surround”operations. The
concept is that typically visual neurons are most sensitive in a small region of the
visual space (the center), while stimuli presented in a broader, weaker
antagonistic region con
centric with the center (the surround) inhibit the neuronal
response. The operations is aimed at detecting locations which locally stand out
from their surround.



C
enter
-
surround
is implemented in the model as the difference
between

fine
(center) and coar
se
(surround)
scales
: The
center is a pixe
l at scale









,
and the surround is the
corresponding

pixel at scale


















Across
-
scale difference between two maps, denoted “
”below, is obtained by
interpolation to the finer scale and point
-
b
y
-
point subtraction.





11



Fig.8

Fig.
Achieve center
-
surround difference through across
-
scale
difference



T h e f i r s t s e t o f f e a t u r e ma p s i s c o n c e r n e d w i t h i n t e n s i t y c o n t r a s t, w h i c h
i n
m
a mma l s i s d e t e c t e d b y n e u r o n s s e n s i t i v e e i t h e r t o d a r k c e n t e r s o n
b r i g h t
s u r r o u n d s, o r t o b r i g h t c e n t e r s o n d a r k s u r r o u n d s. B o t h t y p e s o f s e n s i t i v i t i e s
a r e s i mu l t a n e o u s l y c o mp u t e d i n a s e t o f s i x ma p s I ( c,s ):

( 1 )



T h e s e c o n d s e t o f ma p s i s s i mi l a r l y c o n s t r u c t e d f o r t h e c o l o r c h a n n e l s,
w h i c h i n c o r t e x a r e r e p r e s e n t e d u s i n g a s o
-
c a l l e d

c o l o r d o u b l e
-
o p p o n e n t

s y s t e m: I n t h e c e n t e r o f t h e i r r e c e p t i v e f i e l d, n e u r o n s a r e e x c i t e d b y o n e c o l o r ( e.g.,
r e d ) a n d i n h i b i t e d b y a
n o t h e r ( e.g., g r e e n ), w h i l e t h e c o n v e r s e i s t r u e i n t h e
s u r r o u n d. S u c h s p a t i a l a n d c h r o ma t i c o p p o n e n c y e x i s t s f o r t h e r e d/g r e e n,
g r e e n/r e d, b l u e/y e l l o w a n d y e l l o w/b l u e c o l o r p a i r s i n h u ma n p r i ma r y v i s u a l c o r t e x.

A c c o r d i n g l y, ma p s
R G ( c,s )

( f o r r e d/g r e e n, g r
e e n/r e d d o u b l e o p p o n e n c y ) a n d
B Y ( c,s )

( f o r b l u e/y e l l o w, y e l l o w/b l u e d o u b l e o p p e n e n c y )

a r e c r e a t e d a s

E q.2 a n d
E q.3.

( 2 ),( 3 )



L o c a l o r i e n t a t i o n i s o b t a i n e d f r o m I u s i n g o r i e n t e d G a b o r p y r a mi d s

(



)
,
w h e r e










r e p r e s e n t s t h e s c a l e a n d














is the
preferred orientation. Orientation feature maps,

(





)
, encode, as a group,
local orientation contrast between the center and surround scales:

(4)

12




In total, 42 feature maps are computed: for intensity, 12 for color and 24
for orientation.


5.1.4

Normalization


Fig.9



The operator is denoted as N(.).
Normalize the values in the map to a fixed
range [0..M
] in order to eliminate modality
-
dependent amplitude differences. Find
the location of the map’s global maximum M and compute the average

̅

of all
its other local maxima; and globally multiplying the map by
(



̅
)

.


5.1.5

Conspicuity Maps


Fig.10

The feature maps

are combined into three conspicuity maps at the scale 4

(



)
.

̅




(


)


̅





(


)





̅





(



)
This is
obtained through across
-
scale addition
,


,

which consists of

reducing
of
each map to
scale 4

and point
-
by
-
point add
ition.

(5)

(6)

(7)




13


5.1.6

Saliency Map


Fig.11


The three conspicuity maps are normalized and summed into the final input
S to the salienc
y

map.

(8)



























14


5.1.7

Example

Original Image




Salienc
y

Map




Combine salienc
y

map with original image




Table,2

15


5.2

Graph
-
Based Visual Saliency
(GBVS)



Jonathan Harel, Christof Koch and Pietro Perona(2007)

[32]


The algorithm consists of two steps: first forming activation maps on certain
feature

channels, and then normalizing them in a way which highlights
conspicuity and admits combination with other maps. The model is simple, and
biologically plausible insofar as it is naturally parallelized. This model predicts
human fixations more powerfully t
han the classical algorithms of Itti & Koch

[11] [12] [15]
.


T
he leading models of visual saliency may be organized into the these three
stages:


(s1)
extraction
: extract feature vectors at locations over the image plane


(s2)
activation
: form an
"activation map" (or maps) using the feature vectors


(s3)
normalization/combination
: normalize the activation map (or maps,


followed by a
combination

of the maps into a single map)


5.2.1

Forming an Activation Map
(s2)

5.2.1.1

A Markovian Approach



D
e
fi
ne the
dissimilarity of

M(i,j)

and
M(p,q)

as


(
(



)


(



)
)

|


(



)

(



)
|


Consider now the fully
-
connected directed

graph


,
obtained by
connecting every node of the lattice

M,

labelled with two indices

(



)





,
with all other n
-
1 nodes.
The

directed edge from node (i,j) to node (p,q) will be
assigned a weight


(
(



)

(



)
)



(
(



)

|
(



)
)


(







)




(



)


(









)



is a free parameter in the algorithm which has no significant effect on results.

Thus, the weight of the edge from node (
i,

j)

to node (p
,

q)

is proportional to their
dissimilarity and to their closeness in the domain of M. Note that the edge

in the
opposite direction has exactly the same weight. We may now dene a Markov
chain on



by normalizing the weights of the outbound edge
s of each node to
1, and drawing an equivalence

between nodes & states, and edges wei
ghts &
transition probabilities
. The equilibrium distribution

of this chain, re
fl
ecting the
fraction of time a random walker would spend at each node/state if he

were to
walk forever, would naturally accumulate mass at nodes that have high
16


dissimilarity with

their surrounding nodes, since transitions into such subgraphs
is likely, and unlikely if nodes have

similar M values. The result is an activation
measure whic
h is derived from pairwise contrast.




We call this approach organic" because, biologically, individual
“nodes”(neurons) exist in a connected,

retinotopically organized, network (the
visual cortex), and communicate with each other

(synaptic ring) in a way

which
gives rise to emergent behavior, including fast decisions about

which areas of a
scene require additional processing. Similarly, our approach exposes connected
(via

F) regions of dissimilarity (via w), in a way which can in principle be
omputed in a

completely

parallel fashion. Computations can be carried out
independently at each node: in a synchronous

environment, at each time step,
each node simply sums incoming mass, then passes along measured

partitions of
this mass to its neighbors according to

outbound edge weights. The same simple

process happening at all nodes simultaneously gives rise to an equilibrium
distribution of mass.


5.2.2

Normalizing
an Activation Map (s3)


The aim of the "normalization" step of the algorithm is much less clear than
that
of the activation

step. It is, however, critical and a rich area of study. Earlier,
three separate approaches were mentioned

as existing benchmarks, and also the
recent work of Itti on surprise [4] comes into the saliency

computation at this
stage of the p
rocess (although it can also be applied to s2 as mentioned above).

We shall state the goal of this step as:
concentrating mass on activation maps
. If
mass is not concentrated

on individual activation maps prior to additive
c
ombination, then the resulting m
aster map

may be too nearly uniform and hence
uninformative. Although this may seem trivial, it is on some

level the very soul
of any saliency algorithm: concentrating activation into a few key locations.



Armed with the mass
-
concentration de
fi
nition, we
propose another
Markovian algorithm as follows:

This time, we begin with an activation map









, which we wish to

normalize

.

We

construct a graph


with



nodes labelled with indices from




. For each node
(
i,

j
)
and every

node
(
p
,

q
)
(including
(
i,

j
)
) to which it is connected, we introduce an edge from
(
i
,

j
)
to
(
p
,

q
)
with

weight:



Again, normalizing the weights of the outbound edges of each node to unity
and treating the resulting

graph as a Markov chain gives us the opportunity to
17


compute the
equilibrium distribution over the

nodes. Mass will
fl
ow
preferentially to those nodes with high activation. It is a mass concentration

algorithm by construction, and also one which is parallelizable, as before, having
the same natural

advantages. Experimen
tally, it seems to behave very favorably
compared to the standard approaches

such as "DoG" and "NL".



5.2.3

Example

Original
Image




Saliency

Map




Combine saliency

map with original image

18





Table 3

6

Applications


Beyond the original application
of the saliency map as the stage of a control
system for covert attention, it has found use in other, related areas. Perhaps the most
immediate extension is to predict

eye

movements

[33] [34]
. There are numerous
technical applications in which the saliency

map is typically used to prioritize
selection, e.g. to identify the most important information in visual input streams and
to use this to improve performance in generating or transmitting visual data

[33]
.
Even an "inverse" saliency map has been used, to
de
-
emphasize salient image regions
and to direct attention to other regions
[35]
. Another original application of saliency
maps is to generate synthetic

vision

for simulated actors in virtual environments
[36]
.
Saliency maps have also been integrated in a
VLSI hardware model of visual selective
attention (Indiveri 2000

[37]
).














19


7

Reference

[1]

J. K. Tsotsos (1991). Is Complexity Theory appropriate for analysing biological
systems? Behavioral and Brain Sciences 14(4):770
-
773.

[2]

A. Treisman G. & Gelade

(1980). A feature integration theory of
attention.

Cognitive

Psychology

12:97
-
136.

[3]

F. Crick (1984). Function of the

thalamic

reticular complex: the searchlight
hypothesis. Proceedings of the National Academies of Sciences USA
81(14):4586
-
90.

[4]

E.
Weichselgartner & G. Sperling (1987).

Dynamics

of automatic and controlled
visual attention. Science 238:778
-
780.

[5]

J. M. Wolfe (1994). Guided Search 2.0: A Revised Model of Visual Search.
Psychonomic Bulletin & Review 1(2):202
-
238.

[6]

Tsakanikos, E. (2004). La
tent inhibition, visual pop
-
out and schizotypy: is
disruption of latent inhibition due to enhanced stimulus salience,

Personality and
Individual Differences
, 37, 1347
-
1358.

[7]


Kapur, S. (2003). Psychosis as a state of aberrant salience: a framework linking
b
iology, phenomenology, and pharmacology in schizophrenia.

American Journal
of Psychiatry
,160, 13

23

[8]

Koch, C. and Ullman, S. Shifts in selective visual attention: towards the
underlying neural circuitry. Human Neurobiology 4:219
-
227 (1985).

[9]

Clark, J.J., Fer
rier, N. (1988). Modal control of an attentive vision system. Proc.
ICCV, Tarpon Springs Florida, p514

523.

[10]

Sandon, P. (1990). Simulating visual attention, J. Cognitive Neuroscience 2,
p213
-
231.

[11]

Itti, L., C. Koch, et al. (1998). A model of saliency
-
based v
isual attention for
rapid scene analysis, IEEE Transactions on Pattern Analysis and Machine
Intelligence 20(11), p1254
-
1259.

[12]

Itti, L., Koch, C. (2000). A saliency
-
based search mechanism for overt and covert
shifts of visual attention, Vision Res 40(10
-
12),

p1489
-
506

[13]

Walther, D., L. Itti, et al. (2002). Attentional selection for object recognition
-

A
gentle way. Biologically Motivated Computer Vision, Proceedings 2525,
p472
-
479.

[14]

Navalpakkam V, Itti

L. (2005). Modeling the influence of task on attention,
Vision Res. 45(2), p205
-
31.

[15]

Itti, L., Baldi, P. (2006).

Bayesian

Surprise Attracts Human Attention. Advances
in Neural Information Processing Systems 18, 547

554.

[16]

Humphreys, G., Müller, H., (1993). S
earch via Recursive Rejection (SERR): A
20


Connectionist Model of Visual Search,

Cognitive

Psychology
, 25, p45
-

110.

[17]

Zhang, L., Tong, M. H., Marks, T.K., Shan, H., & Cottrell, G.W. (2008). SUN: A
Bayesian framework for saliency using natural statistics. Jour
nal of Vision,
8(7):32, p1

20.

[18]

Bruce, N.D.B., Tsotsos, J.K. (2009). Saliency, Attention, and Visual Search: An
Information Theoretic Approach, Journal of Vision 9:3, p1
-
24.

[19]

Tsotsos, J.K., Itti
, L., Rees, G. (2005). A Brief and Selective History of Attention,
in Neurobiology of Attention, Editors Itti, Rees & Tsotsos, Elsevier Press, 2005

[20]

Desimone, R., Duncan, J. (1995). Neural mechanisms of selective visual
attention, Ann. Rev. of Neuroscience
18, p193
-
222.

[21]

Itti, L., Koch, C. (2001), Computational modeling of visual attention, Nature
Reviews Neuroscience 2, p 1
-
11.

[22]

Treisman, A., Gelade, G. (1980). A feature integration theory of attention,
Cognitive Psychology 12, p97
-
136.

[23]

Usher, M., Niebur, E.
(1996). Modeling the temporal dynamic of IT neurons in
visual search: A mechanism for top
-
down selective attention, J. Cognitive
Neuroscience 8:4, p311
-
327.

[24]

Nothdurft, H.C.

Salience from feature contrast: variations with texture
density.

Vision Research

40

(2000)

[25]

J. M. Wolfe (1994). Guided Search 2.0: A Revised Model of Visual Search.
Psychonomic Bulletin & Review 1(2):202
-
238.

[26]

V. Navalpakkam & L. Itti (2005). Modeling the influence of task on attention,
Vision Research 45(2):205
-
231.

[27]

R. L. Didday

(1976). A model of visuomotor mechanisms in the frog optic
tectum. Mathematical Biosciences 30:169
-
180.

[28]

P. F. Dominey & M. A. Arbib (1992). A cortico
-
subcortical model for generation
of spatially accurate sequential

saccades
.

Cerebral

Cortex

2(2):153
-
175.

[29]

J. M. Findlay & R. Walker, R (1999). A model of saccade generation based on
parallel processing and competitive inhibition. Behavioral and Brain Sciences
22:661
-
674.

[30]

F.H. Hamker (1999). The role of feedback connections in task
-
driven visual
search, in: D.

Heinke, G.W. Humphreys, A. Olson (Eds.),

Connectionist

Models

in Cognitive Neuroscience. Springer Verlag. London, pp. 252
-
261.

[31]

Li Z (2002). A saliency map in primary visual cortex Trends in Cognitive
Sciences 6(1): 9
-
16.

[32]

D. Parkhurst, K. Law, & E. Niebur
(2002). Modeling the role of salience in the
allocation of overt visual attention. Vision Research 42(1):107
-
123.

[33]

Underwood, G., Foulsham, T, van Loon, E., Humphreys, L. and Bloyce, J.. Eye
21


movements during scene inspection: A test of the saliency map hypo
thesis.
European Journal Of Cognitive Psychology 18(3):321
-
342 (2006)

[34]

Su, S. L., Durand, F. and Agrawala, M. An Inverted Saliency Model for Display
Enhancement. In Proceedings of 2004 MIT Student Oxygen Workshop, Ashland,
MA (2004)

[35]

Courty, N. and Marchand,

E.

Visual

perception

based on salient features. Proc. of
2003 IEEE/RSJ Intl. Conference on Intelligent Robots and Systems. Las Vegas,
Nevada 2003

[36]

Indiveri G. Modeling selective attention using a neuromorphic analog VLSI
device. Neural Computation 12(12):2
857
-
80 (2000)

[37]

Robinson, D. L. and Petersen, S. E. The pulvinar and

visual

salience
. Trends
Neuroscience 15(4):127
-
132 (1992)


[38]

Laurent Itti

.
Visual Salience
. Retrieved
June

2
7
, 20
12
, from
http://www.scholarpedia.org

[39]

Ernst Niebur
.
Saliency Map
. Retrieved
June

2
7
, 20
12
, from
http://www.scholarpedia.org

[40]

Salience (neuroscience)
.

Retrieved
June

2
7
, 20
12
, from

http://en.wikipedia.org