1 - Machine Neuroscience Technologies

kneewastefulΤεχνίτη Νοημοσύνη και Ρομποτική

29 Οκτ 2013 (πριν από 4 χρόνια και 10 μέρες)

93 εμφανίσεις




Page

3

Proposed Article

By

Dr. James O. Gouge

gougejo@gmail.com


1.

Identification and Significance of the Problem

One theory of human object recognition posits that recognition by the visual system is by
extracting simple geometric volumes from perceived objects (
geons), analyzing their
interrelations and formulating identification. This recognition
-
by
-
component (RBC) theory
(Biederman , 1987) can explain how things can successfully be recognized in spite of changes in
size, orientation or partial occlusion.

Th
e MIT software is unique; it goes beyond the current state of the art in object recognition. The
software works well in unconstrained (lighting, shadows, clutter, etc.) environments, it lends
itself to sensor and information fusion tasks, has modest memor
y requirements, is aspect
invariant, and improves performance by parameterizing itself during use.

Machine Intelligence Technology (MIT) software described herein can perform object
identification on a Personal Computer (PC).

DR. James O. Gouge has man
y Patents, Articles and extensive experience in automatic target
recognition. In previous experimental work with military and medical imaging using sonar,
radar, ultrasound and electro
-
optical images Dr. Gouge has achieved 100% target identification
in th
e presence considerable noise and clutter.



MIT software algorithms have the ability to make object identification decisions using models
of human cognitive processes, and can demonstrate the functionality on a Personal Computer
(PC).


MIT software ha
s no peers. The algorithm is large
-
scale in the sense that it is applied to image
processing problems previously characterized as large
-
scale. It goes beyond the current state of
the art in object recognition. The MIT softwarel highlights following prop
erties:

1)

MIT processes any digital image, including ultrasound, radar, visible light, infra
-
red light. It
can be implemented with any two
-
dimensional imaging sensor.

2)

MIT is fast. The software uses integer processing, avoiding complicated mathematical
ma
nipulations.

3)

MIT has modest memory requirements. A single MIT algorithm requires a little over a
megabyte of memory, not counting the image buffer.




Page

3

4)

MIT processing is intensity invariant. Object classifications are not affected by variations in
pixel in
tensity.

5)

MIT processing is independent of camera range. Scale does not affect the outcome of
classification. (The only range limit is due to the resolution of the image. Of course, as with
any image processing approach, a sufficient number of pixels are

required to identify an
object.)

6)

MIT is aspect angle insensitive. Rotation does not affect the outcome of classification. The
algorithm is designed to be independent of in
-
plane rotations. Out
-
of
-
plane rotations are
handled by a combination of invarian
t design and parameterization with real image data.

7)

MIT requires no preprocessing, such as binary thresholding or edge detection. Since there is
no preprocessor, there is no chance that critical information will be deleted before
classification is attempt
ed.

8)

MIT uses no heuristics or mathematical models, eliminating the chance that an assumption
will be invalid in any particular case. Since the algorithm is parameterized with real
imagery, its operation can never be inconsistent with reality.

9)

MIT requires

no external information to classify an object. External sources of orientation or
position information are not needed by the algorithm.

10)

MIT can reject irrelevant features and objects, so
-
called “clutter”. This results in patterns
that are independent
of background and independent of features that can be varied. The
software learns to discriminate between relevant and irrelevant during its parameterization
with real imagery, so no heuristics are implied here.

11)

MIT can classify objects that are partially

obscured or covered by a shadow. The edge of the
shadow will not result in an incorrect classification.

12)

MIT delineates object borders precisely. These precise borders can be used by another
software module to characterize and identify the object.

13)

MIT r
ecognizes obscured objects by rejecting irrelevant features. There is even the potential
for separating two overlapping objects using a novel technique we call Monoscopic
Stereography (MS).

14)

MIT output is a compact representation. The resultant feature cla
ss image includes only a
few relevant class values. Everything that is irrelevant is set to a gray color.

15)

MIT concentrates on classifying objects with single images. This ensures that identification
of the object can be just as accurate as the continua
l tracking of the object.

16)

MIT algorithm learns from doing; it can parameterize itself during the tracking of objects,
resulting in improved performance over time.


Background

The existing state of the art in image processing relies heavily on brittle alg
orithms that can be
confused by normal variables inherent in real
-
world images, namely three dimensional rotational
changes, variations in illumination, differences in scale and resolution, aspect ratio changes and
shadowing. These variables pose no major

problems to biologic visual systems. The proposed
approach uses the Gouge Transformation algorithms in concert with an architecture that emulates
the biological vision processes, representing a new and revolutionary technology for robotic
vision and obje
ct identification.




Page

3

MIT is uniquely qualified to automate object detection and classification in the multi
-
imaging
integrated sensor environment of robotic control and intelligence. The software can optimize
still frame and video imagery resolution, and m
aximize the automation of detection, acquisition,
classification and tracking processes.

Machine Intelligent Processor (MIP)

MIT is made up of three major parts: the quantifier algorithms, the Gouge Transform algorithms,
and the synthetic cerebral cortex

(SCC) (see flow path below).



1.1.1

The Quantifiers

The quantifiers operate on the image. They comprise algorithms designed to quantify the spatial
relationships and amplitude (luminance) of pixels contained in a processing “kernel”. A kernel
i
s an “n x n” group of pixels, which is moved across an image in a stepwise fashion from left to
right and top to bottom, covering the total area of interest in an image. Kernels, in a sense, are
“windows” that are used to transform pixel spatial relations
hips and luminance amplitudes into
abstract digital forms. These “quantifiers” remain invariant regardless of orientation, shadowing,
or other variations.

1.1.2

The Gouge Transforms

The Gouge Transforms are eight algorithms that are fundamental to the overal
l system
architecture. These algorithms effectively transform image features into a group of unsigned
integers which are used to address the synthetic cerebral cortex during the learning phase and/or
the analysis phase of image processing. They provide r
epeatable results even when variations
exist between the image being analyzed and the images used to train the analysis system.

Features

Pixels

Synthetic Cerebral Cortex

Gouge Transformation

Quantifiers


2
-
D

Image

Relational
Processing

Multi
-

Resolution

Kernelling

Quantifier

Computation

Action
Element

and/or

Di
splay


Learned

Feature

Mapping

Identified

and

Attributed

Features

Relations




Page

3

1.1.3

The Synthetic Cerebral Cortex (SCC)

Abstract information from the Gouge Transform algorithms is further processed in the synthet
ic
cerebral cortex. The SCC architecture consists of the same three functional parts that exist in
biological life forms. The first part corresponds to the single layer striate cortex. There the
integers from the transform algorithms feed the next layer

analogous to the biological
intermediate cortical region. Interconnects are generated during the relational learning process as
needed to create pathways (digital synapses). The relational interconnections represent stored
knowledge. In other words, lea
rned knowledge is captured in pathways themselves.

In the image analysis and learning process, the intermediate elements are connected to terminal
action elements. One action element can generate new pixel information, that when overlaid
onto the original

image, highlights the specific learned features of objects. Other terminal action
elements could link to programs that initiate, for example, tracking functions or weapon release
actions.

MIT forms synaptic correspondences pseudorandomly in the manner of

the biologic model. That
means the process is repeatable but not predictable. The number of required synaptic
interconnections is unknown initially and is established during the learning process; the number
expands or reduces as needed during training.

MIT results in exceedingly flexible target
recognizers that are neither brittle nor mathematically predictive and restricted, but the process is
deterministic in the sense that the same input always results in an identical output.

1.1.4

Description of Approach

MIT functionally emulates biologic vision architecture. Its quantifiers encode pixel luminance
and spatial information, its Gouge Transformation extracts and quantifies feature attributes, and
its SCC forms feature relationships. The present research fo
rmulates the transform algorithms
and will integrate them into a process flow.

Three aspects of Dr. Gouge’s software approach make it a unique image analysis tool and
contribute to real time performance and identification accuracy. These functionalities

are 1)
software learning, 2) error correction, and 3) speed critical computing.

1.1.4.1

Software Learning

An analyst initially solves the identification problem. To initialize the learning process, feature
recognition routines called “quantifiers” cast target pix
el patterns into digital form. The software
“learns” a similarity metric by associating the quantifiers with identified feature attributes. Once
initialized, the software is presented with additional target images, and builds a growing set of
feature ide
ntifications. Algorithm performance will improve with each presentation of images.
This process is continued until the space of possible correspondences is sufficiently covered. If
insufficient images are available during this “training” phase, the algo
rithm can learn during
actual operation.





Page

3

1.1.4.2

Error Correction

Many image processing algorithms fail in unconstrained environments because they are not
capable of error correction. MIT incorporates this capability through recursive processing.
Recurrent opera
tion is a way to produce a clear delineation of feature classes, and fill in
shadowed or blank feature areas. This important processing step, either ignored or addressed
with ad hoc methods by others, “cleans
-
up” unavoidable “errors” and ambiguities. In

this way,
the MIT implicitly includes error correction.


For example, consider the image of a monochromatic tropical water scene (Figure 1). The first
stage process annotates in color the features of interest. A recursive second stage reprocesses the
im
age to remove ambiguities. These steps can be repeated until no changes are indicated.




Figure 1: Two stage process for classifying image features



Figure 1 demonstrates a recurrent mode where the image is re
-
run through the algorithm. T
he
software annotates features of interest in color. In the top right image, blue corresponds to
‘water’ and ‘water over sand.’ Yellow corresponds to coral. Black corresponds to rock. Cyan
corresponds to sky. White corresponds to clouds. Green corres
ponds to ‘leaves.’ Brown
corresponds to tree bark. Magenta corresponds to boarders between the coral and the water and
boarders between the rock and the water.


ORIGINAL IMAGE


FIRST STAGE IMAGE


GRAY SCALE RESULT


SECOND STAGE IMAGE

Multiple passes draw
information from
further away to
disambiguate and
separate cla
sses.

Untrained patterns
could be classified in a
third stage as “water
-
over
-
rock.”

Regions with
ambiguous or wrong
classifications show
patterns that can be
classified again.




Page

3

In this example, the classification result of the original image is “noisy” and filled with “e
rrors.”
Some classifications are wrong due to “collisions” created by the many
-
to
-
few mapping of the
algorithm. Some classifications are correct but are not as one would expect. For example, there
are classifications of “bark” mixed in with the leaves o
f the tree. This is understandable because
parts of the leaves are made of the same material as the bark. It is also noted that initially
portions of the shadow that are of non
-
constant intensity are classified as leaves. This is
understandable because
the shadow is the projection of the leaves onto the water.


The recursive second stage process and subsequent stages clean
-
up these “errors” and even
identify additional features. For example, the classifier can initially distinguish water and rocks.
How
ever, there is no distinct pattern for “water over rocks.” By the end of the second stage,
however, it is clear that there are unique patterns appearing where there is water over the rock.
The software has mapped these patterns to the “unknown” class, sh
own in gray because it was
not told how to classify and color those patterns.


1.1.4.3

Speed Critical Computing

MIT is designed around speed critical binary computing. All arithmetic processes are performed
on binary boundaries, requiring only single operations t
o perform additions, subtractions,
multiplications and divisions (a multiplication is a shift in one direction, a division is a shift in
the other direction). Complex mathematics is avoided to keep computational overhead low.

Research Tasks

The objective

of Dr. Gouge’s research program was to develop the MIT image processing
system and demonstrate target identification on sample digital radar images. The following tasks
were performed in this research effort :

Select quantifier computations

Implement the

Gouge Transform algorithms

Select representative images for training and demonstration

Train software to identify target set

Demonstrate target identification using representative images

Write Final Report


1.1.5

Task 1: Select Quantifier Computations

Dr. Gou
ge selected appropriate quantifier algorithms with the objective of maximizing MIT
performance for the identification and classification task. The key is to find one or more
quantifiers that are uniquely capable across a large set of image modalities (IR,

radar, EO, etc).
We currently have a “toolbox” of quantifiers that has been shown to be very generic in the
whole. Each quantifier can fail when used alone, but when several are used together, consistent
feature extraction results.




Page

3

1.1.6

Task 2: Implement
Gouge Transform Algorithms

The Gouge Transforms establish relationships between image features of interest and their
attributes. For instance, arcs and curved lines are detected and quantified by means of a spiral
scan of a kernel “window” and straight li
nes by means of a raster scan. These scans are
illustrated for a 4 x 4 pixel kernel in Figure 2 below. Other scan patterns detect and quantify
object corners and intersecting lines.







Figure 2: Gouge Transform scan patterns

The transform algorithms are fundamental to the overall system architecture. The algorithms
will be encoded for application and demonstration on a PC. The transform algorithms associate
abstract quanti
fiers of pixel luminance data and spatial relationships, and change this information
into a relational domain that remains stable regardless of orientation, shadowing, or other
variations. The Gouge Transforms effectively represent the learned knowledge b
ase.

1.1.7

Task 3: Select Images for Training and Testing

Representative images will be selected for training and testing. DR. GOUGE has access to a
library of images from an Unmanned Aerial Vehicle (UAV) flight
-
test programs. These images
and/or represent
ative customer provided digital images will form the basis for demonstrating the
accuracy of object identification and classification.






Page

3

1.1.8

Task 4: Train Software to Identify Object Set

The algorithms are trained to identify one or more object in a cluttered

environment and to reject
artifacts. The software must be properly parameterized (trained) to perform identification and
classification. Unlike neural network training, the training process is almost instantaneous once
the software is presented with the

training set. The trained knowledge base forms the set of
correlations between pixel relational features and targets. Training proceeds as follows:

1.

The analyst circumscribes a desired feature or object, with an area of interest. The area
is segmented b
y the kernelling method that transforms feature pixel luminance and
spatial relationships into abstract numerical information.

2.

The quantified feature information is evaluated by the Gouge Transformation in a spiral
or linear fashion that extracts featur
e attributes.

3.

The SCC relates feature attributes to knowledge elements that assign each feature a
descriptive name and an optional action code. The action code specifies actions to be
taken when a knowledge element is excited, and directs the machine to

utilize the action
value in some predetermined manner.


The Phase II effort will focus on automating the knowledge capture and learning process.

MIT will yield accurate classification independent of approach angles. Since the quantifiers are
invariant t
o in
-
plane rotations and insensitive to out
-
of
-
plane rotations, a discrete set of out
-
of
-
plane training examples, each from a different combination of rotation angles will suffice for the
algorithms to “learn” what a target looks like from every possible a
pproach angle. The required
set of training examples will be considerably smaller than a set needed for template
-
based
training.


1.1.9

Task 5: Demonstrate Target Identification

DR. GOUGE will demonstrate object identification classification using in
-
house rad
ar imagery
and/or representative customer provided digital images. This demonstration will be performed
on a laptop PC running the Windows operating system.


1.1.10

Task 6: Final Report

A final report including findings, drawings, and specifications develop
ed under tasks 1
-
5 of the
Phase I program will be prepared.


1.1.11

Option: Task 7: Periscope Image Processing Demonstration

This optional task moves the feasibility demonstration to a higher level and prepares a seamless
transition to the Phase II effort. MI
T will be trained with actual Navy provided imagery of sea
and land targets and features. MIT processing will be evaluated on tactical imagery from
representative periscope sensors and deficiencies in its capabilities noted for correction in the
follow
-
on

Phase II effort. Integration of MIT into operational flight software will be addressed.
A demonstration of MIT combat identification and classification potential will be negotiate at a



Page

3

Weapon System Support Activity site or software development facility
. A report describing the
results of this task will be delivered after concluding the 6 month effort. We propose the
following tasks:

1.

Select 10 to 20 digital images containing diverse poses of a demonstration target for
training

2.

Perform training function

on demonstration target, This process is repeated for each of
the multiple images to ensure generalization of the identification process to imaging
variables such as out
-
of
-
plane rotation angle. (In
-
plane rotation are handled by rotation
invariant quant
ifiers)

3.

Perform second and subsequent iterations of the training to resolve conflicts

4.

Interface MIT with representative operational flight software

5.

Demonstrate MIT on tactical target images

6.

Evaluate performance against select criteria, note deficiencies f
or follow
-
on development

7.

Write Summary Report