Vision and Navigation of Marsokhod Rover

gurgleplayΤεχνίτη Νοημοσύνη και Ρομποτική

18 Οκτ 2013 (πριν από 4 χρόνια και 8 μήνες)

101 εμφανίσεις

Vision and Navigation of Marsokhod Rover

Marina Kolesnik

Space Research Institute

84/32 Profsoyuznaya St., Moscow, 117810 Russia


The exploration of the Martian surface with help of a long
duration rover is planned

as a part of the international space project "Mars
98". In order to provide the rover
autonomy, we have developed a path generation algorithm that makes use of 3
stereo reconstruction. Two main tasks are solved for successive obstacle avoidance:
(1) rec
ognition of visible terrain in front of the rover; and (2) safe path generation
and execution. The area
based stereo reconstruction algorithm [11] which combines
the pyramidal data structure and dynamic programming technique has been used for
the recogn
ition of the local environment. Prohibited areas are identified on the
elevation map regarding rover’s locomotion capabilities to overcome them. The safest
path generation is based on the Dijkstra algorithm being applied to non
areas. The compu
tational complexity and memory requirements of the algorithm
developed meets the implementation constraints of the onboard real
time processing.
We also provide the result of the tests which have been carried out in sandy and rocky
sites (Kamchatka, Russia
, 1993; Mojave desert, California, 1994; Tarusa, Russia,
1994), to prove the robustness of the vision
guided system.


stereo vision system, image pyramid, dynamic programming technique,
elevation map, directed graph, Dijkstra algorithm.

Vision and Navigation of Marskhod Rover




The heart of vision
based navigation system is stereo reconstruction of the surface
relief. Extensive research experience in the field of stereo analysis has been
accumulated worldwide. The known algorithms for passive stereo matching can be
ified in two basic categories:

1. Feature
based algorithms. These algorithms [8, 10, 12] extract features from the
images, such as edges, segments, contours, separate points (markers) and then match
the corresponding features on the left and right images.
The matching stage of all
these algorithms is computationally fast, because only a small subset of the image
pixels are used, however, in general, the process of feature extraction is time
consuming. Another drawback is that the algorithms of this class ma
y fail if the
primitives can not be reliably determined in the image pair. In particular, the edge
segment extraction is quite sensitive to any brightness distortion, as well as to
imbalance of the vision system parameters. Furthermore, feature
based metho
ds yield
usually only a sparse depth maps that is unacceptable for solving the path planning

2. Area
based. Assuming that the left and right images of a pair are locally similar to
each other, one can find a strong correlation between the small gray
level areas on the
different images [5, 9]. The underlying assumption appears to be a valid one for
relatively textured areas; however, it may prove wrong at occlusion boundaries and
within featureless regions.

The algorithms of both categories often use

special methods to improve the
matching reliability, such as:






瑨攠 a牥a
扡獥搠 潲oe污瑩潮l 灲潣e獳s [ㄴ1⸠ 周 灡瑨t 灬慮湩湧 獴数s 潭灲楳敳o 潦o 瑨攠


Vision and Navigation of Marskhod Rover



2. interpolation of these 3
D values to get a regular surface grid of heights (Digital
Terrain Model);

3. elaboration of a navigation map of the terrain with several classes (flat, traversable,

unknown) using two local thresholds for slope and height discontinuity;

4. path generation by observing a distance margin from the rover to the obstacles.

This interesting solution includes, however, the computationally expensive step (2),
which is not r
eally necessary for the motion. The last step (4) could also be simplified
by minimizing the local risk for the rover.

The special issue in vision
guided navigation is the design of relatively stable
and fast algorithm for the stereo reconstruction, which

doesn't need large memory
resources. The accuracy of reconstructed 3
d shape may not be so high because we
need to detect those obstacles only, which vertical dimension is bigger then 30 cm.
The time consumption of our stereo reconstruction algorithm is q
uite small due to
simple features being used and combination of the image
programming techniques. The path is calculated in the spatial domain based on the
map, describing the distance to visible points instead of using the Digital Terrain
Model (heights) on the real surface grid. We emphasize visibility, because it is useless
to investigate the invisible (unknown) regions in front of the rover, as the rover will
not enter these areas. Dijkstra algorithm [3] is applied to minimize a local ri
sk for the
rover along the path. These steps are believed to be much faster then other methods
widely used.

This paper consists of three parts. In the first part we describe the stereo
reconstruction and navigation algorithms. A brief description of the
onboard computer
and analysis of the rover stereo vision system parameters are given in the second part.
The experiments and processing results are presented in the third part. The algorithm
performance compared to known hardware
based solution is given in

the conclusion.

1. Navigation Algorithm

The autonomous rover progression consists of the following steps to be
repeated in cycle:




Vision and Navigation of Marskhod Rover



execution step by traversing along the path;

The general principle applied to match points in the right and left
images is
correlation. It consists of comparing the gray level values of the images on a small
size (3x3 pixels) local window centered on each point of the left image to find the
most similar window on the right image. The disparities (parallaxes: pixel sh
ift from
left to right image) thus obtained are then used for the reconstruction of the distance
between the rover and the points of the terrain surface, based on the given camera's

The stereo matching process [11] is implemented under the follo
assumptions concerning the stereo images and a surface to be reconstructed:

周 潲楧楮慬i 業age猠 a牥 a汷ly猠 湯楳n 摵 瑯t ge潭整o楣i氠 a湤n 灨潴潭e瑲楣




䄠 晡獴s 灲潣e摵牥 瑯t e瑲t琠 瑨攠 扲楧桴湥獳h 晥a瑵牥猠 睨楣栠 a牥 浯牥 獴慢汥 睩瑨w


畳u搠 a猠 瑨攠 楮i瑩a氠 獨楦琠 癡汵敳Ⱐ 睨楬攠 灡獳s湧 瑨攠 桩h桥爠 牥獯汵s楯渠 污le爮r L潣a氠
潲oe污瑩潮l a湡汹獩s 楳i 潭扩湥搠 睩瑨w 瑨攠 摹湡
浩 灲潧牡浭楮i 浥瑨潤m 瑯t

F楮慬汹Ⱐ 瑨攠 灡牡汬a 浡瀠 楳i 牥a汣畬l瑥搠 瑯t 瑨攠 牥a氠 獣
a汥l 摩獴a湣e 浡瀠


Vision and Navigation of Marskhod Rover



distance map is recalculated to create an elevation map with respect to that horizontal
plane which is under the rover. The actual values of rover inclinations (roll an
d pitch
angles) are taken into account. The obstacles are detected on the elevation map
according to the rover locomotion capacities to get over them. The regions which are
detected as the obstacles, represent the prohibit zones in the rover's field of vie
the second step the p
ath from the start point to the feasible destination points is
generated by applying Dijkstra algorithm to the elevation map. To implement this
step, we consider those pixels which do not belong to the obstacles as the nodes of t
directed graph. The start point of the path (see fig. 1) is considered as the graph origin.
The length of the graph edge connecting the node

with the node

is a positive
number defined as follows:



are the values of the elevation map;

Fig. 1. Path planning task: directed graph on the image field.

As usual, the length of a particular path which joins any two given vertices of the
graph, is defined as th
e sum of weights of the edges composing the path. Because the
real path must follow continuously through the image field, the graph edges can be
connected in the following way: each node in the image row

can be connected only
with the three nodes from th
e previous row

(see fig. 1). Under this restriction, the
number of operations for searching the shortest path from the start point to destination
points is
, were
is the image size. The number of operations required in the
Vision and Navigation of Marskhod Rover



ulation described above is strictly less then the number of operation used in the
Voronoi Diagram method, which is implemented in paper [16].

Finally, a virtual destination is selected as a
target point keeping the global
rover displacement within the dir
ection defined by the mission task. The whole path is
then reconstructed from the target to the start position according to the best direction
for each graph node.

2. Rover Stereo Vision System

The rover stereo vision system consists of two cameras, and
onboard computer
providing image capturing/processing facilities.

Fig. 2. Rover stereo vision system.

The stereo cameras were installed in the rover in such a way to enable them to
analyze the nearest rover environment (
fig.2). The blinded area is within
approximately 2m from the rover. The cameras are installed on the vertical rack of
1m. The cameras’ inclination toward the horizon is 10
. A rather large stereo basis (50
cm) made possible to process stereopairs with the
resolution of 128x128 pixels. This
is enough to recognize major obstacles during the rover motion: the difference
between the parallax values corresponding to the top and the bottom of a stone of 30
cm height is equal to 3 pixels from a distance of 14m fro
m the rover's center.

In order to provide the autonomy of movement, control and timing
experiments, data collection and storing etc., the rover is equipped with an onboard
computer based on the 32
bit T805 transputer from INMOS Corporation [17], that can
be regarded both as a special (i.e. image processing) and a general purpose processor.
Major characteristics of IMS
T805 are:

bit internal and external architecture.

Vision and Navigation of Marskhod Rover



30 MIPS (peak) instruction rate.

4 Kbyte on
chip RAM direct addressable.

Internal t

4 fast Serial Links (10 Mbit/sec).

Less than 1 watt power consumption at 30 MHz.

The heart of onboard computer is four transputer modules, which are the real
copy of each other both electrically and even mechanically. There is no distinguished
e among them as far as the access to the peripheral blocks concerned, but, and it is a
substantial point, only two out of four transputer modules are powered at a time.
Which two, it is determined by the actual state of the overswitch logic [2]. The
er system has an access to
256 Kbytes local memory (upgradable).

Since the stereo matching process is based on the assumption of epipolar
geometry of the original stereo images, the optical axes of the cameras must be
strictly parallel to each other. L
et’s estimate the accuracy for the alignment of the
cameras assuming that vertical parallax for the corresponding points is within
pixel. Let us suppose that the world co
ordinate system


coincides with the left
camera position (fig.3). Let us d
enote the camera viewing angles as a pan angle

related to Y
direction, tilt angle

related to X
direction, and roll angle

related to Z
direction (



means variation of these angles with respect to the parallel
optical axes.) The pixel misma
tching due to camera imbalance can be calculated by
the following formula:


is the camera focal length, (


ideal projection, (

real projection
of the point


Focal length of the
cameras installed in the rover equals to 12 mm, the size of the
CCD matrix is 7mm (




3.5mm). As it follows from the equations above, the most
significant contribution to

y (which is crucial for the matching process) is the
variation of the angle



1 (in pixels), then

5 arcsec.

It is still a question whether is it possible to keep the camera axes parallel with
accuracy of 5 arcsec after the rover landing. In addition the temperature variations on

Vision and Navigation of Marskhod Rover



Fig.3 The scheme of the cameras orientation in the stereo vision system.

the planetary surface will lead to thermal deformation the imaging system. That is
why additional calibration of the stereo vision system may be necessary after the rover
nding. One way to calibrate the system can be based on matching the set of points
put on the target. Such points should be placed uniformly within the viewing field of
the cameras. A search for the correspondences is performed along the horizontal and
ical directions on
earth. Later on, the calculated map of vertical parallaxes can be
used onboard to compensate the vertical (y
direction) divergence for every stereo pair
taken. Such preprocessing of the original image will prevent the occurrence of loca
area corruption in the matching process. The errors in the recalculated distance map
due to uncertainty in the camera orientation are negligible with respect to the rover

3. Experiment and Processing Results

Marsokhod consists of a chassis with 6
wheels, each of which is articulated so
that it can be turned forward or backwards on each side of central joint. The chassis
holds on
board power and computing capabilities, and is equipped with stereo
cameras mounted on a central mast.

Two basic princip
les underlying the autonomous locomotion are:


Vision and Navigation of Marskhod Rover



The position of the rover during the motion should be risk

at each moment of
the operation. In unexpected situation the autonomous movement can be terminated to
return the rover under remote control;

Usually, rover operates within the normal path range of 10
12m long. In
emergency case while traversing along the

generated path, if any obstacle is detected
by the rover sensors (it means that the actual rover's inclination becomes higher than
the threshold), the rover stops and then performs one step rotation in the opposite
direction, away from the obstacle. At th
e next step the system is searching a path of
about 3
4m long to avoid the obstacle. If the path generation is unsuccessful, the rover
rotates one step further. If during a complete turn
around a path couldn’t be found,
then the guidance returns back to m
anual handling (i.e. to remote control). Every time
when the path is successfully executed, the rover restarts the execution of the motion
scheme from the beginning, i.e. returns to the normal path range operations.

Fig. 4 shows the successive steps of th
e image processing in order to generate
a path which is highlighted in the left image (4c). In Fig. 4b the prohibited area is
shown in white color, this is actually a deep pit in the ground. Areas shaded in black
are safe for the rover motion. The start
off point of the generated path is located at 2.9
m in front of the rover. Fig.5 demonstrates the processing result in another test
site. A
relatively flat sandy terrain with a smaller pit in the right
down corner of the image is
shown in fig. 5a. The brig
htness of the pixels within the traversable area (5b) are
proportional to the elevation values in the appropriate points on the scene. The
beginning of the path shown is at 2.2m distance off the rover.

Fig. 6 and 7 illustrate the navigation system capabil
ity to work stable within
the different illumination. The stereo pair (fig.6) shows a lava field in Kamchatka
close to volcano Tolbachik. Both stereo images look gray because of the black
volcanic wet dust (it was raining during the test). The stereo pair
(fig.7) has been
taken during the test in Mojave desert (California, 1994). These images look rather
different from the previous one because of bright sun. The overilluminated sandy
areas are somewhat hard to detect the correspondences, nevertheless, a saf
e path is


In this paper we have discussed the concept of a stereo vision system, and the
navigation algorithm for autonomous planetary rover. Both the good quality of the test
Vision and Navigation of Marskhod Rover



results and high performance of the software implementat
ion have demonstrated a
feasibility of real
time automatic navigation. The robustness of the algorithm
developed are proved in a number of experiments including different types of terrain
(sandy and rocky scenes) along with wide range of illumination. All
software is written in OCCAM language taking into account onboard computer
architecture and limitations. The navigation software has also been tested alone on the
commercially available add
on transputer card to PC. The overall processing time o
such card equals 10 sec (image resolution is 128x128 pixels) versus 1.3 min in
onboard processing. The difference is due to rather small local memory installed in the

onboard computer. There is known hardware solution (MD96 board) for real
based stereo emerged from a European ESPRIT project [4]. MD96 board
is based on eight Motorola 96002 Digital Signal Processors and has a peak processing
power of 240 MFLOPS, enabling to perform stereo reconstruction of the 128x128
images in 0.9 sec. In

our configuration (1 transputer module, the same image
resolution) the stereo reconstruction algorithm takes 8 sec. This result is comparable
to French hardware
based solution, but doesn’t require additional costs.


1. Baker, H.H. & Binford, T.
O. (August 1981). Depth from Edge and Intensity Based
Seventh Int. Joint Conf. on Art. Intel.
Vancouver, 631

2. Balazs, A. & Biro, J. & Szalai, S. (15
21 May 1994). Onboard computer for Mars
96 rover.
Proc. 2
nd Int. Sympos. on Missions, Techn
ologies and Design of Planetary

Center National d'Efudes Spatiales(CNES, France)
Russian Space Agency
(RKA), Moscow
Petersburg (Russia).

3. Dijkstra, E.W. (1959). A note on two problems in connection with graphs.
, 1, 269

4. Fa
ugeras, O.D., et al. (1993). “Real
time correlation based stereo: algorithm
implementations and applications”.
The Intern. Jour. of Comp. Vision.

5. Fua, P. (1993). A Parallel Stereo Algorithm that Produces Dense Depth Maps and
Preserves Image Features.
chine Vision and Applications.

6(1), 35

6. Gimel'farb, G.L. (1991). Intensity
based computer binocular stereo vision: signal
models and algorithms.
Int. J. of Imaging Systems and Technology.
Vol. 3, No. 3,

Vision and Navigation of Marskhod Rover



7. Grimson, W.E.L. (1981).
From Imag
es to Surfaces: A Computational Study of the
Human Early Visual System.

MIT Press, MA

8. Grimson, W.E.L. (Jan. 1985). Computational Experiments with a Feature Based
Stereo Algorithm.
IEEE Trans. on Patt. Anal. and Mach. Intell.
7(1), 17


M.J. (Dec. 1989). A System for Digital Stereo Image Matching.
Phot. Eng.
and Rem. Sens.
55(12), 1765

10. Kim, N.H. & A.C. Bovik (1988). A Contour
Based Stereo Matching Algorithm
Using Disparity Continuity.

Pattern Recognition
. 21(5), 505

11. Ko
lesnik, M.I. (1993). Fast algorithm for the stereo pair matching with parallel
Lecture Notes in Computer Science.
719, 533

12. Medioni, G. & R. Nevatia (1984). Matching Images Using Linear Features.
Trans. on Patt. Anal. and Mach. In
6(6), 675

13. Moravec, H.P. (Sept. 1980). Obstacle avoidance and Navigation in the Real World
by Seeing Robot Rover.
Ph.D. Thesis,
Stanford Univ., Comp. Sc. Dept., Report

14. Nishihara H.K. (September, 1984). Practical Real
Time Imaging Stereo Matcher.
Optical Engineering
. Vol.23, N.5.

15. Ohta, Y & T. Kanade (March, 1985). Stereo by Intra
Inter Scanline Search
Using Dynamic Programming.
IEEE Trans. on Pattern. Anal. Machine Intell.

7( 2).

16. Proy, C., et
al. (16
22 Oct. 1993). Improving autonomy of Marsokhod 96.
Cong. of the Int. Astronautical Federation.

Graz, Austria.

17. Transputer Data Book. (1992). 2
nd Edition, London, UK:Prentice Hall.

Vision and Navigation of Marskhod Rover



Fig. 4 a. Original stereo pair:

resolution: 256x256 pixels.

Rover's inclinations: pitch = 0



Fig. 4b


Black area on the Fig. 4b is suitable for the rover's motion.

The start point of the path (Fig.4c) has the distance 2.90m from the rover's center .

The distance of the fracture point is 4.60m and
the distance of the target point is 12.70m



Prohibit / Traversable Zones

Left + Path

Vision and Navigation of Marskhod Rover



Fig. 5 a. Original stereo pair: resolution: 256x256 pixels.

Rover's inclinations: pitch = 0



Fig. 5b Fig.5c

Gray area on the Fig. 5b which is the part of elevation map (practically flat here) is suitable for the
rover's motion. The start p
oint of the path (Fig.5c) has the distance 2.2m from the rover's center .

The distance to the target point is 9.35m



Prohibit / Traversable Zones

Left + Path

Vision and Navigation of Marskhod Rover



Fig. 6. Original stereo pair and generated path: resolution: 256x256 pixels.


7. Original stereo pair and generated path.

Resolution: 256x256 pixels. Rover's inclinations: pitch=



Left + Path


Left + Path