that others can use,

coordinatedcapableΛογισμικό & κατασκευή λογ/κού

4 Νοε 2013 (πριν από 3 χρόνια και 7 μήνες)

66 εμφανίσεις

How to develop and maintain
a scientific software,


that others can use,

in an academic environment

Roberto A Lotufo

Fac. de Engenharia Elétrica e de
Computação

Universidade Estadual de Campinas

Campinas
-

SP, Brasil

Oct 2002

Summary


Discussion on designing commercial
software library for image processing


Proposed methodology using XML
tools


An application using Mathematical
Morphology toolbox for MATLAB,
Python, and Tcl/Tk

Requirements of a scientific
commercial software


Reliable and robust


Documentation

(for new and expert users)


reference manual, tutorial, demonstrations


different media: hardcopy, HTML, on
-
line help


consistent with implementation


Updated frequently

(software release panic)


requires a longer life (adaptable to new techn.)


Competitive
(efficient)


Multiplatform

By contrast:

typical research software


Reliability:

poor, only for a paper


Robustness:

it is badly tested (single user)


Documentation

(none, or maybe a thesis)


if available it is most probably inconsistent


New
updates

normally insert many bugs


Competitiveness:

very high ( short life)


Multiplatform


normally runs only on the researcher PC

How information is stored


There are three main components of
information


Content


Structure


Presentation


Examples


Latex versus Tex or HTML


Algorithm, program in a specific language
and documentation

How information is stored


Keeping content, structure,
presentation
separate
, increases its
reusability


SGML (first to follow this idea)


XML (for Internet)

Software development


Try to apply the same motivation of
XML for program coding


Content => Presentation


algorithm => programming languages,
(many environments: MATLAB, Tcl),
Makefiles, Release packing.


documentation => HTML, Latex

Characteristics of an image
processing library


Large number of functions (order of
hundreds
)


Most function programs are
very
similar
: same startup, same finishing
up code, etc.


it is common to use
COPY&PASTE


Requires an user interface: script
language, visual language or GUI

Recommendations


Decrease the number of functions:


Hierarchical coding


Polymorphic functions


ex: same dilation function interface for
binary, gray
-
scale, color, 1D, 2D, 3D, etc.


Automatic code
-
generator versus copy&paste


for software and
documentation


Test suite


Automatic

software packing and releasing

Code Proliferation and
Reliability


Must be
avoided

(very important)


reuse of functions (operator
decomposition)


hierarchical calls


polymorphic functions


If a code is used only for a single
situation in a single function, it is
very likely of being
poorly tested

Copy&Paste Syndrome


Easy to do, difficult to
maintain



Increase the size of your document


code proliferation


lower reliability


If a later change is necessary, it is
difficult to keep the initial
consistency


Very susceptible to bugs

Code Generator as opposed
to Copy&Paste


Shorter master documents


Forces
consistency

in the syntax, in
the style, in the documentation, etc


The rules, conventions, etc are
programmed (which can be changed
globally and consistently if necessary)


Do not introduce bugs. Increase
reliability
.

Test Suite


It is
very difficult

to test software


Test Suite is a collection of scripts that
executes the functions and compares the
result with known data


Difficult to design the data for testing


Every function must have many test scripts


Guarantees the
rightness

of the
implementation


No one likes

to write
documentation

and
testsuite


Test Suite


Easier for scripting languages


input => output


More difficult for interactive
systems


(input, state) => (output, new state)


For interactive systems, the
requirement of a journal playback
capability for the test suite

Documentation


It maybe the most
expensive

and
important

part of a scientific software:


must be done by experienced persons


must include typical examples and demos


must be consistent with the implementation


must be updated frequently


must be formatted to different media:


paper, on
-
line help, web browser, demonstration

Solution for documentation


Keep them in the
same

file of the function code


any change in the code can be easily followed by
the change in documentation


Documentation is generated
automatically


the images and graphics of the documentation
must be generated
on
-
the
-
fly

with the current
version of the software


avoid redundancy. Ex: name, number and type of
parameters are automatically documented.


Image Processing

Programming Language


Requires
two levels
:


User application level:

scripting language


easier to program and change, slower in
execution


Developer level:

compiled language


faster in execution, more difficult to
program


Solutions:


Python and C/C++


MATLAB and C/C++


Tcl/Tk and C/C++

Knowledge

Software Model

MATLAB

Legacy

Library

MMachLib

GUI

editor

API: Morphology Library

Interface

Tcl/Tk

Interface

Code and Doc

Generator

Doc.

Doc

XML Technologies


XSL

(Stylesheet processor)


XML
Schema
, DTD (structure)


DOM
, SAX (XML data structure)


XPath

(query language)



There are many public domain implementations of
these tools (Sun, IBM, others) in many languages
(Java, C/C++, Tcl, Python)

XSL

Input XML

XSL

Processor

Output
Document

Stylesheet

XSLT

Examples (MATLAB toolbox)


Toolbox
Documentation
: main index


Demonstrations
:


Beef
segmentation


Flat
zone

concept


Cornea

cells


Weave
detection


Functions
:


Dilation


Geodesic

DT


Benchmark

Example (Tcl/Tk toolbox)


Tcl/Tk is a multiplatform scripting
language with user interface


it is good for user
-
interaction stand
-
alone application


Assisted segmentation


ProntoRegion

(Watershed)


ProntoContour

(Live
-
wire)


NeuroLine

(Brain MR segmentation)

History of the Project


Morphology Toolbox for
Khoros



USP
-
INPE
-
UNICAMP


1993
-
97
-

MMach
, one public release per year


1995
-
97
-

MMachLib ind. lib (MSWindows
-
Unix)


Morphology Toolbox for
MATLAB


SDC Eng.
-

Softex project (Brazilian govern)


1997
-
today
-

development


nov1998: first commercial release (www.mmorph.com)


Morphology Toolbox for
Python


mar2002: started.


(
Book
: Dougherty & Lotufo
-

Hands on Morphological
Image Processing, SPIE, 1st semester 2003)

Project Team

Toolbox for MATLAB


Roberto A Lotufo


overall design, coding and documentation


Junior Barrera


design of MM operators, documentation


Rubens C Machado


design of software tools and coding


Francisco A Zampirolli


coding of document generators

Project Team

Toolbox for Python


Roberto A Lotufo


overall design, coding and documentation


Rubens C Machado


overall design, design of software tools
and coding


Alexandre

Gonçalves Silva


python coding and document generators

Future Work


Still lot of work:


IDE

for Adesso users


Finish many loose (unfinished) points


New porting to a
distributed

image
processing environment:


AdessoWeb


IPWeb

Conclusions


The
methodology

has been proved to
be very
suitable


Automatic

code and doc
generators

are crucial tools for commercial
quality image processing tools


A highly sophisticated commercial
software can be
maintained

with very
low cost
:


MATLAB toolbox is released every 6
months.