C. Project Description

jamaicacooperativeAI and Robotics

Oct 17, 2013 (3 years and 10 months ago)

136 views


C
-
1

C.

Project Description

1.

Results from Prior NSF Support

Summary of Previous and Current Awards for Hsu

I
NFORMATION
T
ECHNOLOGY
R
ESEARCH
(ITR)

FOR
N
ATIONAL
P
RIORITIES
,

2004
-
2007:

Hsu is a co
-
PI on
ASE
(sim+dmc)
-
0428826,
$750,000, “
ITR:
Parallel Data Mining for

Nanoscale Kinetic Monte

Carlo

Simulation Models

. This project
provides partial support

to

two undergraduate

programmers

in computer
science, one female, and is generating data used in the thesis of one M.S. student in mathematics and one
Ph.D. student i
n computer science.

Since the award date

(
September
, 2004), .
this project has
initiated

the
implementations
of
dynamic Bayesian networks
in Hsu’s software library,
Bayesian Network tools in Java (BNJ)
,
to be used
in Rahman and Kara’s parallel

kinetic Mont
e Carlo
simulator (
LEAP
-
KMC
)
.
This grant has also

partially

suported
the
publication of
a

pape
r

on technique selection and
one on
Monte Carlo methods for
probabilistic inference.

Hsu has primary development responsibility for the machine learning aspect
of this
project and is investigating techniques based on k
-
nearest neighbor (k
-
NN), support vector machine (SVM),
and symbolic regression approaches.

F
RONTIERS IN
I
NTEGRATIVE
B
IOLOGICAL
R
ESEARCH
(FIBR)
, 2004
-
200
9: Hsu is a senior person
on
FIBR
-
0425759
, “
M
olecular Evolutionary Ecology of Developmental Signaling Pathways in Complex
Environments
”. This project supports one M.S. student in mathematics and is generating data used in the
student’
s thesis and that of a
second Ph.D. student in computer science.
Since the award date (September,
2004),
this project has
led to

a new release (v3.2) of
BNJ

and
structure learning
modules to be used in causal
modeling in ecological genomics.

One journal paper and several conference papers are in preparation.

REU

D
EVELO
PMENT

(1999
-
2000):

During Hsu’s one
-
year appointment at NCSA he led a research program
on applied KDD for commercial decision support, was responsible for industrial data mining projects, and
developed machine learning and probabilistic reasoning software
for large
-
scale

KDD applications. This led
to the PI’s participation in two summer NSF REU programs whose tutorial components were partly
developed and piloted in a summer course on data mining at Kansas State. Hsu has been responsible for
major contribu
tions to the
D2K

reuse library, including modules for data clustering (1998), stochastic search
wrappers for feature selection (1999), web clickstream mining (2000), Bayesian network structure learning for
decision support (2000), and stochastic sampling
-
b
ased inference in Bayesian networks (2001).

Specifics of Previous and Current Awards for Rahman

“Chemisorption Studies at Metal Surfaces” CHE
-
9812397, (99
-
02) $280,000; CHE
-
0205064 (02
-
05) $315,000
(PI); “Kansas Center for Advanced Scientific Computing'' N
SF/EPSCoR006169, (96
-
99), $302,837 (CoPI);
“Upgrading of a High Performance Computational Facility”(CDA
-
9724289) (1997
-
00) $350,000 (PI); “US
-
Pakistan Workshop: 25th International Nathiagali Summer College” (2000), $15,000; 26
th

(2001), $15,000; 27
th

(2002
), INT0215511, $20,000 (PI); “Evolution of Nanoscale Film Morphology” (ERC0085604) (2000
-
2003)
$1,170,834 (with 3 Co
-
PI's); “Single Molecule Magnet for Quantum Computing,” NER/CIS
-
0304665(2003
-
04) $100,000 (PI).;”Theoretical Studies of Intermetallic Surfac
es”: US
-
Turkey Cooperative Research, INT
-
0244191 (2003
-
05) $40,000 (PI); “
ITR:
Parallel Data Mining for Nanoscale Kinetic Monte Carlo

Simulation Models
”,
ASE
(sim+dmc)
-
0428826 (2004
-
07),
$750,000

(PI).

Highlights of Completed Projects (Rahman)

Funds from ab
ove awards have provided financial support to eight graduate students and four post
-
doctoral
associates. Five of the above twelve individuals are females. The details can be fou
nd in the relevant
publications.
Some details of the work relevant to this
proposal follow:
Atomistic studies of initial stages of
homoepitaxial growth on Ag(111).

Using molecular statics (MS) and dyna
mics (MD) simulations, we show
that
while at low temperatures the formation of (100)
-
microfacetted step edges are favored over th
e (111)
-
type,
the situation is reversed at higher temperatures. These results point to the importance of temperature
dependent atomic vibrations in considerations of epitaxial growth.
Morphology of ledge patterns during step flow

C
-
2

growth
. During step flo
w growth, in the presence of meandering instability, of step edge patterns of vicinals of
Cu(001), we find an invariant shape of the step profile. The step morphologies change with increasing
coverage from a somewhat triangular shape to a more flat, invar
iant steady state form. Our KMC simulations
show the kink Ehrlich
-
Schwoebel barrier to be critical for determining the ledge morphology.
Evolution of step
morphology in thermal equilibrium.

Our KMC simulations of thermally induced changes in the step profi
les of
vicinals of Cu(001), using a set of critical energy barriers obtained from reliable manybody potentials, provide
a good agreement with results for kink formation energy, and exponents for time correlation functions with
those obtained from STM data,

for a large temperature range. A key element here is the ability to obtain
macroscopic properties of steps like the stiffness parameter from microscopic considerations.
Self
-
teaching
Kinetic Monte
-
Carlo method
. We are developing new KMC codes with autom
atic generation of microscopic
events using manybody potentials and accurate methods for the calculation of energy barriers, as needed.
Because of the automation and inherent pattern recognition ability, this code is expected to provide an
accelerated, mi
croscopic approach to examine issues related to non
-
equilibrium and equilibrium phenomena
on metal surfaces.
Diffusion of 2D Cu clusters on Cu(111):
Using a closed data base consisting of 294 transition
events involving periphery diffusion, we show that t
he dynamics of small clusters (8
-
38 atoms) is governed by
their size and shape, and produce an effective diffusion barrier of 0.65 ± 0.02 eV, in good agreement with
experiments. The larger islands (50
-
1000 atoms show an interesting scal
ing with size and te
mperature
.
Prefactors for interlayer diffusion on Ag/Ag(111):

From calculated energy barriers and kinetic Monte Carlo
simulations, we find that good agreement with experimental data requires that the prefactors for terrace and
step
-
edge diffusion dif
fer by

two orders of magnitude
.
Molecular dynamics of adatom and cluster diffusion on metal
surfaces:
these studies

are providing insights into novel complex, multi
-
atom diffusion mechanism that appear
as a function of surface temperature. By providing a measur
e of when anharmonic effects become important,
these studies provide the limits of validity of simulations emerging from KMC for which the harmonic
approximation is assumed. The above projects were performed under the grant ERC0085604 has now
expired, and
no new funds for them are available.
Theoretical Studies of Chmisorption at Metal Surfaces.
Rahman’s
group is engaged in
ab initio

ele
ctronic structure calculations
of a range of phenomena on metal surfaces
including examination

of structural relaxation,

c
hanges

in local electronic structure, reactivity
, chemisorption
,
vibrational dynamics, and surface stress
as induced by surface geometry, presence of steps and kinks and by
adsorbate
s. A number of efficient codes
base
d on density functional theory
with bot
h t
he local density
and
the gene
ralized gradient approximation
are available to the group and several current students are already ver
y
familiar with their usage.

Rahman’s group is now collaborating with a team from Computing and
Information Sciences on
P
arallel Data Mining for Nanoscale Kinetic Monte Carlo

Simulation Models
, which aims at
scaling up simulations for 2
-
D
epitaxial growth to much larger neighborhoods and irregular surface
configurations..

2.

Objectives
, Expected Outcomes and Long
-
Term Goals

The

main goal

of
the proposed work is
to build models for nanoscale

materials processes that
are

applicable

to

emerging technologies
for computation. These models need to be able to
represent
three
-
dimensional
ph
enomena at multiple time scales. Thus, our

su
pporting
technical objective

is to

e
xtend current

simulation
infrastructures
by developing more general geometric and temporal representations.

Examples of processes that
have been simulated in the past,
but present a challenge to scale up
,

include:



deposi
tion of thin layers (e.g.,
metal
-
on
-
metal
in
semiconductor wafers and data storage media
)



initiation and propagation of surface defects



dynamics
,
diffusion and adsorption
of
proteins, peptides, and other organic molecules

Such phenomena
are
not limited to

crystal lattice structures as in the case of many existing simulations, but
can include adsorption of organic compounds to inorganic surfaces, diffusion of proteins into media such as
electrophoresis gels, etc. The types of computational models that are c
urrently used to simulate
material
evolution
are usually based on two
-
dimensional algorithms and propagate information in the third dimension
using a dynamic programming or “sweep
-
plane” approach. A consequence of generalizing from these to real

C
-
3

3
-
D proce
sses is that
better representations are needed for
material evolution and
fault propagation
neighborhoods.

An interdisciplinary
team of researchers from Physics and Computer Science has formed to address the
problem of g
eneralizing
and extend
ing

existing f
rameworks for “multiscale” simulation

of physical
phenomena such as the above, from the atomic level up.

Previous and related work

addresses the
spatial

multiscale
aspect of simulating
material evolution

and grain boundary diffusion in solids and nanostru
ctures
.

However, r
epresenting and calculating the dynamics
,

using a mixture of models at different
temporal

scales
,

presents
yet
another
theoretical challenge.

For example, the events involved in initiation of
a surface defect

in
a data storage medium
ran
ge o
ver time scales
from picoseconds (10
-
12

s) to seconds
, while the propagation

of
the defect and its impact on
data storage
failure may be measured
on

scales
of
up to 10
7

seconds.
Additionally, s
ome components of this time axis are independent of the sp
atial scale.

The emphasis of this new work

is on formalizing
multi
-
time representations for 3
-
D
nanostructures
,
using
temporal graphical models

such as

dynamic
Bayesian networks and relational and object
-
oriented extensions
thereof
. Besides the
new applic
ations to
simulation of
nanoscale materials processes, the
novel contribution
s
of this approach include the scaling up of
stochastic simulations for

a more general class of

discrete
phenomena

over time, which has
potential
benefits for time series predicti
on,
process control,
and
planning.

Specific desired
outcomes include:

1.

implementations of existing and new machine learning algorithms for approximation of energy
functions (macroscopic and microscopic rates) in simulation of experimental processes

2.

r
eprese
ntations and
semistructured data model
s

for
multi
-
time nanoscale phenomena

3.

f
ielded applications such as parallel kinetic Monte Carlo simulators

using these representations

The computer scientists in the group are interested in
the challenge of sca
ling up
simulations that involve

state spaces
in excess of 2
100

configurations
, with about a

petaflop

(10
12

floating point operations)

required to
estimate the energy function for
each

atomic
-
scale configuration.

Such problems
are

combinatorially
intractable even

when
all

symmetries
,
caching and
parallelization

venues

are
exploited
. In
the
material
evolution

domain, spatial decomposition can only abstract

this

to

the billions of
distinct configuration classes
.
The challenge is then to inductively generalize over

previously seen configurations, in order to discover some
equivalence classes of, or local regularities in, the energy function

The wide range of time scales for
simulations presents both a challenge
and an opportunity:
On the one hand,

simulation at the

finest temporal
grain (15
-
20 decimal orders of magnitude shorter in duration than the longest episodes to be modeled)

is not
feasible; therefore,
learning mixed
-
time models can provide a way to obtain useful estimates for event times.
S
uch an abstraction
may be required, for instance
, to
model long
-
term

nanoparticle

diffusion and
sintering

(synthesis of materials from powder
) at a macroscopic
scale.

On the other hand, such an abstraction
is often
enough to approximate the outcome of
an experiment

to
suffi
cient accuracy. F
or example,
the experiment might
be to test

the effectiveness of a new catalyst with respect
to some qualitative outcome

of interest

such as:

D
oes
this nanoparticle resist
self
-
adhesion in application for
one year?”

The
physicists in our

group seek
to extend the
methodology of materials process

modeling in
revolutionary ways
, supporting design and control

of

the

properties of materials as needed for emerging
technological applications.

For them,
the long
-
term goal
is to have a set of gen
eral
-
purpose

computational
tool
s,

using which they can control the growth patterns of
Figure
1
. Illustration of the 3
-
shell, 36
-
atom
neighborhood representation currently used for
a (111) fcc 2
-
D system.


C
-
4

nanostructures: metallic films, nanoscale storage media, nanowires,
C
60
-
walled nanotubes,
etc
., as a function of
temperature, substrate geometry, and material composition
.

This in turn requires a
computational
modeling
tool that can
approximately
map the “system inputs” to
the
parameters of rate equations

for the structures of
interest
.

Together, we have identified

three orthogonal

improvements to
existing frameworks
that

are needed to
achieve the shared goal of
this c
omputational
modeling tool
. First, previous and
current funded

research
has
pushed towards finer
-
grained simulations

by means of parallelization and data mining
: “
more energy
evaluations, generalized to cove
r a
large
r

state space
, in the same amount of time

.

One goal of this
research project seeks to create a robust and reliable mapping function from a 2
-
D geometric specification of
a local 3
-
shell neighborhood about an active atom, to
the
activation

energ
y

for the state
.
This neighborhood
is depicted in

Figure
1
.
S
econd,
an

independent
problem

is to develop a
true 3
-
D model
rather than iterate
over 2
-
D computations for one layer; this will allow

interlayer

(vertical)

transitions i
n complex epitaxial
growth simulations and support modeling of 3
-
D
material evolution
.

Third,
temporal

abstractions are needed
to simulate
long
-
term

events

such as crack formation
(both macroscopic and microscopic)
from
short
-
term
,
atomistic
computations
.

As
Figure
2

illustrates
, t
his proposal addresses primarily the third
necessary
extension
and aspects of the second
one
that are relevant to dynamics
.

This block diagram is explained in
more
detail in Section
4

(P
roposed
R
esearch
)
.


Figure
2
.
Overview
: b
lock diagram of
improved

system

for

nanoscale process modeling.


C
-
5

3.

Present State of Knowledge

3.1

Background and Related Work

Theoretical physics research over the past ten years has
included many efforts to model discrete phenomena
at intermediary scales of time and distance
. These range

from basic processes
, which
are minimal both in
duration and distance
,

to long
-
term effects
.

Material evolution
:

Primitive
processes that can be ide
ntified and simulated at the nanoscopic level

in
material evolution

include: deposition,
diffusio
n, nucleation, attachment,
detachment
,
edge diffusion
,
diffusion down step,
nucleation on top of islands
, and

dimer diffusion
.


The prefactors and activation e
nergy
barriers for a
typical state

in

a 3
6
-
atom neighborhood
(3
-
shell atomic model)

require tens of CPU seconds.

Coating and crack initiation and propagation:

Research on surface coatings for prevention of fatigue
crack initiation typically focuses on s
uppressing the
development of the critical persistent slip band surface
morphology through high modulus coatings and the
associated dislocation image forces. Brittle coatings of
this type tend to have limited performance and impact
on crack initiation resi
stance because cracks can initiate
by other

mechanisms. In comparison to traditional
crack
-
resistant coatings, multilayered metallic thin films
can be tailored to generate exceptional hardness and to
retain substantial ductility from the constituents. It i
s
this extensive flexibility that facilitates material
-
specific
control of the critical surface morphology responsible
for fatigue crack initiation.

The propagation of cracks is facilitated by grain
-
boundaries or easy planes (stacking faults) where the
en
ergy to separate two planes is the least.



Adatom decoration of stepped surfaces
:
Deposition on
m
onoatomically stepped surfaces
has been demonstrated with metal
s such as gold
(Au) on platinum (Pt) and palladium (Pd).
Vicinially stepped surfaces such as
Pt(665)

have in
turn been used as templates for the formation of
nanowire
s and similar

metal deposits
.

Controllable synthesis of nanostructures is one
reason why a general architecture for
computational modeling of nanoscale phenomena
is important and is
needed for better
characterization of events, discovery of interesting
properties, and prediction of longer
-
term behavior
from specification.


Figure
3
. Fatigue crack initiation.

Figure
4
.
Generic monoatomically stepped surface.


C
-
6

3.2

Need for Extended Geometric and Temporal Representations of Materials Processes

Early experiments using parameter
s for 2
-
D Cu(111) islands (of 10
-
1000 atoms) at 400 Kelvin indicate that
the number of CPU
-
seconds required to achieve a requisite 95% is estimated to be greater than 4 * 10
4
, or
about half of a CPU
-
day on the test system. In preliminary benchmarking expe
riments, an existing high
-
performance FORTRAN code, courtesy of Oleg Trushin, was executed on a current
-
generation desktop PC
(single Pentium IV processor) and began to saturate its cache of precomputed states only after the most
frequent 1000 unique state
s had been visited. The first 10
4

seconds resulted in only 300 hundred unique
states being visited, after which, a dramatic increase in previously visited states resulted in over 3 million state
transition evaluations being achieved in the second 10
4

seco
nds. Realizing this speedup was largely a matter
of caching the results. This performance curve indicated that scaling up to hundred
-
atom neighborhoods
would pose a significant challenge. Throughput of such a sy
s
tem at 40 seconds of wall clock time per
state
evalution prompted us and our collaborators to look at parallelizing kinetic Monte Carlo (KMC) simulations.
Independent of this effort, we began to develop a representation that supported machine learning using
instance
-
based methods, multivariate re
gression, and kernel methods (support vector machines). The
resultant system aims at a combined speedup of a few hundred times
.

Scaling up through parallelization and data mining allows more detailed modeling to be done in the same
amount of wall clock ti
me.
However, this

only addresses the first technical improvement to the existing
frameworks, stated at the end of Section
2

abov
e
.

The need for
an extended geometric model can be seen in
Figure
3

and
Figure
4

above:
modeling of faults and wire growth is limited to monoatomically stacked or
stepped structures
unless

an ad hoc mechanism is defined for vertical jumps, or a true 3
-
D model is
developed. For general
-
purpose modeling
tools, the latter is
more
extensible

and therefore
preferable.

At
least as important to this proposed research is

the need for multi
-
time models in order to represent longer
-
term cumulative, periodic,
or gradual complex effects
.

4.

Proposed Research

This sec
tion describes specific technical contributions and outcomes of the proposed work. We

first review
the system architecture depicted in
Figure
2

and list the key tasks and desired outcomes of current work to
delineate it from incre
mental extensions and more significant, fundamental improvements.

The top box in Figure 2 lists five sources of input data for data mining using temporal graphical models:

1.

the
previously computed exact parameters

for

robust estimators

that, as documented i
n Section
3
,

include prefactors and activation energy barriers
; also expresses some background knowledge

2.

the
process specification
, a history of all relevant energy values, which in some cases may also
include higher
-
level adjus
table parameters such as priors for the rate equations to be expressed by the
temporal graphical models. (In current system, this is

a simple qualitative selector and the rest of the
specification is

captured by the state specification
. When

different le
vels of temporal abstraction

are
introduced
,
however,
Markovity may be violated.)

3.

the
spatial and geometric state specification
, consisting of at least a bit vector representation of
the “initial conditions”, describing the neighborhood about one or more a
ctive atoms

4.

a
semi
-
structured event language

for propagation and caching of state transition information,
prefactors and factors (
in atemporal, i.e., spatial,
abstractions), and mixture coefficients for the multi
-
time model

5.

functional, parametric, or const
raint network
-
based representation of
simulations
: these propagate
information from higher and lower levels of a hierarchical temporal model such as a hierarchical
hidden Markov model (HHMM)

The upper

three

boxes comprise the “user data”, whether given in
configuration files or interactively; the
lower box contains computed inputs and all of the variational factors that can be solved for within one
atomistic simulation (at a single time “slice”).


C
-
7

The middle box decomposes the technical objective of finding
the right temporal abstraction to simulate
long
(er)
-
term events
, into pattern recognition (inference), representation, and learning aspects. We now
document these aspects. For clarity, we proceed in the order of representation (Section
4.1
), learning (Section
4.2
), and parameter estimation by inference (Section
0
)
.

4.1

Representation:
General Infrastructure

for
Nanoscal
e
Materials Process Simulations

This

section describes t
he proposed extensions to the computational framework on the input end that lead us
to a new 3
-
D, multi
-
time model for

material evolution and other

phenomena

(the central “representation”
box of the middle layer in
Figure
2
)
.

In or
der to develop a data model that facilitates learning and multiple and mixed time scales, we must first
consider the limitations of current practice
:

1.

Simulators of 3
-
D

material evolution

(simple homoepitaxial growth and more complex varieties
, wire
growth,

self
-
assembly,
etc.
) currently use a “sweep
-
plane” representation that propagates the effects of
isolated 2
-
D computations. While this approach is easier to parallelize using boundary methods, it
cannot not capture all 3
-
D effects, such as some vertical
jumps, that we are interested in. This has an
unwanted side effect on multi
-
time modeling: for many growth and diffusion processes, the vertical
axis is tied to the time parameter. Iterating at a fixed layer “thickness” means iterating at a
uniform

time
granularity, whereas we seek to develop an adaptive multi
-
time model.

2.

3
-
D materials processes are numerous and those that are being computationally simulated are
becoming more

complex. As we progress towards multi
-
time models, organizing and interfacing
s
imulators will present an increasingly difficult information management task.
An extensible
ontology and semi
-
structured
data model, such as the investigators are developing for the thin

film
and epitaxial
growth domain
s
, is needed for a more general fami
ly of phenomena.

A corollary of the above points is that
the spatial state specification needs to be extended to a spherical
neighborhood
, rather than the “cylindrical” or “stack of hexes” neighborhood

indicated by
Error! Reference
source not found.
. For t
his purpose
we will adapt uniform (voxel) and adaptive (octree) representations

[Foley
et al.
, 1996]

from the field of volume graphics
.

Voxel
-
based representations are more data parallel and
easier to manipulate, but adaptive spatial decomposition makes a

tradeoff between increased bookkeeping
overhead and potential savings in dealing with volumes in bulk. We face this same representational tradeoff
in
designing
our
adaptive multi
-
time representation
.

Therefore, making synergistic design choices for the
process specification and geometric state specification (the upper level of boxes in
Figure
2
) has a high impact
on the whole framework: not only on the simulation, but also on inductively learning from the outputs and
reasoning wi
th them to iteratively improve the adaptive subdivision. The consequence of this choice is that
with the right (generative) data model, we can bootstrap the process of searching for a good spatial and
temporal abstraction.

Probabilistic graphical models
1

provide one such generative representation. We will now consider a
framework for learning them from simulation data and outline some challenges.

4.2

Learning
: Graphical Models for

New Applications of
Parallel

Kinetic Monte Carlo

Figure
5

depicts a typical
BNJ

workflow for learning graphical models from scientific (or
industrial) data. In
the next section we discuss the combined application of learning and inference modules in BNJ and give a list
of technical development milestones.

Ke
y technologies

applied in the deployment of such an experimenter’s workbench are:

1.

Parallel, distributed computation




1

The term “graphical” is overloaded

in this context



in the cross
-
cutting fields of probabilistic reasoning and
machine learning, it refers

to models and algorithms based on graph (i.e., network) representations, not necessarily
to computer graphics and graphical user interfaces (GUIs).


C
-
8

2.

Reusable software modules for high
-
level performance tuning in learning graphical models

3.

A semi
-
structured data format for BNs (the XML
Baye
sian Network Interchange Format
)

Figure 6 depicts a traditional dynamic Bayesian network (DBN), represented using a two
-
time slice temporal
Bayesian network and applied using an unrolling procedure as shown. Circles denote hidden states in this
figure. Sq
uares, whch usually controllable quantities in a decision network, represent observable (possibly
controllable) variables, including energies of activation and prefactors.


Figure
5
. Example
BNJ

workf
low for learning graphical models from data.


Figure
6
. Example Dynamic Bayesian Network for a generic material evolution process.

S
pecific

important

challenges to learning of temporal graphical models include


C
-
9

1.

aggregating and abs
tracting over time slices

(as depicted by the inner rounded box labeled “temporal
abstraction”)

to obtain a multi
-
time model

2.

incorporating other models of time: continuous time, extended lag or “long time lag” models,
etc.

In addition, spatial decompositio
n
for

partial evaluation
and modeling 3
-
D geometry
(not shown)
are
part
s

of
our current work that will need to be integr
ated into the multi
-
time models, as described in Section
2
.

4.3

Inference:

Graphical Models

for

Estimation and P
rediction at

Mixed Time Scales

General
Function

Specific
Technique

Software Module

Year (to be) implemented

Inference

Exact

Clustering

(Junction Tree)

Infrastructure complete

(optimizations, animation
in BNJ v3.1
)

Conditioning

Infrastructure complete

(
opt.

still needed
)

Variable Elimination

Infrastructure complete

(ditto)

Pointwise Product

Year 1

Approximate

Bounded loop cutset
conditioning

Year 1

Stochastic Sampling

Adaptive importance sampling (Cheng and
Druzdzel, 2000) done
; others in prog
ress

Other (hybrids)

Years 1
-
3

Learning

Structure

Constraint
-
based

Year 1: CMU
Causality Lab

integration

Score
-
based

K2

done
; others in years 1
-
3

Distributions


Years 1
-
2;

gradient descent done
;
forward
-
backward a
lgorithm

in progress

Representati
on

Objective
&

relational (PRMs, OOBNs, etc.)

Year 2

Dynamic Bayesian networks

Years 1
-
2
;

Boyen
-
Koller, factored frontier

Decision networks, influence diagrams

Infrastructure extended
; inference and
learning, Years 1
-
2

Continuous chance nodes

Year 1

Continuous time

Years 2
-
3

New & hybrid
representations

Hierarchical hidden Markov models

Years 1
-
3

ILP, object/relational models

Years 1
-
3

Latent variable models

(Meek/Chickering, PC, FCI)

Years 1
-
3

Applications

3
-
D diffusion

Year 1, initial implem
entation

year 1

Crack propagation

Years 1
-
2, initial implementation

first 18 mos.

Other 3
-
D nanostructures:
nanowire/nanotube

Years 2
-
3

Table
1
. BNJ feature ove
rview and development timetable for the EMT project.


This section
briefly surveys the current state of practice in this area of research, technological needs that must
be addressed to meet the goal of

developing graphical models and providing a usable software toolkit to the
computational physics user community.

General
-
purpose software tools for data mining

are abundant, but similar tools for learning models from data
are not as accessible to
computational science and engineering

researchers and students.
Outside the cross
-
cutting disciplines of artificial intelligence
and statistical computation, this proves to be an educational lack.
Specifically,
many of the curricula
in knowledge discovery in databases (KDD) using
real

data has
covered

on
ly a few aspects of Naïve Bayes, clustering, and Bayesian statistics, and in th
e case of time series prediction,
autoregressive moving average (ARIMA) process models and some simple state transition models, usually
without covering learning and inference. As a general practice, most such teaching programs have not
incorporated learn
ing and
inference in graphical models in general, even where efficient and scalable
algorithms are available.


C
-
10

Table 1 lists the learning and inference modules that make up the middle section of
Error! Reference source
not found.
. Those modules that are m
ost directly relevant to this EMT project are bolded.

4.4

Value Added

to EMT
: Beyond
Material Evolution

A significant part of this proposal is to develop the framework for bridging the gap between the length scales
feasible for
ab initio

calculations of comple
x systems and those necessary for examining processes like
diffusion of atoms and vacancies in grain boundaries in Fe and Fe
-
Ni based alloys. One way to bridge this
gap is through a combination of
ab initio

electronic structure calculation with kinetic Mo
nte Carlo (kMC)
simulations. In the proposed work we will first consider first the simpler case of diffusion in grain
-
boundaries in the homogeneous system consisting of pure Fe and then move onto considerations of those in
Fe
-
Ni based alloys. For both sys
tems activation energy barriers for selected cases will be obtained from
ab
initio

calculations of the total energy using the nudged elastic band method [
Johnson
et al.
, 1998
] and
compared with those from model potential. While this is a relatively new and

reliable technique, it has already
found its way in state
-
of
-
the
-
art electronic structure codes which are used by Rahman and Kara.

The calculations for multi component systems are indeed computationally intensive but they have become
feasible on present
-
d
ay computers. Some model interaction potentials are already available and will be used.
Proposed
ab initio

will also facilitate the development of robust interatomic potentials for further application
to the projects proposed here.

At the initial stage e
nergy barriers for important processes will provide the input for a kMC simulation of the
diffusion of atoms and vacancies in the alloys under consideration, as function of temperature. These
calculations will be done through the usage of the model potenti
als. Classical molecular dynamics simulations
will be carried out to obtain further insights into the importance and relevance of additional diffusion
mechanisms, at elevated temperatures, since
ab initio

results are most appropriate for low temperatures.
Activation energy barriers for these new processes will then obtained from
ab initio

calculations and the data
base for kMC simulations will be enlarged. With further refinements these sets of calculations at several
lengths and time scales will greatly en
hance our knowledge of the characteristics of diffusion in these alloys.
Available experimental data on the systems will attest to the validity of the calculations. The calculations in
turn will motivate experimentalists to investigate further these proces
ses under controlled conditions.

Results from the
ab initio

study mentioned in this proposal and the model interatomic potentials that we
propose to develop will set the stage for a detailed, temperature dependent investigation of fracture related
phenomen
a in steel alloys as a function of composition, segregation and stoichiometry. These large scale
atomistic simulations will be carried out in several stages.



MC simulations to determine segregation profiles at and near grain boundaries



KMC simulations with

energy barriers calculated using
ab initio

techniques and model potentials.

For the study of fracture dynamics in systems containing grain
-
boundaries, it is desirable to obtain a realistic,
temperature dependent, configuration near and at the grain bounda
ries. It is well known that near a grain
boundary or near a surface, the concentration of the two elements in a binary alloy differs from that of the
bulk resulting in strong segregation profiles. We propose to perform grand canonical MC simulations for a
series of cells containing different types of (symmetrical tilt) grain boundaries, at different Fe
-
Ni
stoichiometries and temperatures. For each system and temperature, energetics and dynamics will be
monitored and analyzed in order to extract information

about the mode of the crack propagation and
fracture. Of interest will be the region near the crack front where dislocation nucleation and motion will be
expected as well as the general behavior, at the atomic level, the displacement field near the slip p
lane.

A new feature we will add to our novel approach, where
ab initio

techniques and model potentials are used in
tandem, is a continuous quality test of the model potentials used in the simulations. This will be done by
regularly taking newly developed s
tructures during the fracture dynamics and compare the energetics and
forces from the model potentials used and those obtained from
ab initio
. If large discrepancies are observed, a
re
-
fit of the model potentials, including the newly calculated energies an
d forces from
ab initio
, will be

C
-
11

obtained and the procedure continued till convergence is reached. This quality
-
test and improvement of the
tailored model
-
potentials will provide, hopefully, robust and with predictive power results.

In the proposed studies

we will calculate the pre
-
exponential factors for the diffusion of Fe, Ni and vacancies
at grain boundaries. Of course a number of steps will be involved for accurate, realistic calculations of the
diffusion coefficient. Note that the kMC simulations prop
osed above with
a
b initio

calculations of energy
activation barriers for possible diffusion mechanisms, will not give any information about the diffusion
prefactor We are thus proposing the development of robust model potentials and their usage for the
cal
culations of the vibrational entropy contribution to the diffusion coefficient. The details for this can be
found elsewhere
[Kuerpick
et al.
, 1997
]. Recently, we have been engaged in a number of advanced and
accelerated computational techniques [
Rahman
et

al.
, 2004
]

such as:

kinetic Monte Carlo, nudged elastic band
method for calculation of diffusion paths and barriers,
and
accelerated molecular dynamics simulations for
examining growth processes on metal surfaces. We propose to apply these techniques to
examine the process
of diffusion in the various phases of Fe
-
Ni.

4.5

Evaluation Plan

Our
evaluation

approach

can be divided among

model

development
,
refinement
,

and

application

phases
.

D
EVELOPMENT
: The first
12
-
month

shall produce

the
temporal data models

(eve
nt language)

and a
standardized, documented training corpus for l
earning
temporal
probabilistic models from
our own
simulators for material evolution, generalizing to 3
-
D boundary diffusion. Meanwhile, we will adapt data
models for crack propagation and o
ther surface defects. In an
overlapping

18 month phase
,
we will develop
new simulators using the DBN structure learning and inference

E
VALUATION
: We will develop algorithms for both
learning

and
inference

of graphical models and compare
them to existi
ng o
nes for DBN structure learning and exact and approximate inference
. This project focuses
on
extending structure learning to

relational, spatial, and multi
-
time

models

and the evaluation of
robustness by
statistical validation of

discovered models. We pro
pose to generalize and refine existing
evaluation methods by (
i
) conducting ablation studies to test the graceful degradation of the system given
resource bounds
; (
ii
) checking
dynamical
models against

samples of

exact computations
. Because bootstrap
meth
ods for model evaluation are computationally intensive, we
will

use high
-
performance Grid applications
that we are
developing using the existing cluster facility
to accomplish the indicated experiments.

A
PPLICATION
: To validate the resultant m
odels, we wi
ll use exact parameters for the external energy
function as calculated using conjugate gradient, molecular dynamics (MD) cooling, and global Markov Chain
Monte Carlo (MCMC) optimization when computationally feasible
.

M
odel application is
intended

to lead
to
a process of iterative improvement of model
s wherein the time series learning, representation, and reasoning.

5.

Broader Impacts

5.1

Value Added beyond EMT

Probabilistic graphical models have been used in numerous recent applications to classification, forecas
ting
and in causal or compositional inference in many domains. These include ecological data, economic data,
climate data, spectroscopic data of many kinds, and medical data, in all of which Bayesian networks have been
shown to perform well in the above t
asks. The PI

recently received an NSF EPSCoR First Award (June,
2002


August, 2003)
for a research project on building
probabilistic network models of cell cycle
-
regulated
genes in yeast.
Several additional research projects at KSU, Iowa State, and CMU
focus
on algorithms for
learning graphical models from data. We intend to continue this work through the
three

years of this project
,
exposing students to open research problems and current challenges
. Bayesian network structure learning is
an important
but intractable subproblem of probabilistic reasoning. Therefore, greedy score
-
based
algorithms
for structure learning

are sometimes used, but these are sensitive to the order in which variables
are scored. Unfortunately, finding the optimal ordering of
inputs entails search
through the permutation
space of variables. Furthermore, in real
-
world applications of structure learning, the gold standard (ground
-

C
-
12

truth) network is typically unknown. In response, we have developed a scoring method for orderings
that
uses a well
-
known greedy algorithm for structure learning (
K2
) and exact and approximate

inferential loss,
given specified evidence.
[Hsu
et al.
, 2002]
describe
s

how this scoring module fits within a
n

optimization
framework
that evaluates the fitness

of variable orderings.

One the computer science side, o
ur project

has three points of greatest educational impact:



Fundamental concepts of graphical models:
There are many courseware packages that introduce
undergraduates to concepts of graph theory and

probability theory, but integrative introductions to
graphical models of probability are rare and tend to focus on just a few features. Furthermore, there
are few such tools in the public domain, CMU’s Causal and Statistical Reasoning tutor [Glymour and
Scheines, 2004] being the only one that to our knowledge can demonstrate fundamental concepts
using general
-
purpose networks.



Learning and inference:
The primary value added by
BNJ

is the ability to demonstrate and
experiment with many algorithms for reaso
ning with graphical models and learning model structure
and parameters from data. This is presented in a single, unified framework with many reusable,
extendable classes [Kruger, 1992; Gamma
et al.
, 1995] for visualization, error measurement.



Interfacing
to real databases and working with network and data formats
:
BNJ

can help
introduce students to semistructured data, using a new XML Bayesian Network format that
integrates and converts among existing formats from Microsoft Research,
Hugin
,
Netica
,
etc.
, i
ncluding
several legacy formats; it can also use WEKA’s ARFF format for training and inference data.

Applications, examples, and student workbooks developed using
BNJ

will give students in computational
sciences and applied mathematics an opportunity to
st
udy
,
produce, and use real models

in interactive
exercises. It will also support guided study with a variety of example networks, including not only small
networks from many online repositories but also network structures and distributions automatically g
enerated
to specification or learned from real world data. This real
-
world data includes experimental data that is a
product of research uses of
BNJ
. Data from such projects is already being used in an educational setting in
advanced undergraduate and gr
aduate
-
level courses on machine learning and data mining. Equally important,
each tutorial on an algorithm will include a discussion on
known conditions for its reliability
under various
conditions. As part of this
EMT

project, we will develop supplementa
ry data sets and a courseware module
for
BNJ

that uses them to illustrate practical applications of data preparation, learning, and inference in
graphical models.

Desired o
utcomes and benefits
from

use of
BNJ
-
based materials

E
DUCATIONAL
B
ENEFITS
:
Our pedag
ogical goals are to help students attain understanding and interest in
the areas of computational science and engineering (CSE) and applied mathematics that relate to CSE.
Higher education should train students to be effective
solvers
and

posers of proble
ms
. This is the focal
challenge in every discipline of mathematics, science, and engineering. In the developing field of
computational genomics, however, new application domains are identified and methodology is advanced,
continuously and very rapidly.
This poses a novel and complex challenge. We must prepare students for
lifelong learning in computational sciences by imbuing them not only with technical background, but with
enthusiastic interest for the subject matter and its theoretical and practical
significance. Specifically, we must
help undergraduates in computational
physics

to develop an awareness of computer science and applied
mathematics as important facets of their discipline
and

to appreciate them as integrative subjects for
professional pr
actice and potential postgraduate study.

R
EUSE
:
The most significant side benefit of the proposed work

is our plan to develop an
extensible

architecture and reuse library for
research in materials evolution, 3
-
D propagation of cracks, and other
nanoscale p
rocesses
. The architecture is based upon data models and application programmer interfaces
(APIs) that we have developed and are continuing to refine in our research. Into this framework, our student
participants can incorporate codes that implement algo
rithms for analyzing data, then test, correct, compare,
and refine them. The suite of software tools shall in turn serve as a foundation for training
users

to design

C
-
13

and carry out experiments in scientific applications of graphical models, by writing or a
dapting programs for
their own use. Transfer of this training technology to other computational science and engineering curricula
to serve the aims stated in
Section
1

is one measure of
success in achieving broader impacts
.

We
have developed new courses and a formal interdisciplinary curriculum at both the graduate and
undergraduate levels and integrated them into a research program that emphasizes rigorous specification,
development, and assessment of intelligent systems for
co
mputational science and engineering
. We are
delivering these to both traditional campus
-
based students and remote students, some of whom we anticipate
shall be veterans of the
computational science and engineering

industry, seeking continuing education. W
e are
also working to develop programs for outreach at pre
-
collegiate levels. Our desired pedagogical benefits,
concrete outcomes
, and approaches are:



Curriculum improvement
: courses, degree programs,
materials

to facilitate active learning



Early educati
onal outreach
: improved

recruitment and retention

of underrepresented groups



Undergraduate involvement in research
: collaborative experience
;
technology transfer



Increased competence
: mentoring from pre
-
collegiate through postdoctoral level

Test Sites and
Advisory Group

W
e propose to leverage our existing research by incorporating our basic theoretical advances in machine
learning and probabilistic reasoning into our intelligent systems courses, demonstrating their benefit to
students through hands
-
on devel
opment experience with the software packages we have developed:
Machine
Learning in Java (MLJ)

and
Bayesian Network Tools in Java (
BNJ
).
This leads to concurrent engineering of our
research codes with the educational code base, offering undergraduates and

new graduate students the chance
to learn about state
-
of
-
the
-
field software tools from the original developers. We have found that
providing
visual explanation of technologies to students facilitates the process of active learning

by allowing
them to int
eract with models, data, or algorithms. We shall provide students with visual programming
infrastructures for development of distributed, high
-
performance KDD and collaborative filtering systems
and require them to develop user interfaces and visualizatio
ns of models. We expect benefits of this
visualization approach to accrue to new interdisciplinary teaching programs in problem solving in the colleges
of engineering, arts and sciences, agriculture, and architecture.

The PI at KSU has been using
BNJ

in a
n introductory course in AI, an undergraduate course in data mining,
and a graduate course on machine learning.
To date, a total of 15 institutions

(including KSU, Iowa State,
and CMU)

have elected to adopt
BNJ

as test sites for at least the duration of t
he proposed 3
-
year project.
Other universities and companies have also made informal commitments to use
BNJ

on

a trial basis.
BNJ

v3
is to be piloted as a teaching tool at

a total of

25 universities.

5.2

Dissemination Plan for Research and Educational Module
s

External Evaluation and Coordination with EMT Technical Leads

To address the comprehensive nature of this proposed project, the evaluation plan is multidimensional in
nature and is formed around three main elements: 1) the learning experience for the stu
dent; 2) the research
environment; and 3) educational outreach. The framework of the evaluation design is based on the logic
model, recommended by NSF. The logic model highlights the breadth and depth of the possible i
m
pacts of
the
BNJ

program as well as

the long
-
term n
a
ture of the project. The model provides a visual representation
of the interrelated aspects of the project and aligns with the Principal Investigators’ approach to research on
student understanding, development of strategies on building c
apacity in computational science and
engineering (CSE), as well as the extensive approach to the development of instructional materials.

The outcomes of the
BNJ

program will be expressed in terms of student motiv
a
tion and learning, and
changes in faculty’s

knowledge and application of the
BNJ

toolkit. The evaluation will document the impact
on undergraduate students, graduate students, and fa
c
ulty. The program can also be expected to influence
curric
u
lum and the general culture of the learning environment.
Thus, the evaluation will capture anticipated

C
-
14

and unantic
i
pated changes the program may bring about. The evaluation will use a variety of indic
a
tors to
measure the breadth and depth of impact of
the
Bayesian Network tools in Java

on individuals and partner
ing
inst
i
tutions.

The evaluation plan is consistent with the NSF and other professional guidelines on evaluation
for a pr
o
ject of this magnitude. On
-
going data generation and analysis (
formative

feedback) will be provided to
the pr
o
ject leadership. This s
ystem of recursive review and analysis will allow the project leaders to make
necessary modifications to the program activities and products throug
h
out the implementation of the
project’s goals.
Summative

evaluation feedback will be provided to the projec
t personnel

annually and at
NSF’s request.

T
ECHNOLOGY
T
RANSFER
: Our key dissemination effort is the development of
research codes
applicable
to real
-
world learning and inference problems, including collaborative filtering and the specific
computational
sci
ence and engineering

applications we have discussed. This began over
two
years ago with the first

experimental prototype of
Bayesian Network tools in Java
.
BNJ

has been downloaded over
5000

times from
SourceForge

since 04 May 2002 and has over
250

regist
ered users worldwide
at the time of this writing
. We
anticipate that they shall be refined the next
three

years through interaction with our local and international
collaborators and instructors of KDD
-
related courses at this and other universities, as do
cumented in the
supplementary letters of support.

L
OCAL AND
I
NTERNATIONAL
M
ENTORING
: In addition to funneling production
-
level research and
development tools such as
BNJ

and
MLJ

back into the classroom, we have devoted focused effort to
mentoring of studen
ts with potential to conduct research and become teachers in our subject area. This
began in 1998 with our supervision of graduate research assistants and undergraduate programmers at NCSA
and continued with our participation (1999
-
2000) in the Engineerin
g Learning Enhancement
Action/Resource Network (LEA/RN). Since 2001, the PI has served as the computer science undergraduate
honors advisor, organized spring seminars and summer workshops for this program, mentored a student who
received a Goldwater schol
arship and an
NSF Graduate Fellowship

for her proposed contributions to
BNJ
,
and served as faculty advisor on a 2
-
student project in the Computing Research Association Collaborative
Research Experience for Women (CREW) program (2002
-
2003). We also intend
to work with
undergraduates from the KSU Developing Scholars Program (DSP), which is devoted to research experiences
for students from underrepresented groups.

All faculty participants will incorporate postgraduate and postdoctoral mentoring into synergist
ic activities
such as interdepartmental seminars, an activity supported by our respective
universities
.

D
EVELOPMENT OF
S
OFTWARE
T
OOLS AND
I
NFRASTRUCTURE
:

Table 1 in Section 3.1 lists the courseware
modules to be implemented. To produce the tutorials and s
tudent workbook, we

will develop
visualization
classes, front
-
end applications, example networks, and examples generated by recording
learning and
inference using
these networks. Most of these tutorials emphasize fundamental theory and published
algorithm
s, as Table 1 illustrates, but recently published research algorithms for
structure learning and exact
and approximate inference

will also be included to give students exposure to comparative experimental
methods
. This project focuses on extending structu
re learning to semi
-
structured relational models and the
evaluation of robustness by statistical validation of discovered models. We propose to generalize and refine
existing evaluation methods by (
i
) validating more model features; (
ii
) checking automati
cally extracted
models against published

gold standard networks
. Because bootstrap methods for model evaluation are
computationally intensive,
the
BNJ

development team
will
develop a

high
-
performance Grid

interface to
support batch (non
-
interactive) bench
marking and experimentation with the core
BNJ

infrastructure.

E
DUCATIONAL
O
UTREACH
:

Our integrative research and education program includes planned
demonstrations, workshops, and web learning materials for 8
-
12
th

grade outreach. We aim towards
developing
workshop activities for our university’s science and technology program for teen women, and one
of the co
-
PIs (
Hsu
) has administered the summer science institute for high school juniors and seniors in our
department for the past 2 years. The latter has ab
out 40% female participation, significantly higher than the
admissions or retention rates in the CS undergraduate program. We have noted a high level of interest

C
-
15

among women in such summer programs who are prospective majors in our CS p
rogram or in the
co
mputational physical sciences
.

We introduce undergraduates


as early as their second year


to AI, machine learning, simulation, and
visualization algorithms that they help implement and use in experiments. Furthermore, we believe that early
undergraduat
e research experiences are a key to retention of female students and other underrepresented
groups of students in engineering, giving them
exposure to

both theoretical background and commercial
and industrial applications
. Our efforts are supported by col
laboration with our university’s Women in
Engineering and Science Program. As a research scientist at NCSA, the

KSU

PI (
Hsu
) was able to hire 40
-
60% women research programmers at the 11
-
12 through undergraduate level. To bring participation of
women clos
er to this level in our programs and encourage retention through graduate study, we have
regularly lectured at our undergraduate engineering honors and ethics and applications (
KSU
CIS 492:
Computers and Society) seminars. The PI has served since 2001 as
departmental honors chair. We are also
planning integrative research experiences for undergraduates, including but not limited to honors students.

All but one of the upper
-
division courses offered in our KDD program are open to undergraduates.

D
ISTRIBUTI
ON OF
EMT

S
OFTWARE
T
OOLS
:

The BNJ web site [Hsu
et al.
, 2004] is
http://bnj.sourceforge.net
.
Dissemination of results shall be achieved by development and distribution of
open
-
source courseware built upon the KSU

BNJ

and CMU
Causality Lab

infrastructure, together with
models in an open, semi
-
structured data format (the XML Bayesian Network Interchange Format). The
software modules shall be added to
BNJ
, an open
-
source toolkit developed and disseminated by the res
earch
lab of one of the co
-
PIs (
Hsu
). Meanwhile, training materials produced for and as part of this
research
program

in the form of digitally
-
recorded lectures, electronic manuals, and tutorials for BNJ software will be
freely available. We will distrib
ute these through our
web
-
based distance learning infrastructure

as we
have done for the past two years.

A web and file server, tape backup system, and DVD+RW drive to be used
for dissemination of the electronic materials and courseware are budgeted as sm
all computer items, and the
expenditure sc
hedule is given in the budget justification
.

Integrating and complementing our efforts toward dissemination and outreach is the planned development of
web training materials
on graphical models and on applied machi
ne learning and probabilistic reasoning.
We shall provide Java source code from our source code control (CVS) repository and from multiple mirrors
of the open source developer network,
Sourceforge
):. Rather than providing a heterogeneous, disorganized
mi
xture of implementations of learning algorithms, our courseware project undertakes to document these
algorithms in the simplest and clearest way by illustrating common data structures and representations of
hypotheses using many common technical computing
tools. Our goal is to bring probabilistic and reasoning
and the tools supporting our proposed research to as wide an audience as possible. Therefore we use a
complementary approach of developing open
-
source software both in general
-
purpose imperative lan
guages
such as Java
and
prototyping using

technical computing languages

such as
MATLAB
,
Mathematica
, and
R
.
This trades off portability and interoperability (Java) against readability and accessibility (
MATLAB
, etc.) to
students who may be new to both pro
gramming and intelligent systems.

W
ORKSHOPS
:

Regional workshops hosted on the KSU campus in Manhattan, KS, or at the University of
Kansas test site in Lawrence, KS, will permit the lead PIs to meet annually for project coordination. The KS
location is pr
oximate to KSU and within short travel distance to several of the test sites. The workshops will
include presentations and discussion panels on
BNJ

development and instructional use

of
BNJ
, including in
research seminars and graduate
-
level courses. The
P
I

(Hsu)

has successfully proposed, organized, and hosted
three such satellite workshops of international conferences (IJCAI 2001, AAAI/UAI/KDD 2002, IJCAI
2003).
O
rganization
al

costs for
these

workshops, including print copies of the workshop proceedings,

documentation, mailings and postage, and presentation equipment rental
, will be covered via a

nominal
registration fee ($25 per attendee
)
for professional and faculty
attendees not from the advisory group or test
site group. Students will not be assessed

a registration fee; instead, the nominal cost of proceedings will be
covered out o
f a requested operating budget.