Automatic Composition of Transition-based Semantic Web Services with Messaging

pikeactuaryInternet και Εφαρμογές Web

20 Οκτ 2013 (πριν από 3 χρόνια και 7 μήνες)

61 εμφανίσεις

Automatic Composition of Transition-based
Semantic Web Services with Messaging
Daniela Berardi
1
,Diego Calvanese
2
,Giuseppe De Giacomo
1
,
Richard Hull
3
,Massimo Mecella
1
1
Universit
`
a di Roma “La Sapienza”,
<lastname>@dis.uniroma1.it
2
Libera Universit
`
a di Bolzano/Bozen
calvanese@inf.unibz.it
3
Bell Labs,Lucent Technologies,
hull@lucent.com
Abstract:In this paper we present Colombo,a framework in which web services
are characterized in terms of (i) the atomic processes (i.e.,operations) they can per-
form;(ii) their impact on the “real world” (modeled as a relational database);(iii) their
transition-based behavior;and (iv) the messages they can send and receive (from/to
other web services and “human” clients).As such,Colombo combines key ele-
ments from the standards and research literature on (semantic) web services.Using
Colombo,we study the problem of automatic service composition (synthesis) and
devise a sound,complete and terminating algorithm for building a composite service.
Specifically,the paper develops (i) a technique for handling the data,which ranges
over an infinite domain,in a finite,symbolic way,and (ii) a technique to automatically
synthesize composite web services,based on Propositional Dynamic Logic.
1 Introduction
Service Oriented Computing (SOC [1]) is the computing paradigm that utilizes web
services (also called e-Services or,simply,services) as fundamental elements for re-
alizing distributed applications/solutions.Web services are self-describing,platform-
agnostic computational elements that support rapid,low-cost and easy composition of
loosely coupled distributed applications.
SOC poses many challenging research issues,the most hyped one being web ser-
vice composition.Web service composition addresses the situation when a client re-
quest cannot be satisfied by any available service,but by suitably combining “parts of”
available services.Composition involves two different issues [1].The first,typically
called composition synthesis,is concerned with synthesizing a specification of how to
1
coordinate the component services to fulfill the client request.Such a specification can
be produced either automatically,i.e.,using a tool that implements a composition algo-
rithm,or manually by a human.The second issue,often referred to as orchestration,is
concerned with howto actually achieve the coordination among services,by executing
the specification produced by the composition synthesis and by suitably supervising
and monitoring both the control flow and the data flow among the involved services.
Orchestration has been widely addressed by other research areas,and most of the work
on service orchestration is based on research in workflows.
In this paper we address the problem of automatic composition synthesis of web
services.Specifically,we introduce an abstract model,called Colombo,that combines
four fundamental aspects of web services,namely:(i) A world state,representing the
“real world”,viewed as a database instance over a relational database schema,referred
to as world schema.This is similar to the family of “fluents” found in semantic web
services models such as OWL-S [15],and more generally,found in situation calculii
[17].(ii) Atomic processes (i.e.,operations),which can access and modify the world
state,and may include conditional effects and non-determinism.These are inspired by
the atomic processes of OWL-S.(iii) Message passing,including a simple notion of
ports and links,as found in web services standards (e.g.,WSDL [3],BPEL4WS [2])
and some formal investigations (e.g.,[6,10]).(iv) The behavior of web services (which
may involve multiple atomic processes and message-passing activities) is specified us-
ing finite state transition system,in the spirit of [5,6,10].The first three elements paral-
lel in several respects the core elements of the emerging SWSL (Semantic Web Service
Language) ontology for semantic web services [11].The fourth element provides an
abstract approach to formally model the internal process model of a web service,also
reflected as an option in the SWSL ontology.
We also assume that:(v) Each web service instance has a “local store”,used to cap-
ture parameter values of incoming messages and the output values of atomic processes,
and used to populate the parameters of outgoing messages and the input parameters of
atomic processes.Conditional branching in a web service will be based on the values
of the local store variables at a given time.(The conditions in atomic process condi-
tional effects are based on both the world state and the parameter values used to invoke
the process.) (vi) Finally,we introduce a simple form of integrity constraints on the
world state.
A client of a web service interacts with it by repeatedly sending and receiving
messages,until a certain situation is reached.In other words,also the client behavior
can be abstractly represented as a transition system.
In order to address the problemof automatic web service composition,we introduce
the notion of “goal service”,denoting the behavior of a desired composite service:it
is specified as a transition-based web service,that interacts with a client and invokes
atomic processes.Our challenge is to build a mediator,which uses messages to interact
with pre-existing web services (e.g.,in an extended UDDI directory),such that the
overall behavior of the mediated system faithfully simulates the behavior of the goal
service.
The contribution of this paper is multifold:(i) Colombo unifies and extends the
most important frameworks for services and service composition;(ii) it presents a
technique to reduce infinite data value to finite symbolic data;(iii) it exploits and
2
extends techniques based on Propositional Dynamic Logic to automatically synthe-
size a composite service (see [5]),under certain assumptions (and we refer to this as
Colombo
k,b
;(iv) it provides an upper bound on the complexity of this problem.To
the best of our knowledge,the work reported in this paper is the first one proposing
an algorithm for web service composition where web services are described in terms
of (i) atomic processes,(ii) transition-based process models,(iii) their impact on a
database representing the “real world”,and (iv) message-based communication.As
stated in [13],Service Oriented Computing can play a major role in transaction-based
data management systems,since web services can be exploited to access and filter data.
The framework developed in this paper shows the feasibility of such an idea.
BPEL4WS [2] allows for (manually) specifying the coordination among multiple
web services,expressed in WSDL.The data manipulation internal to web services
is based on a “blackboard approach”,i.e.,a set of variables that are shared within
each orchestration instance.Thus,on the one hand BPEL4WS provides constructs for
dealing with data flow,but on the other hand,it has no notion of world state.
OWL-S [15] is an ontology language for describing semantic web services,in terms
of their inputs,outputs,preconditions and (possibly conditional) effects,and of their
process model.On the one hand OWL-S allows for capturing the notion of world state
as a set of fluents,but on the other hand it is not clear howto deal with data flow(within
the process model).
Several works on automatic composition of OWL-S services exists,e.g.,[16,18].
Most results are based on the idea of sequentially composing the available web ser-
vices,which are considered as black boxes,and hence atomically executed.Such an
approach to composition is tightly related to Classical Planning in AI.Consequently,
most goals express conditions on the real world,that characterize the situation to be
reached:therefore,the automatically devised composition can be exploited only once,
by the client that has specified the goal.Conversely,in Colombo the goal is a speci-
fication of the transition system characterizing the process of a desired composite web
service.Thus,it can be re-used by several clients that wants to execute that web ser-
vice.
Colombo extends the Roman model,presented in [5],mainly by introducing data
and communication capabilities based on messages.The level of abstraction taken
in [5] focuses on (deterministic,atomic) actions,therefore,the transition system rep-
resenting web service behavior is deterministic.Also,all the interactions are carried
out through action invocation,instead of message passing.Finally,in [5] there is no
difference between the transition system representing the client behavior and the one
specifying the goal,as it is in Colombo.
Colombo has its root also in the Conversation model,presented in [6,10],ex-
tending it to deal with data and atomic processes.Web services are modeled as Mealy
machines (equipped with a queue) and exchange sequence of messages of given types
(called conversations) according to a predefined set of channels.It is shown how to
synthesize web services as Mealy machines whose conversations (across a given set of
channels) are compliant with a given specification.In [10] an extension of the frame-
work is proposed where services are specified as guarded automata,having local XML
variables in order to deal with data semantics.
In [19] web services (and the client) are represented as possibly non-deterministic
3
transition systems,communicating through messaging,and composition is achieved
exploiting advanced model cheking techniques.However,a limited support for data
is present and there is no notion of local store.It would be interesting to apply our
techiques for finitely handling data ranging an infinite domain to their framework,in
order to provide an extension to it.
Finally,it is interesting to mention the work in [8],where the authors focus on
data-driven services,characterized by a relational database and a tree of web pages.In
such a framework,the authors study the automatic verification of properties of a single
service,which are defined both in a linear and in a branching time setting.
The rest of the paper is organized as follows.Section 2 illustrates Colombo with
an example.Section 3 introduces the formal concepts of Colombo.In Section 4 the
problemof web service composition is formally stated and an upper bound on its com-
plexity is provided.Section 5 shows our technique for handling the data,which ranges
over an infinite domain,in a finite,symbolic way.Section 6 presents our tecnhique to
automatically synthesize composite web services in Colombo based on Propositional
Dynamic Logic.Section 7 concludes the paper and highlights future work.In the
appendices,technical results are provided.
2 An Example
In this section,we illustrate Colombo and give an intuition of our automatic web
service composition technique by means of an example involving web services that
manage inventories,payment by credit or prepaid card,request shipments,and check
shipment status.
The world schema is constituted by four relations,defined over (i) the boolean
domain Bool,(ii) an infinite set of uninterpreted elements Dom
=
(on which only
the equality relation is defined) denoted by alphanumeric strings,and (iii) an infinite
densely ordered set Dom

,denoted by numbers.An instance of the world schema is
shown in Figure 1.For each relation,the key attributes are separated from the others
by the thick separation between columns.The intuition behind these relations is as fol-
lows:Accounts stores credit card numbers and the information on whether they can
be charged;PREPaid stores prepaid card numbers and the information on whether
they can be still be used;Inventory contains item codes,the warehouse they are
available in,if any,and the price;Shipment stores order id’s,the source warehouse,
the target location,status and date of shipping.
Figure 2 shows the alphabet A of the atomic processes,that are invoked by the
available web services,and are used in the goal service specification.Intuitively,A
represents the common understanding on an agreed upon reference alphabet/semantics
cooperating web services should share [7].For succinctness we use a pidgen syn-
tax for specifying the atomic processes in that figure.We denote the null value using
ω.The special symbol ’-’ denotes elements of tuples that remain unchanged after the
execution of the atomic process.Throughout the paper,when defining (conditional)
effects of atomic processes,we specify the potential effects on the world state using
syntax of the form ‘insert’,’delete’,and ‘modify’.These are suggestive of proce-
dural database manipulations,but are intended as shorthand for declarative statements
4
Acconts
CCNumber
credit
1234
T
...
...
PREPaid
PREPaidNum
credit
5678
T
...
...
Inventory
code
available
warehouse
price
H.P.6
T
NGW
5
H.P.1
T
SW
10
...
...
...
...
Shipment
order#
from
to
status
date
22
NGW
NYC
‘‘requested’’
16/07/2005
...
...
...
...
...
Figure 1:World Schema Instance
about the states of the world before and after an effect has occurred.Finally,the access
function f
R
j
(a
1
,...,a
n
) (see Section 3) is used to fetch the n +j-th element of the
tuple in Ridentified by the key a
1
,...,a
n
 (i.e.,the j-th element of the tuple after the
key).
Figure 3 shows (the transition systems of) the available web services:Bank checks
that a credit card can be used to make a payment;Storefront,given the code of
an item,returns its price and the warehouse in which the item is available;Next
Generation Warehouse (NGW) allows for (i) dealing with an order either by
credit card or by prepaid card,according to the client’s preferences and to the item’s
price,and for (ii) shipping the ordered item,if the payment card is valid;Standard
Warehouse (SW) deals only with orders by credit cards,and allows for shipping
the ordered item,if the card is valid.Throughout the example we are assuming that
other web services are able to change the status and,possibly,to postpone the date of
item delivery using suitable atomic process,which are not shown in Figure 2.In the
figure,transitions concerning messages are labeled with an operation to transmit or to
read a message,by prefixing the message with!or?,respectively.
All the available web services are also characterized by the following elements
(for simplicity,not shown in the figure).(i) An internal local store,i.e.,a relational
database defined over the same domains as the world state (namely,the set Bool of
booleans,the set Dom
=
of alphanumeric strings,and the set Dom

of numbers),is
used to store parameters values of received messages that have been read and need to
be processed during the execution of the web service.(ii) One port for each message
(type) a service can transmit or receive.As an example,the web service Bank has two
ports,one for receiving messages (of type) CCnum and another for sending messages
(of type) approved.Each port for an incoming message has associated a queue (see
below) and a web service can always transmit messages,but can receive them only if
the queue is not full.A received message is then read (and erased from the queue)
when the process of the web service allows it.(iii) One queue (of length one) for each
message type the web service can receive.The queues are used to store messages that
have been received but not read yet.For example,the web service Bank has one queue,
5
CCCheck
I:c:Dom
=
;% CC card number
O:app:Bool;% CC approval
effects:
if f
Accounts
1
(c) then
either modify Accounts(c;T) or
modify Accounts(c;F) and approved:= T
if ¬f
Accounts
1
(c) then
approved:= F
checkItem:
I:c:Dom
=
;% item code
O:avail:Bool;wh:Dom
=
;p:Dom

% resp.item
% availability,selling warehouse and price
effects:
if f
Inventory
1
(c) then
avail:= T and and wh:=f
Inventory
2
(c) and p:=f
Inventory
3
(c)
and either no-op on Inventory or
modify Inventory(c;F,-,-)
if ¬f
Inventory
1
(c) or f
Inventory
1
(c) = ω
then avail:= F
charge:
I:c:Dom
=
;% Prepaid card number;
O:paymentOK:Bool;% Prepaid card approval
effects:
if f
PrePaid
1
(c) then
either modify PrePaid(c;T) or modify PrePaid(c;F)
and paymentOK:= T
if ¬f
PrePaid
1
(c) then paymentOK:= F
requestShip:
I:wh:Dom
=
;addr:Dom
=
;% resp.source warehouse
% and target address
O:oid:Dom
=
;d:Dom

;s:Dom
=
;% resp.order id,
shipping date and status
effects:
∃d,o oid:=new(o) and
insert Shipment(new(oid);wh,addr,‘‘requested’’,d)
and d:=f
Shipment
4
(oid) and s:= ‘‘requested’’
checkShipStatus:
I:oid:Dom
=
;% order id
O:s:Dom
=
;d:Dom

;% resp.shipping date & status
effects:
if f
Shipment
1
(oid) = ω then no-op and s,d uninit
else s:=f
Shipment
3
(oid) and d:=f
Shipment
4
(oid)
Figure 2:Alphabet of Atomic Processes
6
! replyCCCheck(approved)
CCCheck(CCnum; approved)
? requestCCCheck(CCnum)
(a) Bank
! replyCheckItem(avail,
warehouse,price)
checkItem(code;
avail,warehouse,price)
? requestCheckItem(code)
(b) Storefront
? requestOrder(payBy,cartNum,
addr,price)
(payBy == PREPAID) ∧ (price  10)/
charge(cartNum; paymentOK)
(payBy == CC) ∨ (price > 10)/
! requestCCCheck(cartNum)
? replyCCCheck(approved)
approved == T/
requestShip(wh,addr;
oid,date,status)
! shipStatus
(oid,date,status)
? requestShipStatus(oid)
! shipStatus(oid,date,status)
checkShipStatus(oid;
date,status)
paymentOK == T/
requestShip(wh,addr;
oid,date,status)
approved == F /
! failMsg()
paymentOK == F/
! failMsg()
(c) Next Generation Warehouse
? requestOrder(CCNum,addr,price)
! requestCCCheck(CCnum)
? replyCCCheck(approved)
approved == F/
! refuseMsg()
approved == T/
requestShip(wh,addr;
oid,date,status)
? requestShipStatus (oid)
checkShipStatus(oid;
date,status)
! shipStatus
(oid,date,status)
! shipStatus
(oid,date,status)
(d) Standard Warehouse
Figure 3:Transition systems of the available services
for storing messages (of type) CCnum.
Figure 4 shows (the transition systemof) a goal service:it allows (i) to buy an item
characterized by a given code;(ii) to pay for it either by credit card or prepaid,depend-
ing on the client’s preferences,the item’s price and the warehouse in which the item
is stored;and (iii) to check the shipment status.Note that the goal service specifies both
message-based interactions with the client (e.g.,?requestPurchase(code,payBy)
for receiving from the client the item code and the preferred payment method) and
atomic processes that the available web service contained in the composition should
execute.
With our composition technique,we are able to automatically construct a mediator
such as S
0
shown in Figure 5.As an aid to the reader,we explicitly indicate in the
figure the sender or the receiver of each message,in order to provide an intuition of the
notion of linkage that will be introduced in the following sections.Note that,differently
fromthe goal service,the mediator specifies message-based interaction only,involving
either the client or a web service.The mediator is also characterized by a local store,
a set of ports and a queue for each incoming message (type),not shown in the figure.
An example of interactions between S
0
,the client and the available web services are
as follows.S
0
reads a requestPurchase(code,payBy) message that has been
transmitted by a client (into the suitable queue) and stores it into its local store:such
message specifies the code of an itemand the client’s preferred payment method.Then,
S
0
transmits the message requestCheckItem(code) to Storefront,i.e.,into
its queue,and waits for the answer (for simplicity we assume that the queue is not
full).Thus,Storefront reads fromits queue the message (carrying the item’s code),
executes the atomic process checkItem(code) by accessing the tuple of relation
Accounts having as key the given code:at this point,the information on the ware-
7
? requestPurchase(code,payBy)
checkItem(code;
avail,warehouse,price)
(payBy == CC) ∨ (price > 10) /
CCCheck(cartNum; authorized)
authorized == T/
requestShip(wh,addr;
oid,date,status)
(avail == F) ∨ ((payBy ==
PREPAID) ∧ (warehouse = SW))/
! responsePurchase(“fail”)
(avail == T) ∧ ((payBy == CC) ∨ (
warehouse = NGW))/
! responsePurchase(“provide cart number”)
(payBy == PREPAID) ∧ (price  10)/
charge(cartNum; authorized)
? requestShipStatus (oid)
checkShipStatus(oid;
date,status)
! shipStatus
(oid,date,status)
! shipStatus
(oid,date,status)
? msgCartNum(cartNum)
Figure 4:Transition systemof the goal service
house the itemis available in (if any) and its price can be fetched and transmitted to the
mediator.Hence,S
0
reads the message replyCheckItem(avail,warehouse,
price) and stores the values of its parameters into its local store.If no warehouse
contains the item(i.e.,avail == F),S
0
transmits a responsePurchase(‘‘fail’’)
message to the client,informing her that the request has failed,otherwise (i.e.,if
avail == T) S
0
transmits a responsePurchase(‘‘provide cart num’’)
to the client,asking her for the card number,and the interactions go on.
3 The Model
This section provides an overview of the formal model used in our investigation,fo-
cusing on Colombo
k,b
.
Model of the “real world”.A world (database) schema is a finite set W of relations
having the form:
R
k
(A
1
,...,A
m
k
;B
1
,...,B
n
l
),
where A
1
,...,A
m
k
is a key for R
k
,and where each attribute A
i
,B
j
is associated with
Bool,Dom
=
or Dom

.A world instance is a database instance over W.
We allowfor constraints over relations (see belowfor notion of “accessible term”).
A key-accessible constraint is an expression of the form ϕ = ∀x
1
,...,x
n
(ψ),where
the x
i
’s are distinct variables,and where ψ is a boolean expression over atoms over
accessible terms over a set of constants and variables {x
1
,...,x
n
}.A world instance
I satisfies this constraint if for all assignments α for variables x
1
,...,x
n
,formula ψ
is true in I when interpreted according to α.
Atomic Processes.Atomic processes in Colombo,inspired by OWL-S atomic pro-
cesses,may access/modify one or more of relations in the world schema.In typical
applications a given relation of the world schema may be accessible by just one web
8
? requestPurchase(code,payBy)
[from client]
! requestCheckItem(code)
[to Storefront]
(avail == F)/
! responsePurchase(“fail”)
[to client]
? replyCheckItem(avail,warehouse,price)
[from Storefront]
? requestCCCheck(cartNum)
[from NGW]
? failMsg() [from NGW]
? replyCCCheck(approved)
[from Bank]
! requestCCCheck(cartNum)
[to Bank]
! responsePurchase(“fail”)
[to client]
? requestShipStatus(oid)
[from client]
! requestShipStatus(oid)
[to NGW]
? shipStatus(oid,date,status)
[from NGW]
! shipStatus(oid,date,status)
[to client]
? shipStatus(oid,date,status)
[from NGW]
! shipStatus(oid,date,status)
[to client]
(avail == T)/
! responsePurchase(“provide cart num”)
[to client]
? msgCartNum_msgIN(cartNum) [from client]
(warehouse = SW) ∧ (payBy == CC)/ !
requestOrder(cartNum,addr,price) [to SW]
? requestCCCheck(cartNum)
[from SW]
! replyCCCheck(approved)
[to SW]
? refuseMsg() [from SW]
? replyCCCheck(approved)
from Bank
! requestCCCheck(cartNum)
[to Bank]
? requestShipStatus(oid) [from client]
! requestShipStatus(oid)
[to SW]
? shipStatus(oid,date,status)
[from SW]
! shipStatus(oid,date,status)
[to client]
! shipStatus(oid,date,status)
[to client]
? shipStatus(oid,date,status)
[from SW]
(warehouse = NGW) ∧ ((payBy == CC) ∨ (price > 10))
/ ! requestOrder(“CC”,cartNum,addr,price)
[to NGW]
(warehouse = SW) ∧ (payBy == PREPAID)/
! responsePurchase(“fail”) [to client]
(warehouse = NGW) ∧ ((payBy == PREPAID) ∨ (price  10))/ !
requestOrder(“PREPAID”,cartNum,addr,price) [to NGW]
! replyCCCheck(approved)
[to NGW]
! responsePurchase(“fail”)
[to client]
Figure 5:Transition systemof the mediator
service or by several web services,or by all web services.Furthermore,when execut-
ing,the atomic processes can make a finitely bounded non-deterministic choice.This
can be viewed as indicating that the world instance holds only partial information about
the state actually observalbe by the atomic processes.
The syntax for describing conditions,integrity constraints,and for describing the
local stores of web services,is based on the use of symbols denoting constants (taken
from Dom = Bool ∪ Dom
=
∪ Dom

) and variables.(These variables are typed as
Bool,Eq,Leq.)
At a given point in time during execution of a web service,there may be an assign-
ment α of variables (e.g.,in the local store of some web service) to elements of Dom.
For a variable v,α may assign a value fromDom,or ω (null value).
Notation:Let R(A
1
,...,A
n
;B
1
,...,B
m
) be a relation in the world schema W.We
define a family of n-ary functions f
R
j
for j ∈ [1..m],as follows.Let I be an instance
over W,and a
1
,...,a
n
be (not necessarily distinct) elements of Dom.Then the value
of f
R
j
(a
1
,...,a
n
) in I is defined to be either (i) the null value ω if a
1
,...,a
n
 

π
{A
1
,...,A
n
}
(I(R)),or (ii) it is equal to the unique b
j
where a
1
,...,a
n
,b
1
,...,b
n
 ∈
I(R) (for some b
k
’s).We refer to the functions f
R
j
as the access functions.
Given constants C and variables V,the set of accessible terms over C,V is defined
recursively to include all terms contructed using C,V and the f
R
j
functions.An atom
over C,V is an expression of form (i) init(t),(ii) t = t

,(iii) t < t

,or (iv) t > t

,
where t,t

are accessible terms.Atoms and propositional formulas constructed using
themare given a truth value under an assignment α in the usual manner.
Definition:An atomic process is an object p which has a signature of form(I,O,CE)
9
with the following properties.The input signature I and output signature O are sets of
typed variables.The conditional effect,CE,is a set of pairs of form(c,E),where c is
a (atomic process) condition and E is a finite non-empty set of (atomic process) effect
(specifications).Condition c is a boolean expression over atoms over accessible terms
over some family of constants and the input variables u
1
,...,u
n
.
An effect e ∈ E is a pair (es,ev),where:es (the effect on the world) is a set of ex-
pressions having the forms (i) insert R(t
1
,...,t
k
;s
1
,...,s
l
);(ii) delete R(t
1
,...,t
k
);
or (iii) modify R(t
1
,...,t
k
;r
1
,...,r
l
);where the t
i
’s and s
j
’s are accessible terms
over some set of constants and u
1
,...,u
n
,and where each r
j
is either an accessible
term or the special symbol ‘−’ (denoting that that position of the identified tuple in R
should be unchanged);and ev (effect on outputs) is a set of expressions of the form(iv)
v
j
:= t,where j ∈ [1..m] and t is an accessible term over some set of constants and
u
1
,...,u
n
;or (v) v
j
:= ω,where j ∈ [1..m] (There must be exactly one expression
for each v
j
.)
The definition of the semantics of an atomic process execution is relatively straight-
forward – based on the values for the input variables and the current world instance,
if a conditional effect (c,E) has true condition then one element e ∈ E is nondeter-
ministically chosen.If the application of e on the world instance satisfies the global
constraints Σ then e is to the world instance and used to determine the values of the
output variables.
We write (α,I) 
p(r
1
,...,r
n
;v
1
,...,v
m
)


,I

) over W,Σ,if the pair (α

,I

) is one
of the possible pairs resulting from the execution of p as described above.The trace
of this move is the syntactic object p(c
1
,...,c
n
;d
1
,...,d
m
) where c
i
is the domain
value identified by α(r
i
) (recall that α is the identity on elements of Dom) and where
d
j
is the domain value α

(v
j
).
Messages,Ports,and Links.A message type has a name mand a signature of form
d
1
,...,d
n
,where n ≥ 0 and each d
i
∈ {Bool,Eq,Leq}.
In Colombo,a (service) port signature of a service S,denoted Port or PortS,is
a set P of pairs having the form(m,in) or (m,out),where the m’s are message types,
and each pair in P has a distinct message type.Let F = {S
1
,...,S
n
} be a family of
services (with or without one client) having associated port signatures {P
1
,...,P
n
}.
A link for F is a tuple of form(S
i
,m,S
j
,n) where (m,out) ∈ P
i
,(n,in) ∈ P
j
,and
m,n have identical signatures.(It can occur that i = j,although perhaps not typical
in practice.) A linkage for F is a set L of links such that the first two fields of L are a
key for L,and likewise for the second two fields.It is not required that every port of a
service S occur in L.
In this paper we will assume that a linkage L is established at the time of designing
a systemof interoperating services,and that L does not change at runtime.
Local &Queue Store,Transmit,Read,Has-seen.Let S be a non-client web service.
The local store LStore
S
of S is a finite set of typed variables.For each incoming port
(m,in) of S we assume that there is a distinguished boolean variable π
m
in LStore
S
(which is set true if there is at least one message in the queue.) Also,each non-client
service S has a queue store QStore,used to hold the parameter values of incoming
10
messages,which can be thought of as being held by a queue.(We focus on queues of
length 1.)
As illustrated in Section 2,for passing messages between services we have two ba-
sic operations:transmit and read,denoted using!mand?m,respectively.A transmit
is based on an explicit step of the sending service,and is reflected as an asynchronous
receive at the receiving service.In Colombo
k,b
,a transmit will block if the corre-
sponding queue of the receiver is full.(An alternative is to view the send as failed and
let the sending service continue with other activities.) Similarly,in Colombo
k,b
the
read operation will block until there is something in the appropriate queue (although
other semantics are possible).
With regards to client services in Colombo
k,b
,we bundle the receive and the read
as just receive.We do not model the local or queue stores of clients,but maintain sim-
ply a unary relation,denoted HasSeen or HasSeen
C
,which holds elements of Dom.
Intuitively,at a given time in an execution of C,HasSeen
C
will include all of con-
stants appearing in service specification (Constants
C
),and also all domain elements
that occur in messages that have been transmitted to C.
Abstract Model of Internal Service Process.In Colombo
k,b
,a guarded automaton
is a tuple (Q,δ,F,LStore,QStore) where Q is a finite set of states,F ⊂ Q is a set
of final states,and LStore (QStore) is the local (queue) store.The transition function
δ contains tuples (s,c,μ,s

) where s,s

∈ Q,c is a condition over LStore ∪ QStore
(no access to the world instance),and μ is either a send,a read,or an atomic process
invocation.The non-client services have deterministic signature,i.e.,it is assumed that
for each state in Q,store contents and a world instance,at most one out-going tran-
sition can be labeled with a condition that evaluates to true.The Guarded Automaton
signature of (non-client) service S is denoted GA(S).
In Colombo
k,b
,we assume for a client C that in GA(C) there are exactly two
states,called ReadyToTransmit and ReadyToRead,where the first is the start state
and also the final state.In Colombo
k,b
the client will toggle between the two states.
We use the “has-seen” set HasSeen as an abstract representation of constants that the
client has seen so far.The clients are non-deterministic,in terms of the message they
choose to read,and in terms of the values they transmit.
The moves-to relation  will hold between pairs of the form (id
S
,I),(id
S

,I

),
where id
S
,id
S

are instantaneous descriptions (id’s) for S and I,I

are world in-
stances.This is defined in the usual way.The trace of a pair (id
S
,I),(id
S

,I

)
where (id
S
,I) 
S
(id
S

,I

) will provide,intuitively,a grounded record or log of
salient aspects of the transition from (id
S
,I) to (id
S

,I

),including,e.g.,what pa-
rameter values were input/output froman atomic process invocation,or were received,
read or sent.
For clients,an id is a pair of form (s,HasSeen).The moves-to relation and trace
are defined for clients in the natural manner.
System Execution and Equivalence.In general we focus on a system,which is a
triple S = (C,F,L),where C is a client,F = {S
1
,...,S
n
} is a finite family of web
services,and L is a linkage for (C,F) (i.e.,for {C} ∪ F).
11
For this paper we make the assumption of No External Modifications:when dis-
cussing the execution of one or more systems S
1
,...,S
k
,we assume that no other
systems can modify the relations in the world schema that are accessed by the execu-
tions of S
1
,...,S
k
.
The notion of (initial) instantaneous description (id) for system S is defined in a
natural fashion to be a tuple id
S
= (id
C
,{id
S
| S ∈ F}),based on a generalization
of id for individual services.The moves-to relation for system S,denoted 
S
or ,is
defined as a natural generalization of  for clients and services.More specifically,we
have (id
S
,I)  (id
S

,I

) if (written informally)
(i) If a service performs an atomic process or a read,that is the only service that
moves.For an atomic process the world instance can change,and for the read it
cannot change.
(ii) If a service performs a transmit,then the target of that transmit (according to
L) performs a receive in the same move.In this case the world instance cannot
change.
In case (i),the trace of pair (id
S
,I)  (id
S

,I

) is the trace of the individual service
that changed;in case (ii),the trace is the pair (!m(c
1
,...,c
n
),?n(c
1
,...,c
n
)) where
the!mpart is the trace of the sending service and the?n part is the trace of the receiving
service.
An enactment of S is a finite sequence E = (id
1
,I
1
),...,(id
q
,I
q
),q ≥ 1,
where (a) id
1
is an initial id for S,and (b) (id
p
,I
p
)  (id
p+1
,I
p+1
) for each p ∈
[1..(q − 1)].The enactment is successful if id
n
is in a final state of GA(C) and each
GA(S).
The notion of execution tree for S is,intuitively an infinitely branching tree T
that records all possible enactments.The root is not labeled,and all other nodes
are labeled by pairs of form (id,I) where id is an id of S and I a valid world in-
stance.For children of the root,the id is the initial id of S and I is arbitrary.An
edge ((id,I),(id

,I

)) is included in the tree if (id,I)  (id

,I

);in this case the
edge is labeled by trace((id,I),(id

,I

)).A node (id,I) in the execution tree is
terminating if id is in a final state of GA(C) and each GA(S).
The essence of T,denoted essence(T ),is a collapsing of T,created as follows.
The root and its children remain the same.Suppose that v
1
is a node of T that is also
in essence(T ),and let v
1
,...,v
n
,v
n+1
,n ≥ 1,be a path,where trace(v
i
,v
i+1
)
for each i ∈ [1..n] involves message transmits or reads not involving the client,and
trace(v
n
,v
n+1
) involves an atomic process invocation or a transmit to or from the
client.Then include edge (v
1
,v
n+1
) in essence(T ),where v
n+1
has the same label
as in T,and the this edge is labeled with trace(v
n
,v
n+1
).
Note that for a system S = (C,F,L) each pair of execution trees T and T

of S
are isomorphic,and also essence(T ) and essence(T ) are isomorphic.
Suppose now that world schema W and global constraints Σ are fixed,and let A
be an alphabet of atomic processes.Let S = (C,{S | S ∈ F},L) and S

= (C,{S |
S ∈ F

},L

) be two systems over W,Σ,A,and over the same client C.
We say that S is equivalent to S

,denoted S ≡ S

if for some (any) execution trees
T,T

of S,S

,respectively,we have that essence(T ) is isomorphic to essence(T

).
12
Intuitively,this means that relative to what is observable in terms of client messag-
ing and atomic process invocations (and their effects),the behaviors of S and S

are
indistinguishable.
4 The Composition Synthesis ProblemStatement
In this section we formally define the composition synthesis problem,and also a spe-
cialized version of this called the choreography synthesis problem.We then state our
main results,giving decidability and complexity bounds for composition and choreog-
raphy synthesis in the restricted context of Colombo
k,b
.The proofs for these results
are sketched in Sections 5 and 6.
For this section we assume that a world schema W,global constraints Σ,and an
alphabet Aof atomic processes are all fixed.
For both synthesis problems,assume that a family of available (or pre-defined)
services operating over A is available (e.g.,in an extended UDDI directory).We also
assume that there is a “desired behavior”,described using a specialized system.In
paricular,a goal system is a triple G = (C,{G},L) where C is a client;G is a web
service over alphabet A,called the goal service;and L is a linkage involving only C
and G.
In the general case,given goal system G = (C,{G},L),the composition syn-
thesis problem is to (a) select a family S
1
,...,S
n
of services from the pre-existing
set,(b) construct a web service S
0
(the “mediator”) which can only send,receive and
read messages,and (c) construct a linkage L

over C,S
0
,S
1
,...,S
n
such that G and
S = (C,{S
0
,S
1
,...,S
n
},L

) are equivalent.The choreography synthesis problem
is to (a) select a family S
1
,...,S
n
of services from the pre-existing set,and (b’) con-
struct a linkage L

over C,S
1
,...,S
n
such that G and S = (C,{S
1
,...,S
n
},L

) are
equivalent.
Decidability of the composition and choreography synthesis problems remains open
for most cases of the general Colombo framework.We describe now a family of re-
strictions,in the context of Colombo
k,b
,under which we can acheive decidability and
complexity results for these problems.We feel that the results obtained here are them-
selves quite informative and non-trivial to demonstrate,and can also help showthe way
towards the development of less restrictive analogs.
Let G = (C,{G},L) be a goal system.Two key assumptions of the goal system
are as follows:
Blocking behavior:(a) For each available service,if a state can be entered by a transi-
tion involving a message send,then the service either terminates at that state,or blocks
and waits at that state for a message receive.(b) The client initiates by sending a mes-
sage,and upon message receipt it either halts or sends a message.
Bounded Access:(a) There is a k > 0,such that in any enactment of the client C,
the number of values that can be sent out is ≤ k + the number of values that are re-
cieved by C.(b) For each p > 0 there is a q > 0 such that in each enactment of G,if
at most p new values come fromthe client,then only q distinct key-based searches can
13
be executed by the atomic process invocations in G.
The first restriction prevents concurrency in our systems,and the second one ensures
that in any enactment of G,only a finite number of domain values are read (thus pro-
viding a uniformbound on the size of the “active domain” of any enactment).
For the case of composition synthesis,we restrict the form of mediators and link-
ages that we will look for,as follows:
Strict Mediation:A system S = (C,{S
0
,S
1
,...,S
n
},L

) is strict mediation if in
L

all messages are either sent by the mediator S
0
or received by the mediator.
We also make a simplifying assumption that essentially blocks services outside of the
relevant system(s) frommodifying the world state.
Finally,we say that a mediator service is (p,q)-bounded if it has at most p guarded
automata states and at most q variables in its global store.
Theorem4.1:Assume that all services are in Colombo
k,b
,and assume No Ex-
ternal Modifications.Let G = (C,{G},L) be a goal system and U a finite family
of available web services,all of which satisfy Blocking Behavior and Bounded Ac-
cess.For each p,q it is decidable whether there is a set {S
1
,...,S
n
} ⊆ U and
a (p,q)-bounded mediator S
0
,and linkage L

satisfying Strict Mediation,such that
S = (C,{S
0
,S
1
,...,S
n
},L

) is equivalent to G.An upper bound on the complexity
of deciding this,and constructing a mediator if there is one,is doubly exponential time
over the size of p,q,G and U.
We expect that the complexity bound can be refined,but this remains open at the
time of writing.More generally,we conjecture that a decidability result and complexity
upper bound can be obtained for a generalization of the above theorem,in which the
bounds p,q do not need to be mentioned.In particular,we believe that based on G and
U there are p
0
,q
0
having the property that if there is a (p,q)-bounded mediator for any
p,q,then there is a (p
0
,q
0
)-bounded mediator.
We now describe how the choreography synthesis problem can be reduced to a
special case of the composition synthesis problem.Let G = (C,{G},L) be a goal
system.Suppose that there is a solution S = (C,{S
1
,...,S
n
},L

) for the choreogra-
phy synthesis problem.Then we can build a mediator S
0
and Strict Mediation linkage
L

so that (a) S
0
has exactly one state,(b) the local store of S
0
has only variables
of the form π
m
(which record whether a message of type m has been received),and
S

= (C,{S
0
,S
1
,...,S
n
},L

) is equivalent to G.The converse also holds.Finally,
note that the size of the global store of mediator S
0
is bounded by the total number of
types of message that can be sent by the family U of available services.
From these observations and a minor variation on the proof technique of Theo-
rem4.1 we can obtain the following.
Theorem4.2:Assume that all services are in Colombo
k,b
,and assume No External
Modifications.Let G = (C,{G},L) be a goal systemand U a family of available web
services,all of which satisfy Blocking Behavior and Bounded World State Access.
14
It is decidable whether there is a set {S
1
,...,S
n
} ⊆ U and a linkage L

such that
S = (C,{S
1
,...,S
n
},L

) is equivalent to G.An upper bound on the complexity of
deciding this,and constructing a mediator if there is one,is doubly exponential time
over the size of G and U.
5 FromInfinite to Finite:the Case Tree
This section develops a key aspect needed for the proofs of Theorems 4.1 and 4.2,
namely,it allows us to reason over a finite universe of domain values,rather than over
the infinite universe Dom.The essence of the technique is that instead of reasoning
over (the infinitely many) concrete values in Dom,we reason over a finite,bounded
set of symbolic values.The technique for achieving this reduction is inspired by an
approach taken in [14].A key enabler for the reduction is the assumption that in
Colombo
k,b
services,all conditions and data accesses rely on key-based look-ups;
another enabler is the assumption of Bounded Access.
As part of the construction,we will create “symbolic images” of most of the con-
structs that we currently have for concrete values.For example,corresponding to a
concrete world state I we will have symbolic world state
￿
I,corresponding to a moves-
to relation  in the concrete realmwe shall have a moves-to relation
￿
 in the symbolic
realm,etc.In particular,given a (concrete) execution tree T for some system S of
services,which has infinite branching,it will turn out that the corresponding symbolic
execution tree
￿
T will have a strong (homomorphic) relationship to T,but have finitely
bounded branching.In general,results that hold in the concrete realmwill have analogs
in the symbolic realm.
We assume an infinite set Symb of symbolic values (disjoint fromDom);these will
sometimes behave as values,and other times behave as variables.
Let C be a finite set of constants in Dom and Y a finite set of symbolic values.
Let Atoms(Y,C) be the set of all atoms over Y,C.This includes expressions of the
following forms:
1.incorp(y),with intuitive meaning that symbolic value y has been “incorpo-
rated” into an enactment;
2.bool(y),eq(y) and leq(y),indicating intuitively the domain type associated
with y.
3.y = T and y = F (can be true only if incorp(y) and bool(y)).
4.y = y

(can be true only if y and y

“have” been incorporated and “have” the
same type).
5.y < y

,y > y

(can be true only if leq(y) and leq(y

)).
An sv-characterization (svc for short) for Y,C is a maximal consistent conjunction
over Atoms(Y,C) and their negations.(Informally,the notion of “consistency” here
prevents,e.g.,eq(y) and leq(y),y < y

and y

< y,etc.) Note that we do not allow
15
any y to “have” the value ω.This is because symbolic values range exclusively over
concrete elements of Dom.
Let Y,C be fixed,and σ:Y → Dom.Then there is a unique svc ￿γ such that
￿γ[σ] is true.We denote this svc as svc(σ).There is a natural equivalence relation

Y,C
between assignments from Y to Dom,defined by σ ∼
Y,C
σ

iff for all atoms
a ∈ Atoms(Y,C),a[σ] iff a[σ

].Note that this is equivalent to stating that svc(σ) =
svc(σ

)
Conversely,for an svc ￿γ,it is possible to construct a mapping σ:Y →Dom such
that svc(σ) = ￿γ.
Let Y,C be fixed,where C includes at least all constants occurring in service S.
Let ￿γ be an svc over Y,C.Then an assignment ￿α:LStore
S
→ (Y ∪ {T,F,ω}) is
valid for ￿γ if (i) ￿α(v) ∈ Y ∪ {ω} for v’s not of form π
m
,and ￿α(π
m
) ∈ Bool for
each variable of formπ
m
;(ii) ￿γ |= incorp(￿α(v)) for each v not of formπ
m
;and (iii)
￿γ |= bool(￿α(v)) iff v is of type Bool,and likewise for eq and leq.The notion of
assignment
￿
β:QStore
S
→Y ∪{ω} being valid is defined analogously.
A symbolic id of service S is a 4-tuple
￿
id = (s,￿α,
￿
β,￿γ) where ￿γ is an svc,and ￿α,
￿
β are valid assignments over LStore and QStore for ￿γ.
We now turn to symbolic tuples,relational instances,and world states.A symbolic
tuple has formτ
1
,...,τ
n
,where τ
i
∈ Symb ∪Dom for each i ∈ [1..n].
Let R(A
1
,...,A
m
;B
1
,...,B
n
) be a relation schema in the world schema,with
key A
1
,...,A
m
.The notion of “symbolic instance” of R abstractly represent the set
of tuples that have been “visited” in R.We must also keep track of tuples that are
currently “not in” R,which corresponds to tuples that have been deleted from R by
some atomic execution.Formally,a symbolic instance of R is a pair (In
R
,Out
R
),
where In
R
is a finite set of symbolic tuples over A
1
,...,A
n
,B
1
,...,B
m
,and Out
R
is a set of symbolic tupls over A
1
,...,A
n
.The instance (In
R
,Out
R
) is well-formed
for svc ￿γ if (informally)
1.if ￿γ |= ¬incorp(y
i
),then y
i
should not appear in In
R
nor Out
R
;
2.π
A
1
,...,A
n
(In
R
) ∩Out
R
is empty;
3.In
R
is closed under the tuple-generating dependencies having the form
R(τ
1
,...,τ
n

1
,...,η
m
) ∧τ
j
= τ

j
→R(τ
1
,...,τ

j
,...,tau
n

1
,...,η
m
)
Intuitively,we are “closing” the symbolic instance to include all tuples that are
equivalent under equalities implied by ￿γ.;
4.In
R
“satisfies” the key dependency A
1
,...,A
n
→ B
1
,...,B
m
“modulo the
equalities in ￿γ.
In the following we consider only well-formed symbolic instances.
Let ￿γ be an svc over Y,C.A (valid) symbolic instance of world schema W is a
mapping
￿
I that maps each relation R ∈ W into a well-formed symbolic instance of R
over Y,C.(We also write,e.g.,I(In
R
) to refer to the In component of I(R).)
Given an execution tree T of a system S in Colombo
k,b
satisfying the restric-
tions mentioned in Section 4,we can inductively build up a symbolic execution tree
16
￿
T that correspond to T but using symbolic values,symbolic ids,and symbolic world
states.We let Y be a set of symbolic values which is “large enough” to accomodate the
(bounded) number of look-ups that might occur in an execution of S,and let C be the
set of all constant values occurring in the specification of S.At the root and children of
the root the associated svc ￿γ will satisfy ¬incorp(y) for all symbolic values y.Intu-
itively,as we proceed down a path of
￿
T,we will extend ￿γ to incorporate symbolically
the concrete values that have been read fromthe world state by atomic process invoca-
tions.Along each path the value of ￿γ is refined by “incorporating” newsymbolic values
and assigning for them relationships to the other incorporated symbolic values and to
C.This process is additive or monotonic,in the sense that once a symbolic value y is
incorporated into ￿γ its relationships to the other previously incorporated symbolic val-
ues does not change.After an atomic process invocation we may also have to modify
the symbolic instances (In
R
,Out
R
) for each R in the world schema.
Asubtlety in extending the svc ￿γ is that we must avoid running out of symbolic val-
ues.Suppose that
￿
I is a symbolic instance and ￿γ an svc.Let R(A
1
,...,A
n
;B
1
,...,B
m
)
have key A
1
,...,A
n
.We say that (￿γ,
￿
I) knows f
R
j

1
,...,τ
n
) (where the τ
i
’s range
over Y ∪C) if τ
1
,...,τ
n
 ∈ π
A
1
,...,A
n
(
￿
I(In
R
)).
Based on the above definitions,it is now possible to define the moves-to relation
between symbolic ids of a service S.We focus on atomic process invocations here.
Speaking informally,suppose that there is a transition from state s via atomic process
a(u
1
,...,u
n
;v
1
,...,v
m
).We describe when ((s,￿α,
￿
β,￿γ),
￿
I)
￿
((s

,
￿
α

,
￿
β

,
￿
γ

),
￿
I

) will
hold.First note that there is non-determinismhere,corresponding to the “new” values
that are read by the conditions or updates performed by a.For each family of non-
deterministic choices,a new
￿
γ

and
￿
I

is constructed,corresponding to “new” values
seen and taking advantage of what (￿γ,
￿
I) “knows”.Then,for each conditional ef-
fect (c,E) whose condition is “true” for (
￿
γ

,
￿
I

),a pair (
￿
γ

,
￿
I

) is constructed,where
￿
γ

=
￿
γ

,and
￿
I

is constructed from
￿
I

according to the effect E.The relation
￿
 for
systems S is defined analogously.
We summarize our overview of this reduction from infinite to finite with the fol-
lowing.
Lemma 5.1:(Informally stated) Let S be a systemof services in Colombo
k,b
,and T
an execution tree for S,and let symbolic execution tree
￿
T be constructed as described
above.Then there is a homomorphism h from T to
￿
T with the following properties:
(i) h “preserves levels” (i.e.,the depth of node h(n) in
￿
T is the same as the depth of n
in T.(ii) If n is labeled by (id,I),then h(n) is labeled by (
￿
id,
￿
I) with svc ￿γ,where
(￿γ,
￿
I) is “consistent” with I (and also with the world state accesses that have occurred
in the history above n).(iii) If n

is a child of n in T,then the
￿
 relation holds between
the labels of h(n) and h(n

) in
￿
T.
Importantly,the symbolic execution tree
￿
T described in the preceding lemma has
bounded branching.
17
6 Characterization of Composition Synthesis in PDL
To complete the proofs of Theorems 4.1 and 4.2 we show now how the composition
synthesis problem can be characterized by means of a Proportional Dynamic Logic
formula (PDL).For the necessary details about PDL,we refer to the Appendix and
to [9,12].
The intuition behind the encoding of composition synthesis in PDL,is the follow-
ing:The execution of the various services that participate to the composition is com-
pletely characterized,in the sense that a model of the formula corresponds to a single
execution tree of the system,in which the mediator activates the component services
by sending them suitable messages,and the component services execute the actions
of the goal while exchanging messages with the mediator.In fact,a model of the for-
mula simultaneously represents both the execution of the component services,and the
execution of the goal specification.
The set of non-deterministic outcomes that can be obtained every time an atomic
process is executed by a component service (and by the goal) corresponds to the set of
children nodes in the model of the PDL formula.
The only part of the execution that is left unspecified by the PDL formula is the
execution of the mediator to be synthesized.Since the execution of the mediator is
characterized by which messages are sent to which component services (and conse-
quently,also by which messages are received in response),the PDL formula contains
suitable parts that “guess” such messages,including their receiver.In each model of
the formula,such a guess will be fixed,and thus a model will correspond to the speci-
fication of a mediator realizing the composition.
More precisely,the PDL formula we construct consists of (i) a general part impos-
ing structural constraints on the model,(ii) a description of the initial state of each of
the service,the goal,and the mediator,and (iii) a characterization of what happens ev-
ery time an action is performed.In particular we have to consider the following types
of actions:
1.client sends message,
2.client reads message,
3.mediator/goal sends message to client,
4.mediator/goal reads message fromclient,
5.mediator sends message to component service,
6.mediator reads message fromcomponent service,
7.service sends message to mediator,
8.service reads message frommediator,
9.service/goal executes atomic process.
For lack of space,here we will only give some hints on how the PDL encoding is
defined.Some more details can be found in Appendix B.In specifying the encoding,
we make use of the following meta-variables representing suitable PDL sub-formulas:
(i)
￿
￿α denotes the PDL representation of an assignment over the set of variables of both
the local stores LStore and the queue stores QStore of all services,including the goal.
We also use
￿
￿α
p
to denote the part of
￿
￿α relative to service p,for p ∈ {0,1,...,n,g};
18
(ii)
￿
￿γ denotes the PDL representation of the sv-characterization ￿γ;(iii)
￿
￿
I denotes the
PDL representation of a world state instance.
We make use of one proposition st
i
j
for each state j of the guarded automaton
for service S
i
(all these are pairwise disjoint),and of one proposition exec
i
,for each
service S
i
(either the mediator,a component service,or the goal),intended to be true
when service S
i
is executing.
To determine the execution of the mediator,we will use the following “guessed”
propositions:DO(!m) (resp.,DO(?m)),stating that next a send (resp.,a read) by
the mediator will be performed
1
;NEXT(st
0
i
),stating that the mediator will make a
transition to state i;MAP(

q
0
m
, u),stating that the mediator reads a message m us-
ing variables u as output parameters for the message;MAP( u,

q
i
m
),stating that the
mediator sends a message mto service S
i
using variables u as input parameters.
As an example of the kind of (sub) formulas we use,consider the characterization
of executing an atomic process.Lets assume that the service S
i
is executing mimicking
the call of an atomic process in the goal S
g
.In particular,let S
i
be in the state st
i
h
with
a transition labeled by a guarded action φ/a(

x
i
;

y
i
) getting to a state st
i
h

and let the
goal S
g
be in st
g
k
with a transition labeled by a guarded action φ

/a(

x
g
;

y
g
) getting to
a state st
g
k

;and let us assume that both φ and φ

evaluate to true wrt assignment
￿
￿α and
svc
￿
￿γ.Then we have
[∗]((exec
i
∧exec
g
∧st
i
h
∧st
g
k

￿
￿γ ∧
￿
￿α ∧
￿
￿
I) →
a∧[−a]⊥∧
[a](st
i
h

∧st
g
k

) ∧
￿
(
￿
￿
γ

,
￿
￿
α

,
￿
￿
I

)∈E
a(
￿
￿
γ


￿
￿
α


￿
￿
I

) ∧
[a](
￿
(
￿
￿
γ

,
￿
￿
α

,
￿
￿
I

)∈E
￿
￿
γ


￿
￿
α


￿
￿
I

)
[a](exec
i
∧exec
g
))
where each (
￿
￿
γ

,
￿
￿
α

,
￿
￿
I

) ∈ E is the PDL representation of a triple (
￿
γ

,
￿
α

,
￿
I

) such that
for the action a(

x
i
;

y
i
)/a(

x
g
;

y
g
) we have that (￿γ,￿α,
￿
I)  (
￿
γ

,
￿
α

,
￿
I

),where
￿
￿
α

i
and
￿
￿
α

g
are the only parts of
￿
￿
α

that may be different from
￿
￿α.
This formula states that every time S
i
and S
g
are executing and they are in states
st
i
h
and st
g
h

respectively,and
￿
￿α and
￿
￿γ hold,then:(i) the atomic process a is activated
next (and no other action are possible);(ii) executing a leads S
i
and S
g
to the states st
i
k
and st
g
k

,respectively;(iii) there is an execution branch for each (
￿
￿
γ

,
￿
￿
α

,
￿
￿
I

) ∈ E;(iv)
the only possible next (
￿
￿
γ

,
￿
￿
α

,
￿
￿
I

) must be in E;(v) the service S
i
and the goal S
g
will
continue executing next.
As another example,consider the case where the mediator sends a message to a
service S
i
.Among the others we will have the following subformula:
1
In fact,due to Strict Mediation,DO(?m) is completely determined by the execution of a send by a
component service.
19
[∗](exec
0
∧DO(!m) ∧
￿
￿α ∧
￿
i
MAP( u,

q
i
m
) →
[!m](
￿
￿
α

∧exec
i
∧¬exec
g
))
where
￿
￿
α

is the result of updating
￿
￿α by assigning the values in the variables u of the
local store of S
0
to the the queue

q
i
m
associated with the port for the message mof the
service S
i
.Notice that in fact the only part of
￿
￿
α

that is different from
￿
￿α is the part
relative to the port variables for message min service S
i
.
This formula states that,if the mediator is executing with current assignment
￿
￿α and
it guessed to send the message!m with parameters u to the service S
i
,then next the
assignment would be changed to
￿
￿
α

that differs from
￿
￿α for the values assigned to the
port for min S
i
.Also next the execution is left to S
i
while the goal will not be (in fact
will continue not to be) in execution.
Finally,among the structural part of the formula,prominent parts are those of the
form
∗(exec
0
∧st
0
i

￿
￿α
0

￿
￿γ ∧DO(!m)) →
[∗](exec
0
∧st
0
i

￿
￿α ∧
￿
￿γ →DO(!m))
which state that a guessed proposition,DO(!m) in this case,must assume the same
value everywhere the mediator is executing in a certain state st
0
i
with a certain assign-
ment
￿
￿α
0
for its LStore and QStore and with a certain sv-characterization
￿
￿γ.
Lemma 6.1:Assume that all services are in Colombo
k,b
,and assume No External
Modifications.Let G = (C,{G},L) be a goal systemand U a finite family of available
web services,all of which satisfy Blocking Behavior and Bounded Access.For each p,
q,let Φ
G,U
p,q
be the PDL formula constructed as above.Then,if Φ
G,U
p,q
is satisfiable,there
exists a systemS = (C,{S
0
,S
1
,...,S
n
},L

),where S
0
is a (p,q)-bounded mediator,
S
1
,...,S
n
∈ U,and the linkage L

satisfies Strict Mediation,that is (symbolically)
equivalent to G.
Indeed,by the tree-model property of PDL,if Φ
G,U
p,q
is satisfiable,then it admits a tree-
like model.From such a model we can extract directly a symbolic execution tree for
the goal and for S.To determine which services actually take part in the composition,
it is sufficient to consider those services S
i
for which exec
i
is true at least once.
Observe that,from a model of Φ
G,U
p,q
,one can directly obtain also a specification
of S
0
.This can be done by considering for each of the p states of S
0
and for each
value of
￿
￿α
0
and
￿
￿γ,which of the guessed propositions are true.(Notice that the part of
the PDL formula related to such guesses ensures that the state together with
￿
￿α
0
and
￿
￿γ
determines once and for all the value of the guessed propositions in the whole model.)
Fromthe guessed propositions one can define the transitions of the guarded automaton
for S
0
,extracting from
￿
￿α
0
and
￿
￿γ the guards,and fromthe DO and MAP propositions
(see Appendix B) the actions and their parameters respectively.Considering that the
local store and the queue store for a (p,q)-bounded mediator whose linkage satisfies
Strict Mediation are pre-determined,this provides a complete characterization of the
mediator.
20
7 Conclusion and Future Work
In this paper we have presented Colombo,a framework for automatic web service
composition,that addresses (i) message exchanges,(ii) data flow management,and
(iii) effects on the real world,thus unifying the main approaches that are currently
undertaken by the research community for the service composition problem.Through
a complex example we have shown all the peculiarities of the approach.We have
presented a novel technique,based on case tree building and on an encoding in PDL,
for computing the composition of web services.
In future work we will remove some of the assumptions that we considered in this
work (characterizing Colombo
k,b
).We will consider complex types (i.e.,arbitrary
XML data types that can be transmitted between services),more general accesses to
data stores and queues of arbitrary,but yet finite,length.
Acknowledgement
The authors would like to thank Maurizio Lenzerini,Jianwen Su and the members of
the SWSL working group for valuable discussions.
References
[1] G.Alonso,F.Casati,H.Kuno,and V.Machiraju.Web Services.Concepts,Archi-
tectures and Applications.Springer,2004.
[2] T.Andrews,F.Curbera,H.Dholakia,Y.Goland,J.Klein,F.Leymann,K.Liu,
D.Roller,D.Smith,S.Thatte,I.Trickovic,and S.Weerawarana.Business Pro-
cess Execution Language for Web Services (BPEL4WS) -Version 1.1.http:
//www-106.ibm.com/developerworks/library/ws-bpel/,2004.
[3] Ariba,Microsoft,and IBM.Web Services Description Language
(WSDL) 1.1.Available on line:http://www.w3.org/TR/2001/
NOTE-wsdl-20010315,2001.
[4] C.Batini and M.Mecella.Enabling Italian e-Government Through a Cooperative
Architecture.IEEE Computer,34(2):40–45,2001.
[5] D.Berardi,D.Calvanese,G.De Giacomo,M.Lenzerini,and M.Mecella.Au-
tomatic Composition of e-Services that Export their Behavior.In Proceedings of
the 1st International Conference on Service Oriented Computing (ICSOC 2003),
volume 2910 of LNCS,pages 43–58.Springer,2003.
[6] T.Bultan,X.Fu,R.Hull,and J.Su.Conversation Specification:ANewApproach
to Design and Analysis of E-Service Composition.In Proceedings of the 12th
International World Wide Web Conference (WWW2003),pages 403–410.ACM,
2003.
21
[7] G.De Giacomo and M.Mecella.Service Composition.Technologies,Methods
and Tools for Synthesis and Orchestration of Composite Services and Processes.
Tutorial at 2nd International Conference on Service Oriented Computing (ICSOC
2004),2004.
[8] A.Deutsch,L.Sui,and V.Vianu.Specification and Verification of Data-driven
Web Services.In Proceedings of the 23nd ACMSIGACT SIGMODSIGART Sym-
posium on Principles of Database Systems (PODS 2004),pages 71–82.ACM,
2004.
[9] M.J.Fischer and R.E.Ladner.Propositional dynamic logic of regular programs.
Journal of Computer and System Sciences,18:194–211,1979.
[10] X.Fu,T.Bultan,and J.Su.Analysis of interacting BPEL web services.In
Proceedings of the 13th International World Wide Web Conference (WWW2004),
pages 621–630.ACM,2004.
[11] B.Grosof,M.Gruninger,M.Kifer,D.Martin,D.McGuinness,B.Par-
sia,T.Payne,and A.Tate.Semantic Web Services Language Require-
ments.http://www.daml.org/services/swsl/requirements/
swsl-requirements.shtml,2004.
[12] D.Harel,D.Kozen,and J.Tiuryn.Dynamic Logic.The MIT Press,2000.
[13] P.Helland.Data on the outside versus data on the inside.In CIDR,pages 144–
153,2005.
[14] R.Hull and J.Su.Domain independence and the relational calculus.Acta Infor-
matica,31(6):513–524,1994.
[15] D.Martin,M.Paolucci,S.McIlraith,M.Burstein,D.McDermott,D.McGuin-
ness,B.Parsia,T.Payne,M.Sabou,M.Solanki,N.Srinivasan,and K.Sycara.
Bringing Semantics to Web Services:The OWL-S Approach.In 1st International
Workshop on Semantic Web Services and Web Process Composition (SWSWPC
2004),2004.
[16] S.McIlraith,T.Son,and H.Zeng.Semantic Web Services.IEEE Intelligent
Systems,16(2):46 – 53,2001.
[17] R.Reiter.Knowledge in Action:Logical Foundations for Specifying and Imple-
menting Dynamical Systems.The MIT Press,2001.
[18] E.Sirin,B.Parsia,D.Wu,J.Hendler,and D.Nau.HTNplanning for Web Service
composition using SHOP2.J.Web Sem.,1(4):377–396,2004.
[19] P.Traverso and M.Pistore.Automated Composition of Semantic Web Services
into Executable Processes.In Proceedings of the Third International Semantic
Web Conference,pages 380–394,2004.
22
A Selected Formal Details of the Model
This appendix includes some additional formal definitions for selected aspects of the
Colombo framework and the Colombo
k,b
model,and augments the material pre-
sented in Section 3.
A.1 Atomic Processes
Remark A.1:Unlike OWL-S atomic processes,we do not use a “pre-condition”,
or equivalently,we assume that the pre-condition is uniformly true.We do this to
enable a more uniform treatment of atomic process executions:when a web service
invokes an atomic process in Colombo,the invoking service will transition to a new
state whether or not the atomic process “succeeds”.Optionally,the designer of the
atomic process can include an output boolean variable ‘flag’,which is set to true if
the execution “succeeeded” and is set to fales if the execution “failed”.These are
conveniences that simplifies book-keeping,with no real impact on expressive power.

Definition:An atomic process is an object p which has a signature of form(I,O,CE)
with the following properties.
Input Signature:I is a sequence u
1
:d
1
,...,u
n
:d
n
 where the u
i
’s are distinct
variables,and each d
j
∈ {Bool,‘=’,‘≤’}.For example,if d
i
= ‘=’,then the value
associated with v
i
in an invocation of p should be an element of Dom
=
(or ω).
Output Signature:O is a sequence v
1
:d
1
,...,v
m
:d
m
 where the v
j
’s are distinct
variables and each d
j
∈ {Bool,Eq,Leq}.For example,if d
j
= Eq,then the value
assigned to v
j
by an invocation of p will be an element of Dom
=
(or ω).
Conditional Effects:CE is a set of pairs of form (c,E),where c is a (atomic process)
condition and E is a finite non-empty set of (atomic process) effect (specifications).
Condition c is a boolean expression over atoms over accessible terms over some family
of constants and the variables u
1
,...,u
n
.
An effect e ∈ E is a pair (es,ev) where:
Effect on World State:es is a set of expressions having the forms
(i) insert R(t
1
,...,t
k
;s
1
,...,s
l
)
(ii) delete R(t
1
,...,t
k
)
(iii) modify R(t
1
,...,t
k
;r
1
,...,r
l
)
where R ranges over relations in the world schema,R has key of length k and l addi-
tional columns,where the t
i
’s and s
j
’s are accessible terms over some set of constants
and u
1
,...,u
n
,and where each r
j
is either an accessible term over some set of con-
stants and u
1
,...,u
n
or the special symbol ‘−’ (denoting that that position of the
identified tuple in R should be unchanged).
Effect on Output Variables:ev is a set of expressions of the form
(iv) v
j
:= t,where j ∈ [1..m] and t is an accessible termover some set of constants
and u
1
,...,u
n
,
23
(v) v
j
:= ω,where j ∈ [1..m]
There must be exactly one expression for each v
j
,j ∈ [1..m].
We now describe the semantics associated with atomic process execution.An
atomic process p with characteristics as specified above will be invoked in the con-
text of (i) an assignment α over a set X of variables (which typically corresponds to
the local store of a web service);(ii) a world state I,and (iii) a family Σ of integrity
constraints on the world schema.
A semantics is associated with the execution of p as follows.(This semantics is
straightforward but intricate,so we include the detailed definition to avoid ambiguity.)
Atomic process p is invoked in the context of variable set X using an expression having
form p(y
1
,...,y
n
;z
1
,...,z
m
) where the y
i
’s are distinct elements of X,and the z
j
’s
are distinct elements of X.The result of executing this specification will depend on
α,I,and Σ,and will result in an assignment α

and world state I

(which may be
identical to α and I,respectively).
(a) If no conditions in CE are true in I under α then this execution of p has a “no-
op” effect,i.e.,α

is simply α and I

is simply I.If two or more conditions in
CE are both true under α then again this execution of p has no-op effect.
For the remainder,assume that (c,E) is the pair in CE where c is the unique
condition in CE that is true in I under α.Assume further that (es,ev) is a (non-
deterministically chosen) element of E.
(b) If in any of the insert,delete and/or modify expressions,as interpreted using
α,there is an ω value occurring in a key field,then execution of p has no-op
effect.
(c) If there is a “conflict” between any of the insert,delete and/or modify ex-
pressions in es,as interpreted using α (e.g.,the expressions under α call for
inserting two tuples with the same key,or inserting a tuple with a given key but
also deleting a tuple with that key,etc.),then execution of p has no-op effect.
We now define the potential effect (on the world state) of executing es,assuming
assignment α,world state I,and constraint set Σ,and assuming that item (c) above
does not apply.After defining the notion of potential effect,we describe the conditions
under which it will actually be applied (namely,if applying it does not violate any
constraints in Σ.)
(d) For each expression insert R(t
1
,...,t
k
;s
1
,...,s
l
),in I

the tuple
α(t
1
),...,α(t
k
),α(s
1
),...,α(s
l
)
is in R.If there was different tuple α(t
1
),...,α(t
k
),c
1
,...,c
l
 in R in I,that
tuple is not present in R in I

.
(e) For each expression delete R(t
1
,...,t
k
),in I

there is no tuple in R with first
k fields being α(t
1
),...,α(t
k
).
24
(f) For each expression modify R(t
1
,...,t
k
;r
1
,...,r
l
),there are two cases.If in
I there is no tuple in Rhaving key α(t
1
),...,α(t
k
),then in I

there is no tuple
in Rhaving that key.If in I there is a tuple of formα(t
1
),...,α(t
k
),c
1
,...,c
l
,
then in I

that there is a tuple having formα(t
1
),...,α(t
k
),c

1
,...,c

l
,where
for each j ∈ [1..l],c

j
= α(r
j
),if r
j
is not ‘−’,and c

j
is c
j
otherwise.
(g) (Frame condition):The potential effect state I

is identical to I except as indi-
cated in items (d),(e),and (f).
Finally,we describe the conditions under which the effect of executing (es,ev)
should actually be applied to I and α,and describe the impact on both of those.
(h) Assume that item (b) does apply,item (c) does not apply,that (es,ev) is a non-
deterministic choice from E,and that the potential effect of executing es is as
described as in items (d),(e),(f),and (g).If I

satisfies all of the integrity
constraints in Σ,then the world state becomes I

after this execution of (es,ev).
(i) Also,assume that the conditions of item (h) hold.Then the new assignment α

is constructed from α as follows.For each variable z
j
,j ∈ [1..m],if ‘z
j
:= t’
occurs in es,then α

(z
j
) is given the value of α(t) as interpreted over I

;and is
given the value ω otherwise.Assignment α

is identical to α on local variables
of S not occurring among z
1
,...,z
m
.
Remark A.2:A broad variety of generalizations are possible in the Colombo frame-
work,e.g.,to move away fromexclusively key-based look-ups;letting variables range
over sets of tuples in addition to single tuples,etc.✷
A.2 Linkages,Stores,Transmit,Read
Remark A.3:The notion of linkage is closely related to the notion of linkage in
BPEL,and is used implicitly in the “service schema” of the Conversation model.The
notion of link is inspired by,and closely related to,channels as typical of process
algebras,and as found in the emerging SWSL ontology.In Colombo
k,b
we do not
change the linkage at runtime,but in principle such dynamic changes can be supported
in the Colombo framework.Other variations can be represented in Colombo,such as
allowing multiple services to give input to a channel,or having multiple services read
a message in a channel.✷
Let S be a web service,which is not a client.The local store of S,typically denoted
as LStore or LStore
S
,is a finite set {v
1
:d
1
,...,v
n
:d
n
} where the v
i
’s are distinct
variables and the d
i
’s are types from{Bool,Eq,Leq}.For each incoming port (m,in)
of S we assume that there is a distinguished boolean variable π
m
in LStore
S
,called
the message-present flag or variable;intuitively,this will be set to true if a message has
arrived into the queue of (m,in),and is set to false when that message is read by the
service (see below).
In Colombo we assume that a message that has been transmitted is held in a queue
associated to the incoming port of the receiving service.In general,the queues might
25
be bounded or unbounded.For the current paper,in Colombo
k,b
we assume that the
queues are bounded and have length one.
In addition to the local store,each non-client service S has a queue store,typically
denoted by QStore or QStore
S
.This store is used to hold the parameter values of
incoming messages,which can be thought of as being held by a queue.Specifically,
for each incoming port (m,in) of S,where mhas signature d
1
,...,d
n
,we include
l variables denoted as v
m
k
,for k ∈ [1..l].
We use Store or Store
S
to denote the union LStore
S
∪QStore
S
.
For passing messages between services we have two basic operations:transmit and
read.The syntax of operator transmit,used in the process specification of the sending
service S,is!m(r
1
,...,r
l
),where each r
k
is either a constant or a variable in LS
S
,
of appropriate type.Let (S,m,S

,n) be a link between services S and S

,let α be
a variable assignment for LStore
S
at some point during an enactment of a system
involving S and S

.Execution of!m(r
1
,...,r
l
) at this point will succeed iff the queue
of S

for (n,in) has room(in our case,if the queue is empty).In this case,the variable
v
n
k
of QStore
S

will be assigned the value α(r
k
),for k ∈ [1..l],and π
n
in QStore
S

is set to true.As far as the processing of S

,the receiving of a transmitted message
is essentially an asynchronous event,and is not explicitly represented in the process
model specification for S

.
What happens if the queue for (n,in) is full?Several options are available in the
general Colombo framework.A natural option,which makes the transmit operator
similar to Colombo atomic processes,is to assume that this operator is “executed”,
but that it has no impact on S

,and that a flag is set in S (e.g.,so that the transmit could
be attempted again later on).However,in Colombo
k,b
,we assume that this operator
blocks until the queue of S

has room.
We now turn to the read operation.The syntax of this operator,specified in the
process specification of the service S

that receives messages transmitted over a link
(S,m,S

,n),is?n(v
1
,...,v
l
),where each v
k
is a variable in LS
S
(but not any of
the distinguished message-present flags),of appropriate type.Let α be the assignment
in effect for LStore
S

and β the assignment for QStore
S

,and assume that α(π
n
) is
true.The effect of executing?n(v
1
,...,v
l
) is that α is modified to become α

,where
α

(v
k
) = β(v
n
k
) for k ∈ [1..l],that α


n
) is set to false,and α

is identical to α
elsewhere.
What happens if service S

with assignment α on LStore
S

attempts to execute
?n(v
1
,...,v
l
),but α(π
n
) is false?As with transmit,several options are available in the
general Colombo framework.Anatural option,which makes the read operator similar
to Colombo atomic processes,is to assume that this read operator is “executed”,but
that it has no impact on S

,except perhaps for setting a flag.Alternatively,the designer
of S

could include a test on π
n
before attempting to execute the read.However,in
Colombo
k,b
,we assume that the read operator?n blocks until there is a message in
the queue of (n,in).
Remark A.4:The assumption that all queues have length one,along with a subse-
quent restriction on web services that they are “blocking” as just described,end up
implying that all message transmissions are essentially synchronous,as typical of pro-
cess algebras,in that a message send and the receiving/reading of the message must
26
happen with no intervening activities (neither atomic process invocations nor other
message sends).However,we maintain the formalism of queues in Colombo
k,b
,be-
cause we expect that the results obtained in the current paper can be generalized to
support broader models for message passing as typically arise in the web service stan-
dards and research literature.In particular,it is easy to see how the notion of queue
store can be extended to support queues with arbitrary bounded size.✷
We now describe how message passing works with a client service.We assume
that a client C has access to a finite set Constants
C
of elements from Dom,which
are the constants available to C at any time.For client C we also maintain a unary
relation,denoted HasSeen or HasSeen
C
,which holds elements of Dom.Intuitively,
at a given time in an execution of C,HasSeen
C
will include all of Constants
C
,and
also all domain elements that occur in messages that have been transmitted to C.
What happens if a service S with assignment α executes a transmit operation
!m(r
1
,...,r
l
) directed at C?In Colombo
k,b
we assume that this always succeeds,
and that HasSeen
C
is replaced with HasSeen
C
∪ {α(r
k
) | k ∈ [1..l]}.Intuitively,
then,when a message is transmitted to C is is also read by C immediately.
In Colombo
k,b
,we assume that a client C can transmit a message at the beginning
of its enactment,but after that,it can transmit a message only after it has received
a message.We assume that C acts non-deterministically,and that after receiving a
message it can execute any transmit!m(c
1
,/dots,c
l
) where (m,out) is a port of C
(that occurs in some link of the system) and c
k
∈ HasSeen
C
for each k ∈ [1..l].Aside
from these restrictions on transmit and HasSeen
C
,we do not model in Colombo
k,b
the internal workings of client C.
A.3 Internal process model,services,clients,systems
Let S be a service with signatures Port(S),GA(S).An instantaneous description,or
id,of S over world schema W and constraints Σ.is a tuple (s,α,β) where s is a state
of GA(S),α is an assignment for LStore
S
,β is an assignment for QStore
S
.In general
we consider pairs of the form(id
S
,I),where id
S
is an id over S and I is a world state
over W that satisfies Σ.We sometimes use id to denote id
S
,if S is understood from
the context.
An id id
S
= (s,α,β) is initial if s is the start state of S,α(π
n
) = F for each n
where (n,in) ∈ Port(S) and α is ω elsewhere,and β is uniformally set to ω.
We now define the “moves-to” relation and the “trace” for individual services.The
moves-to relation 
S
(or simply  if S is understood from the context) will hold be-
tween pairs of the form (id
S
,I),(id
S

,I

) under conditions presented below,and
corresponds intuitively to cases where service S can move from one internal state to
the next,and/or where the global store can change (e.g.,if a message is received.) The
definition is more-or-less standard,except that we build in the possibility that moves
by other services might be interspersed (see items (b),(c),and (d) below).The trace
of a pair (id
S
,I),(id
S

,I

) where (id
S
,I) 
S
(id
S

,I

) will provide,intuitively,a
grounded record or log of salient aspects of the transition from (id
S
,I) to (id
S

,I

),
including,e.g.,what parameter values were input/output from an atomic process in-
vocation,or were received,read or sent.We define  and trace simultaneously.Let
27
(id
S
,I),(id
S

,I

) satisfy id
(
S) = (s,α,β) and id
(
S)

= (s





).
(a) Atomic Process:Suppose that GA(S) has a transition from s to s

labeled by
(g,p(r
1
,...,r
n
;v
1
,...,v
m
) where g[α] evaluates to true;if
((α,I),p(r
1
,...,r
n
;v
1
,...,v
m
))  (α

,I

);
and β

is β.Then (id
S
,I)  (id
S

,I

).Also,the trace of pair (id
S
,I),(id
S

,I

),
denoted trace((id
S
,I),(id
S

,I

)),is (p(c
1
,...,c
n
;d
1
,...,d
m
),I,I

),where
c
i
= α(r
i
) for i ∈ [1..n] and d
j
= α

(v
j
) for j ∈ [1..n].
(b) Receive Message:(Cases (b),(c),and (d) are for the case of Colombo
k,b
;vari-
ations of these will be appropriate for other variants of Colombo.) Suppose that
Port(S) includes (n,in) for some message type n with arity l;α(π
n
) = F (i.e.,
the queue for message n is empty,and also β(v
n
k
) = ω for k ∈ [1..l]);α


n
) =
T and α

is identical to α elsewhere;β

(v
n
k
) has arbitrary values (consistent
with the signature of n) and β

is identical to β elsewhere.Then (id
S
,I) 
(id
S

,I

).In this case,trace((id
S
,I),(id
S

,I

)) is (receive n(c
1
,...,c
k
),I

),
where c
i
= α(r
i
) for i ∈ [1..n] and d
j
= alpha

(v
j
) for j ∈ [1..n].
(Note in this case,and cases (c) and (d) below,there are no restrictions on I
to I

.Intuitively,this freedom is incorporated to reflect the possibility that in
a system of services,S might move into (id
S
,I) at some time t
1
,then other
services might make a variety of moves including some that change the world
state to I

at time t
2
,and finally a service might send a message of type n to S at
time t
2
.So as far as S is concerned,it was in (id
(
S),I) just after time t
1
,and
then at time t
2
it moves to (id
(
S)

,I

).)
(c) Read Message:Suppose that GA(S) has a transition from s to s

labeled by
(g,?n(v
1
,...,v
l
) where g[α] evaluates to true;α(π
n
) is true;α

(v
k
) = β(v
m
k
)
for k ∈ [1..l],α


m
) = F,and α

equals α elsewhere;β

(v
m
k
) = ω for k ∈
[1..l] and β

equals β elsewhere.Then (id
S
,I)  (id
S

,I

).In this case,
trace((id
S
,I),(id
S

,I

)) is (?n(d
1
,...,d
l
),I

),where d
k
= α

(v
k
) for k ∈
[1..l].
(d) Transmit Message:Suppose that GA(S) has a transition from s to s

labeled
by (g,!m(r
1
,...,r
l
) where g[α] evaluates to true;α

is identical to α;and β

is
identical to β.Then (id
S
,I)  (id
S

,I

).In this case,trace((id
S
,I),(id
S

,I

))
is (!n(c
1
,...,c
l
),I

),where c
k
= α(r
k
) for k ∈ [1..l].
An enactment of S is a finite sequence E = (id
1
,I
1
),...,(id
q
,I
q
),q ≥ 1,
where (a) id
1
is an initial id for S,and (b) (id
p
,I
p
)  (id
p+1
,I
p+1
) for each p ∈
[1..(q −1)].The enactment is successful if id
n
is in a final state of GA(S).
The notion of execution tree for S is nowdefined.(This can be viewed as a stepping
stone for defining execution tree for a system S.) Intuitively,an execution tree is an
infinitely branching tree T that records all possible enactments.The root is not labeled,
and all other nodes are labeled by pairs of form (id,I) where id is an id of S and
I a valid world state.For the children of the root,the id is the initial id of S and I
28
is arbitrary.An edge ((id,I),(id

,I

)) is included in the tree if (id,I)  (id

,I

);
in this case the edge is labeled by trace((id,I),(id

,I

)).A node (id,I) in the
execution tree is terminating if id is in a final state of GA(S).
We now turn to clients.As noted earlier,we model clients as a special kind of web
service.Clients correspond intuitively to a human (or automated) agent which interacts
with one or more web service to accomplish some goals.While abstract properties of
the internal model of non-client web services is specified in considerable detail (using
notions of local store,automata-based process model,etc.),the abstract properties of
the client are described with only salient details.
At this point we are focused primarily on Colombo
k,b
,but make our definitions
slightly more general,so that they can be used with other studies in the broader Colombo
framework.An instantaneous description (id) for a client C (in the case of Colombo
k,b
)
is a pair (s,HasSeen),where s ∈ {ReadyToTransmit,ReadyToRead} and HasSeen
is a unary relation over elements of Dom (holding,intuitively,all domain elements that
C has “seen” up to this point in an execution).This id is initial if s = ReadyToTransmit
and HasSeen is the set of constants present in the definition of C.
We now define the moves-to relation and notion of trace for clients,in the re-
stricted case of Colombo
k,b
.Let (id
C
,I),(id
C

,I

) satisfy id
C
= (s,HasSeen)
and id
C

= (s

,HasSeen

).For clients in Colombo
k,b
,we combine the read activity
with the receive activity.(As with receive,read,and send for services,the values of
I,I

are not restricted for receive.We insist that I = I

for send,to capture the intu-
ition that in Colombo
k,b
nothing can happen in between the client reading a message
and then transmitting another one.)
(a) Receive Message (which includes Read):Suppose that Port(C) includes (n,in)
for some message type nwith arity l;s = ReadyToRead;s

= ReadyToTransmit;
let d
1
,...,d
l
 be a sequence of l (not necessarily distinct) domain elements;and
let HasSeen

= HasSeen

{d
1
,...,d
l
} where the d’s are a set of at most l (not
necessarily distinct) domain elements.Then (id
S
,I)  (id
S

,I

).In this case,
trace((id
S
,I),(id
S

,I

)) is (receive n(d
1
,...,d
l
),I

).
(b) Transmit Message:Suppose that Port(C) includes (m,out) for some mes-
sage type m with arity l;HasSeen

= HasSeen;c
1
,...,c
l
 is a sequence
of (not necessarily distinct) elements from HasSeen;s = ReadyToTransmit;
s

= ReadyToRead;and I = I

.Then (id
S
,I)  (id
S

,I

).In this case,
trace((id
S
,I),(id
S

,I

)) is (!m(c
1
,...,c
l
),I

).
Note that in Colombo
k,b
there are several forms of non-determinismin the execu-
tion of a client C.This includes which kind of message to send,and which elements
from HasSeen to send in that message.Other forms of non-determinism are present
based on howC interacts with other services;these include that there are no restrictions
on:I,I

for receives,the timing of receives,and the parameter values of incoming
messages.
The notion of (successful) enactment and execution tree for clients is defined anal-
ogously as for services.
The notion of instantaneous description (id) for system S is defined in a natural
fashion,based on a generalization of id for individual services.Specifically,an id for
29
S is a tuple id
S
= (id
C
,{id
S
| S ∈ F}) where id
C
is an id of C and id
S
is an id of
S for each S ∈ F.This id is initial if the ids for C and the S’s are initial.
Because of the blocking behaviors incorporated into Colombo
k,b
,it turns out that
in an enactment at most one service (or the client) will be “executing” at any time (i.e.,
no concurrency).
B Selected Formal Details of the PDL Encoding
B.1 Preliminaries on PDL
Propositional Dynamic Logic (PDL) is a well-known logic of programs developed to
verify properties of programschemas [12].PDL formulas are formed by starting from
a set P of atomic propositions and a set Aof atomic actions,according to the following
abstract syntax:
φ −→ P | ¬φ | φ
1
∧φ
2
| φ
1
∨φ
2
| rφ | [r]φ
r −→ a | φ?| r
1
;r
2
| r
1
∪r
2
| r

where P is an atomic proposition in P,a is an atomic action in A.That is,PDL
formulas are composed from atomic propositions by applying arbitrary propositional
connectives,and modal operators rφ and [r]φ,where r is a program formed as a
regular expression over the atomic actions in Aand the tests φ?.
Intuitively,the modal operators,rφ expresses that there exists an execution of r
reaching a state where φ holds,while [r]φ expresses that all terminating executions of
r reach a state where φ holds (i.e.,it express a partial correctness condition).As for
programs,a means “execute action a”;φ?means “proceed only if φ is true”;r
1
∪ r
2
means “choose non deterministically between r
1
and r
2
”;r
1
;r
2
means “first execute
r
1
then execute r
2
”;r

means “execute r a non deterministically chosen number of
times (zero or more)”.As for
A PDL interpretation is a Kripke structure of the form M = (Δ
M

M
),where
Δ
M
is a non-empty set of states,and ·
M
is an interpretation function which interprets
atomic propositions P
M
⊆ Δ
M
–denoting the states in Δ
M
were P is true– and
atomic actions a
M
⊆ Δ
M
× Δ
M
,–i.e.,denoting the state transition caused by the
atomic action a–.The interpretation function ·
M
is extended to arbitrary formulas and
programs as follows:
30
P
M
⊆ Δ
M
(¬φ)
M
= Δ
M

M

1
∧φ
2
)
M
= φ
M
1
∩φ
M
2

1
∨φ
2
)
M
= φ
M
1
∪φ
M
2
(rφ)
M
= {s | ∃s

.(s,s

) ∈ r
M
∧φ
M
}
([r]φ)
M
= {s | ∀s

.(s,s

) ∈ r
M
→φ
M
}
a
M
⊆ Δ
M
×Δ
M
(φ?)
M
= {(s,s) | s ∈ φ
M
}
(r
1
∪r
2
)
M
= r
M
1
∪r
M
2
(r
1
;r
2
)
M
= r
M
1
;r
M
2
(r

)
M
= (r
M
)

A PDL formula is satisfiable iff φ
M
is nonempty.Checking PDL satisfiability of a
PDL formula is EXPTIME-complete in the size of the formula [9].
PDL enjoys two properties that are of particular interest for our aims.The first is
the tree model property,which says that every model of a formula can be unwound to a
(possibly infinite) tree-shaped model (considering states in Δ
M
as nodes and relations
interpreting actions as edges).The second is the small model property,which says that
every satisfiable formula admits a finite model whose number states |Δ
M
| is at most
exponential in the size of the formula itself.
We use the standard abbreviations for booleans,(e.g.,true,false,→).We also
use “−” as an abbreviation for program (∪
a∈A
a),which denotes the execution of the
next action,“−a” to denote all the actions in A except a,and “u” as an abbreviation
for the program (∪
a∈A
a)

.Notice that [u] represents the master modality,which can
be used to state universal assertions [12].
B.2 Encoding in PDL
Assume that all services are in Colombo
k,b
,and assume No External Modifications.
Let G = (C,{G},L) be a goal system and U = {S
1
,...,S
n
} a finite family of avail-
able web services,all of which satisfy Blocking Behavior and Bounded Access.Let p
be the number of states and q the size of the local store of the mediator to be synthe-
sized.
We present selected parts of the PDL formula Φ
G,U
p,q
encoding the composition syn-
thesis problem.The formula is formed by a conjunction including the following for-
mulas.
General constraints.There are several bookkeeping constraints that need to be for-
mulated in PDL.The most important are:
[∗](exec
q
→¬exec
p
)
for p 
= q and p,q = 0,1,...,n.This says that only one component service or the
mediator can be in execution at each step (sometimes the goal also will be executing).
[∗](st
p
i
∧¬exec
p
→[−]st
p
i
)
31
which says the if S
p
,p = 0,1,...,n,g,is not executing,it does not change state.
There are some requirements on final states.
[∗](Final
g
→Final
1
∧...∧Final
n
∧FINAL
0
)
that says that when the goal is in a final state then also the component services and the
(to be synthesized) mediator is in a final state.
Final is defined in the obvious way
[∗](st
p
→Final
p
)
for every final state st
p
of the service S
p
with p = 1,...,n,g.
Constraints on guessing the Mediator behavior.The following constraints force
the mediator to make exactly the same choices every time it finds itself in the same
circumstances.
∗(st
0
i
∧FINAL
0
) →[∗](st
0
i
→FINAL
0
)
∗(exec
0
∧st
0
i

￿
￿α
0

￿
￿γ ∧DO(!m)) →
[∗](exec
0
∧st
0
i

￿
￿α ∧
￿
￿γ →DO(!m))
∗(exec
0
∧st
0
i

￿
￿α
0

￿
￿γ ∧NEXT(st
0
j
)) →
[∗](exec
0
∧st
0
i

￿
￿α ∧
￿
￿γ →NEXT(st
0
j
))
∗(exec
0
∧st
0
i

￿
￿α
0

￿
￿γ ∧MAP(

q
0
m
, u)) →
[∗](exec
0
∧st
0
i
,
￿
￿α ∧
￿
￿γ →MAP(

q
0
m
, u))
∗(exec
0
∧st
0
i

￿
￿α
0

￿
￿γ ∧MAP( u,

q
i
m
)) →
[∗](exec
0
∧st
0
i
,
￿
￿α ∧
￿
￿γ →MAP( u,

q
i
m
))
Note that all guessed propositions are disjoint from guessed propositions in the
same family.Note also that if S
0
is not executing then the values of the guessed propo-
sition is irrelevant.
Mediator reads message froma service.These are the subformulas that characterize
the mediator reading froma component service.
If the mediator is prescribed to do next?m then it does it,and goes to a guessed
state:
[∗](exec
0
∧DO(?m) ∧NEXT(st
0
l
imp
(?m∧[−?m]⊥∧
[?m]st
0
j
))
In doing?m we guess in which variable of its local store the mediator put the
contents of the massage:
[∗](exec
0
∧DO(?m) ∧
￿
￿α ∧
￿
i
MAP(

q
0
m
, u) →[?m]
￿
￿
α

)
32
where
￿
￿
α

is the result of updating
￿
￿α by assigning the values in in the queues

q
0
m
associ-
ated with the port for the message mto the variables u in the local store of S
0
.Notice
that in fact only α

0
is in general different from α
0
,while α

i
,and α

g
remain equal to
α
i
,and α
g
,respectively.
It the mediator does?mthen next it will continue execute while the goal will not:
[∗](exec
0
∧DO(?m) →[?m](exec
0
∧¬exec
g
))
The svc and the world state instance remain unchanged:
[∗](exec
0
∧DO(?m) ∧
￿
￿γ →[?m]
￿
￿γ)
[∗](exec
0
∧DO(?m) ∧
￿
￿
I →[?m]
￿
￿
I)
Mediator sends a message to a service.These are the subformulas that characterize
the mediator sending a message to a component service.
If we guess that the mediator does!m next then it does it,and goes to a guessed
state:
[∗](exec
0
∧DO(!m) ∧NEXT(st
0
j
) →
(!m∧[−!m]⊥) ∧
[!m]st
0
j
)
In doing!m we guess in which variable of its local store the mediator puts in the
contents of the massage;next the chosen service will be in execution while the goal
will not:
[∗](exec
0
∧DO(!m) ∧
￿
￿α ∧
￿
i
MAP( u,

q
i
m
) →
[!m](
￿
￿
α

∧exec
i
∧¬exec
g
))
where
￿
￿
α

is the result of updating
￿
￿α by assigning the values in the variables u of the
local store of S
0
to the the queue

q
i
m
associated with the port for the message m of
the service S
i
;the execution is given to service S
i
.Notice that in fact only the part of
α

that is different from α

is the part relative to the port variables for message m in
service S
i
.
The svc and the world state instance remain unchanged:
[∗](exec
0
∧DO(!m) ∧
￿
￿γ →[!m]
￿
￿γ)
[∗](exec
0
∧DO(!m) ∧
￿
￿
I →[!m]
￿
￿
I)
Service S
i
receives a message.These are the subformulas that characterize a service
receiving a message fromthe mediator.
Let S
i
being in execution in a state st
i
h
with current svc
￿
￿γ and current assignment
￿
￿α.
Let also assume that S
i
in st
i
h
has a transition labeled by an guarded action φ/?m( x)
getting to a state st
i
h

and that φ evaluates true wrt
￿
￿α and
￿
￿γ.Then we have
[∗](exec
i
∧st
i
h

￿
￿γ ∧
￿
￿α →(?m∧[−?m]⊥∧
[?m]st
i
h


[?m]
￿
￿
α

))
33
where
￿
￿
α

is obtained from
￿
￿α by coping the symbolic values from the port

q
i
m
to the
variables x.The above formula says that:(1) next the action?m will be performed
(and no other action are possible);(2) next state for S
i
will be st
i
h

;(3) the assignment
is changed to
￿
￿
α

.
The execution remains to service S
i
,which now follows the goal service S
g
:
[∗](exec
i
∧?m →[?m](exec
i
∧exec
g
))
The svc and the world state instance remain unchanged:
[∗](exec
i
∧?m∧
￿
￿γ →[?m]
￿
￿γ)
[∗](exec
i
∧?m∧
￿
￿
I →[?m]
￿
￿
I)
Service S
i
sends a message.These are the subformulas that characterize a service
sending a message to the mediator.
Let S
i
being in execution in a state st
i
h
with current svc
￿
￿γ and current assignment
￿
￿α.Let also assume that S
i
in st
i
h
has a transition labeled by an guarded action φ/!m( x)
getting to a state st
i
h

and that φ evaluates true wrt
￿
￿α and
￿
￿γ.Then we have
[∗](exec
i
∧st
i
h

￿
￿γ ∧
￿
￿α →
(!m∧[−!m]⊥∧
[!m]st
i
k

[!m]
￿
￿
α

))
where
￿
￿
α

is obtained from
￿
￿α by coping the symbolic values fromthe variables x to the
port

q
0
m
.The above formula says that:(1) next the action!m will be performed (and
no other action are possible);(2) next state for S
i
will be st
i
h

;(3) the assignment is
changed to
￿
￿
α

.
The execution is given to the mediator S
0
and the goal execution is interrupted and
the execution of a read?min the mediator is prescribed:
[∗](exec
i
∧!m →
[!m](exec
0
∧¬exec
g
∧DO(?m)))
The svc and the world state instance remain unchanged:
[∗](exec
i
∧!m∧
￿
￿γ →[!m]
￿
￿γ)
[∗](exec
i
∧!m∧
￿
￿
I →[!m]
￿
￿
I)
For brevity,we do not report the characterization of the client and the interaction
with the client.As for the execution of atomic process we refer to the main body of the
paper.
34