Karabo: The European XFEL software framework

martencrushInternet και Εφαρμογές Web

8 Δεκ 2013 (πριν από 3 χρόνια και 10 μήνες)

123 εμφανίσεις


Burkhard Heisen for WP76

Novemeber
, 2013

Karabo: The European XFEL
software framework


Design Concepts

The star marks concepts, which are not yet implemented in the current release

Karabo: The European XFEL software framework

Functional requirements

2

Burkhard Heisen (WP76)

DAQ

data readout

online processing

quality monitoring
(vetoing)

SC

processing pipelines

distributed and GPU
computing

specific algorithms
(e.g. reconstruction)

Control

drive hardware and
complex experiments

monitor variables &
trigger alarms

DM

storage of experiment
& control data

data access,
authentication
authorization etc.

setup computation &
show scientific results

allow some
control & show
hardware status

show online data
whilst running

A typical use case:

Accelerator

Undulator

Beam Transport

DM

SC

Control

DAQ

Tight integration of applications

Karabo: The European XFEL software framework

Functionality: What are we dealing with?

1.
Distributed

end points and processes

2.
Data containers

(
Hash, Schema, Image, …)


3.
Data
transport

(data flow, network protocol
)

4.
Process control
(automation, feedback
)

5.
States

(finite state machines, sequencing, automation…)

6.
Data
acquisition

(front end hardware
)

7.
Time synchronization/tagging

(time stamps, cycle ids, etc.
)

8.
Real
-
time

needs (where necessary)

9.
Central services

(archive
,
alarm
,
name resolution
, …)

10.
Security

(who’s allowed to do what from where?
)

11.
Statistics

(control system itself, operation, …)

12.
Logging

(active, passive, central
,
local)

13.
Processing workflows
(parallelism, pipeline execution, provenance)

14.
Clients / User interfaces
(API, languages, macro writing, CLI, GUI)

15.
Software management
(
c
oding, building, packaging, deployment, versioning, …)

3

Burkhard Heisen (WP76)

Karabo: The European XFEL software framework

Distributed end points and processes

4

Burkhard Heisen (WP76)


Concept
: Device Server Model


Similar to: TANGO, DOOCS, TINE*


Elements are controllable objects managed by a device server.


Instance of such an object is a
device
, with a hierarchical name.


Device

classes

can be loaded at runtime (plugins)


Actions pertaining to a device given by its
properties
and

commands


i.e.
get
,
set
,
monitor

some
property

or
execute

some
command


Properties, commands, and (optionally) associated FSM logic statically
defined and further described (
attributes
) in device class. Dynamic (runtime)
extension of properties and commands possible.


Devices can be written in either C++ or Python (later maybe also Java) and
run on either Linux,
MacOSX

or Windows (later)

Karabo: The European XFEL software framework

DETAIL: Distributed endpoints

Configuration
-

API

5

Class:
MotorDevice

static
expectedParameters
( Schema& s ) {



FLOAT_ELEMENT(s).key(“velocity”)



.description(“Velocity of the motor”)



.
unitSymbol
(“m/s”)



.
assignmentOptional
().
defaultValue
(0.3)



.
maxInc
(10)


.
minInc
(0.01)



.reconfigurable()



.
allowedStates
(“Idle”)



.commit();



INT32_ELEMENT(s).key(“
currentPosition
”)



.description = “Current position of the motor”



.
readOnly
()


.
warnLow
(10)



[…]



SLOT_ELEMENT(s).key(“move”)



.description = “Will move motor to target position”


.
allowedStates
(“Idle”)



[…]

}


// Constructor with initial configuration

MotorDevice
(
const

Hash&
config

) {
[…]
}


// Called at each (re
-
)configuration request

onReconfigure
(
const

Hash&
config

) {
[…]

}



Any Device uses a standardized API to
describe itself. This information is used
to automatically create GUI input
masks or for auto
-
completion on the
IPython

console


No need for device developers to
validate any parameters. This is
internally done taking the
expectedParameters

as white
-
list


We distinguish between
properties

and
commands

and associated
attributes
, all of them can be
expressed within the expected
parameters function


Properties and commands can be
nested, such that hierarchical
groupings are possible

Burkhard Heisen (WP76)

Attribute

Property

Command

Karabo: The European XFEL software framework

DETAIL: Distributed
end points and
processes

Creating a new device

6

Burkhard Heisen (WP76)

plugins

1.
Write a class (say:
MyDevice
) that derives
from
Device

2.
Compile it into a shared library (say
libMyDevice.so
)

3.
Select a running
Device
-
Server

or start a
new one

4.
Copy the
libMyDevice.so

to the
plugins
folder

of

the
Device
-
Server

5.
The Device
-
Server will emit a signal to the
broker that a new Device class is
available, it ships the
expected
parameters

as read from static context of
the
MyDevice

class

GUI

libMy

Device.so

signalNewDeviceClassAvailable

(.
xsd
)

Master

Central DB

GUI
-
Srv

Karabo: The European XFEL software framework

DETAIL: Distributed
end points and
processes

Creating a new device

7

Burkhard Heisen (WP76)

plugins

GUI

Master

Central DB

GUI
-
Srv

MyDevice
1

factory: create(“
MyDevice
”, xml)

6.
Given the mask of possible parameters the
user

may fill a valid
configuration

and emit
an instantiate signal to the broker

7.
The configuration will be validated by the
Device factory and if valid, an instance of
MyDevice

will be created

8.
The constructor of the device class will be
called and provided with the configuration

9.
The run method will be called which starts
the state
-
machine and finally blocks by
activating the event
-
loop

10.
The device will asynchronously listen to
allowed events (slots) guided by the
internal
state machine

signalInstantiate
(“
MyDevice
”, xml)

Karabo: The European XFEL software framework

Data containers (Hash,
Image/Matrix/Vector)

8

Burkhard Heisen (WP76)


Concept
: Have some containers for which Karabo provides special support


Hash


String
-
key, any
-
value associative container


Keeps insertion order (iteration possible), hash performance for random lookup


Provide (string
-
key, any
-
value) attributes per hash
-
key


F
ully recursive structure (i.e. Hashes of Hashes)


Serialization: XML, Binary, HDF5, DB


Usage: configuration, device
-
state cache, database interface, message protocol,
etc.


Schema


Describes possible/allowed structures for the Hash. In
analogy:
Schema would
be for Hash, what an XSD document is for an XML
file


Associates meta
-
data (called attributes) to
properties


Image/Matrix/Vector


Some default containers needed for scientific computing


Seamless switching between CPU and GPU representation


Optimized serialization (network transfer)

Karabo: The European XFEL software framework

Data transport (data flow, network
protocol)

9

Burkhard Heisen (WP76)


Concept
:
Separation
between
broker based

(less frequent, smaller data
size) and
point
-
to
-
point

(frequent, large data size)
communication

Communication is cross
-
network, cross
-
language and cross
-
platform


Broker
based


Highly
available full N x N communication between devices of any category
(Control, SC, DAQ, DM) via
broker


P
atterns:
signal/slots,
r
equest/response, simple call


Point
-
to
-
Point



Transient
(run
-
time) establishment of direct (
brokerless
) connections between
devices


TCP
-
based, high
performance for huge
data


A
synchronous IO, memory optimization if local


Karabo: The European XFEL software framework

DETAIL: Data transport

Communication
: Event
-
Driven vs. Scheduled

10

Burkhard Heisen (WP76)

Device



1

Device



2

Device



3

Device



4

Emit

Notify

Notify

Notify

Device



1

Device


2

Device


3

Device



4

Request

Response

Event
-
driven communication

“Push Model”

A minimal set of information is passed

System is scalable (maintains performance)

Failure is harder to detect

Scheduled communication

“Poll Model”

Direct feedback on
request

Nodes may be spammed (DOS)

Growing systems loose performance

Typically, lots of extra traffic is generated

Karabo: The European XFEL software framework

DETAIL: Data transport

Broker based communication
-

API


Communication

happens

between

ordinary

(member,

or

free
-
standing)

functions


Functions

on

distributed

instances

are

identified

by

a

pair

of

strings,

the

instanceId

and

the

functionName


The

instanceId

uniquely

identifies

a

(e
.
g
.

device
-
)instance

connected

to

a

specific

topic

of

the

broker


The

functionName

uniquely

identifies

an

ordinary

function

registered

under

a

given

instanceId


Functions

of

any

signature

(currently

up

to

4

arguments)

can

be

registered

to

be

remotely

callable


Registration

can

be

done

at

runtime

without

extra

tools


Function

calls

can

be

done

cross
-
network
,

cross
-
operating
-
system

and

cross
-
language

(currently,

C
++

and

Python,

Java

will

follow)


The

language’s

native

data

types

are

directly

supported

as

arguments


A

generic,

fully

recursive,

key

to

any
-
value

container

(Hash)

is

provided

as

a

data
-
type

for

complex

arguments

11

Burkhard Heisen (WP76)

Karabo: The European XFEL software framework




DETAIL: Data transport

Broker based communication


Three Patterns


Signals & Slots


SLOT

(
function
, [
argTypes
] )


SIGNAL

(
funcName
, [
argTypes
] )


c
onnect

(
signalInstanceId
,
signalFunc
,
slotInstanceId
,
slotFunc

)


emit
(
signalFunc
, [
args
] )

12

Burkhard Heisen (WP76)

SLOT
(
onFoo
,
int
,
std
::string);

void
onFoo
(
const

int
&
i
,
std
::string& s) { }

SIGNAL
(“foo”,
int
,
std
::string);

c
onnect
(“Device1”, “foo”, “Device2”, “
onFoo
”);

c
onnect
(“”
, “foo”, “
Device3”
, “
onGoo
”);

c
onnect
(“”
, “foo”, “
Device4”
, “
onHoo
”);

e
mit
(“foo”, 42, “bar”);

Device1

Device
2

Device3

Device
4

Emit

Notify

Notify

Notify

SLOT
(
onGoo
,
int
,
std
::string);

void
onGoo
(
const

int
&
i
) { }

SLOT
(
onHoo
,
int
,
std
::string);

void
onHoo
(
const

int
&
i
,
std
::string& s) { }

Karabo: The European XFEL software framework

DETAIL: Data transport

Broker based communication
-

Patterns


Direct Call


call
(
instanceId
,
funcName
, [
args
] )

13

Burkhard Heisen (WP76)

Device
2

Call

Notify

Device1

SLOT
(
onFoo
,
std
::string);

void
onFoo
(
const

std
::string& s) { }

call
(“Device2”, “
onFoo
”, “bar”);


Request / Reply


request
(
instanceId
,
funcName
, [
reqArgs
] ).
timeout
(
msec

).
receive
( [
repArgs
] )

Device
2

Device1

SLOT
(
onFoo
,
int
);

void
onFoo
(
const

int
&
i
) {
reply
(
i

+
i

); }

i
nt

number
;

request
(“Device2”, “
onFoo
”, 21).
timeout
(100).
receive
(number);

R
equest

Notify

Notify

Reply

Karabo: The European XFEL software framework

DETAIL: Data Transport

Illustration

14

Burkhard Heisen (WP76)


HV


Pump

Simulate


Store


Cali
-


brate1


Cali
-


brate2


Load


APD

Logger

RDB

Disk

Storage

GUI

Server

GUI(s)

Terminal(s)

Camera

Device
-
Server

Application

Message Broker

(
Event Loop)

Device Instance

Device

Sub

Control

Karabo: The European XFEL software framework

Process control (automation, feedback)

15

Burkhard Heisen (WP76)


Concept
: Single device processes vs. multiple device processes


P
rocesses which involve a single device and e.g. some hardware


Implementation of a software FSM that mirrors the hardware FSM


Automation and feedbacks implemented using software FSM events.
Events may be internally triggered (auto) or exposed to control system
(interactive
/
manual)



Processes which involve coordination of multiple devices (non real
-
time)


Process is abstracted into
parent device

which sub
-
instructs
children
devices
(
composition
).


Control system protects children devices from direct user control.


Parent devices FSM describes process automation/feedback.


Parent device is device and device
-
controller in person.


Karabo: The European XFEL software framework

DETAIL: Process control

A standardized hardware device

16

Burkhard Heisen (WP76)


Concepts


The hardware is always safe even without software


C
oupling between h/w devices at a “lower” lever than Karabo can exist (real time)


The authority (h/w or s/w) may be different and even change during runtime


A generic state transition table design exists, which allows for flexible h/w control


Ok

HardwareError

CommunicationErro
r

Erro
r

Readjusting

onOutOfSync

onHwError

onComError

r
eset*

none

onException

r
eset

r
eset / action

[ autonomous]

Enter
HardwareError

1.
generic h/w error status bit is set (by PLC)

Exit
HardwareError

1.
click reset button calls
resetHardwareAction
() which
should make any actions to ‘reset’ h/w , if not
successful
HardwareError


2.
Is reentered (eventually


timeout?)

Enter
CommunicationError

1.
Heartbeat from PLC not received by
BeckhoffCom

2.
BeckhoffCom

dead

3.
Broker dead

Exit
CommunicationError

1.
reset*, the * means driven by internal recovery where

no user action required (or possible)

Enter Error

1.
on exception which is not caught in FSM s/w thread

2.
s/w device’s call to
onError
() (only used in composite devices)

Exit Error

1.
c
lick
r
eset button which moves s/w device to
AllOk’s

Initialization, where the h/w

status is requested and the correct state (or Error) moved to depending on the reply

Karabo: The European XFEL software framework

States
(finite state machines, sequencing
,
automation…)

17

Burkhard Heisen (WP76)


Concept
: Devices optionally run finite state machines (FSMs) inside


Devices can implement a custom or inherit a common FSM


Events into the FSM can be triggered internally (automation, sequencing) or
made device
commands

(remotely trigger
-
able)


The FSM provides four hooks fitting into the event
-
driven API style of devices
(
onGuard
,
srcStateOnExit
,
onTransitionAction
,
tgtStateOnEntry
)


Any (writable) property or command can be access restricted according to the
device’s current state. This is done using the attribute
allowedStates
.


As
allowedStates

is an attribute (and thus part of the static XSD) any UI system
is able to pro
-
actively reflect the currently (state dependent) settable properties
and commands. The command
-
line interface uses this information to provide
state
-
aware auto
-
completion
whilst the GUI uses it for
widget
-
disabling

(grey
out).

Karabo: The European XFEL software framework

Detail: Device


Finite state machine (FSM)

18

Burkhard Heisen (WP76)

OK

Initialization

Stopped

Started

none

start

errorFound

reset

stop

// Ok Machine

FSM_TABLE_BEGIN
(
OkTransitionTable
)

//
SrcState

Event
TgtState

Action Guard

Row< Started,
StopEvent
, Stopped,
StopAction
, none >,

Row< Stopped,
StartEvent
, Started,
StartAction
, none >

FSM_TABLE_END

FSM_STATE_MACHINE
(
Ok
,
OkTransitionTable
, Stopped,
Self
)

// Top Machine

FSM_TABLE_BEGIN
(
TransitionTable
)

Row< Initialization, none, Ok, none, none >,

Row<
Ok
,
ErrorFoundEvent
, Error,
ErrorFoundAction
, none >,

Row< Error,
ResetEvent
, Ok,

ResetAction
, none >

FSM_TABLE_END

KARABO_FSM_STATE_MACHINE
(
StateMachine
,

TransitionTable
,

Initialization,
Self
)

Start Stop State
Machine

Error


Any device uses a standardized way to express
its possible program flow


The state machine calls back device functions (
guard
,
onStateExit
,
transition
A
ction
,
onStateEntry
)


The GUI is state
-
machine aware and enables/disables
buttons
proactively

Karabo: The European XFEL software framework

DETAIL: States

F
inite state machines


There is a UML standard

19

Burkhard Heisen (WP76)


State

Machine
:

the

life

cycle

of

a

thing
.

It

is

made

of

states,

transitions

and

processes

incoming

events
.


State
:

a

stage

in

the

life

cycle

of

a

state

machine
.

A

state

(like

a

submachine)

can

have

an

entry

and

exit

behaviors


Event
:

an

incident

provoking

(or

not)

a

reaction

of

the

state

machine


Transition
:

a

specification

of

how

a

state

machine

reacts

to

an

event
.

It

specifies

a

source

state,

the

event

triggering

the

transition,

the

target

state

(which

will

become

the

newly

active

state

if

the

transition

is

triggered),

guard

and

actions


Action
:

an

operation

executed

during

the

triggering

of

the

transition


Guard
:

a

boolean

operation

being

able

to

prevent

the

triggering

of

a

transition

which

would

otherwise

fire


Transition

Table
:

representation

of

a

state

machine
.

A

state

machine

diagram

is

a

graphical,

but

incomplete

representation

of

the

same

model
.

A

transition

table,

on

the

other

hand,

is

a

complete

representation

Karabo: The European XFEL software framework

DETAIL: States

FSM implementation example in
C
++ (header only)

20

Burkhard Heisen (WP76)

//
AllOkState

Machine

FSM_TABLE_BEGIN
(
AllOkStateTransitionTable
)

//
SrcState

Event
TgtState

Action Guard

Row<
StartedState
,
StopEvent
,
StoppedState
,
StopAction
, none >,

Row<
StoppedState
,
StartEvent
,
StartedState
,
StartAction
, none >

FSM_TABLE_END

FSM_STATE_MACHINE
(
AllOkState
,
AllOkStateTransitionTable
,
StoppedState
,
Self
)

// Events

FSM_EVENT2
(
ErrorFoundEvent
,
onException
,
string
,
string
)

FSM_EVENT0
(
EndErrorEvent
,
endErrorEvent
)

FSM_EVENT0
(
StartEvent
,
slotMoveStartEvent
)

FSM_EVENT0
(
StopEvent
,
slotStopEvent
)

// States

FSM_STATE
_
EE
(
ErrorState
,
errorStateOnEntry
,
errorStateOnExit
)

FSM_STATE_E
(
InitializationState
,
initializationStateOnEntry
)

FSM_STATE_EE
(
StartedState
,
startedStateOnEntry
,
startedStateOnExit
)

FSM_STATE_EE
(
StoppedState
,
stoppedStateOnEntry
,
stoppedStateOnExit
)

// Transition Actions

FSM_ACTION0
(
StartAction
,
startAction
)

FSM_ACTION0
(
StopAction
,
stopAction
)

//
StartStop

Machine

FSM_TABLE_BEGIN(
StartStopTransitionTable
)

Row<
InitializationState
, none,
AllOkState
, none, none >,

Row<
AllOkState
,
ErrorFoundEvent
,
ErrorState
,
ErrorFoundAction
, none >,

Row<
ErrorState
,
EndErrorEvent
,
AllOkState
,
EndErrorAction
, none >

FSM_TABLE_END

KARABO_FSM_STATE_MACHINE(
StartStopMachine
,
StartStopMachineTransitionTable
,
InitializationState
,
Self)

FSM_CREATE_MACHINE
(
StartStopMachine
,
m_fsm
);

FSM_SET_CONTEXT_TOP
(this,
m_fsm
)

FSM_SET_CONTEXT_SUB
(this,
m_fsm
,
AllOkState
)

FSM_START_MACHINE
(
m_fsm
)

T
ransition table element

Regular callable function (triggers event)

T
ransition table element

Regular function hook (will be call
-
backed)

Karabo: The European XFEL software framework

DETAIL: States

FSM implementation example in Python

21

Burkhard Heisen (WP76)

#

AllOkState

Machine

allOkStt

= [

#

SrcState

Event
TgtState

Action Guard


(‘
StartedState
’, ‘
StartEvent
’, ‘
StoppedState
’, ‘
StartAction
’, ‘none’),


(‘
StoppedState
’, ‘
StopEvent
’, ‘
StartedState
’, ‘
StopAction
’, ‘none’)

]

FSM_STATE_MACHINE
(‘
AllOkState

,
allOkStt
, ‘
InitializationState
’)

#

Events

FSM_EVENT2
(self, ‘
ErrorFoundEvent
’, ‘
onException

)

FSM_EVENT0
(self, ‘
EndErrorEvent
’, ‘
slotEndError

)

FSM_EVENT0
(self, ‘
StartEvent
’, ‘
slotStart
’)

FSM_EVENT0
(self, ‘
StopEvent
’, ‘
slotStop

)

#

States

FSM_STATE_EE
(‘
ErrorState
’,
self.errorStateOnEntry
,
self.errorStateOnExit

)

FSM_STATE_E
( ‘
InitializationState
’,
self.initializationStateOnEntry

)

FSM_STATE_EE
(‘
StartedState
’,
self.startedStateOnEntry
,
self.startedStateOnExit
)

FSM_STATE_EE
(

StoppedState
’,

self.stoppedStateOnEntry
,
self.stoppedStateOnExit
)


#

Transition Actions

FSM_ACTION0
(‘
StartAction
’,
self.startAction
)

FSM_ACTION0
(‘
StopAction
’,
self.stopAction
)

#

Top Machine

topStt

= [


(‘
InitializationState
’, ‘none’, ‘
AllOkState
’, ‘none’, ‘none’),


(‘
AllOkState
’, ‘
ErrorFoundEvent
’, ‘
ErrorState
’, ‘none’, ‘none’),


(‘
ErrorState
’, ‘
EndErrorEvent
’, ‘
AllOkState
’, ‘none’, ‘none’)

]

FSM_STATE_MACHINE
(‘
StartStopDeviceMachine

,
topStt
, ‘
AllOkState

)

s
elf.fsm

= FSM_CREATE_MACHINE(‘
StartStopMachine
’)

s
elf.startStateMachine
()


Karabo: The European XFEL software framework

Data acquisition

22

Burkhard Heisen (WP76)


Concept
: FEM
-
> PC
-
Layer
-
> Online
-
Cache


PCL machines run highly tuned
devices

which write data to file (online cache)
as fast as possible.


Online cache is (one possible) data source for
Karabo’s

workflow system
.


Karabo: The European XFEL software framework

Real time needs (where necessary)

23

Burkhard Heisen (WP76)


Concept
: Karabo itself does not provide real time processes


Real time processes (if needed) must be defined and executed in layers below
Karabo. Karabo devices will only start/stop/monitor real time processes.


Examples: Beckhoff

motor
-
coupling, Beckhoff feedback systems, etc…

Karabo: The European XFEL software framework

Time synchronization
(time stamps, cycle ids, etc.
)

24

Burkhard Heisen (WP76)


Concept
: Any changed property will carry timing information as attribute(s)


Time information is assigned per property


Karabo’s timestamp consists of the following information:


Seconds since
unix

epoch, uint64


Fractional seconds (up to
atto
-
second resolution), uint64


Train ID, uint64


Time information is assigned as early as possible (best: already on hardware) but
latest in the software device


On event
-
driven update, the device ships the property key, the property value and
associated time information as property attribute(s)


Real
-
time synchronization is not subject to Karabo


C
orrelation between control system (monitor) data and instrument data will be
done using the archived central DB information (or information previously
exported into HDF5 files)

Karabo: The European XFEL software framework

DETAIL: Time synchronization

Distributed Train ID clock

25

Burkhard Heisen (WP76)


Concept
: A dedicated machine with a time receiver board
(h/w) distributes clocks on the Karabo level


Scenario 1: No time information from h/w


Example: commercial cameras


Timestamp is associated to the event
-
driven data in
the Karabo device


If
clock signal
is too late, the next
trainId

is calculated
(extrapolated) given the previous one and the interval
between
trainId's


The interval is configurable on the
Clock device
and
must be stable within a run. Error is flagged if clock
tick is lost
.


Scenario 2: Time information is already provided by h/w


The timestamp can be taken from the h/w or the
device (configurable). The rest is the
same as in
scenario 1.



Clock

Device

Time receiver board

s
ignals:

1.
trainId

2.
epochTime

3.
interval

creates
timestamp and
associates
to
trainId

Karabo: The European XFEL software framework

Central services (archive, alarm, name resolution, …)

26

Burkhard Heisen (WP76)


Concept
:
Karabo’s

central aspects will be reflected within a database


All properties of all devices will be
archived
into DB

in an event
-
driven way by default


Any property carries an “archive policy” attribute to reduce or switch
-
off archiving


Karabo is user centric (login at client start
-
up), the DB will provide all needed
information to perform later access control on devices


Any user
-
specific
GUI settings

will be saved to DB


The DB gives access to all pre
-
configuration

(user
-
centric) of future device instances


Name resolution

is handled by the message broker (filtering on broker, not client)


Besides the broker, other central services are technically not needed.


GUI clients are not directly talking to the broker but are going through a GUI server


Distributed alarm conditions

are planned to be handled by python devices that can
check any (distributed) condition and can be instantiated (armed) at need

Karabo: The European XFEL software framework

Central services
-

N
ame resolution/access

27

Burkhard Heisen (WP76)


Concept
: The only central service needed is the broker, others are optional


Start
-
up issues


A fixed ID can (optionally) be provided prior start
-
up (via command line or file)


If no instance ID is provided the ID is auto
-
generated locally


Servers:
hostname
_Server_
pid


Devices:
hostname
-
pid_classId_counter


Any instance ID is validated (by request
-
response trial) prior startup



Running system issues


The engine for all inter
-
device communication is the
DeviceClient

class


The
DeviceClient

abstracts the
SignalSlotable

layer into a set of functions


instantiate, kill, set, execute, get, monitor etc.


The
DeviceClient

can act without a central entity and be started anytime


The
DeviceClient

can act as master itself and boost performance of other
DeviceClients



Master
DeviceClients

can come and go, everything is handled transparently

Karabo: The European XFEL software framework

Central services


Data archiving

28

Burkhard Heisen (WP76)


Concept
: A central data logger device
collects event driven data and persists


The data logger is a device which is
listens to all other devices


The event
-
driven information is
cached in form of a Hash object for
some time and then persisted to
either file or DB or both


Information is stored in a per
parameter manner


Next to the parameter values the
current valid schema is saved as
well








Logger

Central DB

GUI
-
Srv

Device
-
Server

Instance

Message

Broker

Device Instances

GUI
-
Client

Device
-
Server

Instance

Device Instance

GUI
-
Client

Master Device
-
Server

Instance

Karabo: The European XFEL software framework

DETAIL: Access levels


We will initially have five access levels (
enum
) with intrinsic ordering


ADMIN

= 4


EXPERT

= 3


OPERATOR

= 2


USER

= 1


OBSERVER

= 0


Any Device can restrict access
globally

or on a
per
-
parameter

basis


Global restriction is enforced through the “visibility” property (base class)


Only if the requestor is of same or higher access level he can see/use the device


The “visibility” property is part of the topology info (seen immediately by clients)


Parameter restriction is enforced through the “
requiredAccessLevel
” schema
-
attribute


Parameter restriction typically is set programmatically but may be re
-
configured

at initialization time (or even runtime?)


The “visibility” property might be re
-
configured if the requestors access level is higher
than the associated “
requiredAccessLevel
” (should typically be ADMIN)


The default access level for settable properties and commands is
USER


The default access level for read
-
only properties is
OBSERVER


The default value for the visibility is
OBSERVER

29

Burkhard Heisen (WP76)

Karabo: The European XFEL software framework

DETAIL: Access levels


A role is defined in the DB and consists of a default access level and a device
-
instance specific access list (overwriting the default level) which can be empty.


SPB_Operator



defaultAccessLevel

=>
USER


accessList


SPB_* =>
OPERATOR


Undulator_GapMover_0 =>
OPERATOR


Global_Observer


defaultAccessLevel

=>
OBSERVER


Global_Expert


defaultAccessLevel

=
EXPERT


After authentication the DB computes the user specific access levels considering
current time, current location and associated role. It then ships a default access and
an access level list back to the user.


If
the authentication
service (or DB)
is not available, Karabo falls back to a
compiled default access level (in
-
house:
OBSERVER
,
shipped
-
versions:
ADMIN
)


For a
ADMIN
user it might be possible to temporarily (per session) change the
access list of another user.

30

Burkhard Heisen (WP76)

Karabo: The European XFEL software framework

DETAIL: Security

31

Burkhard Heisen (WP76)

Header […]

__
uid
=42

__
accessLevel
=“admin”

Body […]

Broker
-
Message

Device

Locking:

i
f is locked:



if is
__
uid

==
owner

then
ok


Access control:

i
f
__
accessLevel

>
=

visibility
:



if
__
accessLevel

>=

param.accessLevel

then
ok

GUI
-
Srv

Central DB

1.
Authorizes

2.
Computes context based access levels

u
sername

p
assword

p
rovider

ownIP
*

brokerHost
*

brokerPort
*

brokerTopic
*

userId

sessionToken

defaultAccessLevel

accessList

GUI or CLI

Karabo: The European XFEL software framework

Statistics (control system itself, operation, …
)

32

Burkhard Heisen (WP76)


Concept
: Statistics will be collected by regular devices


OpenMQ

implementation provides a wealth of statistics (e.g. messages in
system, average flow, number of consumers/producers, broker memory used…)


Have a (broker
-
)statistic device that does system calls to retrieve information


Similar idea for other statistical data


Karabo: The European XFEL software framework

Logging
(active, passive, central
,
local)

33

Burkhard Heisen (WP76)


Concept
: Categorized into the following classes


Active Logging
A
dditional
code (inserted by the developer) accompanying the
production/business code, which is intended to increase the verbosity of what is currently
happening.


Code Tracing
Macro based, no overhead if disabled, for low
-
level purposes


Code Logging
Conceptual analog to Log4j, network
appender
, remote and at runtime
priority (re
-
)configuration


Passive Logging
Recording
of
activities in the distributed event
-
driven system. No extra
coding is required from developers, passive logging transparently records system relevant
events.



Broker
-
message logging
Low
-
level debugging purpose, start/stop, not active during
production



Transactional logging
Archival of the full distributed state

Karabo: The European XFEL software framework

Processing workflows (parallelism, pipeline
execution, provenance
)

34

Burkhard Heisen (WP76)


Concept
: Devices as modules of a scientific workflow system


Configurable generic input/output channels on
devices


One channel

is specific for
one data structure

(e.g. Hash, Image, File, etc.)


New data structures can be “registered” and are immediately usable


Input channel configuration:
copy

of connected output’s data or
share

the data with
other input channels
, minimum number of data
needed


ComputeFsm

as base class, developers just need to code the compute method


IO system is decoupled from processing system (process whilst transferring data)


Automatic

(API transparent)
data transfer optimization

(pointer if local, TCP if remote)


Broker
-
based communication for
workflow coordination and meta
-
data sharing


GUI integration to setup workflows graphically (drag
-
and
-
drop featured)


Workflows can be stored and shared (following the general rules of data privacy and
security) executed, paused and
stepped


Parallel execution

Karabo: The European XFEL software framework

DETAIL: Processing workflows

Parallelism and load
-
balancing by design

35

Burkhard Heisen (WP76)

TCP

Memory


Devices within the same device
-
server:


D
ata will be transferred by handing over pointers
to corresponding memory locations


Multiple instances connected to one output
channel will run in parallel using CPU threads



Devices in different device
-
servers:


Data will be transferred via TCP


Multiple instances connected to one output
channel will perform distributed computing

CPU
-
threads

Distributed processing


Output channel technically is TCP server, inputs are clients


Data transfer model follows an
event
-
driven poll architecture
, leads to load
-
balancing
and maximum per module performance even on heterogeneous h/w


Configurable output channel behavior in case no input currently available: throw, queue,
wait, drop

Karabo: The European XFEL software framework

DETAIL: Processing workflows

GPU enabled processing

36

Burkhard Heisen (WP76)


Concept
: GPU parallelization will happen within a compute execution


The data structures (e.g. image) are prepared for GPU parallelization


Karabo will detect whether a given hardware is capable for GPU computing at runtime,
if not falls back to corresponding CPU algorithm


Differences in runtime are balanced by the workflow system

IO whilst computing

Pixel parallel processing

(one GPU thread per pixel)

Notification about new data possible to obtain

GPU

C
PU

Karabo: The European XFEL software framework

Clients / User interfaces (API, languages, macro writing,
CLI, GUI
)

37

Burkhard Heisen (WP76)


Concept
: Two UIs


graphical (GUI) and scriptable command line (CLI)


GUI


Have one multi
-
purpose GUI system satisfying all needs


See following slides for details


Non
-
GUI


We distinguish APIs for programmatically set up of control sequences (others call
those Macros) versus and API which allows interactive,
commandline
-
based
control (
IPython

based)


The programmatic API exists for C++ and Python and features:


Querying of distributed system topology (hosts, device
-
servers, devices, their
properties/commands, etc.):
getServers
,
getDevices
,
getClasses


i
nstantiate
,
kill
,
set
,
execute

(in “wait” or “
noWait
” fashion),
get
,
monitorProperty
,
monitorDevice


Both APIs are state and access
-
role aware, caching mechanisms provide proper
Schema and synchronous (poll
-
feel API) although always event
-
driven in the back
-
end


The interactive API integrates auto
-
completion and improved interactive
functionality suited to
iPython



Karabo: The European XFEL software framework

GUI: What do we have to deal with?


Client
-
Server
(network protocol, optimizations)


User management
(login/logout, load/save
s
ettings, access role support)


L
ayout
(panels, full screen, docking/undocking)


Navigation
(devices, configurations, data, …)


Configuration

(initialization vs. runtime, loading/saving, …)


Customization

(widget galleries, custom GUI builder, composition, …
)


Notification
(about alarms, finished pipelines, …)


Log Inspection

(filtering, configuration of log
-
levels, …)


Embedded scripting
(
i
Python
, macro recording/playing)


Online documentation
(embedded wiki, bug
-
tracing, …)

38

Kerstin
Weger

(WP76)

Karabo: The European XFEL software framework

Client
-
Server (network protocol, optimizations)

39

Master

Central DB

GUI
-
Srv

Message

Broker

GUI
-
Client

I only see
device “A”

onChange

information only
related to “A”


Concept
: One server, many clients, TCP


Server knows what each client user sees (on a
device level) and optimizes traffic accordingly


Client
-
Server protocol is TCP, messages are
header/body style using Hash serialization (default
binary protocol)


Client side socket will be threaded to decouple from
main
-
event loop


On client start server provides current distributed
state utilizing the DB, later clients are updated
through the broker


Image data is pre
-
processed on server
-
side and
brought into
QImage

format before sending

Kerstin
Weger

(WP76)

Karabo: The European XFEL software framework

User management (login/logout, load/save settings,
access role support)

40


Concept
: User centralized, login mandatory


Login necessary to connect to system


Access role will be computed (context based)


User specific settings will be loaded from DB


View and control is adapted to access role


User or role specific
c
onfiguration and wizards are
available

Central DB

1.
Authorizes

2.
Computes context based
access role

u
sername

password

userId

accessRole

s
ession

Kerstin
Weger

(WP76)

Karabo: The European XFEL software framework

Layout
(panels, full screen, docking/undocking)

41


Concept
: Six dock
-
able and slide
-
able (optionally tabbed) main panels


Panels are organized by functionality


Navigation


Custom composition area (sub
-
GUI building)


Configuration (non
-
tabbed, changes view based on selection elsewhere)


Documentation (linked and updated with current configuration view)


Logging


Notifications


Panels and their tabs can be undocked (windows then belongs to OS’s window
manager) and made full
-
screen (distribution across several monitors possible)


Custom composition area (central panel) will be optimized for phones and tablets


GUI behaves natively under
MacOSX
, Linux and Windows

Kerstin
Weger

(WP76)

Karabo: The European XFEL software framework

DETAIL: Layout

Default panel arrangement, docking and sliding

42

Navigation

Custom composition area

Configuration

Notifications

Logging / Scripting console

Documentation

Kerstin
Weger

(WP76)

Karabo: The European XFEL software framework

Navigation (devices,
configurations,
data, …)

43


Concept
: Navigate device
-
servers, devices,
configurations, data(
-
files), etc.


Different views (tabs) on data


Hierarchical distributed system view


Device ownership centric (view compositions)


Available configurations


Hierarchical file view (e.g. HDF5)


Automatic (by access level) filtering of items


Auto select navigation item if context is selected
somewhere else in GUI


Kerstin
Weger

(WP76)

Karabo: The European XFEL software framework

Configuration
(initialization vs. runtime, loading/saving, …
)

44


Concept
: Auto
-
generated default widgets for
configuring classes and instances


Widgets are generated from device information (.
xsd

format)


2
-
column layout for class configuration (label,
initialization
-
value)


3
-
column layout (label, value
-
on
-
device, edit
-
value)
for instance configuration


Allows reading/writing properties (all data
-
types)


Allows executing commands (as buttons)


Is aware about device’s FSM, enables/disables
widgets accordingly


Is
aware about access
level, enables
/disables
widgets
accordingly


Single, selection and all apply capability

Kerstin
Weger

(WP76)

Karabo: The European XFEL software framework

Customization
(widget galleries, custom GUI builder,
composition, …
)

45


Concept
: Combination of PowerPoint
-
like editor and online
properties/commands with changeable widget types


Tabbed, static panel (does not change on navigation)


Two modes: Pre
-
configuration (classes) and runtime configuration (instances)


Visual composition of properties/commands of any devices


Visual composition of devices (workflow
layouting
)


Data
-
type aware widget factory for properties/commands (edit/display)


PowerPoint
-
like tools for drawing, arranging, grouping, selecting, zooming of text,
shapes, pictures, etc.


Capability to save/load custom panels, open several simultaneously




Kerstin
Weger

(WP76)

Karabo: The European XFEL software framework

DETAIL: Customization

Property/Command composition

46

d
rag & drop

Display widget (Trend
-
Line)

Display widget

Editable widget

Kerstin
Weger

(WP76)

Karabo: The European XFEL software framework

DETAIL: Customization

Property/Command composition

47

d
rag & drop

Display widget

(Image View)

Display widget

(Histogram)

Kerstin
Weger

(WP76)

Karabo: The European XFEL software framework

DETAIL: Customization

Device (workflow) composition

48

Workflow node (device)

d
rag & drop

Draw connection

Kerstin
Weger

(WP76)

Karabo: The European XFEL software framework

DETAIL: Customization

Expert panels
-

Vacuum

49

Change between

“Design/Control” mode

Open/Save panel view

Insert text, line, rectangle, …

Cut, copy, paste, remove item

Rotate, scale item

Group items

Bring to front/back

Kerstin
Weger

(WP76)

Karabo: The European XFEL software framework

Notification (about alarms, finished
runs,

)

50


Concept
: Single place for all system relevant notifications, will link
-
out to more
detailed information


C
an be of arbitrary type, e.g.:


Finished experiment run/scan


Finished analysis job


Occurrences of errors, alarms


Update notifications, etc.


Intended to be conceptually similar to now
-
a
-
days smartphone notification bars


Visibility and/or acknowledgment of notifications may be user and/or access role
specific


May implement some configurable forwarding system (SMS, email, etc.)


Kerstin
Weger

(WP76)

Karabo: The European XFEL software framework

Log Inspection (filtering, configuration of log
-
levels, …
)

51


Concept
: Device’s network
appenders

provide active logging information which
can be inspected/filtered/exported


Tabular view


Filtering by
: full
-
text, date/time, message type, description


Export logging data to file


L
ogging events are decoupled from main event loop (threading)


Uses
Qt’s

model/view with SQLite DB as model (MVC design)

Kerstin
Weger

(WP76)

Karabo: The European XFEL software framework

Embedded scripting
(
i
Python
, macro recording/playing)

52


Concept
: Have the best of two worlds


embed
K
arabo
-
CLI

into
K
arabo
-
GUI


Give users the possibility to work with both interfaces seamlessly


Integrate
I
Python

console into
Qt

widget (as
karabo
-
CLI is
I
Python

based)


Display for any GUI event the corresponding script commands


Have macro recording/playing possibilities



Kerstin
Weger

(WP76)

Karabo: The European XFEL software framework

Online documentation (embedded wiki, bug
-
tracing, …
)

53


Concept
: Make the GUI a rich
-
client having embedded internet access. Use it
for web based device documentation, bug tracking, feature requests, etc.


Any device class will have an individual (standardized) wiki page.
P
ages are
automatically loaded (within the documentation panel) as soon as any
property/command/device is selected elsewhere in GUI (identical to configuration panel
behavior). Depending on access role, pages are immediately readable/editable.


Device wiki pages are also readable/editable via European XFEL’s document
management system (Alfresco) using standard browsers


For each property/command the coded attributes (e.g. description, units, min/max
values, etc.) is shown.


European XFEL’s bug tracking system will be integrated



Kerstin
Weger

(WP76)

Karabo: The European XFEL software framework

Software management (coding, building, packaging,
deployment, versioning, …)

54

Burkhard Heisen (WP76)


Concept
: Spiced up
NetBeans
-
based build system, software
-
bundle approach


Clear splitting of Karabo
-
Framework (distributed system) from Karabo
-
Packages
(plugins, extensions)


Karabo
-
Framework
(SVN:
karabo
/
karaboFramework
/trunk)


Coding done using
NetBeans

(for
c++

and python),
Makefile

based


Contains:
karabo
-
library (
libkarabo.so
),
karabo
-
deviceserver
,
karabo
-
brokermessagelogger
,
karabo
-
gui
, and
karabo
-
cli


Karabo
-
library already contains python bindings (i.e. can be imported into python)


Makefile

target “package” creates self
-
extracting shell
-
script which can be installed
on a blank (supported) operating system and is immediately functional


Embedded unit
-
testing, graphically integrated into
NetBeans

(
c++

and python)


Karabo
-
Packages
(SVN:
karabo
/
karaboPackages
/
category
/
packageName
/trunk)


After installation of Karabo
-
Framework packages can be build


SVN checkout of a package to any location and immediate make possible


Everything needed to start a full distributed
K
arabo instance available in package


A tool for package development is provided (templates, auto
svn

integration, etc.)


Karabo: The European XFEL software framework

DETAIL: Software management

The four audiences and their requirements

55

Burkhard Heisen (WP76)


Framework Developer


SVN
interaction, versioning,
releases


Code
development using
Netbeans
/Visual Studio


Addition
of
tests, easy
addition of external dependencies


Tools
for packaging the software into either
binary + header
or source
bundles


Allow
for being framework developer and package developer (see below) in one person at the
same
time


Package Developer


Flexible
access to the Karabo
framework ($HOME/.
karabo

encodes default location)


Allow
"one package
-

one software" project mode (each device project has its own versioning
cycle, individual
Netbeans

project)


Standards
for in
-
house development or XFEL developers need to be
fullfilled
: use
parametrized

templates provided, development under
Netbeans
, use SVN, final code review


Possibility
to add further extern dependencies to the Karabo framework (see above
)


System Integrator/Tester


Simple installation of Karabo framework and selected Karabo packages as
binaries


Start
broker, master, i.e. a full distributed system


Flexible
setup of device
-
servers +
plugins, allow
hot
-
fixes, sanity checks


XFEL
-
User/
Operator


Easy
installation
of pre
-
configured (binary framework + assortment of packages)
karabo

systems


Run
system
(
GUI, CLI
)




Karabo: The European XFEL software framework

DETAIL: Software management

Unit
-
testing

56

Burkhard Heisen (WP76)

Python

C++

Karabo: The European XFEL software framework

DETAIL: Software management

Continuous integration

57

Burkhard Heisen (WP76)


Continuous Integration
is a software development practice where members of a team integrate
their work frequently, usually each person integrates at least daily
-

leading to multiple integrations per
day. Each integration is verified by an automated build (including test) to detect integration errors as
quickly as possible. [Wikipedia
]



Required Features
:


Support for different build systems
and
different OS


Automated builds


nightly
builds


Continuous builds


on demand, triggered by SVN commit


Build matrix


different OS, compiler, compiler options


Web interface


configuration, results


Email notification


Build output logging


easy access to output of build errors


Reporting all changes from SVN since last successful build


easy trace of guilty developer


Plugin for any virtualization product (
VirtualBox
,
VMWare
, etc.)


Netbeans

plugin for build triggering


Easy uploading of build results (installation packages) to web
repository



CI systems on the market:
Hudson,
CruiseControl
,
buildbot
,
TeamCity
,
Jenkins






Karabo: The European XFEL software framework

DETAIL: Software management

Continuous integration

58

Burkhard Heisen (WP76)

Karabo: The European XFEL software framework

Conclusions

59

Burkhard Heisen (WP76)


XFEL.EU software will be designed to allow
simple integration of existing
algorithm/packages


The provided services focus on
solving general problems
like data
-
flow, configuration,
project
-
tracking, logging, parallelization, visualization, provenance


The ultimate goal is to provide a
homogenous software landscape to allow
fast and
simple
crosstalk between all computing enabled categories
(Control, DAQ, Data
Management and Scientific Computing)


The distributed system is device
-
centric (not attribute
-
centric), devices inherently express
functionality for
communication
,
configuration

and
flow control

Karabo: The European XFEL software framework

60

Thank you for your kind attention.

Burkhard Heisen (WP76)