OPENMRS DATA SYNCHRONIZATION

hedgebornabaloneSoftware and s/w Development

Dec 2, 2013 (3 years and 4 months ago)

104 views

OPENMRS

DATA
SYNCHRONIZATION

IMPLEMENTING
OPENMRS

IN LOOSELY CONNECTED
ENVIRONMENTS

27
-
Nov
-
2008, Maros
Cunderlik
, openmrs.org

1

OpenMRS

Software Architecture


Software Architecture


“Architecture is defined by the recommended practice as

the
fundamental organization of a system, embodied in its
components
,
their
relationships

to each other and the environment, and the
principles governing its design and
evolution



WHAT does it do?


HOW does it do it?


WHY does it do it the way it does?


Architecture Documentation: ‘Views’


Logical: documents functional composition of the system elements
and their relationships


Physical: distribution of the logical units onto physical resources


Servers, technologies, protocols, ports, etc.



2

References:

http://www.sei.cmu.edu/architecture/


http://www.sei.cmu.edu/architecture/published_definitions.html


OpenMRS

: Logical View

3

OpenMRS
: Physical View

MySql

DB

OS:



Windows or Unix

Application server:



tomcat, Java

Database:



MySql

or other RDBMS


Tomcat + Spring + Hibernate

(Http + Java Application Server +
Java Persistence)

JDBC

Internet

(HTTP)

Data Entry

Clinical Decision Support

Reporting

4

OpenMRS
: Challenges in rural areas


Goals:


Allow convenient and up
-
to
-
date access to system in rural
districts and health centers


Data collected in rural areas must be available to central
systems in timely manner


Challenges?


Power


Connectivity (packet loss, corruption) and bandwidth


Travel: data cannot be easily shipped to/from central
locations


HW and SW maintenance and upgrades in remote areas


HW failures, patches, SW upgrades, etc.


5

OpenMRS
: Loosely connected


Solutions?


#1: Collect Data on paper and ship it back to central location


Pros vs. Cons ?


#2:
OpenMRS


Lite



Make light
-
weight copy of
openmrs

that support minimal functionality
and distribute it to remote areas


Pros vs. Cons ?


#3: Separate Desktop Application


Make completely separate application that works in disconnected
mode


Pros vs. Cons ?


#4: Connected installs of
OpenMRS

with data synchronization


Pros vs. Cons ?






6

OpenMRS
: #4 Data Synchronization


Reasons for #4:


Health centers need functionality beyond simple data
entry (i.e. reporting, updated drug information); i.e.
making separate application would be costly


Health centers also need to *receive* data about
patients from other centers in their district/province: i.e.
‘one
-
way’ data flow is not sufficient


On
-
site connectivity: there *will* be onsite Internet
connectivity; it may be sporadic and at times unreliable


Given limited amount of dev resources, reuse as much
of core
openmrs

java code as possible

7

OpenMRS
: Data Synchronization
Design


Q: What capabilities must exist in a system for two
installations to exchange data?


A: Four things


1. Serialization


Facility to
reliably

export and then import business objects


2. Globally Unique Identification of data/records


Primary keys are unique only to a single *local* database


3. Change tracking mechanism


How do we know what changed on a given system since ‘last
time’?


4. Transport mechanism capable of working on unreliable
networks





8


#1: Serialization: Serializing object graphs can be
tricky

Data Synchronization: Implementation


public class Person {


protected Address
primaryAddress
;


protected Address
secondaryAddress
;





public Address
getPrimaryAddress
() {..}


public void
setPrimaryAddress
(Address a) {..}


public Address
getSecondaryAddress
() {..}


public void
setSecondaryAddress
(Address a) {..}

}

public static void main(String [ ]
args
) {


Address a = new Address(“Kigali”);


Person p = new Person();


p.setPrimaryAddress
(a);


p.setSecondaryAddress
(a);


a.setValue
(“
Kirehe
”);
assert(
p.getPrimaryAddress
().equals(
p.getS
econdaryAddress
());

}

<Person>


<
primaryAddress

value = ‘Kigali’ />


<
secondaryAddress

value = ‘Kigali’ />

</Person>


..

Address a1 =
p.getPrimaryAddress
();

Address a2 =
p.getSecondaryAddress
();

a1.setValue(“
Rwinkwavu
”):

assert(a1.equals(a2));



9


Serialization Options:


Java native:
java.io.Serializable


doesn’t work well for durable state; cannot move from one JVM to another


3
rd

party tools: Simple,
XStream


Custom


i.e. implement
iava.io.Externalizable




Data Synchronization: leverage Hibernate Persistence Mechanism


Pros:


Reuse what is already in use in
openmrs


Also provides simple solution to #3 problem


Dependent on persistence layer: any changes made outside of it will not be
serialized or understood


Longer
-
term: replace with
rebust

serialization framework in core
openmrs



Data Synchronization: Implementation


10


#2: Record Uniqueness


How do we know
patient_id

of a given patient in two different
databases?


Example: 2 server:
Rwinkwavu

and
Kirehe


0. both
Rwink

and
Kirehe

have exact same # of patients in their tables


1.
Rwink
: Add patient Joe, system assigns next id, assume
patient_id

= 34;


2.
Kirehe
: Add patient Patrick, system assigns next available primary key, say
34;


3. If
Rwink

sends its patient data, patient #34 will be Joe but
Kirehe

‘thinks’
it is Patrick: how to fix???


Two common solutions:


Create mapping tables:
server_id
,
table_id
,
pk



Cons: using one central mapping table creates single point of failure, keeping up
distributed version of the mapping table is
trickly


Use something that *is* globally unique: Universally Unique
IDentifier

(UUID)


Java.util.UUID
,
http://en.wikipedia.org/wiki/UUID


Data Synchronization: Implementation


11


#3: Change tracking mechanism


Servers need to somehow ‘tell’ each other the last time they
‘saw’ each other


Classic problem in distributed computing


Two (at least) common solutions:


#1: Versioning


#2: Change logs/journals


Openmrs

sync: Change journal + hibernate
intercetor

(for
now)


We actually want versioning but doing so requires extensive
changes to data model; compromise: change journal table
analogous to DB transaction log

Data Synchronization: Implementation


12


#4: Robust Transport


Needs:


cannot be connected protocol (i.e. RPC)


Efficient on wire


Back up mode for transport needs to be available in case of
not connectivity


Able to withstand network transport corruption


OpenMRS

Sync solution:


HTTP + checksum + compression


‘USB’ flat file based data interchange as backup

Data Synchronization: Implementation


13


Addressing the Challenges


Power and maintenance


Use
Ubuntu

to minimize need for patches


Application will automatically start up when server boots up and
sync on schedule; no intervention needed


Investment in key infrastructure: solar power, sat. connections


TODOs: application self
-
update, database self
-
migrate


Connectivity, Travel


Sync transmissions checksum
-
ed

and compressed


Data can be retransmitted without concern about
corruption/duplication


If prolonged outage: sync
-
via
-
USB available


Data Synchronization: Implementation


14


DEMO

Data Synchronization: Implementation


15

Data Synchronization: Vision

16