GigaSpaces XAP Elastic Caching Edition

farflungconvyancerΛογισμικό & κατασκευή λογ/κού

2 Δεκ 2013 (πριν από 3 χρόνια και 8 μήνες)

166 εμφανίσεις



© Copyright 2012

GigaSpaces. All Rights Reserved.

Title


Date

Subtitle

GigaSpaces XAP

Elastic Caching Edition
Feature List



February 2012
Elastic Caching Features | 2
Caching Topologies/Schemes ............................................................................................................. 3
Data Replication .................................................................................................................................. 4
Data Partitioning .................................................................................................................................. 7
Near Cache ......................................................................................................................................... 8
Disk Persistence ................................................................................................................................. 10
External Data Source ........................................................................................................................ 10
JDBC Storage Adapter ...................................................................................................................... 10
WAN Replication / Transactions / Locking ....................................................................................... 12
Cache Clusters over WAN ................................................................................................................ 12
WAN Replication Topologies ............................................................................................................ 13
Transaction Management ................................................................................................................. 15
Object Locking .................................................................................................................................. 15
Sync/Async Operations ..................................................................................................................... 16
Partial Update ................................................................................................................................... 16
Dynamic Rebalancing ......................................................................................................................... 16
Automatic Rebalancing ..................................................................................................................... 16
The Elastic PU .................................................................................................................................. 18
Load Balancing/Failover ..................................................................................................................... 19
Load Balancing ................................................................................................................................. 19
Failover .............................................................................................................................................. 20
Data Grid Node Failover and Recovery Process .............................................................................. 20
Data Access and Query Support ....................................................................................................... 21
Query Engine Supported Features ................................................................................................... 22
Bulk Operations ................................................................................................................................. 22
Data Streaming/Continuous Queries ................................................................................................ 23
Indexing................................................................................................................................................ 24
Query Execution Flow ....................................................................................................................... 25
Memory Usage / Serialization ............................................................................................................ 26
Default Serialization Flow .................................................................................................................. 26
Externalizable Serialization Flow ...................................................................................................... 27
XAP .Net Serialization ....................................................................................................................... 27
Cross-Platform & Data Interoperability Support .............................................................................. 29
Supported Data Types ...................................................................................................................... 30
Supported Arrays and Collections .................................................................................................... 31
Interoperability Example .................................................................................................................... 32
Monitoring / Management ................................................................................................................... 33
GigaSpaces Administration and Management API ........................................................................... 33
GigaSpaces JMX API ........................................................................................................................ 33
Rich User Interface ........................................................................................................................... 33
Web-Based Admin Console .............................................................................................................. 34
GigaSpaces Command Line Interface .............................................................................................. 34
Database Integration ........................................................................................................................... 35
The External Data Source API: Data-Load API, Read Through API, Write Through API ................ 36
Write Behind API ............................................................................................................................... 36
Rapid Data Load ............................................................................................................................... 36


GigaSpaces White Paper | 3


Caching Topologies/Schemes
1.1
Does the product support replicated cache topology?
Yes
1.2
Does the product support distributed / partitioned
cache topology?
Yes
1.3
Does the product support near-cache cache topology?
Yes

The GigaSpaces In-Memory Data Grid (IMDG) supports the following deployment topologies:
1. Fully replicated: Each member contains all of the data. Replication between nodes is done
synchronously or asynchronously
2. Partitioned: Each node contains a different subset of the data.
3. Partitioned with backup: same as partitioned, only each partition contains one or more
backup copies that receive the primary data synchronously or asynchronously.
Regardless of the IMDG cluster deployment topology, a client can run a near-cache (called local
cache/view).
All the topologies support co-locating business logic with any of the IMDG instances, enabling fast
data processing and eliminating serialization and network overhead once data is accessed.
Introduction
GigaSpaces XAP Elastic Caching Edition delivers an in-memory data grid for fast data access,
extreme performance, and scalability. XAP eliminates database bottlenecks and guarantees
consistency, transactional security, reliability, and high availability of your data.
XAP Elastic Caching is the only product designed to dynamically scale the data layer,
responding to loads in real time, while significantly boosting the performance of mission-
critical applications. Partnered with powerful event handling and comprehensive
administration and monitoring capabilities, this provides enterprise-grade availability and
reliability, guaranteeing virtually zero unanticipated downtime.
This paper provides an overview of the features and functionality offered by XAP Elastic
Caching, including topologies, data replication, monitoring & management, access & query
support, database integration, and more.
Elastic Caching Features | 4

Figure 1 – Data-Grid Topologies
References
http://www.gigaspaces.com/wiki/display/XAP8/Caching+Scenarios

http://www.gigaspaces.com/wiki/display/XAP8/Space+Topologies

http://www.gigaspaces.com/wiki/display/XAP8/Terminology+-+Data+Grid+Topologies

Data Replication
http://www.gigaspaces.com/wiki/display/XAP8/Packaging+and+Deployment

IMDG replication is the process of duplicating or copying application data and operations from a
source IMDG instance to a target instance, or to multiple target instances. Replication is used mainly
for IMDG high availability (where a replica instance runs in backup mode), for load balancing, and for
sharing local site data with remote sites.
Replicating application data and operations is vital, because the IMDG environment deployment
involves multiple workstation clusters that need to share the same data. To perform the application
data and operations replication, a special replicator component runs inside the engine of each
replicated IMDG instance. This component replicates the IMDG instance activity synchronously or
asynchronously with other instances that belong to the same replication group in the IMDG cluster.
The replication mechanism's default design is an efficient one. Any changes that do not need
propagation to the target instance are not propagated. For example, if an object is written and
removed within the same transaction, none of these operations are replicated. You can modify such
behavior by enabling the replicate-original-state option.
The replicator components find all the replica instances spontaneously, and reconnect automatically
when a target instance is discovered.
When a client performs a destructive operation on the IMDG – such as write, update, take, notify
registration, or transaction commit – a replicator thread at the source instance is triggered, and
handles all the work required to copy the operations and associated objects into the target
instance(s).
GigaSpaces White Paper | 5
When you perform IMDG operations under a transaction, the operations are replicated to the target
instance only when the commit() method is called. The client gets an acknowledgement regarding the
commit operation after both the source and the target instances committed the transaction.
Synchronous and Asynchronous Replication Support
The GigaSpaces cluster provides both synchronous and asynchronous replication schemes. In the
synchronous replication scheme, the client receives acknowledgement of destructive operations
only after both sets of the IMDG instances – source and target – have performed the operation.

Figure 2 – Synchronous Replication Figure 3 – Asynchronous Replication
When the target instance is defined in a backup mode, but is not available, the client receives
acknowledgement from the active primary instance. The operation is logged at the primary and sent to
the backup instance when the source instance re-establishes the connection with the backup. The
primary instance logs all destructive operations until the backup is started. The same behavior
happens when running in asynchronous replication mode.
In the asynchronous replication scheme, destructive operations are performed in the source IMDG
instance, and acknowledgement is immediately returned to the client. Operations are accumulated in
the source IMDG instance and sent asynchronously to the target IMDG instance, after a defined
period of time has elapsed, or after a defined number of operations have been performed (the first of
these to occur). The downside of this scheme is the possibility of data loss if the source instance fails
while transferring the accumulated operations to the target instance. The asynchronous scheme might
also introduce data inconsistency – the source and the target do not have identical data at all times.
Elastic Caching Features | 6
The following table provides a simple comparison between synchronous and asynchronous
replication modes:
Aspect
Synchronous Replication
Asynchronous Replication
Data loss
Each IMDG instance operation waits until
completion is confirmed at both the
primary and backup IMDG instances.
An incomplete operation is rolled back at
both locations; therefore the remote copy
is always an exact duplicate of the
primary.
Might sometimes lose data if there is an
unplanned failover to the backup IMDG
instance.
In a failover situation, the backup IMDG
instance is available for clients only after
all data from its redo log has been
processed.
This might slow down the failover.
Distance
Sensitive to latency when performing
write/update/remove operations, which
are tied directly to distance.
Highly tolerant of latency, and can be
used over the primary and backup IMDG
instances, which are located in different
geographical sites (WAN).
Performance
impact
Client must wait for confirmation of each
IMDG instance operation from both the
source IMDG instance and target IMDG
instance(s).
Performance is mainly based on source
IMDG instance resources (CPU/memory),
target IMDG instance resources
(CPU/memory), and the network
bandwidth and latency between the
source IMDG instance and the target
IMDG instance.
Client is acknowledged immediately after
the source IMDG instance has processed
incoming operations.
Performance is mainly based on source
IMDG instance resources (CPU/memory).
Data integrity
Very accurate. Less accurate.
Failover time
Very rapid.
Backup IMDG instance does not need to
process redo log data.
Backup IMDG instance needs to process
redo log data.
The recovery time is based on the
number of operations.
Bandwidth
requirements
LAN WAN/LAN
References
http://www.gigaspaces.com/wiki/display/XAP8/Replication

http://www.gigaspaces.com/wiki/display/XAP8/Controlling+Replication+Granularity

http://www.gigaspaces.com/wiki/display/XAP8/Controlling+the+Recovery+Process

GigaSpaces White Paper | 7
Data Partitioning
Data partitioning means groping large data set into smaller logical buckets to support enhanced
scalability.
Data load-balancing (aka data partitioning or data routing) is done based on user key or routing object
(content based load-balancing). The user may supply its own routing implementation by implementing
the routing field getter method. A customized load-balancing can be done simply by implementing the
hashcode function of the routing field.

Figure 4 – Data-Partitioning
@SpaceClass
public class Account
{
Integer accountID;
String accountName;

@SpaceRouting
public Integer getAccountID()
{
return accountID;
}
public String getAccountName()
{
return accountName;
}
}
Figure 5 – The SpaceRouting Annotation
When using the hash-based load-balancing policy the client proxy spreads new written IMDG
instance objects into relevant cluster IMDG instances. The IMDG instance that routs the operation is
determined using IMDG instance object routing field value hashcode. This value, together with the
number of total cluster partitions, is used to calculate the target IMDG instance for the operation:
Target partition IMDG instance ID = hashcode % (# of partitions)
Elastic Caching Features | 8

Figure 6 – Data-Partitioning Example
The routing field must implement the hashcode method, and is used when performing both write and
read operations. When using this approach, the assumption is there is normal distribution of routing
field values to have even distribution of the data across all the cluster partitions.
The @SpaceRouting decoration is responsible for the affinity of the data and the activity associated
with it. You can rout write and read operations based on the @SpaceRouting decoration. You can
route execution of a specific business logic using the same mechanism to execute a specific
functionality on specific nodes.
This affinity is the cornerstone of the Space-Based Architecture GigaSpaces provides. You can co-
locate your business logic and its associated data, assuming that requests for a specific data item will
be always routed to the same logical partition (that might move to different container based on
SLA or user request as a result of rebalancing request).
Because there is no client-side routing table, the extra latency associated with failover or relocation is
ZERO. The client always knows the location of each logical partition via the lookup service.
References
http://www.gigaspaces.com/wiki/display/XAP8/Load-Balancing#Load-Balancing-HashBasedLoadBalancing

http://www.gigaspaces.com/wiki/display/XAP8/A+Typical+SBA+Application

http://www.gigaspaces.com/wiki/display/XAP8/Terminology+-+Space-Based+Architecture

Near Cache
http://www.gigaspaces.com/wiki/display/XAP8/Space+Topologies

Each client can cache the recently used data that is fully stored at the IMDG (regardless of its
topology) for faster read-access time.
GigaSpaces supports two client-side cache models:
1. Local cache – on-demand client cache load. Data is loaded in a lazy manner into the client-
side cache. This cache evicts its data based on available free memory on the client side.
GigaSpaces White Paper | 9

Figure 7 – Local Cache
Below is simple benchmark comparing
Ehcache
get() operation with the GigaSpaces local
cache using the GigaSpace.readById() operation. In this benchmark the local cache/ehcache
size can accommodate the entire data set the application is accessing.

Figure 8 – Local Cache Benchmark
2. Local View – pre-fetch client cache load. Client cache is loaded based on predefined SQL
query definitions.

Figure 9 – Local View
Once changes are done within the IMDG data, the relevant objects within the local view/local
cache will be updated in a transparent manner (streaming). Updates sent from the IMDG into
the client local cache/view in batches to leverage the network in optimized manner. You may
control the batch size when starting the local cache/view.
References
http://www.gigaspaces.com/wiki/display/XAP8/Local+Cache+and+Local+View

Elastic Caching Features | 10
Disk Persistence
2.1
Does the product support disk persistence out-of-the-box
(without any need for application coding) for replicated
cache topology?
Yes
2.2
Does the product support disk persistence out-of-the-box
(without any need for application coding) for distributed /
partitioned cache topology?
Yes

GigaSpaces provides several persistency models, which enable different modes of interaction with a
database.
GigaSpaces exposes several interfaces and persistency implementations. Each provides a different
persistency model suitable for a different scenario. Two models are described below:
 External Data Source – Enables you to persist data into a user-defined model (using
Hibernate config by default).
 JDBC Storage Adapter – out-of-the-box model that plugs into any database or an embedded
database (AKA file persistence).
External Data Source
The
External Data Source
model includes a simple interface with a built-in implementation using
Hibernate, which supports storing data in an existing data source and in the IMDG. Data is loaded
from the data source during IMDG instance initialization (via the DataProvider implementation), and
thereafter, the application works with the IMDG instance directly. Meanwhile, the data source is
continually updated with all the changes made in the IMDG instance (via the DataPersister
implementation).
JDBC Storage Adapter
This option does not require a mapping library to store data within the database. The mapping into the
database tables is done automatically by GigaSpaces XAP. This mode supports implicit data recovery
from the database when the IMDG is started, stores IMDG objects data within database table as rows,
and persists the replication redo log. The data source and IMDG are continually synchronized.
The GigaSpaces External Data Source (EDS) is an IMDG component that provides advanced
persistency capabilities for the Space-Based Architecture.
The External Data Source provides the DataProvider/DataPersister interfaces (with a built-in
implementation using
Hibernate
) that can be used to store data in an existing data source and in the
IMDG. Data is loaded from the data source during IMDG initialization (DataProvider), and thereafter,
the application works with the IMDG directly. Meanwhile, the data source is continually updated with
all the changes made in the IMDG (DataPersister).
GigaSpaces White Paper | 11
Persistency can be configured to run in Synchronous or Asynchronous mode:
 Synchronous Mode – see
Direct Persistency


Figure 10 – Synchronous Persistency Mode
 Asynchronous Mode – see
Asynchronous Persistency with the Mirror


Figure 11 – Asynchronous Persistency Mode
The difference between the Synchronous and Asynchronous persistency mode is in how data is
persisted back to the database. The Synchronous mode data is persisted immediately once the
operation is conducted, where the client application waits for the External Data Source to confirm the
write. In the Asynchronous mode (mirror service), data is persisted in a reliable asynchronous
manner using the mirror Service as a write behind activity. This mode provides maximum
performance.
References
http://www.gigaspaces.com/wiki/display/XAP8/Persistency


http://www.gigaspaces.com/wiki/display/XAP8/External+Data+Source

Elastic Caching Features | 12
WAN Replication / Transactions / Locking
3.1
Does the product offer bi-directional replication capability
across WAN clusters for replicated cache?
Yes
3.2
Does the product offer bi-directional replication capability
across WAN clusters for distributed / partitioned cache?
Yes
3.3
Does the product offer capability to synchronously
replicate date over WAN?
Yes
3.4
Does the product offer delta processing for updates for
optimized WAN replication?
Yes
3.5
Does the product support distributed transactions across
cache clusters residing in different regions (WAN)?
Yes
3.6
Does the product support entry-level locks? (i.e., ability to
lock an entry for update.)
Yes

Cache Clusters over WAN
Enterprise-wide applications require the sharing of data across remote geographical sites. This means
that several instances of the application need to share data, while the communication bandwidth may
be low and speed between the different sites is relatively slow.
In some cases, due to security and limited bandwidth, each site requires only a portion of the local
site data to be replicated to other remote sites. In such cases, data should be filtered before being
replicated to other remote sites based on user business logic.
Cluster-to-cluster synchronization over WAN can be between:
 A cluster running in one site of the enterprise and another cluster running on another remote
site of the enterprise;
 A local cluster running within the enterprise and another cluster running over the cloud;
 A cluster running over cloud A and another cluster running on cloud B.
These scenarios are relevant for disaster recovery requirements and for active-active situations.
GigaSpaces XAP support for WAN data distribution is addressed by the following components:
 Asynchronous Replication – GigaSpaces enables batching IMDG operations and
replicating them based on time intervals or number of operations. The IMDG includes a redo
log mechanism that efficiently logs operations and entries and performs a handshake in the
source and target IMDG instance for every replication event. Because the replication is
asynchronous, client response time is not affected.
 Processing Unit – You can deploy your business logic into GigaSpaces XAP, co-locating the
business logic and IMDG within the same memory address. The business logic can react
when a matching event occurs within the IMDG. This enables you to create specific business
logic that interacts with remote sites, and synchronize activities across the different sites.
GigaSpaces White Paper | 13
 Data Replication Granularity – WAN environments create the need to define fine-grained
replication – sometimes at the object level. GigaSpaces XAP supports IMDG instance, class,
and object level (content-based) replication granularity that is decoupled from the actual
application business logic and done through external configuration.
 Data Bulk Load – In WAN environments there is a need to pre-load the various nodes before
they are available for public use. GigaSpaces includes a built-in mechanism to pre-load IMDG
instances before they are accessible to clients.
 Master-Local Space or Local View (Hub & Spoke Architecture) – In WAN environments,
you might want to form smart dynamic caches on the client side that are decoupled from the
backend IMDG, to store only data relevant for the application instance. This client cache runs
in the client memory address and is updated using streaming techniques with a push/pull
model. The local cache can be loaded on demand or pre-loaded using standard SQL.
 Automatic Data Recovery – GigaSpaces XAP provides a data and event notification
registration recovery mechanism, enabling restarted IMDG instances to pre-load data from a
partner IMDG instance.
 Loosely-Coupled Architecture – Because data is stored in memory within IMDG instances
you can construct loosely-coupled architectures, having data distributed without physical
relationships in terms of data complexity, machine host names, Network IP or actual IMDG
instance location.
 System-Level Events and Fault-Detection – GigaSpaces XAP’s "nervous system" – the
Service Grid – provides system-level events, enabling external applications and internal
components to be aware of the status of the various distributed components, such as their
deployment, active mode, or failure – and act upon this information. These events can be
accessed via the JMX API.
 Intuitive Provisioning via GUI – GigaSpaces XAP provides an extensive enterprise-level
monitoring, provisioning, management and administration GUI tool. Using one management
console, the user can view and query the status of deployed services and IMDG in real time.
WAN Replication Topologies
GigaSpaces XAP provides various WAN-based replication topologies. These support a wide range of
deployments scenarios from the very simple to very complex. Each has a different impact on the
performance and deployment complexity.
WAN Gateway
This is the most advanced option. This topology includes the following:
 Active-Active (Multi-Master), Active-Passive (bi-directional replication) support.
 Automatic deployment. No static deployment or manual handling.
 Control of data flow over the WAN.
 Small number of connections over the WAN.
 Conflict resolution support (concurrent updates).
 Indirect data replication.
 Remote site bootstrapping.
Elastic Caching Features | 14

Figure 12 – Master-Slave WAN Replication Figure 13 – Multi-Master WAN Replication

For more information about the GigaSpaces WAN Gateway see
Local Cache over WAN
www.gigaspaces.com/wiki/display/XAP8/Multi-Site+Replication+over+the+WAN

This topology includes the following:
 Multi-Master database deployment – sync data through the database replication module.
 One-Master cache
 Simple to deploy and manage
 Primary and backups located within the same site
 One central location for data updates
 Local cache nodes at each site
 A copy of the recently used data co-located with the app server process
 Improve read activity performance significantly
 Avoid reading data over the WAN
 Hub & Spoke model – clients can come and go

Figure 14 – Local Cache over WAN
GigaSpaces White Paper | 15
Synchronous Replication over WAN
You can deploy IMDG (replicated or partitioned) across multiple sites. This topology includes:
 One IMDG cluster – simple to deploy and manage
 Local cache/view improves read activity performance
 Avoid reading data over the WAN


Figure 15 – Synchronous Replication over the WAN
Transaction Management
The GigaSpaces IMDG supports local, distributed, and XA transaction managers, thereby supporting
all standard transaction isolations.
The IMDG is based on an industry-proven model invented 30 years ago (Linda, aka tuple space) that
was designed to scale transactional distributed systems.
Transaction support is provided through an explicit API and Spring automatic transaction demarcation
decorations. The developer can entirely abstract transaction handling from the business logic,
enabling Spring and GigaSpaces XAP to handle the transaction behavior completely transparently.
Learn more:
http://www.gigaspaces.com/wiki/display/XAP8/Transaction+Management

http://www.gigaspaces.com/wiki/display/XAP8/Space+Locking+and+Blocking

Object Locking
Both optimistic and pessimistic locking models are supported. Special optimization is provided for the
optimistic locking model and for batch operations. You can lock a specific object exclusively using the
EXCLUSIVE mode and the readByID operation.
Lean more:
http://www.gigaspaces.com/wiki/display/XAP8/Optimistic+Locking

http://www.gigaspaces.com/wiki/display/XAP8/Pessimistic+Locking

Elastic Caching Features | 16
Sync/Async Operations
All data retrieval has both synchronous and asynchronous modes. This means you can perform a
read or a take call waiting for a matching object to appear within the IMDG (in case there is no
matching object when the operation is called), or you can call asyncRead or asyncTake operation and
be called via a listener when a matching object appears within the IMDG. Such asynchronous
operations are very useful when building IMDG clusters that span multiple data centers over WAN.
Partial Update
You can perform data updates using the PARTIAL_UPDATE mode. Once this behavior is enabled
you should place a null value within the IMDG object field you do not want replaced when updating
specific object content. When using this option, only the changes (delta) are replicated to target IMDG
instances (backup), improving replication speed.
Dynamic Rebalancing
4.1
Does the product support ability to add new replicated
cache instance at run-time?
Yes
4.2
Would the newly added replicated cache instance
participate in load-balancing at runtime?
Yes
4.3
Does the product support ability to add new distributed /
partitioned cache instance at run-time?
Yes
4.4
Does the product offer capability to automatically re-
balance partitioned data across newly added instance?
Yes

Automatic Rebalancing
GigaSpaces XAP supports automatic discovery, rebalancing, and expansion/shrinking of the IMDG
while the application is running. When deploying an IMDG, the system partitions the data (and co-
located business logic) into logical partitions. You can choose the number of logical partitions or let
GigaSpaces XAP calculate the number.
The logical partitions can run initially within the same GigaSpaces container, and later relocate to
other containers (started dynamically on demand) on other machines, enabling the system to expand
and increase its memory and CPU capacity while the application running.
Usually, the number of IMDG logical partitions is the maximum expected number of machines that can
host the IMDG, multiplied by a scaling factor that is usually the number of containers per machine.
The number of logical partitions and replicas per partition should be determined at deployment. The
number of containers hosting the IMDG instances can be changed during runtime.
GigaSpaces White Paper | 17
The component that is responsible for reshaping the IMDG in runtime is the Elastic Service Manager
(ESM):

Figure 16 – The Elastic Service Manager
What happens when a GigaSpaces container node is removed?
When removing a grid node (GSC) hosting IMDG primary instance, clients fail over to a hot standby
IMDG backup instance. A new backup instance of the failed IMDG partition is recreated on of the
available GSCs to comply with the predefined SLA.
What happens when a GigaSpaces container node is added?
When a new grid node (GSC) joins the cluster, the system can be configured to automatically
rebalance IMDG partitions/application instances to run on the newly introduced grid node. This can
also be done manually using the rebalance utility, by starting new containers and relocating existing
IMDG instances to the newly added nodes.
Dynamic rebalancing based on SLA
The ESM can rebalance the IMDG based on memory or CPU utilization (SLA). You should specify the
SLA at deployment:
ProcessingUnit pu = esm.deploy(new ElasticDataGridDeployment("mygrid")
.capacity("1g", "2g")
.maximumJavaHeapSize("250m")
.highlyAvailable(true)
.addSla(new MemorySla("75%"));
The ESM continually rebalances instances across all available machines. This service currently
supports the built-in IMDG PU. Future XAP versions will have the ESM supporting PUs with custom
business logic co-located with the data grid. The rebalance utility supports both the IMDG PU and
custom PUs with/without embedded IMDG.
Elastic Caching Features | 18
The Elastic PU
Production machines might be restarted every few days, they might fail abnormally and be restarted,
or new machines might be started and added to the grid. To enable even distribution of primary IMDG
instances or simply stretch the running instances across all available machines, the Elastic PU
spreads primary and backup IMDG instances evenly across all the machines running GSCs.

Figure 17 – Data Rebalancing
How does GigaSpaces XAP rebalancing work?
The GigaSpaces XAP runtime environment differentiates between a grid node (also called a
Container (GSC)) that is running within a single JVM instance, and an IMDG node (called also a
logical partition). A partition can have one primary instance and zero or more backup instances.
A grid node hosting zero or more IMDG nodes (backup or primary instances belonging to different
partitions). IMDG nodes (or any other deployed PU instance) can move (relocate) between grid nodes
in runtime. The GigaSpaces IMDG can expand or shrink its capacity in real time, by adding or
removing grid nodes and relocating existing logical partitions to newly started containers.
If the system decides to relocate a primary instance, it first switches its activity mode to backup, and
the existing backup instance is switched into a primary. When the new backup is relocated, it recovers
its data from the existing primary. This how the system expands without disruption.
GigaSpaces XAP rebalancing gives you total control over which logical partitions to move, where to
move them, and whether to move primary or backup instances. You can increase the capacity or
rebalance the IMDG automatically or manually. This is a very important capability in production
environments. Without this control, the system might move partitions that are hosted within containers
that are not fully consumed, and move too many instances into the same container, which would
crash the system.
GigaSpaces XAP can dynamically adjust the high-availability SLA to cope with current system
memory resources. This means that if there is insufficient memory to instantiate all the backup
partitions, GigaSpaces XAP relaxes the SLA in runtime to enable the system to continue running.
When the system identifies that there are enough resources to accommodate all the backups it starts
the missing backups automatically.
See more:
www.gigaspaces.com/wiki/display/XAP8/Elastic+Processing+Unit

GigaSpaces White Paper | 19
Load Balancing/Failover
5.1
Does the product support load-balancing across
replicated cache instances? If so, please mention the
schemes supported in comments
Yes
5.2
Does the product support load-balancing across
partitioned cache instances? If so, please mention the
schemes supported in comments
Yes
5.3
Does the product support connection failover in case of a
replicated cache instance failure?
Yes
5.4
Does the product support connection failover in case of a
partitioned cache instance failure?
Yes

GigaSpaces XAP provides SLA-based automated service failure detection, failover, and failback for
its IMDG and any other services deployed into its container. Failover happens within 1-2 seconds and
can be tuned with a reliable network to happen within less than one second.
The system automatically provisions a new IMDG backup instance when failure occurs (the existing
backup is promoted to a primary) within a few seconds, to support continuous high availability. The
machine/container used to host the newly started backup is determined based on predefined SLA.
Load Balancing
You can load-balance any activity (read/write) using any of the built-in load-balancing options. The
following table describes the supported load balancing policies.
Policy
Description
local-space
This policy routes the operation to the local embedded space (without
specifying the exact space name). It is used mainly with P2P clusters.
round-robin
The clustered proxy distributes the operations to the group members in
a round-robin fashion. For example, if there are three spaces, one
operation is performed on the first space, the next on the second space,
the next on the third space, the next on the first space, and so on.
fixed-by-hash
Each space in the group is assigned a certain range of hash codes. The
clustered proxy computes a hash from the first space object or template
sent by the user, and directs that operation and all subsequent
operations to the space that the hash code belongs to.
hash-based
As above, except a new hash is computed for each user operation, and
so each operation may be routed to a different space. This ensures, with
high probability, that operations are evenly distributed. This is the
default mode and the recommended mode.
See the
Data-Partitioning
section for more details.
Elastic Caching Features | 20
Failover
Failover is the mechanism used to route user operations to alternate IMDG instances if the target
IMDG (of the original operation) fails. Several IMDG instances can belong to a failover group, which
then defines their failover policy.
The component responsible for failover in GigaSpaces is the clustered proxy. This component
maintains a list of IMDG instances that belongs to the failover group. When an operation on a IMDG
fails, because the IMDG instance is unavailable, the clustered proxy tries to locate a live and
accessible IMDG instance. If it finds such IMDG instance, it re-invokes the operation on that IMDG
instance. If it doesn't find any live IMDG instance, it throws the original exception back to the user.
Data Grid Node Failover and Recovery Process
Failover and recovery apply to an IMDG deployed with a backup instance (partitioned or non-
partitioned). The failover and recovery process includes the following steps (assuming the IMDG is
deployed into a GSC):
1. The IMDG backup instance identifies that the primary IMDG instance is not available and
turns itself into a primary mode. The active election settings that are part of the cluster config
can be tuned to speed up this step.
2. The client is routed to the backup IMDG instance. You can change the failover timeout as part
of the cluster config to tune this.
3.
The Grid Service Manager (GSM) identifies that the backup IMDG instance is missing and
provisions a new IMDG instance into an existing GSC – preferably an empty GSC, or based
on predefined SLA. You can tune this step by modifying the GSM settings.

4.
An IMDG backup instance is started. It accesses the lookup service and registers itself.

5.
The IMDG backup instance identifies that a primary IMDG instance exists (goes through
primary election process) and moves itself into backup mode.

6.
The IMDG backup instance reads existing IMDG objects from the primary IMDG instance
(memory recovery). This process uses multiple threads (recovery threads). You can tune this
by modifying the recovery batch size and also the number of recovery threads.

7. The IMDG primary instance clears its redo log (during the previous step) and begins to
accumulate incoming destructive events (write, update, take) within its redo log.
8.
The IMDG backup instance finishes reading IMDG objects from the primary instance.

9. The IMDG primary instance replicates its redo log content to the backup. At this time, the
IMDG backup instance also receives sync replication events from the primary.
10. When the redo replication is completed, the recovery process is finished. The IMDG backup
instance is ready to act as a primary IMDG instance.
References
http://www.gigaspaces.com/wiki/display/XAP8/Configuring+the+Processing+Unit+SLA

http://www.gigaspaces.com/wiki/display/XAP8/Failover+Detection+and+Tuning

http://www.gigaspaces.com/wiki/display/XAP8/Failover

http://www.gigaspaces.com/wiki/display/XAP8/Proxy+Connectivity

http://www.gigaspaces.com/wiki/display/XAP8/About+Network+Failure+Detection

http://www.gigaspaces.com/wiki/display/XAP8/Communication+Protocol

GigaSpaces White Paper | 21
Data Access and Query Support
6.1
Does the product offer a cost-based optimizer for
improved query performance?
Yes
6.2
Does the product support OQL / SQL query capability?
Yes
6.3
Does the product support Like queries (starts with &
contains)?
Yes
6.4
Does the product support bulk get interface for optimized
retrieval? If so, please add comments on this support.
Yes
6.5
Does the product support bulk put interface for optimized
put? If so, please add comments on this support.
Yes
6.6
Does the product support streaming updates based on
OQL subscription?
Yes
6.7
Does the product support streaming updates based on
key, regex subscription?
Yes

Queries can be executed against remote or embedded IMDG instances. You can use template-based
(query by example) or pure SQL-based queries to locate matching objects within the IMDG. The
product supports JPA , POJO/Spring, and JDBC/ODBC APIs. The ODBC API is provided through a
3
rd
-party bridge. Queries are always executed in parallel, enabling the system to optimize use of multi-
core machines and the network.
GigaSpaces query engine support is highly optimized, and supports primitive data types, nested
complex objects, and custom queries with a built-in high-performance indexing engine optimized for
both equality and bigger/less-than queries.
The GigaSpaces IMDG supports large queries with very large result sets using a smart data iterator,
enabling the client to iterate over a very large data set without any overhead.
The GigaSpaces Map-Reduce API enables transporting the query into the IMDG itself to be executed
within the IMDG, avoiding the need to transport the whole query result back into the client application.

Figure 18 – GigaSpaces Map-Reduce API
Elastic Caching Features | 22
When the IMDG is configured to run according to an LRU cache policy, a cache missing the
underlying database is queried automatically. The IMDG queries are translated transparently into
database queries. Relevant data found within the database is loaded into the IMDG and handed back
to the application, enabling the application to reuse this data without accessing the database again.
Query Engine Supported Features

All basic logical operations to create conditions: =, <>, <,>, >=, <=, like,
BETWEEN, NOT like, is null, is NOT null, IN
.
 AND/OR operators to join two or more conditions
 Indexing
 Like / Regular expression Queries
 Dynamic (parametric) Queries
 Nested objects Queries
 Nested arrays, collections and maps Queries
 Join (via JDBC)
 Order By
 Group By
 Sub Query (IN)
 Aggregate functions: COUNT, MAX, MIN, SUM, AVG (via JDBC)
Bulk Operations
Bulk operations for data retrieval are supported using readMutiple , takeMultiple and GSItaerator.
Bulk operations for data updates are supported using writeMultiple and updateMultiple.

Figure 19 – IMDG Operations
References
http://www.gigaspaces.com/wiki/display/XAP8/SQLQuery

http://www.gigaspaces.com/wiki/display/XAP8/JDBC+Driver

http://www.gigaspaces.com/wiki/display/SBP/Custom+Query

http://www.gigaspaces.com/wiki/display/XAP8/POJO+Support#POJOSupport-CodeSnippets

GigaSpaces White Paper | 23
Data Streaming/Continuous Queries
Data streaming and continuous queries are supported through the Session-Based Messaging API,
the Notify Container, and the local view:

Figure 20 – Notify Container


Figure 21 – Session-Based Messaging API
References
http://www.gigaspaces.com/wiki/display/XAP8/Session+Based+Messaging+API

http://www.gigaspaces.com/wiki/display/XAP8/Notify+Container


http://www.gigaspaces.com/wiki/display/XAP8/Local+View




Elastic Caching Features | 24
Indexing
7.1
Does the product support configuration based indexes?
(i.e., addition of indexes without application coding)
Yes
7.2
Does the product support programmatic index addition at
runtime?
Yes

When executing queries, the IMDG looks for matching objects stored within the IMDG. This can be
time consuming, especially when there are many potential matching objects. To improve the query
execution performance, it is possible to index one or more object properties. The IMDG maintains
additional data in-memory for indexed properties, which shortens the time required to determine a
match, thereby improving data retrieval performance.
Properties are not always indexed, nor are all the properties in all the classes always indexed. The
reason is that indexing has its downsides as well:
 An indexed property can speed up read/take operations, but might also slow down
write/update operations.
 An indexed property consumes more resources, specifically the memory footprint per entry.
Usually it is recommended to index properties that are used in common queries. However, in some
cases, you might prefer a smaller footprint or faster performance for a specific query, and then,
adding/removing an index should be considered.
You can specify which properties of a class are indexed, using annotations or gs.xml:
Via Annotations:
@SpaceClass
public class Person
{
private String lastName;
private String firstName;
private Integer age;

...
@SpaceIndex(type=SpaceIndexType.BASIC)
public String getFirstName() {return firstName;}
public void setFirstName(String firstName) {this.firstName =
firstName;}

@SpaceIndex(type=SpaceIndexType.BASIC)
public String getLastName() {return lastName;}
public void setLastName(String name) {this.lastName = name;}

@SpaceIndex(type=SpaceIndexType.EXTENDED)
public String getAge() {return age;}
public void setAge(String age) {this.age = age;}
}

GigaSpaces White Paper | 25
Via XML:
<gigaspaces-mapping>
<class name="com.gigaspaces.examples.Person" persist="false"
replicate="false" fifo="false" >
<property name="lastName">
<index type="BASIC"/>
</property>
<property name="firstName">
<index type="BASIC"/>
</property>
<property name="age">
<index type="EXTENDED"/>
</property>
</class>
</gigaspaces-mapping>

When a read, take, readMultiple, or takeMultiple call is performed, a template/SQLQuery is used to
locate matching IMDG objects. The template object might have multiple field values, some might
include values and some might not (i.e., null field values acting as wildcards). The fields that do not
include values are ignored during the matching process.
Query Execution Flow
When multiple class fields are indexed, the IMDG looks for the field value index that includes the
smallest number of matching IMDG objects with the corresponding template field value as the index
key.
The smallest set of IMDG objects is the list of objects to match against (matching candidates).When
the candidate IMDG object list is constructed, it is scanned to locate IMDG objects that fully match the
given template (all non-null template fields match the corresponding IMDG object fields).
Class fields that are not indexed are not used to construct the candidate list.
References
http://www.gigaspaces.com/wiki/display/XAP8/Indexing

Elastic Caching Features | 26
Memory Usage / Serialization
8.1
Does the product store objects in Serialized form in cache,
when indexes are created on value object?
Other
8.2
Does the product store indexes in Serialized form in
cache?
Yes

GigaSpaces uses a unique approach when transporting IMDG objects from one process to another
(client-IMDG instance, IMDG instance-IMDG instance). By default, objects are not serialized using the
regular
Java serialization approach
, but using GigaSpaces serialization technology.
There are several options to control GigaSpaces serialization:
 Default mode – Active when your space class does not implement
Externalizable
. You can
choose one of the serialization modes listed below to control the way the space object fields
are serialized.

Implement Externalizable
– In this case, only the native serialization is supported. This
mode enables you to have total control of the object transport.
With the default mode (when the space class does not Implement Externalizable), you can control the
serialization mode of Space Class non-primitive fields when they are written or read from the space
(remote or embedded) using the space-config.serialization-type property. Supported options are:
 0 – Native Serialization (Default)
 1 – Light Serialization
 2 – Full Serialization
 3 – Compressed Serialization
Indexed data stored within a special data structure enables very fast scanning with minimum
overhead and minimum footprint. Each index holds the indexed value in serialized form and
references all relevant IMDG objects.
Default Serialization Flow
When a client performs an IMDG operation using a remote IMDG instance (IMDG instance running on
a different VM than the client program), the POJO/PONO/POCO non-primitive fields used with the
write/update operation, or POJO template non-primitive fields used with the read/take/notify operation
are serialized into a special object (packet) and sent to the IMDG. Primitive fields are copied into the
packet object. When the POJO/PONO/POCO or the template packet arrives to the IMDG, non-
primitives are de-serialized and their fields (primitive and non-primitive types) are stored within the
IMDG using a generic data structure, or are used to find matching objects (read/take/notify) in the
IMDG. When a read/take operation is called, the matching object data is serialized and sent back to
the client program. When the matching object arrives into the client program VM, it is de-serialized
and used by the client application.
GigaSpaces White Paper | 27

Figure 22 – Default Serialization Flow
The POJO/PONO/POCO field values and its meta-data information are extracted in run-time
(marshaled), and transferred into the IMDG using GigaSpaces generic portable object. When sent
back to the client as a result of a read/take operation, the object is de-marshaled and used by the
client application.
Externalizable Serialization Flow
When the Java Space Class implements (relevant for a Java POJO) the Externalizable interface,
readExternal and writeExternal are called, you can control the stream transferred across the network.


Figure 23 – Externalizable Serialization Flow
References
http://www.gigaspaces.com/wiki/display/XAP8/Controlling+Serialization

XAP .Net Serialization
http://www.gigaspaces.com/wiki/display/XAP8/Externalizable+Support

GigaSpaces PBS (Portable Binary Serialization) is the underlying technology used to serialize and
transport non-Java objects from the client side to the IMDG side. It is highly optimized serialization
technology that enables a .Net or C++ client to communicate with the IMDG very efficiently.
To control Space Class field serialization, use the
StorageType
attribute.

Elastic Caching Features | 28
Here are th
e StorageType
supported options:

 StorageType=Object means that the .NET proxy will serialize the property value in PBS and
the IMDG will de-serialize it back into its java counterpart, and store it in the IMDG as such.
 StorageType=Binary means that the .NET proxy will serialize the property value in PBS, and
the IMDG will not de-serialize it, just keep the bytes in a special Binary container in the IMDG.
On read/take, the binary content will be passed as-is back to the .NET proxy, which will de-
serialize it back to a .NET object.
 StorageType=BinaryCustom is the same as Binary, except that the .NET proxy uses .NET
serialization instead of PBS serialization.
Performance Considerations
 Binary is almost always the fastest, because the server side does not need to
de-serialize/serialize on write/read.
 BinaryCustom is somewhat slower in most cases than Binary, because PBS serialization is
faster and more efficient than .NET default serialization, since .NET serialization is more
generic and needs to take into account consideration that GigaSpaces does not (for example,
GigaSpaces XAP only serializes meta data the first time, .NET does it every time). However,
on the IMDG side, the data is still stored as a BLOB.
 Object serialization is usually the slowest because the IMDG must fully de-serialize the
BLOB and reconstruct the POJO.
Note that for common primitive types (int, string, date/time, etc.), performance differences are
negligible (if existent), and there is no sense in changing the default from object. As the property
payload increases, it makes more sense to start tweaking its storage type.
References

http://www.gigaspaces.com/wiki/display/XAP8NET/Controling+Serialization


GigaSpaces White Paper | 29
Cross-Platform & Data Interoperability Support
9.1
Does the product offer Java client API?
Yes
9.2
Does the product offer .Net client API?
Yes
9.3
Does the product requires custom interfaces to be
implemented for making objects portable across Java and
.Net? If so, please add comments
No
9.4
Does the product offers any tools to auto-create
Java/.Net custom objects from plain POJOs? If so, please
add comments.
Yes

GigaSpaces XAP supports interoperability, including enabling Java, .NET and C++ applications to
communicate and access each other easily and efficiently, while also maintaining the benefits of the
GigaSpaces scale-out application server:
 Transparency: Use the application native OO classes to store data in-memory to share data
across different applications. For example, an object can be written to the IMDG from a C++
client, processed using a Java client, and a .NET client can receive a notification for the
updated data.
 Performance: Communication between different applications is performed through the IMDG
directly, without the need for adapters or XML translation.
 SOA: All applications communicate using the IMDG, enabling each of them to exist as a
loosely-coupled service.

Figure 24 – Interoperability Support
GigaSpaces data interoperability supports primitive types, strings, date/time, collections and user-
defined objects
GigaSpaces API provided for Java (and all its variations such as scala , JRuby , etc), .Net and C++.
You can deploy Java, .Net and C++ applications and co-locate them with the IMDG.
Elastic Caching Features | 30
Java API doc:
http://www.gigaspaces.com/wiki/display/XAP8

http://www.gigaspaces.com/docs/JavaDoc8/index.html

.Net API doc:
http://www.gigaspaces.com/wiki/display/XAP8/Packaging+and+Deployment

http://www.gigaspaces.com/wiki/display/XAP8NET

http://www.gigaspaces.com/docs/dotnetdocs8.0/

CPP API doc:
http://www.gigaspaces.com/wiki/display/XAP8NET/.NET+Processing+Unit+Container

http://www.gigaspaces.com/wiki/display/XAP8/XAP+CPP

http://www.gigaspaces.com/docs/cppdocs8.0/annotated.html

There is no need to generate matching Java classes for .Net/C++ classes when using the .Net/C++
GigaSpaces API.
http://www.gigaspaces.com/wiki/display/XAP8/CPP+Processing+Unit

Supported Data Types
The following types are supported by the IMDG for matching and interoperability:
CLS
C#
VB.Net
Java
Description
System.Byte

byte

Byte

byte

8-bit integer.
1

Nullable<Byte>

byte?

Nullable(Of
Byte)

java.lang.Byte

Nullable wrapper for byte.
1

System.Int16

short

Short

short
16-bit integer.
Nullable<Int16>

short?

Nullable(Of
Short)

java.lang.Short
Nullable wrapper for short.
System.Int32

int

Integer

int
32-bit integer.
Nullable<Int32>

int?

Nullable(Of
Integer)

java.lang.Integer
Nullable wrapper for int.
System.Int64

long

Long

long
64-bit integer.
Nullable<Int64>

long?

Nullable(Of
Long)

java.lang.Long
Nullable wrapper for long.
System.Single

float

Single

float
Single-precision floating-
point number (32 bits).
Nullable<Single>

float?

Nullable(Of
Single)

java.lang.Float
Nullable wrapper for float.
System.Double

double

Double

double
Double-precision floating-
point number (64 bits).
Nullable<Double>

double?

Nullable(Of
Double)

java.lang.Double
Nullable wrapper for double.
GigaSpaces White Paper | 31
CLS
C#
VB.Net
Java
Description
System.Boolean

bool

Boolean

boolean
Boolean value (true/false).
Nullable<Boolean>

bool?

Nullable(Of
Boolean)

java.lang.Boolean
Nullable wrapper for
boolean.
System.Char

char

Char

char
A Unicode character (16
bits).
Nullable<Char>

char?

Nullable(Of
Char)

java.lang.Character
Nullable wrapper for char.
System.String

string

String

java.lang.String
An immutable, fixed-length
string of Unicode characters.
System.DateTime

Nullable<DateTime>

DateTime

DateTime?

DateTime

Nullable(Of
DateTime)

java.util.Date
An instant in time, typically
expressed as a date and
time of day.
2,3

System.Decimal

Nullable<Decimal>

decimal

decimal?

Decimal

Nullable(Of
Decimal)

java.math.BigDecimal
A decimal number, used for
high-precision
calculations.
2,4

System.Guid

Nullable<Guid>

Guid

Guid?

Guid

Nullable(Of
Guid)

java.util.UUID
A 128-bit integer
representing a unique
identifier.
2

System.Object

object

Object

java.lang.Object
Any object
Supported Arrays and Collections
The following collections are mapped for interoperability:
.Net
Java
Description
T[]

E[]

Fixed-size arrays of
elements.

System.Collections.Generic.List<T>

System.Collections.ArrayList

System.Collections.Specialized.StringCollection

java.util.ArrayList
Ordered list of
elements.
System.Collections.Generic.Dictionary<K,V>

System.Collections.HashTable

System.Collections.Specialized.HybridDictionary

System.Collections.Specialized.ListDictionary

java.util.HashMap
Collection of key-
value pairs.
System.Collections.Generic.SortedDictionary<K,V>

System.Collections.Specialized.OrderedDictionary

java.util.LinkedHashMap
Ordered collection
of key-value pairs.
System.Collections.Specialized.NameValueCollection

System.Collections.Specialized.StringDictionary

java.util.Properties
Collection of key-
value string pairs.
1

Elastic Caching Features | 32
Interoperability Example
.Net Class:
Using GigaSpaces.Core.Metadata;
namespace MyCompany.MyProject.Entities
{
[SpaceClass(AliasName="com.mycompany.myproject.entities.Person")]
public class Person
{
private string _name;
[SpaceProperty(AliasName="name")]
public string Name
{
get { return this._name; }
set { this._name = value; }
}
}
}
The matching Java Class:
package com.mycompany.myproject.entities;\
public class Person
{
private String name;
public String getName()
{
return this.name;
}
public void setName(String name)
{
this.name = name;
}
}

Learn more:
http://www.gigaspaces.com/wiki/display/XAP8/Platform+Interoperability+in+GigaSpaces


http://www.gigaspaces.com/wiki/display/XAP8/Interoperability

GigaSpaces White Paper | 33
Monitoring / Management
10.1
Built-in JMX support
Yes
10.2
Application-level JMX support
Yes
GigaSpaces Administration and Management API
The GigaSpaces Administration and Management API facilitates administration and monitoring of all
GigaSpaces services and components. The API provides information and the ability to operate on the
currently running
GigaSpaces Agents
,
GigaSpaces Managers
,
GigaSpaces Container
s
,
Lookup
Service
s
, Processing Unit
s
Not only can you monitor the running system, but you can also take actions based on events the
runtime environment triggers. These include: API to start GigaSpaces containers, deploy
application/data grid, scale deployed applications/data grid, move running application/data grid
between containers, and query the system for its internal runtime components status. You can extract
statistics about every component’s behavior or be notified about any change within the runtime
environment.
, and Spaces (IMDG instances).
This API enables significant reduction of downtime costs, when the downtime is a result of human
error, such as misconfiguration, incorrect deployment procedure (starting the app before the
database), and more.
The API enables creating custom administration business logic that listens to garbage collection
spikes and performs the following in the event of garbage collection spikes:
 Auto-troubleshooting – Get the relevant machine info where the spike occurred, increase
the log-level of that machine (for a limited duration), get a snapshot of both the system and
the space, and copy the log file data into a central location for further analysis.
 Take a corrective action – Add another node to reduce the load on the relevant machine or
other machines so that the system continues working until the situation is corrected.
References
http://natishalom.typepad.com/nati_shaloms_blog/2009/09/the-interactive-cloud-part-i-general-concept.html

GigaSpaces JMX API
http://www.gigaspaces.com/wiki/display/XAP8/Administration+and+Monitoring+API

You can also use the JMX API to manage the relevant settings exposed.
Rich User Interface
http://www.gigaspaces.com/wiki/display/XAP8/Space+JMX+Management

The GigaSpaces XAP rich user interface enables you easily to manage every aspect of the system.
http://www.gigaspaces.com/wiki/display/XAP8/GigaSpaces+Management+Center

Elastic Caching Features | 34
Web-Based Admin Console

http://www.gigaspaces.com/wiki/display/XAP8/The+XAP+Web+Based+Dashboard


Figure 25 – Web-Based Admin Console
GigaSpaces Command Line Interface
Use the Command line-interface to manage the entire system.

http://www.gigaspaces.com/wiki/display/XAP8/Command+Line+Interface

GigaSpaces White Paper | 35
Database Integration
11.1
Out-of-the-box Oracle DB and other RDBMS data
synchronization
Yes
GigaSpaces XAP supports write-through, write behind, read-through, and read-ahead (cache warm-
up) with an Oracle database. One of the unique features that GigaSpaces provides is the mirror
service, which enables reliable asynchronous replication of data from the front-end IMDG to an Oracle
database (or any other database).
The user can control the batching behavior (rate and batch size) of the asynchronous persistency.
GigaSpaces ensures that in a case of a failure of any of the involved components (cache node or
database) data that was not yet updated in the database will be updated by an alternate node or
when the database eventually comes back up again.
GigaSpaces XAP enables pre-loading of data from Oracle database (or any other RDBMS). This is
provided via an interface that you can implement. A default implementation is shipped with the
product that uses Hibernate to load the data from a relational database. The default implementation
loads the data by multiple threads in parallel to achieve very fast loading time. It also supports
Hibernate's stateless session to further optimize the loading process.
You can use out-of-the-box persistency with an Oracle database (or any other RDBMS), or you can
use your existing ORM Hibernate mapping, or you can have a completely customized persistency
implementation. The product can persist all its data to an Oracle database (or any other RDBMS) and
load data from the Oracle database on demand (lazy load) or in a predefined manner.
Learn more:
http://www.gigaspaces.com/wiki/display/XAP8/Persistency

http://www.gigaspaces.com/wiki/display/XAP8/External+Data+Source

http://www.gigaspaces.com/wiki/display/XAP8/External+Data+Source+API

The mirror service enables persisting data asynchronously, reliably, while maintaining the transaction
boundary of the operations executed. This activity is performed through an external process rather
than an internal thread to enable the system to delegate heavy-duty persistency and mapping
activities onto a dedicated server if necessary. This enables the system to offload any persistency-
related activities to a different machine, preferably the database machine. This co-located behavior
provides optimized database access, because all persistency-related activities are done in batches,
utilizing database bulk operations that are highly tuned. Un-persisted data is stored in memory or on
disk, on both the primary and backup instances of the cache.
http://www.gigaspaces.com/wiki/display/XAP8/GigaSpaces+for+Hibernate+ORM+Users

Learn more:
http://www.gigaspaces.com/wiki/display/XAP8/Asynchronous+Persistency+with+the+Mirror

http://www.gigaspaces.com/wiki/display/XAP8/Async+Persistency+-+Mirror+-+Advanced

Elastic Caching Features | 36
The External Data Source API: Data-Load API, Read-Through API, Write-Through API
The External Data Source API provides three major functionalities, each defined by a specific interface:

Data Retrieval
determines how data is retrieved from the data source when it is needed by the IMDG

Data Persistency
determines how data is updated in the data source when it is changed in the IMDG

Initialization and Shutdown
determines:
 How the data source is configured – database name, location.
 Which data is loaded into the IMDG at startup.
 How to close the data source.
References
http://www.gigaspaces.com/wiki/display/XAP8/External+Data+Source+API

Write-Behind API
http://www.gigaspaces.com/wiki/display/XAP8/How+to+Customize+Initial+Load

Provide users to persist data in reliable asynchronous manner offloading the persistency and mapping
business logic to a remote server.
References
http://www.gigaspaces.com/wiki/display/XAP8/Asynchronous+Persistency+with+the+Mirror

Rapid Data Load
http://www.gigaspaces.com/wiki/display/XAP8/Async+Persistency+-+Mirror+-
+Advanced#AsyncPersistency-Mirror-Advanced-ImplementingaCustomMirrorDataSource

GigaSpaces IMDG includes sophisticated rapid data load module allows loading massive amount of
data into the cache:

Figure 26 – Parallel Data Load API
About GigaSpaces
GigaSpaces Technologies is the pioneer of a new generation of application virtualization platforms and a leading provider of
end-to-end scaling solutions for distributed, mission-critical application environments, and cloud enabling technologies.
GigaSpaces is the only platform on the market that offers truly silo-free architecture, along with operational agility and
openness, delivering enhanced efficiency, extreme performance and always-on availability. The GigaSpaces solutions are
designed from the ground up to run on any cloud environment -- private, public, or hybrid -- and offer a pain-free, evolutionary
path to meet tomorrow's IT challenges.
Hundreds of organizations worldwide use GigaSpaces' technology to enhance IT efficiency and performance, among which
are Fortune Global 500 companies, including top financial service enterprises, e-commerce companies, online gaming
providers, and telecom carriers.