Compact off-heap structures in the Java language

farflungconvyancerSoftware and s/w Development

Dec 2, 2013 (3 years and 7 months ago)

104 views

© 2013 IBM Corporation
Compact off-heap structures
in the Java language
IBM Software Group: Java Technology Centre

Kavitha Varadarajan – Java Developer Prashanth S Krishna
vkavitha@in.ibm.com
prashanth.krishna@in.ibm.com
May 9th, 2013
ORB Developer
2
© 2013 IBM Corporation
Important Disclaimers
THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSES
ONLY.
WHILST EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THE INFORMATION
CONTAINED IN THIS PRESENTATION, IT IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED.
ALL PERFORMANCE DATA INCLUDED IN THIS PRESENTATION HAVE BEEN GATHERED IN A CONTROLLED
ENVIRONMENT. YOUR OWN TEST RESULTS MAY VARY BASED ON HARDWARE, SOFTWARE OR
INFRASTRUCTURE DIFFERENCES.
ALL DATA INCLUDED IN THIS PRESENTATION ARE MEANT TO BE USED ONLY AS A GUIDE.
IN ADDITION, THE INFORMATION CONTAINED IN THIS PRESENTATION IS BASED ON IBM’S CURRENT
PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM, WITHOUT NOTICE.
IBM AND ITS AFFILIATED COMPANIES SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF
THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION.
NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, OR SHALL HAVE THE EFFECT OF:
- CREATING ANY WARRANT OR REPRESENTATION FROM IBM, ITS AFFILIATED COMPANIES OR ITS OR
THEIR SUPPLIERS AND/OR LICENSORS
3
© 2013 IBM Corporation
Compact off-heap structures in the Java language
Challenge
:In Java, the layout of objects is completely abstracted away from the application
code, leaving Java inherently challenged when required to inter-operate with non-Java
sources of data, such as relational databases.


Native data structures must typically be copied into the Java heap and marshalled/boxed
into Java data structures before they can be manipulated by Java code.
Solution:
With Packed Object , we propose a new object model that provides language
support and enables:
1)

Denser representation of data structures
2)

Explicit specification of field layout.
4
© 2013 IBM Corporation
Agenda

Java to Native interaction today

Java to Native made easier!

Introduction to Packed Objects

Implementation today/ Future directions

Coding example

Serialization Example

Performance benefits – Sneak peek

Questions/suggestions
5
© 2013 IBM Corporation
Java to Native interaction today – Towards a simpler model.....

There are several strategies today for accessing native memory in Java

Each has its own challenges:

JNI requires a Java programmer to write in C/C++

NIO direct
ByteBuffer
requires knowing precise offsets

sun.misc.Unsafe
is undocumented, and requires privileged access

Each has its own programming model … and none of them are “regular” Java

Why struggle with …

JNICALL jshort getPort(JNIEnv* env, jclass clazz, jlong addr)

or

byteBuffer.getShort(DEST_OFFSET + PORT_OFFSET)

or

unsafe.getShort(addr + DEST_OFFSET + PORT_OFFSET)


instead of?

addr.port
6
© 2013 IBM Corporation
Speak to me in 'Java', I don't speak 'Native'

Java only speaks ‘Java’…

Data typically must be copied/(de)serialized/marshalled onto/off Java heap

Costly in path-length and footprint
7
© 2013 IBM Corporation

A sample network packet copy – with today's JAVA

Java only speaks ‘Java’…

Data typically must be copied / (de)serialized / marshalled into / out of Java heap

Costly in path-length and footprint
metadata
src_addr
dest_addr
metadata
metadata
metadata
metadata
source
destination
src_addr
src_port
dest_addr
dest_port
source
destination
packet_header
packet_header
Native
JVM
copy
copy
8
© 2013 IBM Corporation
Object Locality and Cache Performance - Today

Objects in Java may not be laid out in memory in an optimal order for access

Some GC policies will try to improve the layout

Even when objects are close together, the headers are in the way

Ideally, the object graph would look like this:

But it may look more like this:

This is beyond the programmer's control
metadata
src_addr
dest_addr
metadata
metadata
header
metadata
source
destination
packet_header
metadata
src_addr
dest_addr
metadata
metadata
metadata
metadata
source
destination
packet_header
9
© 2013 IBM Corporation
Modern Caching Architecures – Can JAVA take advantage ?
** reference: http://www.multicoreinfo.com/prefetching-multicore-processors/
10
© 2013 IBM Corporation
With Packed Objects – On Heap

Allows controlled layout of storage of data structures on the Java heap

Reduces footprint of data on Java heap

No (de)serialization required
metadata
src_addr
src_port
dest_addr
dest_port
source
destination
packet_header
Native
JVM
src_addr
src_port
dest_addr
dest_port
source
destination
packet_header
copy
copy
11
© 2013 IBM Corporation
With Packed Objects – Off Heap

Enable Java to talk directly to the native data structure

Avoid overhead of data copy into / out of Java heap

No (de)serialization required
src_addr
src_port
dest_addr
dest_port
source
destination
packet_header
Native
JVM
packet_header
metadata
copy
copy
12
© 2013 IBM Corporation
Object Locality and Cache Performance with Packed Objects

The layout of a packed object is precisely specified by the programmer

This is necessary when modelling non-Java data structures …


but can also be used to optimize Java-only data types

This allows OO paradigms to be used while reducing the costs:

Less “pointer chasing” to find the data

Less footprint overhead due to object headers

Increases the benefits from cache pre-fetching
metadata
src_addr
src_port
dest_addr
dest_port
source
destination
packet_header
13
© 2013 IBM Corporation
PackedObject Delivery and Intended Use
PackedObject is an experimental feature in IBM J9 Virtual Machine.
Goal(s) of Feature:

Improve serialization and I/O of Java objects

Allow direct access to “native” (off-heap) data

Allow for explicit source-level representation of compact data-structures
Intended Use:

Provide an opportunity for feedback and experimentation

Not meant for production support

Not a committed language change
14
© 2013 IBM Corporation
PackedObjects for IBM's Java
Features of today's Java work well in certain
scenarios, poorly in others...
Present l
...changing how Java data is represented and
native data is accessed introduces new
efficiencies into the Java language


Object overhead
: Object metadata (aka
headers) and references introduced by object-
oriented paradigms


No direct access to off-heap data
: Java
Native Interface or Direct Byte Buffers
required to read / write off-heap data


Redundant data copying:

Copying
between on-heap and off-heap data often
required


Sub-optimal memory layout:

Related
objects may not be adjacent in memory,
slowing down access


Reduced headers & references


Direct access to off-heap data


Elimination of data copies


In-lined data allows for optimal caching
and pre-fetching
15
© 2013 IBM Corporation
PackedObjects

A new type in the Java language:
PackedObject

Is not derived from Object, and hence disallows assignment and casting

BoxedPackedObject
wrapper class acts as a bridge

A “regular” Object used to hold a reference to a PackedObject
Object
PackedObject
Number
String
Object[]
BoxedPackedObject
DBRecord
PackedByte[]
Integer
Float
OSDataStruct
16
© 2013 IBM Corporation
Using Packed Objects

Current solution is based on Java annotations

No changes to the Java compiler (
javac
) at this time

Unfortunately, this adds a lot of complexity to the programming model

There are many important restrictions and considerations

Please see the Java 8 User Guide for details

Illustrative code samples are available

New APIs for packed objects are available, including:

Creating and querying

Reflect access to fields

Helpers for packed arrays

New JNI extension available to integrate with packed objects
17
© 2013 IBM Corporation
PackedObject Technology Status

Available for evaluation by early access customers as part of the IBM Java 8 Beta 2

Currently supported platforms:

Linux x86 32-bit

z/OS 31-bit

Standard GC policies only (i.e.
optthruput
,
optavgpause
,
gencon
)

This is an evolving feature:

Not yet supported for production use

Details will change in future releases

Your feedback can affect the future direction!
18
© 2013 IBM Corporation
Future Work

Exploring language changes to simplify the programming model

Replacing annotations with new syntax (e.g. keywords, operators)

Will make it easier to write correct code and to highlight problems

Merging the PackedObject and Object hierarchies

Will allow the use of existing API (e.g. collections) without BoxedPackedObject

Will allow PackedObject subclasses to implement interfaces

Providing a more “Java-like” experience

Expanded platform support

64-bit and compressedrefs

POWER / PPC
19
© 2013 IBM Corporation
Code Snippets
20
© 2013 IBM Corporation
A Simple Packed Class: Point
package
com.ibm.packedexample.concepts;
/* This class represents a point in a 2 dimensional space. */
@Packed
public final class
Point
extends
PackedObject
{
public int
x
;
public int

y
;
public
Point(
int
x,
int
y) {
this
.
x
= x;
this
.
y
= y;
}
public

int
hashCode() {
return

x
^
y
;
}
public
boolean equals(PackedObject obj) {
if(!(obj
instanceof
Point)) return
false
;
Point point = (Point)obj;
return
(
x
== point.
x
) && (
y
== point.
y
);
}
}
Most code is unchanged from
“regular” Java
Annotation marks this as a
packed class
Class must extend
PackedObject and be final
21
© 2013 IBM Corporation
Using a Nested Field: Boxfor
package
com.ibm.packedexample.concepts;
/* This class represents a simplified 2 dimensional box */
@ImportPacked({
“com/ibm/packedexample/concepts/Point”
})
@Packed
public final class
Box
extends
PackedObject
{
public
Point
origin
;
public
Point
extent
;
public
Box(Point origin, Point extent) {
this
.
origin
.copyFrom(origin);
this
.
extent
.copyFrom(extent);
}
public
Point getCorner() {
return
new
Point(
origin
.
x
+
extent
.
x
,
origin
.
y
+
extent
.
y
);
}
}
Need to explicitly import all
packed classes used
Nested fields aren't references, so
assignment by reference is not
allowed.
Instead, use
copyFrom()
to
transfer the data
Accessing a nested field
works as expected
22
© 2013 IBM Corporation
Using a Packed Array: Polygon
package
com.ibm.packedexample.concepts;
/* This class represents a polygon of 2D points */
@ImportPacked({
“com/ibm/packedexample/concepts/Point”
})
public class
Polygon
{
public
Point[]
vertices
;
public
Polygon(
int
vertexCount) {
vertices
=
new
Point[vertexCount];
}
public
void setVertex(
int
index, Point point) {
vertices
[index].copyFrom(point);
}
public
Point getVertex(
int
index) {
return
vertices
[index];
}
}
Need to explicitly import all
packed classes used – even in
classes that are not @Packed
Packed array elements aren't
references, so assignment by
reference is not allowed.
Instead, use
copyFrom()
to
transfer the data
Accessing a packed array
works as expected
This allocates data for
vertexCount
Point objects and
initializes it all to 0
23
© 2013 IBM Corporation
Using a Nested Packed Array: the Packed Header Example
package
com.ibm.packedexample.concepts;
/* The imaginary packet_header example used in earlier slides */
@ImportPacked({
“com/ibm/packedexample/concepts/Address”
})
@Packed
public final class
PackedHeader
extends
PackedObject
{
public
Address
source
;
public
Address
destination
;
}
@Packed
public final class
Address
extends
PackedObject
{
@Length(4)
public
PackedByte[]
addr
;
public short

port
;
}
Need to specify a constant
length for nested packed
arrays.
Only packed types can be
nested, so we can't use
byte[]
.
Special packed types are
provided for this.
24
© 2013 IBM Corporation
Serialization using Class libraries
import
java.io.File;
import
java.io.FileOutputStream;
import
java.io.ObjectOutputStream;
@ImportPacked({
“com/ibm/packedexample/concepts/Point”
})
public class
Serializer {


public static

void main
(String[] args)
throws
Exception {

Point
point1
= new Point(100, 200);
ObjectOutputStream
oos
= new ObjectOutputStream(new
FileOutputStream(
new File("ser.txt")));
oos.writeObject(point1);

}
}

Point
point1
= PackedObject.newNativePackedObject(Point.class);

Note:
Freeing up the native memory allocated above, is the programmer's responsibility.
JAVA Serialization
On heap Packed Object
Off heap Packed Object
25
© 2013 IBM Corporation
Deserialization using Class libraries
import
java.io.File;
import
java.io.FileInputStream;
import
java.io.ObjectInputStream;
@ImportPacked({
“com/ibm/packedexample/concepts/Point”
})

public class
Deserializer {

public static void main
(String[] args)
throws
Exception{
ObjectInputStream
oos
= new ObjectInputStream(new
FileInputStream(new File("ser.txt")));
Point
point1
=
(Point) oos.readObject();
System.out.println(“Point, after deserialization “ +
point1.toString());
}
Simple benchmark test – Echoing POJOs Vs Packed Object counterparts, using RMI-JRMP
and RMI-IIOP.
JAVA Deserialization
Packed Object
deserialized on heap
26
© 2013 IBM Corporation
PackedObjects Performance

Trade Data Benchmark**:

Manages a large array (50Mbytes) of trade objects

Evaluate PackedObjects vs. conventional Java

Footprint is 20% better with PackedObjects

Benchmark elapsed time is 95% better with PackedObjects
** see http://mechanical-sympathy.blogspot.ca/2012/10/compact-off-heap-structurestuples-in.html
Trade Data
PackedObjects vs Traditional Java
0
20
40
60
80
100
120
Footprint
Elapsed time
Improvement
Rel
ative I
m
provement
(no
r
m
ali
z
ed

to tradition
al Java)
Traditional Java
PackedObjects
27
© 2013 IBM Corporation
Summary
Data field
allocation and
storage

Fields are either primitives or references
to other objects. Non-primitives data
types must be allocated as separate
objects.

When allocating a packed object, all
corresponding data fields get allocated
simultaneously and packed into a single
contiguous object.

Child objects

Each child object has its own object
header; parent object has a reference to
each of them.

Child objects are contained within the
packed object, and do not have their own
headers. No references necessary.

Each element in an object array is a
reference to an object with its own
header. The element objects may not
be contiguous in memory.

Array elements packed contiguously with
a single header. No references
necessary.
Current Java
PackedObject
Off-heap
Arrays

Data outside of the Java heap cannot be
directly accessed. Data must be copied
in and out of a Java version, or
accessed indirectly (e.g. JNI, NIO,
Unsafe).

A packed object can be used the same
way if the data is on- or off-heap. The
JVM transparently handles the access.
28
© 2013 IBM Corporation
Copyright and Trademarks
© IBM Corporation 2013. All Rights Reserved.
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International
Business Machines Corp., and registered in many jurisdictions worldwide.
Other product and service names might be trademarks of IBM or other companies.
A current list of IBM trademarks is available on the Web – see the IBM “Copyright and
trademark information” page at URL:
www.ibm.com/legal/copytrade.shtml
29
© 2013 IBM Corporation
Reference

Data Prefetching in the Era of Multicore Processors -
http://www.multicoreinfo.com/prefetching-multicore-processors/

IBM Java 8 Beta community https://ibm.biz/BdxPpH

Collaborate via the community

Refer here for detailed api doc

More packed object example's can be found here
https://www.ibm.com/developerworks/community/forums/html/topic?id=c577ef63-62aa-496a-
a586-9703a762f306