Serialization

currygeckoSoftware and s/w Development

Dec 2, 2013 (3 years and 11 months ago)

106 views

Spring/2002


Distributed Software Engineering

C:
\
unocourses
\
4350
\
slides
\
DefiningThreads

1

Serialization

Flatten your object for automated storage or
network transfer

Spring/2002


Distributed Software Engineering

C:
\
unocourses
\
4350
\
slides
\
DefiningThreads

2

Software object persistence


Persistence
: Saving information about an object
to recreate at different time, or place or both.


Object serialization

means of implementing
persistence: convert object’s state into byte stream
to be used later to reconstruct (build
-
deserialized)
a virtually identical copy of original object.


Default serialization for an object writes:



the class of the object,


the class signature,


values of all non
-
transient and non
-
static fields.

Spring/2002


Distributed Software Engineering

C:
\
unocourses
\
4350
\
slides
\
DefiningThreads

3

Serialization protocol


For serialization:


java.io.ObjectOutputStream

via
writeObject

which calls on
defaultWriteObject
,


For deserialization:


java.io.ObjectInputStream

via
readObject

which calls on
defaultReadObject
.



Any object instance that belongs to the
graph of the object being serialized must be
serializable

as well.


Superclass must be Serializable.

Spring/2002


Distributed Software Engineering

C:
\
unocourses
\
4350
\
slides
\
DefiningThreads

4

Serialization protocol


Customize default
: implement extended
versions of default methods in:


writeObject


readObject


But final fields cannot be read with
readObject
. Need to use default.


Create own complete serialization by
implementing the interface
Externalizable
.


Spring/2002


Distributed Software Engineering

C:
\
unocourses
\
4350
\
slides
\
DefiningThreads

5

Specifying persistent objects


Class of the object to be serializable must
implement interface:

java.io.Serializable


This interface is an empty interface and is
used to mark the objects of such class as
persistent.

Spring/2002


Distributed Software Engineering

C:
\
unocourses
\
4350
\
slides
\
DefiningThreads

6

Deserialization


It reads values written during serialization


Static fields in the class are left untouched.


If class needs to be loaded, then normal initialization of
the class takes place, giving static fields its initial
values.


Transient fields will be initialized to default values


Recreation of the object graph will occur in
reverse order from its serialization.



Spring/2002


Distributed Software Engineering

C:
\
unocourses
\
4350
\
slides
\
DefiningThreads

7

import java.io.Serializable;

import java.util.Date;

import java.util.Calendar;

public class PersistentTime implements Serializable {


public PersistentTime() {


time = Calendar.getInstance().getTime();



}


public Date getTime() {



return time;


}


private Date time;

}

Example

Spring/2002


Distributed Software Engineering

C:
\
unocourses
\
4350
\
slides
\
DefiningThreads

8

Class java.io.ObjectOutputStream


An
ObjectOutputStream

instance writes primitive
data types and graphs of Java objects to an
OutputStream
. The objects can be read (reconstituted)
using an
ObjectInputStream
. Persistent storage of
objects can be accomplished by using a file for the stream.
If the stream is a network socket stream, the objects can be
reconstituted on another host or in another process.


Only objects that support the
java.io.Serializable

interface can be written to streams. The class of each
serializable object is encoded including the class name and
signature of the class, the values of the object's fields and
arrays, and the closure of any other objects referenced
from the initial objects.

Spring/2002


Distributed Software Engineering

C:
\
unocourses
\
4350
\
slides
\
DefiningThreads

9

Class java.io.ObjectOutputStream


The method
writeObject

is used to write an object to
the stream. Any object, including Strings and arrays, is
written with
writeObject
. Multiple objects or
primitives can be written to the stream. The objects must
be read back from the corresponding
ObjectInputstream

with the same types and in the
same order as they were written.


Primitive data types can also be written to the stream using
the appropriate methods from
DataOutput
. Strings can
also be written using the
writeUTF

method.

Spring/2002


Distributed Software Engineering

C:
\
unocourses
\
4350
\
slides
\
DefiningThreads

10

Example

import java.io.ObjectOutputStream;

import java.io.FileOutputStream;

import java.io.IOException;

public class FlattenTime{


public static void main(String [] args){


String filename = "time.ser";


if(args.length > 0){



filename = args[0];


}


PersistentTime time = new PersistentTime();


FileOutputStream fos = null;


ObjectOutputStream out = null;


try{



fos = new FileOutputStream(filename);



out = new ObjectOutputStream(fos);



out.writeObject(time);



out.close();


}


catch(IOException ex){



ex.printStackTrace();


}


}

}

Spring/2002


Distributed Software Engineering

C:
\
unocourses
\
4350
\
slides
\
DefiningThreads

11

import java.io.ObjectInputStream;

import java.io.FileInputStream;

import java.io.IOException;

import java.util.Calendar;

public class InflateTime{


public static void main(String [] args){


String filename = "time.ser";


if(args.length > 0){



filename = args[0];


}


PersistentTime time = null;


FileInputStream fis = null;


ObjectInputStream in = null;


try{



fis = new FileInputStream(filename);



in = new ObjectInputStream(fis);



time = (PersistentTime)in.readObject();



in.close();


}


catch(IOException ex){



ex.printStackTrace();


}


catch(ClassNotFoundException ex){



ex.printStackTrace();


}


System.out.println("Flattened time: " + time.getTime());


System.out.println("Current time: " + Calendar.getInstance().getTime());


}

}

Spring/2002


Distributed Software Engineering

C:
\
unocourses
\
4350
\
slides
\
DefiningThreads

12

Serializable vs. Non
-
Serializable objects


Java.lang.Object does not implement serializable,
so you must decide which of your classes need to
implement it.


AWT, Swing components, strings, arrays are
defined serializable.


Certain classes and subclasses are not serializable:
Thread, OutputStream, Socket


When a serializable class contains instance
variables which are not or should not be
serializable they should be marked as that with the
keyword
transient
.

Spring/2002


Distributed Software Engineering

C:
\
unocourses
\
4350
\
slides
\
DefiningThreads

13

Transient fields


These fields will not be serialized.


When deserialized, these fields will be initialized
to default values


Null for object references


Zero for numeric primitives


False for boolean fields


If these values are unacceptable


Provide a readObject() that invokes
defaultReadObject() and then restores transient fields to
their acceptable values.


Or, the fields can be initialized when used for the first
time. (Lazy initialization.)

Spring/2002


Distributed Software Engineering

C:
\
unocourses
\
4350
\
slides
\
DefiningThreads

14

Serial version UID


You should explicitly declare a serial version UID
in every serializable class.


Eliminates serial version UID as a potential source of
incompatibility.


Small performance benefit, as Java does not have to
come up with this unique number.


private static final long serialVersionUID =rlv;


rlv can be any number out thin air, but must be unique
for each serializable class in your development.


If you want to make a new version of the class
incompatible with existing version, choose a different
UID. Deserialization of previous version will fail with
InvalidClassException.


Spring/2002


Distributed Software Engineering

C:
\
unocourses
\
4350
\
slides
\
DefiningThreads

15

Customizing OutputObjectStream,
InputObjectStream


To provide special behavior in the writing
or reading of stream object bytes implement

private void writeObject(ObjectOutputStream out)
throws IOException;

private void readObject(ObjectInputStream in) throws
IOException, ClassNotFoundException;


Spring/2002


Distributed Software Engineering

C:
\
unocourses
\
4350
\
slides
\
DefiningThreads

16

Creating your own protocol:
Externalizable


Instead of implementing the Serializable
interface, implement Externalizable:

interface Externalizable{

public void writeExternal(ObjectOutput out)
throws IOException;

public void readExternal(ObjectInput in) throws
IOException;

}




Spring/2002


Distributed Software Engineering

C:
\
unocourses
\
4350
\
slides
\
DefiningThreads

17

Performance


Serialization is a very expensive process.
You must clearly have reasons to serialize
instead of you directly writing what you
need to save about the state of an object.

Spring/2002


Distributed Software Engineering

C:
\
unocourses
\
4350
\
slides
\
DefiningThreads

18

Default or Customized serialization? Or

Implementing Serializable judiciously


Allowing a class’s instances to be serializable can
be as simple as adding the words “implements
Serializable” to the class specification.


This is a common misconception, the truth is far
more complex.


While efficiency it is one cost associated with it,
there are other long
-
term costs that are much more
substantial.


Using default serialization is very easy but this a
very specious

Spring/2002


Distributed Software Engineering

C:
\
unocourses
\
4350
\
slides
\
DefiningThreads

19

Serialization Costs


Your object’s private structure is out for the
viewing!!!! It’s become part of the API.


A major cost is that it decreases flexibility to
change a class’s implementation once the class has
been release


Increases the likelihood of bugs and security
holes.


Increases the testing associated with releasing a
new version of the class.

Spring/2002


Distributed Software Engineering

C:
\
unocourses
\
4350
\
slides
\
DefiningThreads

20

Serialization caveats


Implementing Serializable is not a decision to be
undertaken lightly.


Classes design for inheritance should rarely
implement serializable and interfaces should rarely
extend it.


You should provide parameterless constructor on non
-
serializable classes designed for inheritance, in case it is
subclassed and the subclass wants to provide
serialization.


Inner classes should rarely if ever, implement
Serializable.


A static member class can be serializable.


Spring/2002


Distributed Software Engineering

C:
\
unocourses
\
4350
\
slides
\
DefiningThreads

21

Consider using a custom serialized form


The default serialized form of an object is an
encoding of the physical representation of the
object graph rooted at the object


Data contained in the object


Data contained in every object reachable from it.


Topology by which all of these objects are interlinked.


The ideal serialized form contains only the
logical

data represented by the object. It is independent of
its physical representation.

Spring/2002


Distributed Software Engineering

C:
\
unocourses
\
4350
\
slides
\
DefiningThreads

22

Consider using a custom serialized form


Default serialization is likely to be
appropriate if an object’s physical
presentation is identical to its logical
content.


Appropriate: A Name class.


Not appropriate: A doubly linked List class.

Spring/2002


Distributed Software Engineering

C:
\
unocourses
\
4350
\
slides
\
DefiningThreads

23

Consider using a custom serialized form


Disadvantages of default serialization when
physical and logical representation differ:


Permanently ties the exported API to the
internal representation.


Can consume excessive space.


Can consume excessive time.


Can cause stack overflow.

Spring/2002


Distributed Software Engineering

C:
\
unocourses
\
4350
\
slides
\
DefiningThreads

24

Consider using a custom serialized form


A reasonable serialized form for a List is the
number of entries followed by each of the entries.


Although default serialized form is correct for a
List case, it may not be the case for any object
whose invariants are tied to implementation
-
specific details.


Example: a hash table using buckets. This is based on
the hash code of the key, which may change from JVM
to JVM, or for different runs of the hash table in same
JVM. Thus default serialized form can violate the
invariant for hash tables in this case.

Spring/2002


Distributed Software Engineering

C:
\
unocourses
\
4350
\
slides
\
DefiningThreads

25

readObject() and security attacks


Deserialization uses
defaultReadObject
()
and
readObject
() to create a new instance of a
class.


Thus
readObject

is a constructor!!!!!


So,
readObject

must behave like any other
constructor:


Check for argument’s validity if need be


Make copies of parameters where needed


Otherwise, a very simple job for an attacker to
violate object’s invariants.


Provide a hand
-
made serialization of the attack object.

Spring/2002


Distributed Software Engineering

C:
\
unocourses
\
4350
\
slides
\
DefiningThreads

26

Guide for writing a bulletproof readObject


Private reference fields should be initialized with
copies of its values.


Check invariants and throw an
InvalidObjectException

if they fail.


As with constructors, do not invoke any
overridable methods.


If an entire object graph must be check for validity
after deserialization, the
objectInputValidation

interface should
be used.

Spring/2002


Distributed Software Engineering

C:
\
unocourses
\
4350
\
slides
\
DefiningThreads

27

writeReplace()


Sometimes it may not be appropriate to serialize
the actual object, but some specifically given
object.

<access> Object writeReplace()
throws ObjectStreamException;


Returns an object that will replace
the current object during
serialization. Any object may be
returned including the current one.

Spring/2002


Distributed Software Engineering

C:
\
unocourses
\
4350
\
slides
\
DefiningThreads

28

A comment about access qualifier


These methods can be of any accessibility


They will be used if they are accessible to the
object type being serialized


If a class has private readResolve, it only affects
serialization of objects that are exactly its type.


If package
-
accessible readResolve affects only
subclasses within the same package


public and protected readResolve affect objects of all
subclasses.

Spring/2002


Distributed Software Engineering

C:
\
unocourses
\
4350
\
slides
\
DefiningThreads

29

readResolve()


Recall that deserialization produces an instance of
a class object.


If a given class should only have one instance
(singleton pattern), then via deserialization we can
provide a different instance!!!


In general you need to be concerned of what is
being created for
instance
-
controlled

classes.


Enter:
readResolve();

this is a method that
returns the appropriate instance of the class at
hand by the readObject() or defaultReadObject()
methods.

<access> readResolve() throws ObjectStreamException;