Thursday, March 14, 2013

Serialization in Java

Serialization is one of the advanced topics of Java. Before delving into more detailed aspect of Serialization, let's start with the definition.

Serialization is a technique to encode objects as byte stream and reconstruct objects from their encoded byte stream. So encoding an object into a byte stream is known as Serialization and the reverse process ( converting a byte stream into object) is known as deserialization.

 

So why do we need this feature ?

  1. Transfer object from one JVM to another JVM
  2. Store object on a file system to be used later
  3. Deep copying a web of objects
  4. Extend the life time of an object (can access object even if application execution is complete)
  5. Create an object without using constructor (through deserialization)
All these will become more clearer after covering this post.

Enable Serialization on a class

Enabling serialization on a class is trivial; you just need to implement Serializable interface.That's it; there is no other baggage.

 import java.io.Serializable;  
   
 public class Person implements Serializable {  
      private String name;  
   
      public String getName() {  
           return name;  
      }  
   
      public void setName(String name) {  
           this.name = name;  
      }  
 }  

Points to be noted about Person class:
  1. To enable serialization on a class implement Serializable interface ( from java.io package)
  2. Person class doesn't have any other method apart from setters/getters. This means Serializable interface is a marker/tagging interface (doesn't have any method).
  3. Serialized form is sequence of bytes. So instance of Person class can be stored on a disk, can be buffered or can even be transferred over network to a remote machine.

Serialization in action

Let's try to write an instance of person object on your local disk.

 import java.io.FileNotFoundException;  
 import java.io.FileOutputStream;  
 import java.io.IOException;  
 import java.io.ObjectOutputStream;  
   
 /**  
  * Class for serializing Person instance  
  * @author Siddheshwar  
  *  
  */  
 public class SerializationTest {  
      public static void main(String[] args) {  
           Person p = new Person();  
           p.setName("rai");  
           try {  
                FileOutputStream fs = new FileOutputStream("person.ser");  
                ObjectOutputStream oos = new ObjectOutputStream(fs);  
                oos.writeObject(p);  
                oos.close();  
           } catch (FileNotFoundException e) {  
                e.printStackTrace();  
           } catch (IOException e) {  
                e.printStackTrace();  
           }  
      }  
 }  

Above code creates an instance of Person class and then writes the object on disk in file named as person.ser . File gets created inside the workspace at default location ( path on my machine is F:\workspace\Project\person.ser). You can also give an absolute path for the file. To write object, you need to call writeObject() method on ObjectOutputStream.

Now, let's deserialize person.ser file to create an instance of Person.

 import java.io.FileInputStream;  
 import java.io.FileNotFoundException;  
 import java.io.IOException;  
 import java.io.ObjectInputStream;  
   
 /**  
  * Deserialize person instance from the stream  
  *   
  * @author Siddheshwar  
  *   
  */  
 public class DeserializationTest {  
      public static void main(String[] args) {  
           try {  
                FileInputStream fis = new FileInputStream("person.ser");  
                ObjectInputStream ois = new ObjectInputStream(fis);  
                Person per = (Person) ois.readObject();  
                ois.close();  
                System.out.println(" val :" + per.getName());  
           } catch (FileNotFoundException e) {  
                e.printStackTrace();  
           } catch (IOException e) {  
                e.printStackTrace();  
           } catch (ClassNotFoundException e) {  
                e.printStackTrace();  
           }  
      }  
 }  
Output
val : rai

Above code converts person.ser into a Person object. This is achieved by calling readObject() method on ObjectInputStream. Also readObject() method returns Object (super class); so casting is required to retrieve a person object. Person object is retrieved from a file, Bingo. Check out the kind of checked exception deserialization can throw. ClassNotFoundException will be thrown if Person class is not found. 

ObjectOutputStream[Java SE7 Doc] and ObjectInputStream[Java SE7 Doc] API's actually perform serialization and deserializtion ( along with off course, Serializable interface[Java SE7 doc])

Evolution is Evil for a serializable class

Evolution of your application is a normal thing. This means classes change over time. Let's take case where Person class evolves and adds one more attribute, age.
class Person implements Serializable {
    private String name;
    private double age;
    //setters/getters
}
Now let's run the class DeserializationTest.java again to convert person.ser into a person object. But keep in mind that now Person class has changed. Deserialization fails! Why so?
Sounds like evolution is bad. 

Evolution otherwise is not bad but in this case Person class is not in a state to handle evolution viz a viz serialization is concerned. Serialized form is that of the previous Person class but you are trying to deserialize it with new Person class (with an extra attribute). The reason for the failure is that binary compatibility of the Person class has broken after a new attribute got added.

Code throws InvalidClassException(sub class of IOException); so in our code it will be caught inside IOException catch block.So life with serialization is not as easy, as it looked earlier.

You need to keep in mind few things :
You must explicitly declare a unique version id; in absence of this id, value gets generated in default manner and as class has changed so generated value will not match and hence deserialization fails.

From Java Doc:


The serialization runtime associates with each serializable class a version number, called a serialVersionUID, which is used during deserialization to verify that the sender and receiver of a serialized object have loaded classes for that object that are compatible with respect to serialization. If the receiver has loaded a class for the object that has a different serialVersionUID than that of the corresponding sender's class, then deserialization will result in an InvalidClassException. A serializable class can declare its own serialVersionUID explicitly by declaring a field named "serialVersionUID" that must be static, final, and of type long:


ANY-ACCESS-MODIFIER static final long serialVersionUID = 42L;           

If a serializable class does not explicitly declare a serialVersionUID, then the serialization runtime will calculate a default serialVersionUID value for that class based on various aspects of the class, as described in the Java(TM) Object Serialization Specification. However, it is strongly recommended that all serializable classes explicitly declare serialVersionUID values, since the default serialVersionUID computation is highly sensitive to class details that may vary depending on compiler implementations, and can thus result in unexpected InvalidClassExceptions during deserialization. Therefore, to guarantee a consistent serialVersionUID value across different java compiler implementations, a serializable class must declare an explicit serialVersionUID value. It is also strongly advised that explicit serialVersionUID declarations use the private modifier where possible, since such declarations apply only to the immediately declaring class--serialVersionUID fields are not useful as inherited members. Array classes cannot declare an explicit serialVersionUID, so they always have the default computed value, but the requirement for matching serialVersionUID values is waived for array classes.

 So all you need to do is add serialVersionUID in person class before serialization. 

 So above code can undergo evolution of adding another attribute age. After deserialization you get default value for the missing attributes. So evolution is NOT bad if you declare serialVersionUID in your serializable class. 

 import java.io.Serializable;  
   
 class Person implements Serializable {  
      private static final long serialVersionUID = 1L;  
      private String name;  
   
      public String getName() {  
           return name;  
      }  
   
      public void setName(String name) {  
           this.name = name;  
      }  
 }  

Controlling Serialization

Serialization and Deserialization processes are atomic; as object state is written/read in one method call; you don't have any way to write or read selectively. 

                    writeObject(instace);
                    readObject();

Does it mean; you can't control serialization at field level ? 
Can you say, don't write this particular field? 
Can you write your own custom writeObject and readObject methods ?

 Answer to all above questions is big YES. Let's discover:
  •  If you want to disable serialization on a field, declare it as transient. This means that writeObject call will not write attributes which are declared as transient. And when you deserialize it; that particular attribute gets default value. This is particularly helpful if you have some secured information like password, SSN number etc.
                  class Person implements Serializable {
                          static final long serialVersionUID = 1L;
                          private String name;
                          private transient String password;   //don't write this field
                  }

  • You can also override writeObject() and readObject() to get more control on what you write along with declaring attribute as transient
          Check out Java API implementation of ArrayList.

No comments:

Post a Comment