geekRai: March 2013

Tuesday, March 26, 2013

Algorithm Analysis and Big Oh Notation

In mathematics and computer science, an algorithm is a step-by-step procedure for calculation. Algorithms are used for calculation, data processing, and automated reasoning. - Wikipedia

Algorithms are independent of programming language as well as platform on which it run. Algorithm analysis is mainly about the resources like computational time, storage, computer hardware, bandwidth etc. But usually measuring computational time is most important.

Computational time or run time of algorithm is computed by counting up the number of steps an algorithm takes on a given problem instance. But defining a unified model to calculate run time of an algorithm is quite complicated.

Let's take a case of multiplying two numbers vs adding two numbers. Multiplying two numbers takes more time than adding two numbers on most processors. So addition and multiplication will not take same number of steps. Let's make things more simpler by analyzing addition of two numbers on two modern languages like C and Java. Even in this case addition of two numbers will not take exactly same number of steps. So this proves exact analysis of algorithms is going to be much more complicated as it will depend on factors like processors, languages, different versions of the same language and even on how smart the algorithm developer is etc.

To make things much more easier and reliable; RAM Model of Computation plays a vital role. These assumptions make analysis of algorithm easier without missing important aspects. Some of the basic assumptions of RAM model are:

Mathematical and logical operations like +,-,*,=, floor, ceiling, if etc take one time stamp.
Number of time a loop runs depends on the number of iterations. So Loop is not considered a simple operation and its composition of many single-step operations.
Memory access like load,store,copy of an item requires one unit of time (independent of the source like disk or cache)

So with this model, adding and multiplying two numbers is going to take same unit of time. And even adding two numbers will have same complexity on any platform and language.

So RAM model captures essential behavior of computers. And its useful and simple to work with.

Complexity

Using RAM model, number of steps an algorithm takes can be counted by executing it. But complexity also depends on the nature/variation of input data set. Sorting an already sorted integer array will take far lesser number of steps compared with a unsorted array. So you need to provide all possible data set to find out the run time complexity. This brings in the notion of best, worst, and average-case complexity of algorithms.

The worst-case complexity : maximum number of steps taken in any instance of size n.
The best-case complexity : minimum number of steps taken in any instance of size n.
The average-case complexity : average number of steps taken over all instances of size n.

Average-case and best-case are very subjective and difficult to calculate. The meaning of best and average depends on so many factors. So its better to avoid all such complexities associated with best and average case; and consider only worst-case.

( O ) Big-Oh Notation

f(n) = 53n^2 + 654n - logn

Performing precise worst-case analysis on above function will be a daunting task. But above equation tells that "time grows quadratically with n". So it is much easier to talk in terms of upper and lower bounds of time-complexity. Big Oh simplifies our analysis by ignoring levels of details that do not impact our comparison of algorithm. So Big Oh analysis of below two functions are same:

f(n) = 5n;

and g(n) = n;

So constant factors like 5 in above case are ignored when comparing two algorithms. Let's take a look at the formal definition of Big Oh notation :

f(n) = O(g(n)) means c.g(n) is an upper bound on f(n). Thus there exists some constants c such that f(n) is always <= c.g(n) for large n {Big Oh Notation}

f(n) = Ω(g(n)) means c.g(n) is a lower bound of f(n). Thus there exists some constants c such that f(n) is always >= c.g(n) {Omega Notation}

f(n) = Ɵ(g(n)) means c1.g(n) is an upper bound on f(n) and c2.g(n) is a lower bound on f(n). Thus there exists constants c1 and c2 such that f(n) <= c1.g(n) and f(n) >= c2.g(n) {Theta Notation}

The Big Oh notation enables us to ignore details and focus on the big picture. So it doesn't care if one sorting algorithm sorts 3 times faster than the other. It's more about which algorithm sorts faster when your input size is like 10,000 items. When we say "the running time is O(n^2)," we mean that there is a function f(n) that is O(n^2) such that for any value of n, no matter what particular input of size n is chosen, the running time on that input is bounded from above by the value of f(n). Equivalently, we mean that the worst-case running time is O(n^2). Big oh is used to describe the asymptotic behavior of an algorithm, that is rate of growth. It is also an upper bound which might not be tight.

Compare Efficiency of Algorithms : Big Oh notation & worst-case analysis.

Big Oh Big picture

Constant functions

f(n) =1

Multiplying two numbers, printing "Hello World", adding/averaging two numbers,
accessing an array element with index.

Logarithmic functions

f(n) = logn

Finding an item in BST, looking for a name in a address book. Grows faster than constant function

Linear functions

f(n) = n

Looking each item once ( twice, 10 times) in a list/array, counting number of items in a linked list

Superlinear functions

f(n) = nlgn

Quicksort, mergesort, heapsort. Grows little faster than than linear

Quadratic functions

f(n) = n ^ 2

Bubble sort, selection sort. Cost of looking at most or all pairs of items in an n-element universe

Cubic functions

f(n) = n^3

Iterate through all triplets of items in a list

Exponential functions

f(n) = c^n {c>1}

Iterating all subsets of n items

Factorial functions

f(n) = n!
Generate all permutations or ordering, Travelling salesman problem brute-force way. Functions like n!

n! >> 2^n >> n^3 >> n^2 >> nlogn >> n >> logn >> 1

References:

Big Oh cheat sheet : link
http://discrete.gr/complexity/

Friday, March 22, 2013

JVM Architecture

Java Virtual Machine (JVM) runs/executes java's compiled file(.class). Its job is to load class files and then execute the bytecode contained inside it. Below diagram shows the life cycle of a java program.
More details about bytecode on this post.

Java Virtual Machine is called virtual because it is an abstract computer (or machine) defined by specification. The implementation of the specification is also known as JVM. JVM in general could mean specification, implementation or instance. Let's cover these aspects in detail:

JVM Specification: [link]

JVM specification is template for implementing JVM tool. Specification defines certain features every JVM must have but leaves many choices to the designer of each implementation. It's the specification which says, how your JVM should behave? Like, if JVM runs out of memory, it should throw out an appropriate error. Also, JVM specification is different from the Java specification (Java specification controls the language; JVM specification controls the tool which executes programs)

JVM implementation

Specification is an abstract thing; it gets converted into a product after implementation. When you install JVM in your computer/laptop, we refer to a particular JVM implementation: Windows JVM for 32 bit machine, windows JVM for 64 bit machine, JVM for Mac OSX etc . JVM specification is flexible enough to allow implementation to be either completely in software or to a varying degree in hardware .

JVM instance

This is one of the most confusing aspect. JVM instance comes into picture when you run your Java application. Runtime instance job is, to run a Java application( i.e. java className). So when application gets launched runtime instance gets life and when application completes, the instance dies. Your application could be as small as a class which just prints "Hello World" or it could be as complicated as a distributed application (.war or .ear) which runs 24/7, until the world stops. So, if you start 3 applications at the same time using same JVM (implementation); it means that you have 3 JVM instances. Each Java application runs inside its own JVM.

Each Java application runs inside a run-time instance of some concrete implementation of the abstract specification of the JVM.

Architecture

JVM consists of two major subsystems and memory areas defined in specification.

source : artima.com

Class Loader Subsystem

This subsystem, loads class files from both the program and Java API. As per specification, only those files which are needed are loaded. It loads types (classes and interfaces) using fully qualified name. After loading, it parses information about type( from the binary data contained in the class file), and then places this type information into the method area. And as the program runs, JVM places all the created objects on the heap.

This post talks class loading and unloading in detail.

Execution Engine Subsystem
Execution engine executes bytecode instructions contained in the loaded classes. This component of JVM can have some aspect implemented as hardware. JVM can support multiple execution techniques:

1. bytecode interpreter : Interprets the byte code, one at a time.
2. just-in-time compiler: faster than interpreter but requires more memory. Bytecodes of the method
are compiled to native machine code when method is invoked for the first time. Also machine code
is cached so that it can be reused on the subsequent calls.
3. adaptive optimizer: In this technique, JVM starts by interpreting the bytecodes but monitors the
activity of the running program and identifies the most heavily used areas of code. As program
runs, the virtual machine compiles to native machine code and optimizes only those heavily used
areas of code. The rest of the bytecode is interpreted.

Runtime Data Areas

When a program runs, JVM organizes the memory it needs to execute a program into several runtime data areas. Specification of data area is quite abstract to let it get implemented on a wide variety of computers and devices. Each instance of the JVM has one method area and one heap. These areas are shared by all threads running inside the virtual machine. Each running thread has its own PC (program counter) register and Java Stack. If thread is executing a Java method (not a native method), the value of the PC register tells the next instruction to execute. Java stack stores the state of Java method invocation(not native invocation) for the thread. The state of a Java method invocation includes its local variables, the parameters with which it was invoked, its return value (if any), and intermediate calculations. The state of native method invocation is stored in an implementation dependent way in native method stacks, as well as in registers or other implementation dependent memory areas.

Reference:
http://www.artima.com/insidejvm/ed2/jvm5.html
http://www.artima.com/insidejvm/ed2/jvmP.html
http://kkarthikeyanblog.wordpress.com/2012/08/23/helloworld-in-jvms-view-how-java-program-executed-internally-in-jvm/

Related Post : Understanding Java Bytecode  Class Loading and unloading in JVM

Thursday, March 14, 2013

Serialization in Java

Serialization is one of the advanced topics of Java. Before delving into more detailed aspect of Serialization, let's start with the definition.

Serialization is a technique to encode objects as byte stream and reconstruct objects from their encoded byte stream. So encoding an object into a byte stream is known as Serialization and the reverse process ( converting a byte stream into object) is known as deserialization.

So why do we need this feature ?

Transfer object from one JVM to another JVM
Store object on a file system to be used later
Deep copying a web of objects
Extend the life time of an object (can access object even if application execution is complete)
Create an object without using constructor (through deserialization)

All these will become more clearer after covering this post.

Enable Serialization on a class

Enabling serialization on a class is trivial; you just need to implement Serializable interface.That's it; there is no other baggage.

 import java.io.Serializable;  
   
 public class Person implements Serializable {  
      private String name;  
   
      public String getName() {  
           return name;  
      }  
   
      public void setName(String name) {  
           this.name = name;  
      }  
 }

Points to be noted about Person class:

To enable serialization on a class implement Serializable interface ( from java.io package)
Person class doesn't have any other method apart from setters/getters. This means Serializable interface is a marker/tagging interface (doesn't have any method).
Serialized form is sequence of bytes. So instance of Person class can be stored on a disk, can be buffered or can even be transferred over network to a remote machine.

Serialization in action

Let's try to write an instance of person object on your local disk.

 import java.io.FileNotFoundException;  
 import java.io.FileOutputStream;  
 import java.io.IOException;  
 import java.io.ObjectOutputStream;  
   
 /**  
  * Class for serializing Person instance  
  * @author Siddheshwar  
  *  
  */  
 public class SerializationTest {  
      public static void main(String[] args) {  
           Person p = new Person();  
           p.setName("rai");  
           try {  
                FileOutputStream fs = new FileOutputStream("person.ser");  
                ObjectOutputStream oos = new ObjectOutputStream(fs);  
                oos.writeObject(p);  
                oos.close();  
           } catch (FileNotFoundException e) {  
                e.printStackTrace();  
           } catch (IOException e) {  
                e.printStackTrace();  
           }  
      }  
 }

Above code creates an instance of Person class and then writes the object on disk in file named as person.ser . File gets created inside the workspace at default location ( path on my machine is F:\workspace\Project\person.ser). You can also give an absolute path for the file. To write object, you need to call writeObject() method on ObjectOutputStream.

Now, let's deserialize person.ser file to create an instance of Person.

 import java.io.FileInputStream;  
 import java.io.FileNotFoundException;  
 import java.io.IOException;  
 import java.io.ObjectInputStream;  
   
 /**  
  * Deserialize person instance from the stream  
  *   
  * @author Siddheshwar  
  *   
  */  
 public class DeserializationTest {  
      public static void main(String[] args) {  
           try {  
                FileInputStream fis = new FileInputStream("person.ser");  
                ObjectInputStream ois = new ObjectInputStream(fis);  
                Person per = (Person) ois.readObject();  
                ois.close();  
                System.out.println(" val :" + per.getName());  
           } catch (FileNotFoundException e) {  
                e.printStackTrace();  
           } catch (IOException e) {  
                e.printStackTrace();  
           } catch (ClassNotFoundException e) {  
                e.printStackTrace();  
           }  
      }  
 }

Output
val : rai

Above code converts person.ser into a Person object. This is achieved by calling readObject() method on ObjectInputStream. Also readObject() method returns Object (super class); so casting is required to retrieve a person object. Person object is retrieved from a file, Bingo. Check out the kind of checked exception deserialization can throw. ClassNotFoundException will be thrown if Person class is not found.

ObjectOutputStream[Java SE7 Doc] and ObjectInputStream[Java SE7 Doc] API's actually perform serialization and deserializtion ( along with off course, Serializable interface[Java SE7 doc])

Evolution is Evil for a serializable class

Evolution of your application is a normal thing. This means classes change over time. Let's take case where Person class evolves and adds one more attribute, age.

class Person implements Serializable {
   private String name;
   private double age;
   //setters/getters
}

Now let's run the class DeserializationTest.java again to convert person.ser into a person object. But keep in mind that now Person class has changed. Deserialization fails! Why so?
Sounds like evolution is bad.

Evolution otherwise is not bad but in this case Person class is not in a state to handle evolution viz a viz serialization is concerned. Serialized form is that of the previous Person class but you are trying to deserialize it with new Person class (with an extra attribute). The reason for the failure is that binary compatibility of the Person class has broken after a new attribute got added.

Code throws InvalidClassException(sub class of IOException); so in our code it will be caught inside IOException catch block.So life with serialization is not as easy, as it looked earlier.

You need to keep in mind few things :
You must explicitly declare a unique version id; in absence of this id, value gets generated in default manner and as class has changed so generated value will not match and hence deserialization fails.

From Java Doc:

The serialization runtime associates with each serializable class a version number, called a serialVersionUID, which is used during deserialization to verify that the sender and receiver of a serialized object have loaded classes for that object that are compatible with respect to serialization. If the receiver has loaded a class for the object that has a different serialVersionUID than that of the corresponding sender's class, then deserialization will result in an InvalidClassException. A serializable class can declare its own serialVersionUID explicitly by declaring a field named "serialVersionUID" that must be static, final, and of type long:

ANY-ACCESS-MODIFIER static final long serialVersionUID = 42L;

If a serializable class does not explicitly declare a serialVersionUID, then the serialization runtime will calculate a default serialVersionUID value for that class based on various aspects of the class, as described in the Java(TM) Object Serialization Specification. However, it is strongly recommended that all serializable classes explicitly declare serialVersionUID values, since the default serialVersionUID computation is highly sensitive to class details that may vary depending on compiler implementations, and can thus result in unexpected InvalidClassExceptions during deserialization. Therefore, to guarantee a consistent serialVersionUID value across different java compiler implementations, a serializable class must declare an explicit serialVersionUID value. It is also strongly advised that explicit serialVersionUID declarations use the private modifier where possible, since such declarations apply only to the immediately declaring class--serialVersionUID fields are not useful as inherited members. Array classes cannot declare an explicit serialVersionUID, so they always have the default computed value, but the requirement for matching serialVersionUID values is waived for array classes.

So all you need to do is add serialVersionUID in person class before serialization.

So above code can undergo evolution of adding another attribute age. After deserialization you get default value for the missing attributes. So evolution is NOT bad if you declare serialVersionUID in your serializable class.

 import java.io.Serializable;  
   
 class Person implements Serializable {  
      private static final long serialVersionUID = 1L;  
      private String name;  
   
      public String getName() {  
           return name;  
      }  
   
      public void setName(String name) {  
           this.name = name;  
      }  
 }

Controlling Serialization

Serialization and Deserialization processes are atomic; as object state is written/read in one method call; you don't have any way to write or read selectively.

writeObject(instace);

readObject();

Does it mean; you can't control serialization at field level ?
Can you say, don't write this particular field?
Can you write your own custom writeObject and readObject methods ?

Answer to all above questions is big YES. Let's discover:

If you want to disable serialization on a field, declare it as transient. This means that writeObject call will not write attributes which are declared as transient. And when you deserialize it; that particular attribute gets default value. This is particularly helpful if you have some secured information like password, SSN number etc.

                  class Person implements Serializable {
      static final long serialVersionUID = 1L;
                          private String name;
                          private transient String password;   //don't write this field
                  }

You can also override writeObject() and readObject() to get more control on what you write along with declaring attribute as transient.

Check out Java API implementation of ArrayList.

Sunday, March 10, 2013

Inner classes in Java

Java provides a mechanism through which you can declare one class inside another class. It's known as Inner class or Nested class.

class OuterClass{ //Enclosing class

class InnerClass{ } //Inner class

}

So, here, we are dealing with class definition within another class. Inner class is a member of the enclosing class. Now obvious questions are : why do we need it ? when do we need to use it ? Is it same as composition ?

Let's understand different aspects of inner classes before addressing these questions:

case 1 : Normal/Plain Inner class (Nested class)

public class Outer{
        private String id;

        private class Inner{
            private String name;
            public void print(){
                System.out.format(" name :"+ this.name + " ;id: "+ id);
            }
        }
        public static void main(String[] args){
            Outer oc = new Outer();
            oc.id = "id1";

            Outer.Inner ic = oc.new Inner();
            ic.name = "rai";

            ic.print(); //prints name :rai ;id: id1
        }
    }

Above class is simplest possible way to see Inner class into action. Let's note few points about code :

Normal classes (non-inner) cannot be made private or protected (can only have public and package access) but inner classes can even use private access modifier as shown above. Making inner class private enables you to completely hide details about its implementation.
Inner class can only be accessed by its outer class.
If you want to invoke inner class from a static method (like main method); you need to do as shown in main method i.e. create an instance of outer class first and then through it create inner class instance as OuterClass.InnerClass obj = <OuterClassInstance>.new InnerClass();
Note, print method; you are able to directly access id attribute of outer class. This is possible because inner class instance is associated with an outer class instance (Outer.Inner ic = oc.new Inner();)

Above case, inner class instance is created inside static method of outer class; lets try to do the same inside a normal method of outer class:

public class Outer {
    public void setInner() {
       Inner i = new Inner();
       i.name = "rai";
       i.print();
    }

    private class Inner {
       private String name;

       public void print() {
           System.out.println(" name :" + name);
       }
    }

    public static void main(String[] args) {
       Outer oc = new Outer();
       oc.setInner();
    }
}

So when you try to instantiate an inner class from the non-static method of outer class; you just need to create instance in normal way. This makes sense as setInner method is called by an instance of outer class; so you are creating inner class instance linked with an outer class instance.

Now, what if, Outer class as well as Inner class has same attribute named as id. How to access both inside inner class?

public class Outer {
    private String name;

    private class Inner {
        private String name;

        public void print() {
            System.out.println(" Inner name :" + this.name);
            System.out.println(" Outer Name :"+ Outer.this.name);
        }
    }
}

case 2 : static Inner class or static Nested classes

Now let's analyze how a static inner class will behave and how can we access it? (need to brush up static keyword concepts? this tutorial). Static inner class is commonly referred to as static nested class. Objects of normal inner classes (non-static) implicitly keeps a reference to the object of the enclosing (Outer) class that created it. This is not true for static nested classes.

So when you don't need a connection between the inner class object and the outer class object, then you can declare the inner class as static. This means that you don't need an outer class object to instantiate inner class. And you can't access a non static outer class object from an object of a nested class.

public class Outer {
    private String name;

    private static class Inner {
        private String name;

        public void print() {
            System.out.println(" Inner name :" + name);
            //won't compile
            //System.out.println(" Outer Name :"+ Outer.this.name);
        }
    }

    public static void main(String args){
        Inner i = new Inner();
        i.print();
    }
}

Static nested classes don't have any link with the outer class; so it is analogous to a static method.

case 3 : Inner classes in methods/scopes

This is one of the trickiest and confusing aspect of inner classes; you can declare static classes inside body of a method or in a block. There are two possibilities :

local classes : declare an inner class within body of method with name
anonymous classes : declare an inner class within body of method without naming it

local class : Below example shows a validation example. It checks if a string representation of number is even or odd.

public class Outer{
   public static void ifEven(String number) {
       final String message = "Even Test !"; //note it's final
       class EvenValidator {
            int val = 0;
            EvenValidator(String number){
               val = Integer.parseInt(number);
            }
            public boolean isEvenNumber() {
               System.out.println(" message : "+ message);
                return val % 2 ==0 ? Boolean.TRUE : Boolean.FALSE;
            }
            //static component not allowed
            /*public static void f(){
               System.out.println("hi");
            }*/
         }
        EvenValidator myNumber1 = new EvenValidator(number);
        System.out.print(myNumber1.isEvenNumber());
    }
    public static void main(String... args) {
        ifEven("123");
    }
}

Output
message : Even Test !
false

Please note that local class can only access local variables and parameters of the enclosing block that are final (hence message is final). Also you can't declare or define any static members, as evident from commented static method f(). So local classes are like inner classes.

anonymous class : This takes local classes one step further by enabling you to declare and instantiate a class at the same time. So anonymous classes are like local classes except that they don't have a name.

interface Contents{
int getValue();
}

public class Anonymous {
   public Contents contents(final String message, int tmp){
       return new Contents(){
           private int i=10;
           public int getValue(){
               System.out.println(" message :"+ message); //final is must
               return i;
           }
       };
   }
   public static void main(String[] args) {
       Anonymous a = new Anonymous();
       Contents c = a.contents("hi");
       System.out.println(" val :"+ c.getValue());
   }
}

Output:
message :hi
val :10

Points which need to be noted :

The semicolon at the end of the anonymous inner class marks the end of the expression.
If you're defining an anonymous inner class and want to use an object that's defined outside the anonymous inner class, then the compiler requires that the argument reference be final. This is why message is declared as final.
You can't have a named constructor in an anonymous class as there's no name.
tmp parameter doesn't need to be declared as final as it's not getting used anywhere in the code.

Let's see a more familiar example which uses Runnable interface to create threads.

public class Anonymous {
    public Thread thread(){
       Runnable r = new Runnable(){
           public void run(){
               System.out.println("run");
           }
       };
       Thread t = new Thread(r);
       return t;
    }
    public static void main(String[] args) {
       Anonymous a = new Anonymous();
       Thread thread = a.thread();
       thread.start();
    }
}

case 4 : Inner class inside interfaces

Normally you can't put any code inside an interface, but a nested/inner class can be part of an interface. Any class you put inside an interface is automatically public and static. So you can nest a class inside an interface when you want to create some common code to be used with all different implementations of that interface.

interface ClassInInterface {
   class Dummy{ //static class
        public void sayHi(){
            System.out.println("Hi !");
        }
   }
}

public class ClassInsideInterface implements ClassInInterface{
   public static void main(String[] args){
        Dummy d = new Dummy();
        d.sayHi(); //prints Hi !
   }
}

.class files

Unlike normal classes which generates a single .class file; compiling a class which has inner classes generates separate class files. One class for outer class and one for each inner class. Below are two possibilities :

If inner class has a name ( like in case 1 and case 2): Outer.class and Outer$Inner.class.
If inner class is NOT named (i.e. anonymous inner class) : Anonymous.class and Anonymous$1.class

why ?

Inner class provides a view to the outer class, as it usually manipulates the outer class object that it was created within.
Inner class can independently extend a class (or implement an interface). So it solves multiple inheritance problem of Java. Interfaces to an extent solve this problem by allowing you to implement multiple interfaces. But inner classes actually allow multiple implementation inheritance.
Is it same as composition ? big NO.

---
do post your feedback !!!

Saturday, March 2, 2013

Generics in Java

Generics implement the Parameterized Type concept in Java and it's one of the major changes of Java 5 (Java SE5). This is derived from a concept known as template of C++ and initial motivation behind this feature was to allow creation of parameterized containers.

List list = new ArrayList(); //Before Java SE 1.5 or Without Generics
List<Type> list = new ArrayList<Type>(); // With Generics

Let's take some of the important aspects of Generics:

Primitive type parameters NOT allowed in Generics

This is one of the limitation of Generics; compiler doesn't allows you to use primitives as type parameter. Instead of primitives you need to use corresponding boxed type as shown below:

        List<int> c1 = new ArrayList<int>();   //Not allowed; doesn't compile
        List<Integer> c2 = new ArrayList<Integer>(); //perfect

So you can't declare containers of primitive but does it mean you can't even store/retrieve primitives ? Certainly NOT. Another Java SE5 feature comes to rescue. Through autoboxing and autounboxing; you can easily add/retrieve primitive types.

         c2.get(0);   //add int

int i = c2.get(i): //retrieve value to primitive type

Why Generics ?

Before Generics came into picture; Containers were used without type parameter. So you could put any data type in the list. But you need to cast when you retrieve data from container/collection (as shown below) :

public static void main(String[] args) {
        List list = new ArrayList();
        list.add("abc");
        list.add(420);
        list.add("NaN");

        String s = (String) list.get(0);
        int i = (Integer)list.get(1);

        System.out.println(s + " "+ i);

        //Integer nan = (Integer)list.get(2); throws ClassCastException
    }

Above code compiles properly but if you try to read data to an incompatible type; a runtime ClassCastException is thrown. Ideally it's better to get error as early as possible; preferably at compile time. So Generics got introduced to provide compile time (type) safety and eliminate the need for cast.

Generics Implementation

Before Generics introduction to the language, lots of code existed with raw type i.e. no type parameter. Challenge before designers was to add Generics feature in such a way that existing code remains legal and interoperable with the new changes.

List list = new ArrayList(); //before Java 5
List<E> list = new ArrayList<E>(); // Java 5+

So first case still compiles on Java SE5+; though you will get warning.
Obvious questions are : how is it possible ? And how both can co-exist ?

This is possible because Generics are implemented by erasure. This means that type constraint is enforced only at compile time and at run time the type information is erased or discarded. The .class file which gets generated as outcome of compiling will have same byte code for above 2 cases (i.e with raw and generics).

To verify same; analyze the content of .class file for above 2 lines inside a main method. javap utility inside bin directory can be used to analyze content of compiled file.

Generics vs Array

Generics and Array vary in couple of way. First, arrays are covariant but Generics by contrast are invariant. This point is shown in below diagram. Integer extends Number class so do the corresponding array classes; but the same is not true in case of Generics.

Another important difference is that arrays are reified (knows type at runtime) but Generics are implemented by erasure( type information erased at run time).

Let's see an example to understand the same:

public static void main(String[] args) {
       Number[] array = new Integer[5];
       array[0] = 5;
       array[1] = 6.5; //Runtime Exception

       List<Number> list = new ArrayList<Integer>(); //Doesn't compile
   }

Above code has 2 problems:

array object allows addition of a Number (6.5) at compile time but at run time it throws java.lang.ArrayStoreException. This is because run time type of array object is Integer[] not Number[] or Double[].
list object doesn't allow declaration as arraylist of Integer. It gives compilation error Type mismatch: cannot convert from ArrayList<Integer> to List<Number>

So Array objects preserve the rules about the type of object they contain. So you can't abuse them. On the other hand Generics moves such error detection to compile time.

Bounds/Wildcards

Advanced generics concepts are discussed in this separate post, here.

---
do post your feedback !!!

geekRai