Saturday, 13 April 2013

Difference between string object and string literal

Background


Everybody is familiar with Strings as a data type. Strings can be used as literals 
  • String name = "John";
or they can be used as objects
  • String name = new String("John");
They may behave the same way and may serve our purpose equally but do we really understand what are the differences between the two. How both the Strings behave and how would it affect our programming logic.

Lets understand the differences between the two and when we would use each....

Note 1: First thing everyone must remember is that Strings are immutable i.e they cannot be changed or altered. Of-course you can perform operations on it like substring()  but this will give you entirely different String. Point here is that once you create a String you cannot modify it.

Using String as a Literal

    You create a String literal simply as below - 
  • String name = "John";
  When you create a String literal it is stored in PermGen area of the Heap(The PermGen normally consists of the string literal pool and loaded classes).

Note 1: From Java 7, String pool is moved from permgen to normal heap area. This was primarily because permgen ares is of fixed size and may led to "java.lang.OutOfMemoryError: PermGen space" finally leading your JVM to crash.  

Note 2: From Java 8 there is no permgen area. All classes are loaded in normal heap are only. 

So now when you create a String as a literal reference point to an object called interned String object which is created in the PermGen of the Heap i.e name will refer to an interned String object. This means, that the character sequence "John" will be stored at a central place and whenever the same literal "John" is used again, the JVM will not create a new String object but use the reference of the 'cached' String.

Note : String literals are interened by default.


To clarify this further lets take an example - 

String name1 = "John";
String name2 = "John";
if(name1 == name2)
{
   System.out.println("Both point to same object \n"); 
}
if(name1.equals(name2))
{

    System.out.println("Both have same value \n");
}

output :
 Both point to same object
 
Both have same value

Hope you get the point now. Object is the same (created once in PermGen area) and is referred every time a literal with same value is referred. So to summarize objects are same and obvious the value too.

Lets understand creating String as an object now.

Using String as an Object

     Creating String as an object is as follows -
  •    String name = new String("John");
Here name  is an individual instance of the java.lang.String class. This object is created just as other objects in the heap.Two objects created with new keyword will always point to two different objects even if their values are same.
Let's take an example to understand this - 


String name1 = new String("John");
String name2 = new String("John");
if(name1 == name2)
{
   System.out.println("Both point to same object \n");
}
if(name1.equals(name2))
{

    System.out.println("Both have same value \n");
}

output :
Both have same value           

As expected we did not get  Both point to same object as our output as both objects are entirely different and are allocated separate memory.But note that they have same value i.e  John and hence we got the output as Both have same value.


As I mentioned earlier String literals are interened by default. But if you need to intern an String Object you need to call  intern() on it.

        String name1 = new String("John").intern();
        String name2 = new String("John").intern();
        if(name1 == name2)
        {
           System.out.println("Both point to same object \n");
        }
        if(name1.equals(name2))
        {
        
            System.out.println("Both have same value \n");
        }
output :
 Both point to same object
 
Both have same value

String pool values are garbage collected

Irrespective of which version of Java you use and where you String pool resides, values of String pool are eligible for GC and follow normal GC rules i.e if they are not reachable from from program GC roots then they are eligible for garbage collection.

Though this is true String literals may not be candidates for GC in most of the cases.  String literals will always be accessible from GC roots as the code will have implicit reference to it. For example a method that uses a string literal will always have a reference to it. So this literal will be garbage collected only if that code referencing it is GCed and that is not possible unless the class/code is dynamically loaded.

If the literal was defined in a class that was dynamically loaded (e.g. using Class.forName(...)), then it is possible to arrange that the class is unloaded. If that happens, then the String object for the literal will be unreachable, and will be reclaimed when the heap containing the interned String gets GC'ed. 

Also this happens on full GC.

Summary

   To summarize what we learned above -  When we create String as a literal it is created as an interned object in PermGen area of the Heap (Normal area of heap from Java 7) and will be referred every time a  literal is created with same value. On the other hand separate object is created every time String is created as an object using new keyword.


Related Links

1 comment:

t> UA-39527780-1 back to top