Open Source For Geeks: Collections

Showing posts with label Collections. Show all posts

Friday, 13 February 2015

Understanding how hashing (HashMap and HashSet) work in Java

Background

Hashing is a very important concept in computer programming and a very popular concept for interview questions. Two important data structures related to hashing are HashMap and HashSet which is exactly what we are going to look at in this post.

Overview

If you are from a computer science background you would know HashMap is a data structure which stores key value pairs where as a HashSet stores unique data. In HashMap we have kind of buckets and each data added to a HashMap falls into one of the buckets depending on the hash value of it. Also you must have heard that adding and retrieving objects in HashMap happen in time complexity O(1).

But still there are open end question like -

What happens when two objects added to HashMap have same hash (code) value ? - a situation typically known as collision.
If above is handled how get and put work in hashmap?.... and so on.

We will address them now.

Understanding how HashMap works

You can visualize HashMap as follows-

So you have an Array and each array position is essentially a Linked List. As you know you don't have to specify size of the HashMap. It increases dynamically like ArrayList. The main data structure is essentially an array.

    /**
     * The table, resized as necessary. Length MUST Always be a power of two.
     */
    transient Entry[] table;

When you create a HashMap you either choose to provide initial capacity or when you don't a default value is used.

    /**
     * The default initial capacity - MUST be a power of two.
     */
    static final int DEFAULT_INITIAL_CAPACITY = 16;

When table is created it is created with either this initial default capacity or the capacity you provide in the constructor.

Next important thing is when should out table we resized to meet the dynamically changing data size of HashMap. Answer depends on a threshold that is determined by load factor.

    /**

     * The load factor used when none specified in constructor.

     */

    static final float DEFAULT_LOAD_FACTOR = 0.75f;

HashMap provides a constructor to initialize load factor as well. So when your data size equals

threshold = table capacity * load factor

then the HashMap table is resized. Constructors are as follows -

    /**
     * Constructs an empty <tt>HashMap</tt> with the specified initial
     * capacity and load factor.
     *
     * @param  initialCapacity the initial capacity
     * @param  loadFactor      the load factor
     * @throws IllegalArgumentException if the initial capacity is negative
     *         or the load factor is nonpositive
     */
    public HashMap(int initialCapacity, float loadFactor) {
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal initial capacity: " +
                                               initialCapacity);
        if (initialCapacity > MAXIMUM_CAPACITY)
            initialCapacity = MAXIMUM_CAPACITY;
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal load factor: " +
                                               loadFactor);

        // Find a power of 2 >= initialCapacity
        int capacity = 1;
        while (capacity < initialCapacity)
            capacity <<= 1;

        this.loadFactor = loadFactor;
        threshold = (int)(capacity * loadFactor);
        table = new Entry[capacity];
        init();
    }

    /**
     * Constructs an empty <tt>HashMap</tt> with the specified initial
     * capacity and the default load factor (0.75).
     *
     * @param  initialCapacity the initial capacity.
     * @throws IllegalArgumentException if the initial capacity is negative.
     */
    public HashMap(int initialCapacity) {
        this(initialCapacity, DEFAULT_LOAD_FACTOR);
    }

    /**
     * Constructs an empty <tt>HashMap</tt> with the default initial capacity
     * (16) and the default load factor (0.75).
     */
    public HashMap() {
        this.loadFactor = DEFAULT_LOAD_FACTOR;
        threshold = (int)(DEFAULT_INITIAL_CAPACITY * DEFAULT_LOAD_FACTOR);
        table = new Entry[DEFAULT_INITIAL_CAPACITY];
        init();
    }

How is data with keys generating same hash code stored in HashMap?

As mentioned earlier each index has reference to object of type Entry which is a Linked List. It has a next pointer. So if an entry is added to hash map it's hash code is computed which determines the index at which it should be put. If that index has an entry then new entry is added to the start of the linked list and existing linked list is appended to next of it.

    public V put(K key, V value) {
        if (key == null)
            return putForNullKey(value);
        int hash = hash(key.hashCode());
        int i = indexFor(hash, table.length);
        for (Entry<K,V> e = table[i]; e != null; e = e.next) {
            Object k;
            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
                V oldValue = e.value;
                e.value = value;
                e.recordAccess(this);
                return oldValue;
            }
        }

        modCount++;
        addEntry(hash, key, value, i);
        return null;
    }

    void addEntry(int hash, K key, V value, int bucketIndex) {
    Entry<K,V> e = table[bucketIndex];
        table[bucketIndex] = new Entry<K,V>(hash, key, value, e);
        if (size++ >= threshold)
            resize(2 * table.length);
    }

Also note how null is handled. Yes null is an acceptable key in HashMap.

So how is data retrieved if two keys generate same hash code?

Same hash code will make searched for both keys data land on same index in the table. From there the each Entry object is iterated over and it's key compared with the search key. Yes both key and value are stored in the Node/Entry object! On successful match corresponding value is returned.

    public V get(Object key) {
        if (key == null)
            return getForNullKey();
        int hash = hash(key.hashCode());
        for (Entry<K,V> e = table[indexFor(hash, table.length)];
             e != null;
             e = e.next) {
            Object k;
            if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
                return e.value;
        }
        return null;
    }

How data is stored in HashMap

First of all the Node array size is always 2^N. Following method guarantees it -

static final int tableSizeFor(int cap) {

    int n = cap - 1;
    n |= n >>> 1;
    n |= n >>> 2;
    n |= n >>> 4;
    n |= n >>> 8;
    n |= n >>> 16;
    return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
}

So lets say you provide initial capacity as 5

cap = 5

n = cap - 1 =  4 = 0 1 0 0
n |= n >>> 1;    0 1 0 0 | 0 0 1 0 = 0 1 1 0 = 6
n |= n >>> 2;    0 0 1 1 | 0 1 1 0 = 0 1 1 1 = 7
n |= n >>> 4;    0 0 0 0 | 0 1 1 1 = 0 1 1 1 = 7
n |= n >>> 8;    0 0 0 0 | 0 1 1 1 = 0 1 1 1 = 7
n |= n >>> 16;   0 0 0 0 | 0 1 1 1 = 0 1 1 1 = 7
return n + 1     7 + 1 = 8

So table size is 8 = 2^3

Now possible index values you can put your element in map are 0-7 since table size is 8. Now lets look at put method. It looks for bucket index as follows -

Node<K,V> p = tab[i = (n - 1) & hash];

where n is the array size. So n = 8. It is same as saying

Node p = tab[i = hash % n];

So all we need to see now is how

hash % n == (n - 1) & hash

Lets again take an example. Lets say hash of a value is 10.

hash = 10
hash % n = 10 % 8 = 2
(n - 1) & hash = 7 & 10 = 0 1 1 1 & 1 0 1 0 = 0 0 1 0 = 2

So it's a optimized modulo operation with bitwise & operator.

A good hash function

A hash function is a method that computes hash of a key where data is stored. It should obviously return value between 0 - (n-1) for an array of length n used to store the data.

A good has function will have take minimum computation time will evenly distribute keys in the array.

For array of size n and m elements inserted it's load factory would ideally be

α = m/n

A hash function can be thought of two parts -

Hash code Map (Key -> Integer)
Compression Map (Integer -> [0,N-1])

Simple hash function (Compression Map) for an array of size N would be

h(k) = k mod n where k is the key and n is the size of the array.

NOTE : In above function h(k) you need to take care that m is not a power of 2. If you do you are only using last m bits of the number to compute hash in that case which is not a good method.

So choose m close to n and m should be prime.

Another compression map can be -

h(k) =lowerbound ( k A mod 1) where k is the key, m is the size of the array and A is a constant between 0 and 1 i.e 1<A<0

NOTE : For String avoid adding ascii values of characters as it is not a good function. Multiple words may result in same hash and map to same bucket resulting in higher collision. Use polynomial function instead. So if your integers of ascii chars are c0, c1, c2 use polynomial like -

C0 + C1(X) + C2(X^2)

String class in Java uses following has method -

    /**
     * Returns a hash code for this string. The hash code for a
     * <code>String</code> object is computed as
     * <blockquote><pre>
     * s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
     * </pre></blockquote>
     * using <code>int</code> arithmetic, where <code>s[i]</code> is the
     * <i>i</i>th character of the string, <code>n</code> is the length of
     * the string, and <code>^</code> indicates exponentiation.
     * (The hash value of the empty string is zero.)
     *
     * @return  a hash code value for this object.
     */
    public int hashCode() {
    int h = hash;
        int len = count;
    if (h == 0 && len > 0) {
        int off = offset;
        char val[] = value;

            for (int i = 0; i < len; i++) {
                h = 31*h + val[off++];
            }
            hash = h;
        }
        return h;
    }

Other hashing techniques are

Linear probing :

if (table is full error)
probe = h(k)
while(table(probe) is not empty)
probe = (probe + 1) mod m
table[prob] = k
double hashing :

if (table is full error)
probe = h1(k)
offset = h2(k)
while(table(probe) is not empty)
probe = (probe + offset) mod m
table[prob] = k

NOTE : In java if hashing is involved (lets say you are using HashMap or a HashSet) make sure you override equals() and hascode() method to suit your requirements. As much as is reasonably practical, the hashCode() method defined by class Object does return distinct integers for distinct objects. (This is typically implemented by converting the internal address of the object into an integer)

HashMap changes in Java8

The performance has been improved by using balanced trees instead of linked lists under specific circumstances. It has only been implemented in the classes

java.util.HashMap,
java.util.LinkedHashMap and
java.util.concurrent.ConcurrentHashMap.

This will improve the worst case performance from O(n) to O(log n).

Lastly lets see HashSet.

Understanding HashSet

Well if you are still guessing the data structure of HashSet then following would be a surprise -

    private transient HashMap<E,Object> map;

    // Dummy value to associate with an Object in the backing Map
    private static final Object PRESENT = new Object();

Yup HashSet stores data as keys of HashMap with dummy value. Constructor is equivalent to that of HashMap -

    /**
     * Constructs a new, empty set; the backing <tt>HashMap</tt> instance has
     * default initial capacity (16) and load factor (0.75).
     */
    public HashSet() {
    map = new HashMap<E,Object>();
    } 



    public HashSet(int initialCapacity, float loadFactor) {
    map = new HashMap<E,Object>(initialCapacity, loadFactor);
    }



    public HashSet(int initialCapacity) {
    map = new HashMap<E,Object>(initialCapacity);
    }

Add and Remove methods are as follows -

    public boolean add(E e) {
    return map.put(e, PRESENT)==null;
    }



    public boolean remove(Object o) {
    return map.remove(o)==PRESENT;
    }

Understanding LinkedHashMap

LinkedHashMap as we know stores the insertion order. However it also extends HashMap so it respects O(1) time complexity as well for insertion. So how does it really work.

LinkedHashMap maintains another type on Entry which holds pointer to previous as well as next Entry Node.

    static class Entry<K,V> extends HashMap.Node<K,V> {
        Entry<K,V> before, after;
        Entry(int hash, K key, V value, Node<K,V> next) {
            super(hash, key, value, next);
        }
    }

LinkedHashMap also stores head and tail of this double linked list and thats how it maintains the insertion order. So even though put follows hashing storage and retrieval order is maintained using doubly linked list.

    /**
     * The head (eldest) of the doubly linked list.
     */
    transient LinkedHashMap.Entry<K,V> head;
    /**
     * The tail (youngest) of the doubly linked list.
     */
    transient LinkedHashMap.Entry<K,V> tail;

So whenever you add a new key, value pair following happens -

    Node<K,V> newNode(int hash, K key, V value, Node<K,V> e) {
        LinkedHashMap.Entry<K,V> p =
            new LinkedHashMap.Entry<K,V>(hash, key, value, e);
        linkNodeLast(p);
        return p;
    }
    // link at the end of list
    private void linkNodeLast(LinkedHashMap.Entry<K,V> p) {
        LinkedHashMap.Entry<K,V> last = tail;
        tail = p;
        if (last == null)
            head = p;
        else {
            p.before = last;
            last.after = p;
        }
    }

When iterating it iterates from head to tail thereby providing same order as insertion.

NOTE : before and after pointers are in addition to the next pointer which is inherited from HashMap.Node class. So next points to next node having same hash (collision scenario) thereby preserving O(1) lookups. Before and After pointers guarantee insertion lookup order.

Understanding TreeMap

TreeMap as you already know stores the data in sorted order. If the data it stores implements Comparable interface then it stores data in that natural order or you can pass a custom comparator to the TreeMap and it will use that to sort and store the data.

TreeMap maintains a binary search tree structure. It has a root, value less than root go to left where as value greater than root go to right and it's balanced too (RBT - A red–black tree is a kind of self-balancing binary search tree).

    /**
     * The comparator used to maintain order in this tree map, or
     * null if it uses the natural ordering of its keys.
     *
     * @serial
     */
    private final Comparator<? super K> comparator;
    private transient Entry<K,V> root;

Simple code snippet would be -

        Entry<K,V> e = new Entry<>(key, value, parent);
        if (cmp < 0)
            parent.left = e;
        else
            parent.right = e;

where cmp is the comparison value got either from comparators compare() method or Comparables compareTo() method.

Saturday, 12 July 2014

How to sort a Map on the values in Java?

Goal

In this post we will see how can we sort a Map based on its values. We will use Comparators.

Map overview in Java -

Code :

 /*
 * author : athakur
 */

public class HelloWorld {


    public static void main(String args[]) {
       
        Map<String,Integer> nameAgeMap = new HashMap<String,Integer>();
        AgeComparator ageComparator =  new AgeComparator(nameAgeMap);
        Map<String,Integer> sortedNameAgeMap = new TreeMap<String,Integer>(ageComparator);

        nameAgeMap.put("Name1",21);
        nameAgeMap.put("Name2",16);
        nameAgeMap.put("Name3",10);
        nameAgeMap.put("Name4",23);

        System.out.println("Original Map : "+ nameAgeMap);
       
        sortedNameAgeMap.putAll(nameAgeMap);

        System.out.println("Sorted Name Age Map : "+ sortedNameAgeMap);
       
    }

}

class AgeComparator implements Comparator<String> {
   
    Map<String,Integer> mapToBeCompared;
   
    public AgeComparator(Map<String,Integer> mapToBeCompared){
        this.mapToBeCompared = mapToBeCompared;
    }

    @Override
    public int compare(String o1, String o2) {
        Integer x = mapToBeCompared.get(o1);
        Integer y = mapToBeCompared.get(o2);
        if(x >= y) {
            return 1;
        }
        else {
            return -1;
        }
    }
   
}

Output :

The output is as expected

Original Map : {Name4=23, Name3=10, Name2=16, Name1=21}
Sorted Name Age Map : {Name3=10, Name2=16, Name1=21, Name4=23}

Note

Above comparator imposes orderings that are inconsistent with equals.
Do not return 0 from the comparator as it will merge the keys. Given that all keys in a Map are unique. Duplicate key entry over writes the old entry.
If you have your own class instead of Integer in the map then it makes sense for your class to implement comparable interface and use compareTo() directly in compare() method.
Collection APIs provide sort method that takes comparator as an argument. You can use that for the sorting criteria that suits your requirement.

To summarize if you want to sort objects based on natural order then use Comparable in Java and if you want to sort on some other attribute of object then use Comparator in Java.

Using Comparator on Entry

You can create a comparator on Entry which is datastructure that HashMap internally uses -

public class SortMapByValue
{
    public static boolean ASC = true;
    public static boolean DESC = false;

    public static void main(String[] args)
    {

        // Creating dummy unsorted map
        Map<String, Integer> unsortMap = new HashMap<String, Integer>();
        unsortMap.put("B", 55);
        unsortMap.put("A", 80);
        unsortMap.put("D", 20);
        unsortMap.put("C", 70);

        System.out.println("Before sorting......");
        printMap(unsortMap);

        System.out.println("After sorting ascending order......");
        Map<String, Integer> sortedMapAsc = sortByComparator(unsortMap, ASC);
        printMap(sortedMapAsc);


        System.out.println("After sorting descindeng order......");
        Map<String, Integer> sortedMapDesc = sortByComparator(unsortMap, DESC);
        printMap(sortedMapDesc);

    }

    private static Map<String, Integer> sortByComparator(Map<String, Integer> unsortMap, final boolean order)
    {

        List<Entry<String, Integer>> list = new LinkedList<Entry<String, Integer>>(unsortMap.entrySet());

        // Sorting the list based on values
        Collections.sort(list, new Comparator<Entry<String, Integer>>()
        {
            public int compare(Entry<String, Integer> o1,
                    Entry<String, Integer> o2)
            {
                if (order)
                {
                    return o1.getValue().compareTo(o2.getValue());
                }
                else
                {
                    return o2.getValue().compareTo(o1.getValue());

                }
            }
        });

        // Maintaining insertion order with the help of LinkedList
        Map<String, Integer> sortedMap = new LinkedHashMap<String, Integer>();
        for (Entry<String, Integer> entry : list)
        {
            sortedMap.put(entry.getKey(), entry.getValue());
        }

        return sortedMap;
    }

    public static void printMap(Map<String, Integer> map)
    {
        for (Entry<String, Integer> entry : map.entrySet())
        {
            System.out.println("Key : " + entry.getKey() + " Value : "+ entry.getValue());
        }
    }
}

This will sort your HashMap.

Important Links

Saturday, 18 May 2013

Hashtable example in Java.

In last post we saw difference between a HashMap and a Hashtable. Let us now see what Hashtable is and how do we use it.

If you see the source code implementation of Hashtable it is similar to that of map. Infact Hashtable implements Map interface.As of the Java 2 platform v1.2, this class was retrofitted to implement the Map interface, making it a member of the Java Collections Framework. For more information refer to Hashtable JavaDocs.

public class Hashtable<K,V>
extends Dictionary<K,V>
implements Map<K,V>, Cloneable, java.io.Serializable {...

Lets see how can we use Hashtable.

Code :

package testCodes;

import java.util.Enumeration;
import java.util.Hashtable;

public class HashtableDemo {
    
    public static void main(String args[])
    {
        Hashtable<String, String> hashTable = new Hashtable<String, String>();
        //Populating Hashtable
        hashTable.put("Name", "John");
        hashTable.put("Country", "USA");
        hashTable.put("Gender", "Male");
        //Printing whole table
        System.out.println("hashTable : " + hashTable);
        //printing individual values
        Enumeration<String> data = hashTable.keys();
        while(data.hasMoreElements())
        {
            String str = (String) data.nextElement();
            System.out.println(str + " : " + hashTable.get(str));
        }
    }    
}

Output :

Understanding the code :

Code is very much similar to HashMap usage but has slight variations. Note Hashtable is a class and not an Interface. put() and get() method remain the same as that of HashMap. What is different is the concept of Enumeration. As specified earlier when Hashtable was introduced there was no Collections and hence no iterators. What we had to use was Enumeration. Enumeration is similar to Iterator. We check whether it has more elements using hasMoreElements() function, if yes access the next element using nextElement() function. Note the use of generics remain the same.

Few very common differences between a HashMap and a Hashtable are -

Hashtable is synchronized where as HashMap is not.
Hashtable does not allow null keys or values whereas HashMap allows one null key and any number of null values.
keys() method of Hashtable return Enumeration where as keySet() method of HashTable returns a Set which can be iterated over as it implements Iterable interface - Enumeration does not.

Sunday, 12 May 2013

How to iterate over a Map in Java?

Just to revise basic of Map go through it demo code. We will use the same piece of code and iterate over it.Purpose of this post is to demonstrate how can we iterate over a Map. We know Map interface does not implement Collection interface and hence we cannot just say someMap.iterator() . To revise Iterators refer to the post on understanding Iterators.Lets see then how do we iterate over a Map.

Java Code :

package testCodes;

import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;

public class MapDemo {

    public static void main(String args[]) {
        Map<String, String> persononalData = new HashMap<String, String>();
        // Populating Map with data
        persononalData.put("Name", "John");
        persononalData.put("Country", "USA");
        persononalData.put("Gender", "Male");
        persononalData.put("Age", "18");
        Iterator<Map.Entry<String, String>> mapIterator = persononalData
                .entrySet().iterator();
        while (mapIterator.hasNext()) {
            Map.Entry<String, String> keyValuPair = mapIterator.next();
            System.out.println("Key is " + keyValuPair.getKey());
            System.out.println("Value is " + keyValuPair.getValue());
        }
    }
}

output :

Understanding the code :

As Map is not a Collection we cannot use .iterator() function on it. What we do is we create a set of Entry objects and iterate over it. Internally key-value pair are stores in Entry objects in the Map. We then get iterator of this set and iterate over it. Entry object has key and values stores in it and we can extract them using .getKey() and .getValue() methods.

Smarter Way :

        for (Map.Entry<String, String> keyValuePairs : persononalData
                .entrySet()) {
            System.out.println("Key is " + keyValuePairs.getKey()
                    + "and value is " + keyValuePairs.getValue());
        }

This is a bit smarter way to do the same iteration. Internally it does the same thing i.e use a iterator and iterate over the Entry set.This may look a bit complex for beginners. So method mentioned above is recommended.

What if I only wish to iterate over keys or just values?

Yes there is a way to specifically iterate over keys and values in Map. The .keySet() function will return a set of keys where as .values() function will return Collection of values.Note once you have the Set you can get it's iterator and iterate over it's values.

Iterate over keys :

        for (String key : persononalData.keySet()) {
           System.out.println("Key is " + key);
       }

Output :

Key is Name
Key is Age
Key is Gender
Key is Country

Iterate over Values :

        for (String value : persononalData.values()) {
            System.out.println("Values is " + value);
        }

Output :

Values is John
Values is 18
Values is Male
Values is USA

Note : Keys in a Map are unique and hence .keySet() will return a set. Two different keys can have same value and hence values can't form a set(Values in a set are unique) and hence what you get from .values() is a Collection.

Simple Map example in Java.

We have learned enough theory on Collections. We also saw a working demo code for ArrayList. In last post we saw why Map interface does not implement Collection framework and in the coming posts we will see how to iterate over the data in Map. But for now lets see a demo code of Map.

Java Code demonstrating Map usage

Code :

package testCodes;

import java.util.HashMap;
import java.util.Map;

public class MapDemo {

    public static void main(String args[]) {
        Map<String, String> persononalData = new HashMap<String, String>();
        // Populating Map with data
        persononalData.put("Name", "John");
        persononalData.put("Country", "USA");
        persononalData.put("Gender", "Male");
        persononalData.put("Age", "18");
        // Printing Map
        System.out.println("Map is " + persononalData);
        // removing data from Map
        persononalData.remove("Gender");
        // Printing Map
        System.out.println("Map is " + persononalData);
        if (Integer.valueOf(persononalData.get("Age")) < 18) {
            System.out.println("You are a minor");
        } else {
            System.out.println("You are a Major");
        }
    }
}

Output :

Understanding the Code :

We create a Map named persononalData to store personal information of some person named John. Note the generics used. Both key-value are of type String. At a later point of time if you try to put another data type(say int) then that is not allowed. You must always decide on the data structure before you start writing you actual code.In this case I am going to store everything as String.

We then start populating data. We add Name, Country, Age and Gender.Note the function is put(key,value) unlike add(element) in Collections.We then print the Map. Note when we say System.out.println(someMap) internally toString() method is invoked on the corresponding object and is printed to standard output.

Next we remove an element using .remove(key) method.Again we print out the Map to see the difference. Check out the output screen shot provided above.

Note : All keys in the Map must be Unique i.e you cannot store two different values for the same key. Lets say you do someMap.put("Name","John") and then again you say someMap.put("Name","Sam") then Sam will overwrite John and when you extract data by get("Name") you will get Sam and not John.

Next we extract age of the person using .get("Age") method. Note this will return you a String. But for comparison purpose you need an integer and hence we convert String to an int using Integer.ValueOf(SomeString) method.In our case age is 18 and hence age<18 will evaluate to be false and else part will get executed. Hence the output shown.

Hope basic operations in Map are clear. In next post we will see how to iterate over Map to manipulate it's data.

Why Map is not a true Collection?

We saw what are Collections in Java. If you recall the diagram from that post Map does not form a part of Collection though it comes under Collection framework. In more technical terms Map interface does not implement Collection interface. Knowing this the question that naturally comes in mind is why so? Why can't Map be a part of Collection frame work? Let us see the explanation.

Reason for Map interface not extending Collection interface

If you look at the respective data structure you can easily guess why Map is not a part of Collection. Each Collection stores a single value where as a Map stores key-value pair. So methods in Collection interface are incompatible for Map interface.For example in Collection we have add(Object o). What would be such implementation in Map. It doesn't make sense to have such a method in Map. Instead we have a put(key,value) method in Map.
Same argument goes for addAll(), remove(), removeAll() methods. So the main reason is the difference in the way data is stored in Map and Collections.
Also if you recall Collection interface implemented Iterable interface i.e any interface with .iterator() method should return an iterator which must allow us to iterate over the values stored in the Collection. Now what would such method return for a Map? Key iterator or a Value iterator? This does not make sense either.

There are ways in which we can iterate over keys and values stores in a Map and that is how it is a part of Collection framework.Do not get confuse here. Let me repeat this - Map is a part of Collection framework but it does not implement Collection interface. We will see in next few posts how Map works internally(that's really an interesting topic) and also how we manipulate data in it.

Just to recall the Collection framework refer to the following diagram -

Saturday, 11 May 2013

Handling null in a Collection.

One of the general problem programmers land into is the NullPointerException. This post will mainly focus on how do we handle this null value. Specially in case of Collections what is the difference between and empty collection, a null Collection and a Collection containing a null value.

Before we see how to handle null value remember that member(instance) variables are assigned default values if they are not initialized where as for local variable you need to explicitly initialize them before using them.So when we define a reference type as an instance variable it is by default set to null.Complete set of default values assigned to instance variables are as follows -

Good Practice

Important point to note while comparing String literals is always put the literal to the L.H.S(Left hand side) of the == or equals() operator. For example -
        String typeOfOS;
       if("linux".equalsIgnoreCase(typeOfOS))
       {
           System.out.println("Very good choise of OS");
       }
Why so you might ask? Reason is simple if your variable typeOfOS is null it will throw a java.lang.NullPointerException.

Handling null in Collections

It is important you understand the difference between a reference being null, Collection being empty and Collection containing null as an element.

Reference being null

List<String> operatingSystems = null;
if(null == operatingSystems)
{
System.out.println("operatingSystems reference has not been initialized yet");
}

Explanation : No memory is allocated on heap in this case. This is just a reference that we have declared. We can check it for null as shown above.
Collection being empty

        List<String> operatingSystem = new ArrayList<String>();
        if(operatingSystem.isEmpty())
        {
            System.out.println("List is empty");
        }

Explanation : Here we create an object of List. Memory is allocated on the heap but there is no element in the List i.e the List is empty.You can check if Collection is empty or not by using .isEmpty() function.
Collections containing null as an element

Ground Work : We know List can have duplicate elements. This means a list can have multiple null values stored in it. Also in a Set duplicates are not allowed and hence only one null value is allowed. Never the less null is an acceptable value as an element in Collection. There are exceptions like EnumSet where null is not allowed but thats a rare usage.

        List<String> operatingSystem = new ArrayList<String>();
        operatingSystem.add(null);
        System.out.println("List is " + operatingSystem);
        if(operatingSystem.isEmpty())
        {
            System.out.println("List is empty");
        }
        else
        {
            System.out.println("List is not empty");
        }

Output :

Explanation : List is clearly not empty as we have added an element(null) to it. Whole point being null can be an element stores in a Collection.If you wish to remove null from the List you can simple say .remove(null).

Saturday, 4 May 2013

Why and when can we encounter ConcurrentModificationException in Java.

Background

If you are a beginner level Java developer and experimenting with Collections you may encounter this exception. Even I was taken with surprise when i first encountered such Exception.Let see why and when do we get such an Exception.
Before we proceed take a look at the Iterator Example code in the post on Understanding Iterators. We will refer to a lot of things from that post.

Now if you recall the Iterator interface it was like -

public interface Iterator<E>

{
    boolean hasNext();
    E next();
    void remove();
}

Iterator object has a remove function but generally beginners tend to forget that. See the following code when you will encounter ConcurrentModificationException.

Example Code

package testCodes;

import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;

public class IteratorDemo {

    public static void main(String[] args) {

        List<String> nameList = new ArrayList<String>();
        nameList.add("John");
        nameList.add("Sam");
        nameList.add("kenny");

        nameList.add("Jessica");

        Iterator<String> namesIterator = nameList.iterator();
        System.out.println("Printing names");
        while (namesIterator.hasNext()) {
            String name = namesIterator.next();
            if (name.equals("Sam")) {
                nameList.remove(name);
            }
            System.out.println(name);
        }
    }
}

Output

Explanation

Why did we get this Exception. Reason is simple. When we Iterate using an Iterator, Iterator objects expects there should be no modification in the Collections which it points to. So how do we remove one might ask. For this reason we have .remove() method in Iterator interface. You can do something like below -

while (namesIterator.hasNext()) {
            String name = namesIterator.next();
            if (name.equals("Sam")) {
                namesIterator.remove();
            }
            System.out.println(name);
     }

Code will correctly compile and run. It's a very simple point but programmers tend to forget it some time.

So basically you should not modify the collection when it is being iterated upon. Another simple example that would throw this exception is

        Map<String, Object> myMap = new HashMap<String, Object>();
        myMap.put("Aniket", 1);
        myMap.put("Abhijit", 2);
        for(String key: myMap.keySet()) {
            myMap.remove(key);
        }

Solution as I mentioned above is to use remove() method of the iterator.Another option could be use concurrent collections. See following example -

        Map<String, Object> myMap = new ConcurrentHashMap<String, Object>();
        myMap.put("Aniket", 1);
        myMap.put("Abhijit", 2);
        for(String key: myMap.keySet()) {
            myMap.remove(key);
        }

What internally happens is as soon as data is updated in a collection the corresponding iterator is updated too which generally does not happen in normal collections.

Collections Example(ArrayList)

Before we dive further into Collections let me take an example to show how Collections actually work.Lets take an List example as it is the most simple to understand. You can then try similar operations on Set and Queue. Not the functions remain the same for all as all three implement Collection interface. Only their implementations will vary.

Code

package testCodes;

import java.util.ArrayList;
import java.util.List;

public class ListDemo {

    public static void main(String[] args) {

        List<String> nameList = new ArrayList<String>();
        nameList.add("John");
        nameList.add("Sam");
        System.out.println(nameList);
        nameList.add(0, "kenny");
        System.out.println("Size of list is " + nameList.size());
        System.out.println(nameList);
        nameList.remove("Sam");
        System.out.println(nameList);
        nameList.remove(0);
        System.out.println(nameList);
        System.out.println("Size of list is " + nameList.size());
    }
}

Output

Explanation

First we create a List name nameList and then populate it with some values.Not the use of polymorphism(Superclass reference i.e List to subclass object i.e ArrayList). You can add values in any Collection using .add() method.Observer the .add(0, "kenny") method. You can specify index as the 1st argument. Since i have put 0 "Kenny" is added to the front of the list(index 0). .remove() is also similar. Either you can specify an object to be removed as an argument or index at which object is to be removed. You can print the size of collection using .size() method. There are various other methods in Collection that we will learn in time but for now these should suffice. You can now play around with other Collections like Set and Queue.

.clear() is another simple method to try. It will simple remove all emenets from the Collection. After this if you do .size() you will get 0.

Tuesday, 30 April 2013

Collection in Java

What is Collection?

Collection if we go by name means group of some entities. From a bit more technical perspective it means an object that groups multiple elements into single unit.Collections are used to store, retrieve, manipulate and communicate aggregate data.

Collection framework in Java

A collection framework is a unified architecture for representing and manipulating collections.All collection frameworks contain the following -

Interfaces
Implementations
Algorithms

All of the collection framework is represented by the following picture -

Each of the above will be taken individually and explained but lets take one example to understand Collection frameworks.

First thing to note is that Collection interface is the top most entity in the entire Collection tree.Yes it is an interface and all other entities in Collection(List,Set,Queue) implement this Collection interface.Collection interface also implements an interface called Iterable which allows us to iterate over the elements stored in the collection but do not worry about it for now.We will explain it separately.
If you see above picture List is a Collection.Note again that List is still an interface and multiple implementations of it are possible. Example we can have ArrayList, LinkedList, Stack, Vector etc. Each concrete implementations just mentioned will have different Algorithms associated with them.

Note : Map interface and it's concrete implementations do not form a part of Collections i.e Map interface does not implement Collection yet it is a part of Java Collection framework. Also since Map interface does not extend Collection interface(which in turn implements Iterable interface) Map cannot be directly iterated over it's elements. There is another way of-course but we shall see it later.

Friday, 13 February 2015

Background

Overview

Understanding how HashMap works

How is data with keys generating same hash code stored in HashMap?

So how is data retrieved if two keys generate same hash code?

How data is stored in HashMap

A good hash function

HashMap changes in Java8

Understanding HashSet

Understanding LinkedHashMap

Understanding TreeMap

Related Links

Saturday, 12 July 2014

Goal

Code :

Output :

Note

Using Comparator on Entry

Important Links

Saturday, 18 May 2013

Code :

Output :

Understanding the code :

Related Links

Sunday, 12 May 2013

Java Code :

output :

Understanding the code :

Smarter Way :

What if I only wish to iterate over keys or just values?

Iterate over keys :

Output :

Iterate over Values :

Output :

Related Links

Java Code demonstrating Map usage

Code :

Output :

Understanding the Code :

Related Links

Reason for Map interface not extending Collection interface

Saturday, 11 May 2013

Good Practice

Handling null in Collections

Saturday, 4 May 2013

Background

Example Code

Output

Explanation

Related Links

Code

Output

Explanation

Tuesday, 30 April 2013

What is Collection?

Collection framework in Java

Collections summed up!