Sunday, 9 July 2017

How to install wine and run windows programs on your mac

Background

Sometimes it becomes necessary to install windows program on your Linux or mac machine. Like I mentioned in a post sometime back there may be some sites that require IE only -
In this post I will show you how to install wine on your mac. Wine is a very handy software that allows you to install and run windows programs in a windows like simulated environment.


Installing Wine on Mac

You need to have homebrew installed on your mac. If not please refer -
 Next Homebrew uses an extension called Homebrew Cask to install other programs. You can install the Cask extension by running following command -
  • brew tap caskroom/cask


Wine needs -
  • Java and 
  • XQuartz 
as dependencies to be already installed. I am assuming you already have Java installed on your machine and set it up in classpath. You can install  XQuartz with following command -
  • brew cask install xquartz


NOTE :  You can similarly install Java if you already done have it -
  • brew cask install java
 Once dependencies are done you can directly install wine with following command -
  • brew install wine


Also install winetricks -
  • brew install winetricks

 Use winetricks to set environment as windows 7 -
  • winetricks win7

Installing and running Windows program from wine

Go to the directory where you have downloaded your exec file and run -
  • wine installer.exe
where installer.exec is your exe file.

 You can find installed files in dir -
  • /Users/athakur/.wine/drive_c
You can then navigate to program files, find your installed program and run it -


 Once in the program directory you can simply run it as -
  • wine ioexplorer.exe


And you are done :)

Related Links

How to check if a Singly Linked List is a Palindrome or not

Background

This is another classic data structure interview question that fall into basic DS problems. You might have seen or known method to find if a String is palindrome or not. You can simply iterate on half of the String and check with reversed other half if it same.

Time Complexity : O(N)
Space Complexity : O(1)

It will be as simple as -

    public static boolean isPalindrome(String str) {
        if(str == null) {
            return false;
        }
        for(int i = 0; i< str.length()/2; i++) {
            if(!(str.charAt(i) == str.charAt(str.length() - i - 1))) {
                return false;
            }
        }
        return true;
    }

Test :
        System.out.println(isPalindrome("ABCDCBA"));
        System.out.println(isPalindrome("ABBA"));
        System.out.println(isPalindrome("ABCD"));
Output:
true
true
false

It can have a variant such that instead of a String you have a Linked List. Now if you have a double linked list it becomes very easy. You start from head and from the tails and keep comparing. Increment the header pointer and decrement the tail pointer in each iteration. Time complexity will be O(N) only.

However the question at hand is of Singly Linked List.

How to check if a Singly Linked List is a Palindrome or not

 

Method 1 : Using a String

 

Iterate over the Linked list and construct a String out of it and then check if that String is a Palindrome.Time complexity O(N) but space complexity is also O(N) since you are now creating a String.

Since interviewer asked you Linked List this is most definitely something he does not want. He could have asked a String palindrome itself if that was the case. But it never hurts to put it out what you are thinking and build upon your answer as you proceed.

 

Method 2 : Using a Stack

 

You can iterate over the Linked List put it's content in stack. Once iteration is over we can iterate over Linked List again and this time with each iteration compare Nodes content with Stacks popped out content. If it does not match it is not a palindrome.
This again has time complexity O(N) and space complexity O(N).

1) Traverse the given list from head to tail and push every visited node to stack.
2) Traverse the list again. For every visited node, pop a node from stack and compare data of popped node with currently visited node.
3) If all nodes matched, then return true, else false.

Code :

    public static boolean isPalindrome(ListNode<String> head) {
        boolean isPanindrome = true;

        Stack<String> stack = new Stack<>();
        ListNode<String> currentNode = head;

        while (currentNode != null) {
            stack.push(currentNode.getValue());
            currentNode = currentNode.getNext();
        }

        currentNode = head;
        while (currentNode != null) {
            if (!currentNode.getValue().equals(stack.pop())) {
                isPanindrome = false;
                break;
            }
            currentNode = currentNode.getNext();
        }
        return isPanindrome;
    }

I have also added it to my Data Structure github repo. Check isPalindrome() method in  https://github.com/aniket91/DataStructures/blob/master/src/com/osfg/questions/LinkedListPalindromeFinder.java 

 

Method 3 : Reversing the 2nd half of the Linked List

 

This is a better version and always one you should aim for. It provides O(N) time complexity and O(1) space complexity -

1) Get the middle of the linked list.
2) Reverse the second half of the linked list.
3) Check if the first half and second half are identical.
4) Construct the original linked list by reversing the second half again and attaching it back to the first half

4th point is optional and depends if you need original List back.

I have added it to my Data Structure github repo. Check isPalindrome2() method in  https://github.com/aniket91/DataStructures/blob/master/src/com/osfg/questions/LinkedListPalindromeFinder.java  

 

Related Links


How to display or hide line numbers in vi or vim text editor

Background

Line number come in very handy when you are working with any Text Editor. However if you take vi or vim editor then line numbers are hidden by default. In this post we will see how we can turn it on.


How to display or hide line numbers in vi or vim text editor

This post assumes you are a vim user and are aware of basic usage. Like for example when you need to save and quit data in vim you-
  • press escape followed by :wq
or if you want to exit without saving
  • :q!
 vi support a lot of commands to use and  options to be set by using colon (:)

To show line numbers in vi simple type below are pressing escape -
  • :set number OR
  • :set nu
 To hide it you can type -
  • :set nu! OR
  • set nonumber





This you need to do every time you launch into vi editor. To make it default edit your file at location ~/.vimrc and append following line at the end -

  • set number
NOTE : You can create this file if it does not exist already.

Now you will always get line numbers when you launch vim editor.




 Now that we have seen how to hide and display line numbers in vim editor lets see how we can jump to a particular line in vim -

How to jump to a particular line in vim

This is also fairly simple. Once you have launched vim you can simply move to any line number using following command -
  • : linenumber
Eg.
  • :6


 You can ever jump directly to your line number immediately as you open vim. To do that use following command while opening vim -
  • vi +linenumber filename
Eg.
  •  vi +6 test.txt
This should open your file and move to the linenumber you have provided.


General Info

vim provides a lot of configurable options to set. To see them all type following command -
  • :set all

To see everything that you have set so far you can type following command -
  • :set
For me it is as follows -



Related Links

Wednesday, 28 June 2017

How to access websites on your Mac that requires Internet Explorer

Background

There are certain websites that can be accessed from Internet Explorer only. This happens because of the websites compatibility with IE. But this will not work on your Mac laptop or Linux machine since you cannot run IE on it. At least not in traditional way - You can always install a software like Wine and then run your windows application in that simulated environment. But there is a much simpler way.

How to access websites on your Mac that requires Internet Explorer

I will take Safari browser in our Mac as an example. 
  • Open Safari browser and open preferences from menu bar at the top.
  • Once opened go to "Advanced Tab"
  • In "Advanced Tab" select the "Show Develop menu in menu bar" check box.

  • Once done you should be able to see "Develop" menu in menubar on top. 
  • Under "Develop" menu you can select "User Agent" and then select the user agent you want. For eg - "Internet Explorer 7"

  • Once you select that your IE compatible page should load fine.

 On other operating systems and browser  - Linux/Chrome/Firefox

Above approach was specific to Safari but the solution remains same - You need to change the user agent. So you have plugins to do so -


Similar you can find a similar plugin in chrome store. 

Manual way

    FIREFOX 4.0 :  In Firefox type in the URL Address: about:config. A webpage will appear saying a warning about the use of the Config. Click on the button about you being careful. In the search bar in the Config type agent and look for the variable general.useragent.override. Double click on it and overwrite the value it has with one of the following (For the default leave the value EMPTY):

    IE6 - Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
    IE7 - Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.0)
    IE8 - Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1)


    CHROME :  Chrome has an about page to CHECK if you have changed your User Agent about: and other options like about:labs, about:memory, about:hang, about:plugins and many others that depending on your version they could be available or not. But for the question at hand this option is not yet in any of the about pages i have found. To have it manually in chrome you need to start chrome with the option user-agent. For example google-chrome --user-agent="Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)" which will open Chrome like it were IE6. The IE User Agents are from the Firefox option above.

    OPERA : In Opera type in the URL Address: about:config. A list will appear and a search address in the upper part of the list. Type in the search address user agent. The option for User Agen will appear below the search address. Click on it and depending on the Browser you want you have several options that change depending on version. For example:

    1 - Opera (this is the default user agent string used by Opera)
    2 - Mozilla (With the Opera String in it)
    3 - Internet Explorer (With the Opera String in it)
    4 - Mozilla (Without the Opera String in it. 100% Mozilla)
    5 - Internet Explorer (Without the Opera String in it. 100% IE)


    But this values could change so you would need to test each one to know what User Agent value it has.

Related Links

Saturday, 17 June 2017

Make a bootable flash drive from an ISO image on Ubuntu

Background

There are many times you download an ISO file from internet and want to install it on your machine. For eg. Windows or Ubuntu image. What you really wish is flash it iso image into a bootable drive like USB drive or a CD drive and install it from there on your machine.

 Make a bootable flash drive from an ISO image on Ubuntu

You have following options -
  1. Etcher- is a free and open-source image burner with support for Windows, OS X and GNU/Linux.
  2. Easy2Boot- Flexible and configurable USB drive multiboot solution which also supports UEFI booting.
  3. LiveUSB Install- is a free software for GNU/Linux and Windows. With LiveUSB Install you can effortlessly install various Linux distros.
  4. Multisystem- is an awesome tool created by LiveUSB.info that works similar to our Windows based MultiBootISOs USB creator.
  5. WinUSB - is a simple tool that enable you to create your own usb stick windows installer from an iso image or a real DVD.
  6. Unetbootin - UNetbootin allows you to create bootable Live USB drives for Ubuntu and other Linux distributions without burning a CD.
 Let's see some of the options -

Unetbootin

To install this run following command in your terminal -
  • sudo apt-get install unetbootin


Once installed you can  open the app. Now you can either select one of the given linux distros (those will get downloaded automatically) or  provide it an iso file from local machine.



WinUsb

To install winusb run following commands on your linux terminal -
  • sudo add-apt-repository ppa:colingille/freshlight
  • sudo apt-get update
  • sudo apt-get install winusb
Once installed you can open up the app select iso file and target usb and start creating bootable usb drive for windows.



NOTE : If you are looking for rufus then it is for Windows only. You cannot use it on your Linux machine.

Counting Semaphore, CountDownLatch, CyclicBarrier - synchronization methods for concurrency

Background

With Java 5 a lot of concurrency mechanisms were introduced for synchronization.



In one of the previous posts we saw what Reentrant locks are and how they help us achieve concurrency. -
We also saw
Along with ReentrantLock and ExecutorService there were other concurrency elements that were introduced in Java 5 like -
  • Counting Semaphore
  • CountDownLatch
  • CyclicBarrier
Today we will try to understand these. Not only their understanding helps us with multi threading they are also popular topic of Java interview question. Lets see them one by one.

Counting Semaphore

Semaphore maintains a number of permits for a resource and only that many number of threads can access the resource. If the maximum permits allowed is reached then threads will have to wait till some other thread owing a permit releases it. 

As an example lets consider a simple semaphore with 1 permit. It's called binary semaphore. It's similar to wait and notify on same object. 

    public static void main(String args[])
    {
        Semaphore binarySemaphore = new Semaphore(1);
        new Thread(() -> {
            try {
                binarySemaphore.acquire();
                System.out.println("Semaphore permit acquired by : " + Thread.currentThread().getName());
                
            } catch (Exception e) {
                e.printStackTrace();
            }
            finally {
                System.out.println("Semaphore permit getting released by : " + Thread.currentThread().getName());
                binarySemaphore.release();            }
            
        }).start();
        new Thread(() -> {
            try {
                binarySemaphore.acquire();
                System.out.println("Semaphore permit acquired by : " + Thread.currentThread().getName());
                
            } catch (Exception e) {
                e.printStackTrace();
            }
            finally {
                System.out.println("Semaphore permit getting released by : " + Thread.currentThread().getName());
                binarySemaphore.release();
            }
            
        }).start();
    }
and the output would be -
Semaphore permit acquired by : Thread-0
Semaphore permit getting released by : Thread-0
Semaphore permit acquired by : Thread-1
Semaphore permit getting released by : Thread-1


 As you can see from code above you acquire a permit using acquire() method and release a permit using release() method.


NOTES :
  1. You can also acquire permit using acquireUninterruptibly(). This is a blocking call and the thread cannot be interrupted.
  2. Now acquire() is also a blocking call however it can be interrupted unlike acquireUninterruptibly() call
  3. You can also use tryAcquire() call which will try to acquire the permit and if available will return immediately with true. If it is not available it will also return immediately with false. So this is a non blocking call.

CountDownLatch

This is another synchronization mechanism in which a resource is not allowed access till predefined number of threads don't complete their operations. So lets say there are 10 threads making a bread slice. As soon as we are ready with 5 slices we can lets say pack it together for selling. In this case we can use a CountDownLatch. Initialize one with 5 and as soon as 5 threads acknowledge they have finished making slices we can start packing (probably a new thread).

So a thread will wait for n other threads. Let's see an example -

    public static void main(String args[])
    {
        CountDownLatch countDownLatch = new CountDownLatch(2);
        new Thread(() -> {
            try {
                Thread.sleep(2000);
                System.out.println("Calling countdown by : " + Thread.currentThread().getName());
                countDownLatch.countDown();
            } catch (Exception e) {
                e.printStackTrace();
            }
           
        }).start();
        new Thread(() -> {
            try {
                Thread.sleep(2000);
                System.out.println("Calling countdown by : " + Thread.currentThread().getName());
                countDownLatch.countDown();
            } catch (Exception e) {
                e.printStackTrace();
            }
           
        }).start();
       
        try {
            System.out.println("Waiting for all other threads finish operation");
            countDownLatch.await();
            System.out.println("All other threads finish operation!");
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }


Output :
Waiting for all other threads finish operation
Calling countdown by : Thread-1
Calling countdown by : Thread-0
All other threads finish operation!
As you can see main thread calls await() on the countdownlatch and wait for 2 threads to call countDown() on it. Here n is 2 but you can configure it in the constructor.

You need to use this when your use case is to wait for some other initial operations to finish before starting some other operation.

NOTE :
  1.  CountDownLatch is not reusable. So once the count reaches 0 i.e n threads have called countdown() the latch is unusable.

CyclicBarrier

CyclicBarrier is yet another synchronization mechanism. In this all n threads will wait for each other to reach the barrier. Such waiting threads are called parties. Number of parties are set in the CyclicBarrier during its creation. All parties reach the barrier and call await() which is a blocking call. Once all parties reach the barrier i.e all call await() then all threads get unblocked and proceed for next execution.

Simple usecase that you can think of is a multiple game scenario is which a game would not start untill all the players have joined. Here all the players are parties where as game start is a barrier.

Eg -

    public static void main(String args[])
    {
        CyclicBarrier cyclicBarrier = new CyclicBarrier(3);
        new Thread(() -> {
            try {
                Thread.sleep(2000);
                System.out.println("Player joining : " + Thread.currentThread().getName());
                cyclicBarrier.await();
                System.out.println("Game starting from : " + Thread.currentThread().getName());
            } catch (Exception e) {
                e.printStackTrace();
            }
           
        }).start();
        new Thread(() -> {
            try {
                Thread.sleep(2000);
                System.out.println("Player joining : " + Thread.currentThread().getName());
                cyclicBarrier.await();
                System.out.println("Game starting from : " + Thread.currentThread().getName());
            } catch (Exception e) {
                e.printStackTrace();
            }
           
        }).start();
        new Thread(() -> {
            try {
                Thread.sleep(2000);
                System.out.println("Player joining : " + Thread.currentThread().getName());
                cyclicBarrier.await();
                System.out.println("Game starting from : " + Thread.currentThread().getName());
            } catch (Exception e) {
                e.printStackTrace();
            }
           
        }).start();
       
    }

and the output is -
Player joining : Thread-0
Player joining : Thread-1
Player joining : Thread-2
Game starting from : Thread-2
Game starting from : Thread-0
Game starting from : Thread-1
As you can see all threads (3 in above case) will wait for each other to reach the barrier. Once they all reach and call await() they can all proceed to their further tasks.
NOTE :
  1. cyclicBarrier.reset() will put barrier on its initial state, other thread which is waiting or not yet reached barrier will terminate with java.util.concurrent.BrokenBarrierException. So CyclicBarrier can be reused unlike CountDownLatch.

Summary

Semaphore : Manages a fixed sized pool of resources.
CountDownLatch : One or more threads wait for a set of threads to finish operations.
CyclicBarrier : Set of threads wait for each other until they  reach a specific point.

Related Links

Tuesday, 23 May 2017

Print binary tree in Spiral order

Background

Sometime back we had seen how to traverse a binary tree and print it. We saw -
  • Pre-order 
  • post-order
  • In-order
  • level order traversals
Binary Tree Traversal

In this post we will see how to print them in a spiral order.  Consider following tree -




We need to print the tree in following order - 1, 2, 3, 4, 5, 6, 7.




Code

Following recursive approach will help achieve this. Idea is to keep a boolean toggle param to print nodes either from left to right or right to left.


    public static int getHeight(BTreeNode root) {
        if (root == null) {
            return 0;
        } else {
            int leftHeight = getHeight(root.getLeft());
            int rightHeight = getHeight(root.getRight());
            return leftHeight > rightHeight ? leftHeight + 1 : rightHeight + 1;
        }
    }


    /**
     * 
     * @param root
     *            if the btrr Worst case time complexity - O(N^2) for skewed
     *            trees No extra space
     */
    public static void printSpiralRecurssive(BTreeNode root) {
        boolean leftToRight = false;
        int height = getHeight(root);
        for (int i = 1; i <= height; i++) {
            printLevelRecurssive(root, i, leftToRight);
            leftToRight = !leftToRight;
        }
    }

    public static void printLevelRecurssive(BTreeNode root, int level, boolean leftToRight) {
        if (level == 1) {
            System.out.println(root.getData());
        } else {
            if (leftToRight) {
                printLevelRecurssive(root.getLeft(), level - 1, leftToRight);
                printLevelRecurssive(root.getRight(), level - 1, leftToRight);
            } else {
                printLevelRecurssive(root.getRight(), level - 1, leftToRight);
                printLevelRecurssive(root.getLeft(), level - 1, leftToRight);
            }
        }
    }

Logic : We first calculate the height of the tree which are basically levels. We then iterate from 1 to height (basically all levels) and print them from left to right or right to left based on the boolean toggle. We toggle this value after each level. For each recursive call we start from root and we go down till we reach the level we want it to be (one next to previously iterated on) based on the height and print nodes.

Complete solution with example is provide under my github repo of Data Structures -
In the same link there is a recursive solution as well that takes O(N) extra space to give same result. Iterative solution  -

private static Stack<BTreeNode> leftToRight = new Stack<>();
private static Stack<BTreeNode> rightToLeft = new Stack<>();
public static void printSpiralIterative(BTreeNode root) {

        rightToLeft.push(root);

        while (!rightToLeft.isEmpty() && !leftToRight.isEmpty()) {
            while (!rightToLeft.isEmpty()) {
                BTreeNode node = rightToLeft.pop();
                System.out.println(node.getData());
                if (node.getLeft() != null) {
                    leftToRight.push(node.getLeft());
                }
                if (node.getRight() != null) {
                    leftToRight.push(node.getRight());
                }
            }

            while (!leftToRight.isEmpty()) {
                BTreeNode node = rightToLeft.pop();
                System.out.println(node.getData());
                if (node.getLeft() != null) {
                    rightToLeft.push(node.getLeft());
                }
                if (node.getRight() != null) {
                    rightToLeft.push(node.getRight());
                }
            }
        }

}


Let me know if you have any questions.

Related Links

Saturday, 20 May 2017

How ConcurrentHashMap Works Internally in Java

Background

In one of the previous posts we saw how HashMap works -
and how it's time complexity of insertion and deletion is O(1) is normal case. Though this is a great data structure to work with in terms of time complexity it is not thread safe which means you cannot use it directly in multi threaded environments without taking additional precautions like synchronizing put/get on your own. Instead Java has provided a thread safe implementation of concurrent hashmap. We can directly use it in case of multi threaded environments for thread safety. Eg. in case of parallel stream introduced in java 8.

How ConcurrentHashMap Works Internally in Java

Before we see how it is implemented in Java lets give it some though. What are possible problems with a HashMap. Race condition, invalid state. Lets say two writes happen at the same time. Since write is not an atomic operation one value may overwrite other and Map may go in inconsistent state. We can obviously add synchronization over read/writes of a HashMap but it would be very inefficient and have performance impact. I would be like single threaded application certainly the behavior we don't expect. To solve this issue Java provides ConcurrentHashMap that has built in thread safety. Let see how -

We know how HashMap works. Internally it stores an array of Entry object which essentially has key, value and pointer to next Entry object (linked list used in case of collision). You can think of each array element as bucket and each Entry object as a data point containing key (in case 2 keys have same hash - collision), value  and pointer to next data element. 

Working :
ConcurrentHashMap as the name suggests allows concurrent read/writes to the Map. But there are limitations. ConcurrentHashMap maintains another data structure internally called segments. Each bucket of HashMap is part of one of the segments. Number of segments is called Concurrency-Level which determines number of thread that can write simultaneous. This Segments gets locked when writing/updating/removing data. Think of Segments as locks used to prevent concurrent write to same bucket of hashmap leading to inconsistency. So as long as write to concurrent hashmap is on different segments it can happen in parallel. Reads are completely lock free i.e No need to acquire lock for reading. Last updated value is returned.


 Now lets go step by step -

 Concurrency-Level , Segment array and initialization :
  • First when you create a ConcurrentHashMap you can provide concurrency level. This determines size of Segment array. Size of segment array will always be equal or more than the concurrency level. If this is not provided default is used - 
    • static final int MAX_SEGMENTS = 1 << 16; // slightly conservative
  • Note that the size of segment table will always be power of 2. So if you give  concurrency level as 10 then next best power of 2 match will be picked up i.e 16 and Segment array of size 16 will be created which implies 16 threads can simultaneously operate on the map.
static final class Segment<K,V> extends ReentrantLock implements Serializable {

    //The number of elements in this segment's region.
    transient volatile int count;
    //The per-segment table. 
    transient volatile HashEntry<K,V>[] table;
}

Putting element in ConcurrentHashMap :

  • For putting element in Map we first need to determine which segment the element should be processed for. For this we first get hascode of the key. Next we do a rehash of the existing hash to ensure
     /**
     * Applies a supplemental hash function to a given hashCode, which
     * defends against poor quality hash functions.  This is critical
     * because ConcurrentHashMap uses power-of-two length hash tables,
     * that otherwise encounter collisions for hashCodes that do not
     * differ in lower or upper bits.
     */
    private static int hash(int h) {
        // Spread bits to regularize both segment and index locations,
        // using variant of single-word Wang/Jenkins hash.
        h += (h <<  15) ^ 0xffffcd7d;
        h ^= (h >>> 10);
        h += (h <<   3);
        h ^= (h >>>  6);
        h += (h <<   2) + (h << 14);
        return h ^ (h >>> 16);
    }
  •  Once hash is calculated you can get the segment which it belongs to and delegate put method to segments put method as follows -
    public V put(K key, V value) {
        if (value == null)
            throw new NullPointerException();
        int hash = hash(key.hashCode());
        return segmentFor(hash).put(key, hash, value, false);
    }

    final Segment<K,V> segmentFor(int hash) {
        return segments[(hash >>> segmentShift) & segmentMask];
    }


We will see how segment is computed in some time with a proper example. Once put is delegated to segment , segment will add it to the appropriate bucket in the segment.

        V put(K key, int hash, V value, boolean onlyIfAbsent) {
            lock();
            try {
                int c = count;
                if (c++ > threshold) // ensure capacity
                    rehash();
                HashEntry<K,V>[] tab = table;
                int index = hash & (tab.length - 1);
                HashEntry<K,V> first = tab[index];
                HashEntry<K,V> e = first;
                while (e != null && (e.hash != hash || !key.equals(e.key)))
                    e = e.next;

                V oldValue;
                if (e != null) {
                    oldValue = e.value;
                    if (!onlyIfAbsent)
                        e.value = value;
                }
                else {
                    oldValue = null;
                    ++modCount;
                    tab[index] = new HashEntry<K,V>(key, hash, first, value);
                    count = c; // write-volatile
                }
                return oldValue;
            } finally {
                unlock();
            }
        }


Now this is very interesting method. Lets understand whats happening here.

  • First call is to lock(). Since it is a write/update operation on a bucket of same segment we need a lock. If you recollect Segment class it extends ReentrantLock so each segment is a lock. So you can call lock() and unlock() directly in Segment class.
  • Next it's like a normal HashMap. You find the index of the Entry table where your elements hash falls and add it there as linked list.
  • You can see similar code as HashMap that updates value if key is same, inserts in array if there is no element in the table and adds it in the linked list of the table if element already exists.
  • Finally once operation is complete it calls unlock() so that other threads can continue update.
  • Note the lock is a blocking call. 
  • You can also see call for rehash if threshold is reached. Like Entry array Segment also has a threshold and when it is reached Segment array is resized for performance. That's what rehash. 
NOTE : For getting index of Segment table first n bits are used where as for getting index of Entry table last N bits are used from enhanced hash integer (See details in example below).

Getting element from  ConcurrentHashMap : 

Get on ConcurrentHashMap is very simple no locks involved. You simply read the data and return -

        public V get(Object key) {
                int hash = hash(key.hashCode());
                return segmentFor(hash).get(key, hash);
        }

        V get(Object key, int hash) {
            if (count != 0) { // read-volatile
                HashEntry<K,V> e = getFirst(hash);
                while (e != null) {
                    if (e.hash == hash && key.equals(e.key)) {
                        V v = e.value;
                        if (v != null)
                            return v;
                        return readValueUnderLock(e); // recheck
                    }
                    e = e.next;
                }
            }
            return null;
        }


NOTE  : readValueUnderLock method is used as a backup in case a null (pre-initialized) value is ever seen in an unsynchronized access method.

Example

Above was just all code and some understanding. Now lets take an actual example.

Let's say we have created a ConcurrentHashMap with concurrency level lets say 10. Based on this Segment array will be created based on following code -

    private static void printSegmentDetails(int concurrencyLevel) {
        int sshift = 0;
        int segmentMask = 0;
        int segmentShift = 0;

        int ssize = 1;
        while (ssize < concurrencyLevel) {
            ++sshift;
            ssize <<= 1;
        }
        segmentShift = 32 - sshift;
        segmentMask = ssize - 1;
        System.out.println("Segment array size :" + ssize);
        System.out.println("segmentShift : " + segmentShift);
        System.out.println("segmentMask : " + segmentMask);
    }


Output for 10 concurrency level:
Segment array size : 16
segmentShift : 28
segmentMask : 15

NOTE  :As mentioned before segment array is of size 2^n such that 2^n >= concurrency level. In this case 2^4

Now that we have segment table in place lets simulate put. We need to put a String called "Aniket" as key. We don't care about value. Just make sure it's not null.

  1. First we will calculate hascode of the key.
  2. Then hash it so for better hash (as mentioned above)
  3. Then based on the result hash we will find which segment will it belong
Remember of Segment table was >= 2^N we now want first N bits to determine which segment this hash falls into. Since N bits will vary from 1 - 2^N which is our segment array size. Also remember code to get this index from above? -
  • int segmentIndex = (hash >>> segmentShift) & segmentMask
This essentially means logically right shift hash with segmentShift bits. Since int is 32 bit and segmentShift = 32 - sshift, hash >>> segmentShift will essentially give you first sshift bits (sshift is nothing but N in 2^N we saw above). segmentMask is to get the N bits post shift.

So in this case,
N  = sshift =  4
2^N = 16 -> Size of segment array
segmentShift = 32 - 4 = 28 (as we saw in output above)
segmentMask = 16 -1 - 15

    public static void main(String args[]) {    
        String key = "Aniket";
        //hascode of key
        System.out.println(key.hashCode());
        //better hash
        System.out.println(hash(key.hashCode()));
        //better hash in binary
        System.out.println(Integer.toBinaryString(hash(key.hashCode())));
        //logical right shift by segmentShift
        System.out.println("Right shifter hash : " + Integer.toBinaryString(hash(key.hashCode()) >>> 28));
        // segment index as binary and of right shift and segmentMask
        System.out.println("Segment Index : " + Integer.toBinaryString((hash(key.hashCode()) >>> 28 ) & 15));
        // segment index as decimal
        System.out.println("Segment Index : " + ((hash(key.hashCode()) >>> 28 ) & 15));
    }


Output :
1965716254
1839402854
1101101101000110000111101100110
Right shifter hash : 110
Segment Index : 110
Segment Index : 6


NOTE : 1101101101000110000111101100110 is 31 bits as rightmost bit is 0 and ignored.  Same goes for all subsequent binmary bit formats.

So your element with key "Aniket" will go in Segment array of index 6. Inside segments it's pretty simple to calculate index of Entry array.

  •  int entryArrayindex = hash & (tab.length - 1);
         int entryArrayindex = (hash(key.hashCode()) & (16 - 1));
         System.out.println("Entry array index : " + entryArrayindex);
         System.out.println("Entry array index in binary : " + Integer.toBinaryString(entryArrayindex));


Output :
Entry array index : 6
Entry array index in binary : 110


So finally Entry is inserted at index 6 of Entry table.

So to summarize for getting index of Segment table first n bits are used where as for getting index of Entry table last N bits are used from enhanced hash integer.


Related Links 

Left shift and right shift operators in Java

Left shift and right shift operators in Java

A lot of time we use >> , >>> or << and <<< operators in Java for bit operations. These operations essentially shift bits left or right depending on the operator. In this post we will see what exactly is the difference.
  • >> is arithmetic or signed shift right
  • >>> is logical or unsigned shift right
  • << is arithmetic or signed shift left
The signed left shift operator "<<" shifts a bit pattern to the left, and the signed right shift operator ">>" shifts a bit pattern to the right. The bit pattern is given by the left-hand operand, and the number of positions to shift by the right-hand operand.

The unsigned right shift operator ">>>" shifts a zero into the leftmost position, while the leftmost position after ">>" depends on sign extension.

NOTE : There is no logic left shift as it is same as arithmetic left shift.

Example :

        System.out.println(Integer.toBinaryString(-121));
        // prints "11111111111111111111111110000111"
        System.out.println(Integer.toBinaryString(-121 >> 1));
        // prints "11111111111111111111111111000011"
        System.out.println(Integer.toBinaryString(-121 >>> 1));
        // prints "1111111111111111111111111000011"
        
        System.out.println(Integer.toBinaryString(121));
        // prints "1111001"
        System.out.println(Integer.toBinaryString(121 >> 1));
        // prints "111100"
        System.out.println(Integer.toBinaryString(121 >>> 1));
        // prints "111100"


As you can see in case of -121 since it is a negative number arithmetic or signed shift right adds a 1 to the rightmost bit where as in case if 121 it adds 0.

logical or unsigned shift does not care about sign. It just adds 0 to the shifted bits.


Related Links

Monday, 15 May 2017

How does database indexing work?

Background

Database indexing is a wide topic. Database indexing plays a important role in your query result performance. But like everything this too has a trade off.  In this post we will see what database indexing is and how does it work.


Clustered and Non clustered index

Before we go to how indexing actually works lets see the two types on indexes -
  1. Clustered index
  2. Non clustered index
Data in tables of a database need to be stored on a physical disk at the end of the day. It is important the way data is stored since data lookup is based on it. The way data is stored on physical disk is decided by an index which is known as Clustered index. Since data is physically stored only once , only one clustered index is possible for a DB table. Generally that's the primary key of the table. That's right. Primary key of your table is a Clustered index by default and data is stored physically based on it.

NOTE : You can change this though. You can create a primary key that is not clustered index but a non clustered one. However you need to define one clustered index for your table since physical storage order depends on it.

Non clustered indexes are normal indexes. They order the data based on the column we have created non clustered index on. Note since data is stored only once on the disk and there is just one column(or group of columns ) which can be used to order stored data on disk we cannot store the same data with ordering specified by non clustered index (that's the job of clustered index). So new memory is allocated for non clustered indexes. These have the column on which it is created as the key of the data row and a pointer to the actual data row (which is ordered by clustered index - usually the primary key). So if a search is performed on this table based on the columns that have non clustered indexes then this new data set is searched (which is faster since records are ordered with respect to this column) and then the pointer in it is used to access the actual data record with all columns.


Now that we have knowledge of clustered and non clustered indexes lets see how it actually works.

Why is indexing needed?

When data is stored on disk based storage devices, it is stored as blocks of data. These blocks are accessed in their entirety, making them the atomic disk access operation. Disk blocks are structured in much the same way as linked lists; both contain a section for data, a pointer to the location of the next node (or block), and both need not be stored contiguously.

Due to the fact that a number of records can only be sorted on one field, we can state that searching on a field that isn’t sorted requires a Linear Search which requires N/2 block accesses (on average), where N is the number of blocks that the table spans. If that field is a non-key field (i.e. doesn’t contain unique entries) then the entire table space must be searched at N block accesses.

Whereas with a sorted field, a Binary Search may be used, this has log2 N block accesses. Also since the data is sorted given a non-key field, the rest of the table doesn’t need to be searched for duplicate values, once a higher value is found. Thus the performance increase is substantial.

What is indexing?

This is rather a silly question given we already saw clustered and non clustered indexes but lets give it a try.

Indexing is a way of sorting a number of records on multiple fields. Creating an index on a field in a table creates another data structure which holds the field value, and pointer to the record it relates to. This index structure is then sorted, allowing Binary Searches to be performed on it. This index is obviously the non clustered one. There is no need to create separate data structure for Clustered indexes since the original data is stored physically sorted based on it.

The downside to (non clustered) indexing is that these indexes require additional space on the disk, since the indexes are stored together in a table using the MyISAM engine, this file can quickly reach the size limits of the underlying file system if many fields within the same table are indexed.

Performance

Indexes don't come free.  They have their own overhead. Each index creates an new data set ordered by the columns on which it is created. This takes extra space (though not as much as the original data set since this just has single data and pointer to actual row data). Also inserts are slower now since each insert will have to update this new index data set as well. Same goes for deletion.

Data structures used in indexes

Hash index :
 Think of this using a HashMap. Key here would be derived from columns that are used to create a index (non clustered index to be precise). Value would be pointer to the actual table row entry. They are good for equality lookups like get rows of data of all customer whose age is 24. But what happens when we need a query like set of data of customer with age greater than 24. Hash index does not work so go in this case.  Hash indexes are just good for equality lookups.
Eg.



B-tree Indexes:
These are most common types of indexes. It's kind of self balancing tree. It stores data in an ordered manner so that retrievals are fast. These are useful since they provide insertion, search and deletion in logarithmic time.
Eg.


Consider above picture. If we need rows with data less that 100 all we need are notes to the left of 100.

These are just common ones. Others are R-tree indexes, bitmap indexes etc.

Related Links

Sunday, 14 May 2017

Difference between ClassNotFoundException vs NoClassDefFoundError in Java

Background

In last post we saw how classloading works in Java -
In this post we will try to understand the difference between ClassNotFoundException vs NoClassDefFoundError thet generally bugs all Java developers.

If you have not gone through above link I strongly suggest you do it right away. It will give you very good understanding on class loading mechanism that we will be using shortly to understand these two situations.

Difference between ClassNotFoundException vs NoClassDefFoundError in Java

  • First point to note is their types. ClassNotFoundException is a checked exception. So you will need to handle it. Either catch it or throw it in method signature. NoClassDefFoundError is an error. Error are generally something you cannot recover from. You can still catch and handle it though.
  • Both things are related to class not available during runtime. Difference is the cause of non availability which we will see shortly. 
  • ClassNotFoundException is throw by running application where as NoClassDefFoundError is thrown by Java runtime.
ClassNotFoundException
We have already seen an example of ClassNotFoundException in last post when we tried to load our custom class with Extension classloader which was parent of Application classloader that actually loaded the class. We said parent classloader does not have visibility of classes loaded by child class loaders. So it threw ClassNotFoundException. Generally ClassNotFoundException is thrown when you try to load a custom class using methods like -
  • Class.forName()
  • ClassLoader.loadClass()
  • ClassLoader.findSystemClass()
and the required class is not found in the classpath. This could be because your classpath is incorrectly configured. Famous example is when you connect to a database using Jave you load the driver class using - Class.forName() [We do this explicit loading pre Java 6. From java 6 this class loading happens automatically. All you have to do is make sure driver jar is present in classpath] -
So for above usecase if you don have a driver class in your classpath then it will led to  ClassNotFoundException. Also you must have noticed by not you need to explicitly handle this exception since this is checked exception.

To resolve this issue you need to check that the class is available in your classpath.

NoClassDefFoundError:
Now this unlike ClassNotFoundException is an Error which is hard to recover from. This generally happens when class is available at compile time but not available at runtime.

One example can be static method/block of a class throws error due to which class does not load (though it was available at compile time and went through in compilation phase). Now if such a class is reference at runtime then it will throw NoClassDefFoundError.

To resolve this error you need check your classpath. It is possible that in your configuration (say in gradle files) you have added jar is lets say test configuration only and not in runtime or compile time configuration. You also need to lookout for any Initialization errors in the logs.


Related Links

How classloader works in Java

Background

We know how Java works. We create Java classes, create instances of it and they interact with each other. In this post we will see how classloaders work. We know javac is a compiler that converts human understandable Java code to class files containing bytecodes that JVM interpreted (java) understands. Classloaders are responsible for loading these classes at runtime. This is one of the good interview questions that is asked to experienced Java developers. This should also help you understand difference between NoClassDefFoundError and java.lang.ClassNotFoundException, So lets get to it.

Basic points

We will come to details of these but to begin with note down these points -
  1. Delegation - Each classloader first delegates loading of class to it's parent (goes all the way up the hierarchy). If parent is not able to load the class then class is tried to be loaded by it's child. If it cannot be loaded by any of the classloaders ClassNotFoundException exception is throws.
  2. Visibility  - Each classloader knows about the classes that it's parents have loaded. However it does not work the other way around. Parents will not know the classes loaded by their child. This brings us to the 3rd points.
  3. Uniqueness - Each class is loaded exactly once. Since each child delegates class loading to it's parent and know the classes it's parents have loaded, it will try to load classes only when it is not loaded by its parent.
Now these are ofcource default behavior of  classloaders that already exist. However you can write your own class loaders and break it (not recommended though).

Classloading in Java

Java has 3 main classloaders that are used to load classes at runtime -
  1. Bootstrap ClassLoader (Also called Primordial classLoader)
  2. Extension ClassLoader
  3. Application  ClassLoader
In that order. So  Bootstrap is parent of Extension and Extension is parent of Application classloader. Each of these classlaoders load classes from a predefined location


Above diagram says it all but let me reiterate -

  • Bootstrap ClassLoader is the topmost level classloader. It does not have any parent. This classloader is a native implementation . This class loader is responsible of loading all standard JDK classes. It does this from path - <JRE>/lib/rt.jar. Since this is native implementation it does not refer to ClassLoader class.
  • Extension ClassLoader is direct child of Bootstrap classLoader. When this classloader tries to load a class it first delegates it to it's parent - Bootstrap ClassLoader. If parent is unsuccessful then Extension ClassLoader will try to load classes from path <JRE>/lib/ext or from path specified in java.ext.dirs system variable. In JVM this is implemented by - sun.misc.Launcher$ExtClassLoader
  • Application classloader is child of Extension classloader. Execution sequence remains same. When a class is loaded from this classloader it delegates to it's parent Extension which in turn delegates it to it's parent Bootstrap. If parents are unsuccessful in loading classes then Application classloaded will try to load class from the classpath - you can give it with arguments -classpath or -cp or specify it in manifest file of jar. In JVM this is implemented by sun.misc.Launcher$AppClassLoader

If application classloader is not able to load the class then it throws ClassNotFoundException. When JVM loads this is the order in which classloaders execute and load classes.


 Code Demo

Let's try to understand few things with code now. First thing we discussed is Bootstrap classloader and how it's the topmost classloader with native implementation and that it does not have any parent. However you cannot refer to Bootstrap classloader in Java. It will give null - Since it is native implementation.

    public static void main(String args[]) throws InterruptedException, IOException {
        ClassLoader classLoader  = String.class.getClassLoader();
        System.out.println(classLoader==null?"Bootstrap classloader not available from Java":"Bootstrap classloader available from Java");
    }

and you should get -
Bootstrap classloader not available from Java

Now lets try to check our visibility principle. By that parent classes should not be able to load classes loaded by their child. We are going to create a new class called HelloWorld and from it check which classloader loaded it (from out previous knowledge all classpath classes are loaded by Application classloader) and well see it's parent (should be Extension classloader) and finally try to load the same HelloWorld class using parent classloader (should fail as parent should not have visibility to classes loaded by child) -

    public static void main(String args[]) throws InterruptedException, IOException {
        ClassLoader classLoader  = HelloWorld.class.getClassLoader();
        System.out.println("Current classloader : " + classLoader);
        System.out.println("Current classloaders parent : " + classLoader.getParent());
        try {
            Class.forName("HelloWorld", true, HelloWorld.class.getClassLoader().getParent());
        } catch (ClassNotFoundException e) {
            System.out.println("Class could not be loaded by the classloader");
            e.printStackTrace();
        }
    }


Output is -
Current classloader : sun.misc.Launcher$AppClassLoader@4554617c
Current classloaders parent : Current classloaders parent : sun.misc.Launcher$ExtClassLoader@677327b6
Class could not be loaded by the classloader
java.lang.ClassNotFoundException: HelloWorld
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:348)
    at HelloWorld.main(HelloWorld.java:93)

You can also see for yourself the way classes are loaded . Just use option -verbose:class while running Java. Example in screenshot below -


NOTE :
  • JVM maintains a runtime pool is permgen area where classes are loaded. Whenever a class is referenced default class loader finds the class is the class path and loads it into this pool. And this is not specific to user defined classes or classes provided in JDK. When a class id referenced it is loaded into the memory.
  •  Yes and ClassLoader instance does not get GCed as it is referenced by JVM thread. Infact that is why even if you have a Singleton class it is possible to create two instances with two different class loaders.
  •  No ClassLoader instances are same as any other Objects in the heap. The statement that it does not get GCed come from the fact that ClassLoaders have references from JVM threads which run till the java process is completed and JVM shuts down. For eg the Bootstrap Class Loader is a native implementation meaning its code is embedded in JVM. So it's reference will always be alive. Hence we say they are not the potential candidates for GC. Other than that GC treats them the same way.

Related Links

t> UA-39527780-1 back to top