Background
As we all know a heap is a complete binary tree that satisfies the heap property:
- Min heap: The value of children is greater than or equal to the parent node. This means the root node of a min heap stores the lowest value or minimum value data.
- Max heap: The value of children is smaller than or equal to the parent. This means that the root node of a max heap stores the highest value or the maximum value data.
Heaps are used to implement priority queues for various use cases. See the below diagrams on what constitutes a valid and invalid heap (Credit: GeeksForGeeks)
Valid min heaps
Invalid min heaps
Valid max heaps
Invalid max heaps
It is an
interesting read on how to insert in a heap, how to remove from the heap, how to heapify an array, how to use an array to represent a heap (The above diagrams show binary tree representation), how to use heaps for implementing priority queue etc.
The time complexity for various heap operations is as follows:
Working with heaps in Python
Now that we brushed upon heap data structure above let's get to the original purpose of this post, which s to understand how to work with heaps in Python.
In Java, you would usually use PriorityQueue implementation to work with heaps. In Python we use python’s inbuilt library named heapq.
Look at the following code:
import heapq
data = [3, 5, 9, 14, 4, 24, 2]
print(f"Original data: {data}")
heapq.heapify(data) # Create min heap
print(f"After heapify: {data}")
heapq.heappush(data, 1)
print(f"After pushing to heap: {data}")
heapq.heappop(data)
print(f"After poping from heap: {data}")
It Prints:
Original data: [3, 5, 9, 14, 4, 24, 2]
After heapify: [2, 4, 3, 14, 5, 24, 9]
After pushing to heap: [1, 2, 3, 4, 5, 24, 9, 14]|
After poping from heap: [2, 4, 3, 14, 5, 24, 9]
As you can see we initially has a simple list called data which we then converted into a min heap (default heap behavior) using heapify method (O(N) time complexity). We then pushed an element to the min heap (O(Log(N)) time complexity) and then poped one which remoes the element from the root - minimum element in this case of min heap (Also has time complexity O(Log(N)))
NOTE: If you want max heap implementation you can just negate the data and push it in the heap.
NOTE: Heap elements can be tuples. This is useful for assigning comparison values (such as task priorities) alongside the main record being tracked. 1st element in the tuple is use for the comparison/priority.
See below code of how we can add tuple in the heap:
import heapq
data = []
heapq.heappush(data, (3, "China"))
heapq.heappush(data, (1, "India"))
heapq.heappush(data, (2, "USA"))
print(data)
It prints:
[(1, 'India'), (3, 'China'), (2, 'USA')]
PriorityQueue class
We can also use PriorityQueue class implementation directly in Python. This internally uses heapq library but is different in following ways
- It's synchronized, so it supports concurrent processes.
- It's a class interface as opposed to the function-based interface of heapq.
You can see following example on how to use PriorityQueue class in python
from queue import PriorityQueue
pQueue = PriorityQueue()
pQueue.put((3, "China"))
pQueue.put((1, "India"))
pQueue.put((2, "USA"))
while not pQueue.empty():
print(pQueue.get())
Above also prints:
(1, 'India')
(2, 'USA')
(3, 'China')
Related Links