Background
As we all know a heap is a complete binary tree that satisfies the heap property:
- Min heap: The value of children is greater than or equal to the parent node. This means the root node of a min heap stores the lowest value or minimum value data.
- Max heap: The value of children is smaller than or equal to the parent. This means that the root node of a max heap stores the highest value or the maximum value data.
Heaps are used to implement priority queues for various use cases. See the below diagrams on what constitutes a valid and invalid heap (Credit: GeeksForGeeks)
Valid min heaps
Invalid min heaps
Valid max heaps
Invalid max heaps
It is an interesting read on how to insert in a heap, how to remove from the heap, how to heapify an array, how to use an array to represent a heap (The above diagrams show binary tree representation), how to use heaps for implementing priority queue etc.
The time complexity for various heap operations is as follows:
Working with heaps in Python
Now that we brushed upon heap data structure above let's get to the original purpose of this post, which s to understand how to work with heaps in Python.
In Java, you would usually use PriorityQueue implementation to work with heaps. In Python we use python’s inbuilt library named heapq.
Look at the following code:
import heapq data = [3, 5, 9, 14, 4, 24, 2] print(f"Original data: {data}") heapq.heapify(data) # Create min heap print(f"After heapify: {data}") heapq.heappush(data, 1) print(f"After pushing to heap: {data}") heapq.heappop(data) print(f"After poping from heap: {data}")
It Prints:
Original data: [3, 5, 9, 14, 4, 24, 2]
After heapify: [2, 4, 3, 14, 5, 24, 9]
After pushing to heap: [1, 2, 3, 4, 5, 24, 9, 14]|
After poping from heap: [2, 4, 3, 14, 5, 24, 9]
As you can see we initially has a simple list called data which we then converted into a min heap (default heap behavior) using heapify method (O(N) time complexity). We then pushed an element to the min heap (O(Log(N)) time complexity) and then poped one which remoes the element from the root - minimum element in this case of min heap (Also has time complexity O(Log(N)))
NOTE: If you want max heap implementation you can just negate the data and push it in the heap.
NOTE: Heap elements can be tuples. This is useful for assigning comparison values (such as task priorities) alongside the main record being tracked. 1st element in the tuple is use for the comparison/priority.
See below code of how we can add tuple in the heap:
import heapq data = [] heapq.heappush(data, (3, "China")) heapq.heappush(data, (1, "India")) heapq.heappush(data, (2, "USA")) print(data)
It prints:
[(1, 'India'), (3, 'China'), (2, 'USA')]