Heaps
The (binary) heap data structure is an array object that we can view as a nearly complete binary tree. Heapsort uses heap data structure to manage its information. In this article, we present one of the most popular applications of a heap: as an efficient priority queue. [1]
Usually, a priority queue has a contract with methods to insert an element into it, examine the head of the queue, and to remove the element at the head of the queue. Here’s our contract for the priority queue.
/** | |
* A data structure designed for holding elements prior to processing. | |
* | |
* @author ravindra | |
* | |
* @param <E> the type of elements in this {@link Queue} | |
*/ | |
public interface Queue<E> { | |
/** | |
* Adds an element into the queue. | |
* | |
* @param elt element to be inserted | |
*/ | |
void insert(E elt); | |
/** | |
* Inspects the element at the head of the queue without removing it. | |
* | |
* @return element at the head of the queue. | |
*/ | |
E examine(); | |
/** | |
* Inspects and removes the element at the head of the queue. | |
* | |
* @return element at the head of the queue. | |
*/ | |
E extract(); | |
/** | |
* Checks whether the queue is empty. The queue is empty if there are no | |
* elements in it. | |
* | |
* @return <code>true</code> if the queue is empty, <code>false</code> | |
* otherwise. | |
*/ | |
boolean isEmpty(); | |
/** | |
* This implementation returns an array containing all the elements of this | |
* {@link Queue}. The runtime type of the returned array is that of the | |
* specified type. | |
* | |
* @param <T> | |
* @param clazz The runtime type of the returned array | |
* @return an array containing all the elements of this {@link Queue} | |
*/ | |
public <T> T[] toArray(Class<T> clazz); | |
} |
Next, we have to implement this contract. Here’s how it looks.
public class PriorityQueue<E> implements Queue<E> { | |
private final Comparator<? super E> comparator; | |
private E[] a; | |
private int size = 0; | |
private static final int DEFAULT_INIT_CAP = 10; | |
private PriorityQueue(Comparator<? super E> comparator, int initialCapacity) { | |
this.comparator = comparator; | |
a = (E[]) new Object[initialCapacity + 1]; | |
a[0] = null; | |
} | |
public static <T> PriorityQueue<T> of(Comparator<? super T> comparator) { | |
return new PriorityQueue<>(comparator, DEFAULT_INIT_CAP); | |
} | |
public static <T extends Comparable<? super T>> PriorityQueue<T> of() { | |
return new PriorityQueue<>(Comparator.naturalOrder(), DEFAULT_INIT_CAP); | |
} | |
public static <T> PriorityQueue<T> of(Comparator<? super T> comparator, int initialCapacity) { | |
return new PriorityQueue<>(comparator, initialCapacity); | |
} | |
public static <T extends Comparable<? super T>> PriorityQueue<T> of(int initialCapacity) { | |
return new PriorityQueue<>(Comparator.naturalOrder(), initialCapacity); | |
} | |
private int parent(int i) { | |
return i / 2; | |
} | |
private int left(int i) { | |
return i * 2; | |
} | |
private int right(int i) { | |
return i * 2 + 1; | |
} | |
private void minHeapify(int i, int heapSize) { | |
if (heapSize > size) | |
throw new IllegalArgumentException("Heap size too large."); | |
final int l = left(i); | |
final int r = right(i); | |
int smallest = i; | |
if (l <= heapSize && comparator.compare(a[l], a[i]) < 0) | |
smallest = l; | |
if (r <= heapSize && comparator.compare(a[r], a[smallest]) < 0) | |
smallest = r; | |
if (smallest != i) { | |
final E tmp = a[i]; | |
a[i] = a[smallest]; | |
a[smallest] = tmp; | |
minHeapify(smallest, heapSize); | |
} | |
} | |
private void buildMinHeap() { | |
for (int i = size / 2; i > 0; i--) | |
minHeapify(i, size); | |
} | |
private void ensureCapacity(int mincap) { | |
final int oldcap = a.length; | |
if (mincap > oldcap) { | |
int newcap = Math.max(mincap, (oldcap * 3) / 2 + 1); | |
E[] oldarr = a; | |
a = (E[]) new Object[newcap]; // unchecked cast | |
a[0] = null; | |
System.arraycopy(oldarr, 1, a, 1, size); | |
} | |
} | |
@Override | |
public void insert(E elt) { | |
ensureCapacity(size + 2); | |
a[size + 1] = elt; | |
decreaseKey(size + 1); | |
size++; | |
} | |
private void decreaseKey(int i) { | |
while (i > 1 && comparator.compare(a[parent(i)], a[i]) > 0) { | |
final E tmp = a[parent(i)]; | |
a[parent(i)] = a[i]; | |
a[i] = tmp; | |
i = parent(i); | |
} | |
} | |
@Override | |
public E examine() { | |
if (this.size == 0) | |
throw new NoSuchElementException(); | |
return a[1]; | |
} | |
@Override | |
public E extract() { | |
if (this.size == 0) | |
throw new NoSuchElementException(); | |
final E head = a[1]; | |
a[1] = a[size]; | |
a[size] = null; | |
size--; | |
minHeapify(1, size); | |
return head; | |
} | |
@Override | |
public boolean isEmpty() { | |
return size == 0; | |
} | |
@Override | |
public String toString() { | |
final StringJoiner sj = new StringJoiner(", ", "[", "]"); | |
for (int i = 1; i <= size; i++) | |
sj.add(a[i].toString()); | |
return sj.toString(); | |
} | |
public <T> T[] toArray(Class<T> clazz) { | |
@SuppressWarnings("unchecked") | |
final T[] arr = (T[]) Array.newInstance(clazz, size); | |
System.arraycopy(a, 1, arr, 0, size); | |
return arr; | |
} | |
} |
Now, we have the priority queue under our belt, we can go ahead and use it to solve real world problems. Even though it is much easier to come up with a slightly contrived example of using the priority queue, this time around, I thought of coming up with a more real example which showcases the real capabilities of our data structure. The beauty of this approach is that it pounds on the data structure so much and allows us to find any vulnerabilities in our implementation. This exercise was drawn from the book [1].
Exercises
6.5-9
Give an Θ(n⋅log2k)−time algorithm to merge k sorted lists into one sorted list, where n is the total number of elements in all the input lists. (Hint: Use a min-heap for k−way merging.)
We can get the first element of each list and insert it into a priority queue. Then, you can remove the smallest element from the queue, and insert another element off the list it came from. Here’s how it looks.
class K_WayMerge { | |
private K_WayMerge() { | |
throw new AssertionError(); | |
} | |
public static void main(String[] args) { | |
final Integer[] sortedOne = { 2, 7, 16 }; | |
final Integer[] sortedTwo = { 5, 10, 20 }; | |
final Integer[] sortedThree = { 3, 6, 21 }; | |
final Integer[] sortedFour = { 4, 8, 9 }; | |
final Integer[] mergedData = kWayMerge(Integer.class, sortedOne, sortedTwo, sortedThree, sortedFour); | |
System.out.println(Arrays.toString(mergedData)); | |
} | |
public static <T> T[] kWayMerge(Comparator<? super T> cmp, Class<T> clazz, T[]... sortedLists) { | |
final Queue<Map.Entry<T, Iterator<T>>> queue = PriorityQueue.of(Map.Entry.comparingByKey(cmp), | |
sortedLists.length); | |
int n = 0; | |
for (T[] a : sortedLists) { | |
final Iterator<T> it = Arrays.asList(a).iterator(); | |
if (it.hasNext()) | |
queue.insert(new AbstractMap.SimpleEntry<>(it.next(), it)); | |
n += a.length; | |
} | |
@SuppressWarnings("unchecked") | |
final T[] sortedData = (T[]) Array.newInstance(clazz, n); | |
for (int count = 0; !queue.isEmpty(); count++) { | |
final Map.Entry<T, Iterator<T>> entry = queue.extract(); | |
sortedData[count] = entry.getKey(); | |
final Iterator<T> it = entry.getValue(); | |
if (it.hasNext()) | |
queue.insert(new AbstractMap.SimpleEntry<>(it.next(), it)); | |
} | |
return sortedData; | |
} | |
public static <T extends Comparable<? super T>> T[] kWayMerge(Class<T> clazz, T[]... sortedLists) { | |
return kWayMerge(Comparator.naturalOrder(), clazz, sortedLists); | |
} | |
} |
Analysis
The running time of extract is log2n. The running time of insert on an n−element heap is log2n.
For k−way merging, in each iteration, we remove the smallest element from the priority queue and insert another element off the list it came from. This takes log2k at each step of the iteration. Thus, total time it takes to merge the lists is, Θ(n⋅log2k).
Let n be the number of elements in all the lists, let k be the number of sorted lists and T(n) be the running time of the procedure. Thus,
T(n)=Θ(n⋅2⋅log2k)=Θ(n⋅log2k)
Runtime analysis of BUILD-MAX-HEAP algorithm
⌊log2n⌋∑h=0⌈n2h+1⌉⋅Θ(h)
rewriting the equation, we then obtain
Θ(n⋅⌊log2n⌋∑h=0h2h+1)<Θ(n4⋅∞∑h=0h2h−1)≡Θ(n4⋅∞∑h=0h⋅(12)h−1)
Now by plugging in x=12 we obtain,
Θ(n4⋅∞∑h=0h⋅xh−1)≡Θ(n4⋅ddx∞∑h=0xh)
Now, we know that
∞∑n=0xn=11−x
Thus, the above equation can be further simplified as follows.
Θ(n4⋅ddx∞∑h=0xh)≡Θ(n4⋅ddx11−x)
Finding the derivative using the chain rule yields:
ddx11−x≡ddx(1−x)−1=1(1−x)2
Plugging this in our original equation, we then obtain,
Θ(n4⋅ddx11−x)≡Θ(n4⋅1(1−x)2)
Finally, by plugging in x=12 we obtain,
Θ(n4⋅1(1−x)2)=Θ(n4⋅1(1−12)2)=Θ(n)
Comments
Post a Comment