Posts

Showing posts from 2021

Regular Expression Matching

Today, let’s focus on implementing a regular expression engine. Matching a given input string against a regular expression pattern is a very common application in the real world. Most regular expression engines are far too complex, and are often based on Finite automata state machines. However, our naive implementation doesn’t use Finite automata state machines.  Regular Expression Matching - Problem Statement Given an input string s and a pattern p, implement regular expression matching with support for '.' and '*' where: '.' Matches any single character. '*' Matches zero or more of the preceding element. The matching should cover the entire input string (not partial). Regular Expression Matching - Definition For the input string $s$ and the pattern $p$, each with length $m$ and $n$ characters respectively, let $t[i][j]$ be the evaluation of the string $s_1 \ldots s_i$ against the pattern $p_1 \ldots p_j$ such that $0 \le i \le m$ and $0 \le j \le n$. H

Longest Monotonically Increasing Subsequence

 Today, let’s take a look at another interesting exercise given in the CLRS book [1]. Problem Statement Exercise 15.4-6 Give an $O(n \cdot log_2 n)-$ time algorithm to find the longest monotonically increasing subsequence of a sequence of $n$ numbers. (Hint: Observe that the last element of a candidate subsequence of length $i$ is at least as large as the last element of a candidate subsequence of length $i - 1$. Maintain candidate subsequences by linking them through the input sequence.) Longest Monotonically Increasing Subsequence - Approach There may be many different approaches to solve this problem. However, we are going to describe one such approach that we came up with. Each approach has its own strengths and weaknesses. The book [1] suggests a different approach to what we are going to present here. Nonetheless, both the approaches have the same asymptotic time and space complexity. Our approach uses an auxiliary array $m$ to store intermediate results. Finally, this array shou

Fractional Knapsack Problem

Introduction In the fractional knapsack problem, the setup is the same in the 0-1 knapsack problem, but the thief can take fractions of items, rather than having to make a binary (0, 1) choice for each item. You can think of an item in the 0-1 knapsack problem as being like a gold ingot and an item in the fractional knapsack problem as more like gold dust. [1] We can solve the fractional knapsack problem by a greedy strategy. To solve the fractional problem, we first compute the value per pound, $v_i/w_i$ for each item. Obeying a greedy strategy, the thief begins by taking as much as possible of the item with the greatest value per pound. If the supply of that item is exhausted, and he can still carry more, he takes as much as possible of the item with the next greatest value per pound, and so forth. [1] Now let’s draw our attention to an exercise in the CLRS book [1]. Exercise 16.2-6 Show how to solve the fractional knapsack problem in $O(n)$ time. [1] By sorting the items by value pe

Weighted Interval Scheduling

 Introduction Weighted interval scheduling is a variant of the Interval scheduling problem described under chapter 16 of our CLRS book [1]. If you are not familiar with the Interval scheduling problem, I would encourage you to read that in the book [1], before proceeding with the rest of this article. Recall that in the original problem we are given a set of $n$ activities, each with associated start time $s_i$ and finish time $f_i$. Our objective is to find the maximum subset of non-overlapping activities. Moreover, the problem of finding the maximum subset of compatible activities exhibits the greedy choice property. Now with this information in mind, let’s move on to our new variant.   Problem Statement Exercise 16.1-5 Consider a modification to the activity-selection problem in which each activity $a_i$ has, in addition to start and finish time, a value $v_i$. The objective is no longer to maximize the number of activities scheduled, but instead to maximize the total value of the a

Correctness of Huffman's algorithm

Image
 First of all, I would like to give a bit of context and explain my motivation behind this article. I have gone through the book [1] and found that the proof of correctness is a bit less informative and making it a bit more hard for a reader to understand on their own. Then I checked some other lecture slides, materials and unfortunately found none of them helpful in understanding the proof easily. The first part of the proof, which is Lemma 16.2 is much better and has some illustrations to assist readers, hence I don’t think it requires more explanation. But the last part which is Lemma 16.3 lacks the required details, hence I thought of adding them here to assist other readers because I don’t want them to spend their time going through the painstaking process which I had when I was first reading the proof. Lemma 16.3 Let $C$ be a given alphabet with frequency $c.freq$ defined for each character $c \in C$. Let $x$ and $y$ be two characters in $C$ with minimum frequency. Let $C^\prime$

0-1 knapsack problem

Problem Statement The 0-1 knapsack problem is the following. A thief robbing a store finds n items. The $i$ th item is worth $v_i$ dollars and weighs $w_i$ pounds, where $v_i$ and $w_i$ are integers. The thief wants to take as valuable a load as possible, but he can carry at most $W$ pounds in his knapsack, for some integer $W$. Which items should he take? We call this the 0-1 knapsack problem because for each item, the thief must either take it or leave it behind; he cannot take a fractional amount of an item or take an item more than once. [1] Now let’s take an exercise from our CLRS book [1]. Exercise 16.2-2  Give a dynamic programming solution to the 0-1 knapsack problem that runs in $O(n \cdot W)$ time, where $n$ is the number of items and $W$ is the maximum weight of items that the thief can put in his knapsack. [1] 0-1 knapsack - Definition  Let $n$ be the number of items, $W$ be the capacity of the knapsack. Let $i$ be the current item being considered, $v_i$ and $w_i$ be it’s

Strassen's method

If you have seen matrices before, then you probably know how to multiply them. If $A = (a_{ij})$ and $B = (b_{ij})$ are square $n \times n$ matrices, then in the product $C = A \times B$, we define the entry $c_{ij}$ for $i, j = 1, 2, ..., n$, by $$c_{ij} = \sum_{k=0}^n a_{ik} . b_{kj}$$ You might at first think that any matrix multiplication algorithm must take $Ω(n^3)$ time, since the natural definition of matrix multiplication requires that many multiplications. You would be incorrect, however: we have a way to multiply matrices in less time than that. In this article, we shall see Strassen’s remarkable recursive algorithm for multiplying $n \times n$ matrices. It runs in $Θ(n^{\log_2 7})$ time, which is asymptotically better than the straightforward approach. [1]   Strassen’s method The strassen’s method makes the recursion tree slightly less bushy by performing only 7 recursive multiplications of sub-matrices instead of 8. Strassen’s method is not at all obvious and it involves fo

Heaps

The (binary) heap data structure is an array object that we can view as a nearly complete binary tree. Heapsort uses heap data structure to manage its information. In this article, we present one of the most popular applications of a heap: as an efficient priority queue. [1] Usually, a priority queue has a contract with methods to insert an element into it, examine the head of the queue, and to remove the element at the head of the queue. Here’s our contract for the priority queue. Next, we have to implement this contract. Here’s how it looks. Now, we have the priority queue under our belt, we can go ahead and use it to solve real world problems. Even though it is much easier to come up with a slightly contrived example of using the priority queue, this time around, I thought of coming up with a more real example which showcases the real capabilities of our data structure. The beauty of this approach is that it pounds on the data structure so much and allows us to find any vulne

Binary Search Tree - Iterative inorder tree walk

Image
Today, I’ll walk you through an iterative algorithm for inorder tree walk over a Binary search tree. First, let me draw your attention to an exercise in the CLRS book [1]. Exercise 12.1-3 Give a nonrecursive algorithm that performs an inorder tree walk. (Hint: An easy solution uses a stack as an auxiliary data structure. A more complicated, but elegant, solution uses no stack but assumes that we can test two pointers for equality.) The one which uses a stack is relatively simple. Here’s how it looks. **Input:** $root$ node of the tree $T$   $INORDER-TREEWALK-ITERATIVE(root)$   $\phantom{{}++{}}$ Let $s$ be an empty stack   $\phantom{{}++{}}$ TreeNode $\;\;current \gets root$   $\phantom{{}++{}}$  `while`$\;!s.isEmpty$ or $current \;\neq sentinel$   $\phantom{{}++{}}$  $\phantom{{}++{}}$ `if` $current \neq sentinel$   $\phantom{{}++{}}$  $\phantom{{}++{}}$ $\phantom{{}++{}}$  $s.push(current)$   $\phantom{{}++{}}$  $\phantom{{}++{}}$ $\phantom{{}++{}}$ $current \gets current.left$   $\p