Posts

Showing posts from February, 2020

Longest Palindromic Substring

A palindrome is a nonempty string over some alphabet that reads the same forward and backward. Examples of palindromes are all strings of length 1, civic, racecar, and aibohphobia (fear of palindromes). Given a string $s$, find the longest palindromic substring in $s$. You may assume that the maximum length of $s$ is $1000$. Example 1: Input: "babad" Output: "bab" Note: "aba" is also a valid answer. Example 2: Input: "cbbd" Output: "bb" Let $p[i][j]$ be a table that states whether the substring $s_i\ldots s_j$ is a palindrome or not. Note that any substring of $length = 1$ can be considered as a palindrome. The following equation describes  the recurrence. $p[i][j] = \begin{cases} true ,  & \text{if $l = 1$} \\[2ex] s_i == s_j,   & \text{if $l = 2$} \\[2ex] p[i+1][j-1]\; and \; s_i == s_j, & \text{if $len \gt 2$} \end{cases}$ Procedure LONGEST-PALINDROME-SUBSTRING takes a String as it’s input. It stores $p[

Optimal binary search trees

Image
Suppose that we are designing a program to translate text from English to French. For each occurrence of each English word in the text, we need to look up its French equivalent. We could perform these lookup operations by building a binary search tree with $n$ English words as keys and their French equivalents as satellite data. Because we will search the tree for each individual word in the text, we want the total time spent searching to be as low as possible. We could ensure an $Θ(log_2 \;n)$ search time per occurrence by using a red-black tree or any other balanced binary search tree. Words appear with different frequencies, however, and a frequently used word such as the may appear far from the root while a rarely used word appears near the root. Such an organization would slow down the translation, since the number of nodes visited when searching for a key in a binary search tree equals one plus the depth of the node containing the key. We want words that occur frequently in the t

Longest common subsequence

Image
Biological applications often need to compare the DNA of two (or more) different organisms. A strand of DNA consists of a string of molecules called bases, where the possible bases are adenine, guanine, cytosine, and thymine. Representing each of these bases by its initial letter, we can express a strand of DNA as a string over the finite set $\{A, C, G, T\}$. For example, the DNA of one organism may be $S1 = ACCGGTCGAGTGCGCGGAAGCCGGCCGAA$, and the DNA of another organism may be $S2 = GTCGTTCGGAATGCCGTTGCTCTGTAAA$. One reason to compare two strands of DNA is to determine how similar the two strands are, as some measure of how closely related the two organisms are. However we can, and do, define similarity in many different ways. The way which we use to measure the similarity of strands $S1$ and $S2$ is by finding a third strand $S3$ in which the bases in $S3$ appear in each of $S1$ and $S2$, these bases must appear in the same order, but not necessarily consecutively. The longer the st