Saturday, November 5, 2011

The Longest Palindrome Substring (Manacher's algorithm)

Problem: Given a string, find a longest palindrome substring.

Solution: We can use general suffix tree that stores the original string and its reverse, which is an O(N) algorithm. However, here we give a better one with less space overhead while still O(N) complexity. This algorithm is called Manacher's algorithm. If we check a string from left to right, we can leverage the palindrome check we did previously. This is from the symmetry of palindrome. The main idea is as follow:
  • Create an array called P[], P[i] stands for the longest palindrome centered at location i. Here i is not the index in the original string. For the original string, the locations we need to check for palindromes contains those characters in string along with the spaces between characters. So if we have a string of length l, we need to have a P[] with length 2*l+1.
  • Our goal is to fill in P[]. For a particular position, we check its left and right. If equals, we extend our check further. Otherwise, the longest palindrome centered at location is found.
  • However, we need to be smarter. Actually we can leverage previous computed P[i] when we calculate a P[x] where x>i
  • So here we add two pointers, p1 and p2, which point to the left and right of the current location i such that |i-p1| = |i-p2| and p2>i>p1. We know p1 refers to a palindrome t and i refers to a palindrome s. If the first character of t is strictly on the right of the first character of s, we know P[p2] = P[p1].
  • Otherwise, say if the first character of t is not strictly on the right of the first character of s, we have P[p2] >= r - p2. where r is the right bound of the palindrome that centered at i. We then need to check if the palindrome at p2 can be longer than p2. The good thing is that we only need to start the characters beyond the length of p2.
  • When the first character of t is strictly on the right of the first character of s, we don't need to move the current center (i). Only when the first character of t is not strictly on the right of the first character of s, we need to move the current center to p2.
  • The total cost is O(N).
The code is as follow:
    void manacher(const string &s)
    {
        int len = s.size();
        if(len == 0) return;
    
        int m[2*len+1];
        m[0] = 0;
        m[1] = 1;
        // "cur" is the current center
        // "r" is the right bound of the palindrome
        // that centered at current center
        int cur, r;
        r = 2;
        cur = 1;
    
        // iterate from 2 to 2*len+1
        for(int p2=2; p2<2*len+1; p2++)
        {
            int p1 = cur- (p2-cur);
            //if p1 is negative, we need to 
            //move "cur" forward
            // re-adjust cur based on p2
            while(p1 < 0)
            {
               cur++;
               r = m[cur] + cur;
               p1 = cur- (p2-cur);
    
            }
    
            // If the first character of t is 
            // strictly on the right of the 
            // first character of s
            //
            // Or here, from the symmetry, if
            // the palindrome centered at cur
            // cover the palindrome centered at
            // p1, we know
            if(m[p1] < r - p2)
                m[p2] = m[p1];
            //otherwise
            else
            {
               // we need to explore the length of
               // the palindrome centered at p2
               // if the palindrome centered at cur covers
               // p2, we can start at "k = r-p2"
               // otherwise, we start at "k=0"
               //reset "cur" 
               cur = p2;
               int k = r-p2;
               if(k<0) k = 0;
               while(1) 
               {
                  if((p2+k+1)&1)
                  {
                    if(p2+k+1 < 2*len+1 && p2-k-1 >=0 && s[(p2+k)/2] == s[(p2-k-2)/2])
                      k++;
                    else break;
                  }
                   else
                  {
                    if(p2+k+1 < 2*len+1 && p2-k-1 >=0)
                      k++;
                    else break;
                  }
    
               }
               // set the right boundary to be "p2+k"
               r = p2+k;
               m[p2] = k;
            }
    
    
        }
    
     
    }
    
    
    

    No comments:

    Post a Comment