**Problem:**Longest common subsequence is the subsequence that appear in multiple sequences (usually two) simultaneously. For example, the longest common sequence of "13486" and "23861' is "386". The diff program is based on this algorithm. Besides, this algorithm is frequently used in bioinformatic research.

**Solution:**Still we will use DP, but different from longest Increasing subsequence, the solution we have is O(n*m), where n and m are the length of two sequence. If we have more than two sequence, the time complexity will be O(n1*n2*n3*...). The following gives the algorithm for two sequences:

- look at the last elements A[
*n*], B[*m*] of two sequences, if A[*n*] == B[*m*], we know the longest common subsequence (LCS) between A[*n*] and B[*m*], LCS(A[1:*n*], B[1:*m*]) = LCS(A[1:*n-1*], B[1:*m-1*]) +1. - if A[
*n*] <> B[*m*], then LCS(A[1:*n*], B[1:*m*]) = MAX(LCS(A[1:*n-1*], B[1:*m*]), LCS(A[1:*n*], B[1:*m-1*])). - then we will have a matrix for LCS, after filling up the matrix, we will get the answer.

**More:**The solution to LCS problem can be easily applied to

**Edit Distance**problem. Given two string X

*i*and Y

*j*, and a number of operations such as copy(), delete(), insert(), etc. Find the optimal way to transform X

*i*to Y

*j*. Each operation has a cost and our goal is to minimize the total cost of the transformation. Here only gives the recursive relations:

- Define T(
*i*,*j*) as the total cost to transform X*i*to Y*j*(one direction). There are several recursive relations. - T(
*i*,*j*) = T(*i*-1,*j*-1) + cost(copy), if X[*i*] == Y[*j*] - T(
*i*,*j*) = T(*i*-1,*j*-1) + cost(replace), if X[*i*] <> Y[*j*] - T(
*i*,*j*) = T(*i*-1,*j*) + cost(delete) - T(
*i*,*j*) = T(*i*,*j*-1) + cost(insert) - find T(
*i*,*j*) as the minimum T(*i*,*j*) in step 2,3,4,5

## No comments:

## Post a Comment