`
leonzhx
  • 浏览: 793544 次
  • 性别: Icon_minigender_1
  • 来自: 上海
社区版块
存档分类
最新评论
阅读更多

1.  Problem Definition:

    a)  Input: Strings X = x1 ... xm, Y = y1 ... yn over some alphabet Sigma (like {A,C,G,T})

    b)  Penalty Pgap for inserting a gap, Pab for matching a & b [presumably Pab = 0 of a = b]

    c)  Feasible solutions: Alignments - i.e., insert gaps to equalize lengths of the string

    d)  Goal: Alignment with minimum possible total penalty

    e)  Example :



 

2.  A Dynamic Programming Approach

    --  Key step: Identify subproblems. As usual, will look at structure of an optimal solution for clues.

        [i.e., develop a recurrence + then reverse engineer the subproblems]

    --  Structure of optimal solution: Consider an optimal alignment of X , Y and its final position:



 

    --  Three cases :

       a)  Case 1: xm , yn matched

       b)  Case 2: xm matched with a gap

       c)  Case 3: yn matched with a gap

       [Pointless to have 2 gaps]

 

    --  Optimal substructure: Let X' = X - xm, Y' = Y - yn.

       a)  If case (1) holds, then induced alignment of X' & Y' is optimal.

       b)  If case (2) holds, then induced alignment of X' & Y is optimal.

       c)  If case (3) holds, then induced alignment of X & Y' is optimal.

 

3.  Optimal Substructure Proof

    --  Proof of Case 1, other cases are similar :

        By contradiction. Suppose induced alignment of X' , Y' has penalty P while some other one has penalty P* < P.

        ==> Appending xm , yn to the latter, get an alignment of X and Y with penalty P + Pxmyn < P + Pxmyn ==> Contradicts optimality of original alignment of X & Y .

 

4.  The Recurrence

    --  Notation: Let Xi = 1st i letters of X , Yj = 1st j letters of Y. Pi,j = penalty of optimal alignment of Xi & Yj 

    --  Recurrence: For all i = 1, ... , m and j = 1, ... , n:

         Pij = min{  (1) Pxi yj + Pi-1,j-1 , (2) Pgap + Pi-1,j (3) Pgap + Pi,j-1 }

    --  Base case : Pi,0 = P0,i = i * Pgap 

 

5.  The Algorithm

    --  A = 2-D array.

        A[i , 0] = A[0 , i] = i * Pgap for any i >= 0

        For i = 1 to m

            For j = 1 to n

                A[i , j ] = min { A[i-1, j-1] + Pxiyi , A[i-1, j] + Pgap , A[i, j-1] + Pgap }

    --  Running Time : O(mn)

 

6.  Reconstructing a Solution

    --  Trace back through filled-in table A, starting at A[m , n]

    --  When you reach subproblem A[i , j ]:

        --  If A[i , j ] filled using case (1), match xi & yj and go to A[i - 1 , j - 1]

        --  If A[i , j ] filled using case (2), match xi with a gap and go to A[i - 1 , j ]

        --  If A[i , j ] filled using case (3), match yj with a gap and go to A[i , j - 1]

        [If i = 0 or j = 0, match remaining substring with gaps]

    --  Running time is only O(m + n)

  • 大小: 11 KB
  • 大小: 5.5 KB
分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics