4.4 QR迭代法 QR迭代法的基本思想是通过不断的正交相似变换,使得 A A A 趋向于一个上三角形式(或拟上三角形式).算法形式非常简洁,描述如下:
算法4.5. QR迭代法 (QRIteration) 1: Set A 1 = A A_{1} = A A 1 = A and k = 1 k = 1 k = 1 2: while not convergence do 3: [ Q k , R k ] = Q R ( A k ) % Q R [Q_k, R_k] = \mathsf{QR}(A_k) \quad \% \quad \mathsf{QR} [ Q k , R k ] = QR ( A k ) % QR 分解 4: compute A k + 1 = R k Q k A_{k + 1} = R_kQ_k A k + 1 = R k Q k 5: k = k + 1 k = k + 1 k = k + 1 6: end while
在该算法中, 我们有
A k + 1 = R k Q k = ( Q k ⊤ Q k ) R k Q k = Q k ⊤ ( Q k R k ) Q k = Q k ⊤ A k Q k . A _ {k + 1} = R _ {k} Q _ {k} = \left(Q _ {k} ^ {\top} Q _ {k}\right) R _ {k} Q _ {k} = Q _ {k} ^ {\top} \left(Q _ {k} R _ {k}\right) Q _ {k} = Q _ {k} ^ {\top} A _ {k} Q _ {k}. A k + 1 = R k Q k = ( Q k ⊤ Q k ) R k Q k = Q k ⊤ ( Q k R k ) Q k = Q k ⊤ A k Q k . 由这个递推关系可得
A k + 1 = Q k T A k Q k = Q k T Q k − 1 T A k − 1 Q k − 1 Q k = ⋯ = Q k T Q k − 1 T … Q 1 T A Q 1 … Q k − 1 Q k . A _ {k + 1} = Q _ {k} ^ {\mathsf {T}} A _ {k} Q _ {k} = Q _ {k} ^ {\mathsf {T}} Q _ {k - 1} ^ {\mathsf {T}} A _ {k - 1} Q _ {k - 1} Q _ {k} = \dots = Q _ {k} ^ {\mathsf {T}} Q _ {k - 1} ^ {\mathsf {T}} \dots Q _ {1} ^ {\mathsf {T}} A Q _ {1} \dots Q _ {k - 1} Q _ {k}. A k + 1 = Q k T A k Q k = Q k T Q k − 1 T A k − 1 Q k − 1 Q k = ⋯ = Q k T Q k − 1 T … Q 1 T A Q 1 … Q k − 1 Q k . 记 Q ~ k = Q 1 ⋯ Q k − 1 Q k = [ q ~ 1 ( k ) , q ~ 2 ( k ) , … , q ~ n ( k ) ] \tilde{Q}_k = Q_1\cdots Q_{k-1}Q_k = \left[\tilde{q}_1^{(k)}, \tilde{q}_2^{(k)}, \ldots, \tilde{q}_n^{(k)}\right] Q ~ k = Q 1 ⋯ Q k − 1 Q k = [ q ~ 1 ( k ) , q ~ 2 ( k ) , … , q ~ n ( k ) ] , 则
A k + 1 = Q ~ k T A Q ~ k , (4.1) A _ {k + 1} = \tilde {Q} _ {k} ^ {\mathsf {T}} A \tilde {Q} _ {k}, \tag {4.1} A k + 1 = Q ~ k T A Q ~ k , ( 4.1 ) 即 A k + 1 A_{k + 1} A k + 1 与 A A A 正交相似
4.4.1 QR迭代与幂迭代的关系 记 R ~ k = R k R k − 1 ⋯ R 1 \tilde{R}_k = R_k R_{k-1} \cdots R_1 R ~ k = R k R k − 1 ⋯ R 1 ,则有
Q ~ k R ~ k = Q ~ k − 1 ( Q k R k ) R ~ k − 1 = Q ~ k − 1 ( A k ) R ~ k − 1 = Q ~ k − 1 ( Q ~ k − 1 T A Q ~ k − 1 ) R ~ k − 1 = A Q ~ k − 1 R ~ k − 1 , \tilde {Q} _ {k} \tilde {R} _ {k} = \tilde {Q} _ {k - 1} (Q _ {k} R _ {k}) \tilde {R} _ {k - 1} = \tilde {Q} _ {k - 1} (A _ {k}) \tilde {R} _ {k - 1} = \tilde {Q} _ {k - 1} (\tilde {Q} _ {k - 1} ^ {\mathsf {T}} A \tilde {Q} _ {k - 1}) \tilde {R} _ {k - 1} = A \tilde {Q} _ {k - 1} \tilde {R} _ {k - 1}, Q ~ k R ~ k = Q ~ k − 1 ( Q k R k ) R ~ k − 1 = Q ~ k − 1 ( A k ) R ~ k − 1 = Q ~ k − 1 ( Q ~ k − 1 T A Q ~ k − 1 ) R ~ k − 1 = A Q ~ k − 1 R ~ k − 1 , 由此递推下去, 即可得
Q ~ k R ~ k = A k − 1 Q ~ 1 R ~ 1 = A k − 1 Q 1 R 1 = A k . (4.2) \tilde {Q} _ {k} \tilde {R} _ {k} = A ^ {k - 1} \tilde {Q} _ {1} \tilde {R} _ {1} = A ^ {k - 1} Q _ {1} R _ {1} = A ^ {k}. \tag {4.2} Q ~ k R ~ k = A k − 1 Q ~ 1 R ~ 1 = A k − 1 Q 1 R 1 = A k . ( 4.2 ) 故
Q ~ k R ~ k e 1 = A k e 1 , \tilde {Q} _ {k} \tilde {R} _ {k} e _ {1} = A ^ {k} e _ {1}, Q ~ k R ~ k e 1 = A k e 1 , 这说明QR迭代与幂迭代有关
假设 ∣ λ 1 ∣ > ∣ λ 2 ∣ ≥ ⋯ ≥ ∣ λ n ∣ |\lambda_1| > |\lambda_2| \geq \dots \geq |\lambda_n| ∣ λ 1 ∣ > ∣ λ 2 ∣ ≥ ⋯ ≥ ∣ λ n ∣ , 则当 k k k 充分大时, A k e 1 A^k e_1 A k e 1 收敛到 A A A 的模最大特征值 λ 1 \lambda_1 λ 1 所对应的特征向量, 故 Q ~ k \tilde{Q}_k Q ~ k 的第一列 q ~ 1 ( k ) \tilde{q}_1^{(k)} q ~ 1 ( k ) 也收敛到 λ 1 \lambda_1 λ 1 所对应的特征向量. 因此, 当 k k k 充分大时, A q ~ 1 ( k ) → λ 1 q ~ 1 ( k ) A\tilde{q}_1^{(k)} \to \lambda_1\tilde{q}_1^{(k)} A q ~ 1 ( k ) → λ 1 q ~ 1 ( k ) , 此时由 (4.1) 可知, A k + 1 A_{k+1} A k + 1 的第一列
A k + 1 ( : , 1 ) = Q ~ k T A q ~ 1 ( k ) → λ 1 Q ~ k T q ~ 1 ( k ) = λ 1 e 1 , A _ {k + 1} (:, 1) = \tilde {Q} _ {k} ^ {\mathsf {T}} A \tilde {q} _ {1} ^ {(k)} \rightarrow \lambda_ {1} \tilde {Q} _ {k} ^ {\mathsf {T}} \tilde {q} _ {1} ^ {(k)} = \lambda_ {1} e _ {1}, A k + 1 ( : , 1 ) = Q ~ k T A q ~ 1 ( k ) → λ 1 Q ~ k T q ~ 1 ( k ) = λ 1 e 1 , 即 A k + 1 A_{k + 1} A k + 1 的第一列的第一个元素收敛到 λ 1 \lambda_1 λ 1 ,而其它元素都趋向于0,收敛速度取决于 ∣ λ 2 / λ 1 ∣ \left|\lambda_2 / \lambda_1\right| ∣ λ 2 / λ 1 ∣ 的大小.
4.4.2 QR迭代与反迭代的关系 下面观察 Q ~ k \tilde{Q}_k Q ~ k 的最后一列.由(4.1)可知
A Q ~ k = Q ~ k A k + 1 = Q ~ k Q k + 1 R k + 1 = Q ~ k + 1 R k + 1 , A \tilde {Q} _ {k} = \tilde {Q} _ {k} A _ {k + 1} = \tilde {Q} _ {k} Q _ {k + 1} R _ {k + 1} = \tilde {Q} _ {k + 1} R _ {k + 1}, A Q ~ k = Q ~ k A k + 1 = Q ~ k Q k + 1 R k + 1 = Q ~ k + 1 R k + 1 , 所以有
Q ~ k + 1 = A Q ~ k R k + 1 − 1 . \tilde {Q} _ {k + 1} = A \tilde {Q} _ {k} R _ {k + 1} ^ {- 1}. Q ~ k + 1 = A Q ~ k R k + 1 − 1 . 由于 Q ~ k + 1 \tilde{Q}_{k + 1} Q ~ k + 1 和 Q ~ k \tilde{Q}_k Q ~ k 都是正交矩阵,上式两边转置后求逆,可得
Q ~ k + 1 = ( Q ~ k + 1 T ) − 1 = ( ( R k + 1 − 1 ) T Q ~ k T A T ) − 1 = ( A T ) − 1 Q ~ k R k + 1 T . \tilde {Q} _ {k + 1} = \left(\tilde {Q} _ {k + 1} ^ {\mathsf {T}}\right) ^ {- 1} = \left(\left(R _ {k + 1} ^ {- 1}\right) ^ {\mathsf {T}} \tilde {Q} _ {k} ^ {\mathsf {T}} A ^ {\mathsf {T}}\right) ^ {- 1} = \left(A ^ {\mathsf {T}}\right) ^ {- 1} \tilde {Q} _ {k} R _ {k + 1} ^ {\mathsf {T}}. Q ~ k + 1 = ( Q ~ k + 1 T ) − 1 = ( ( R k + 1 − 1 ) T Q ~ k T A T ) − 1 = ( A T ) − 1 Q ~ k R k + 1 T . 观察等式两边矩阵的最后一列,可得
q ~ n ( k + 1 ) = c 1 ( A T ) − 1 q ~ n ( k ) , \tilde {q} _ {n} ^ {(k + 1)} = c _ {1} \left(A ^ {\mathsf {T}}\right) ^ {- 1} \tilde {q} _ {n} ^ {(k)}, q ~ n ( k + 1 ) = c 1 ( A T ) − 1 q ~ n ( k ) , 其中 c 1 c_{1} c 1 为某个常数.依此类推,可知
q ~ n ( k + 1 ) = c ( A T ) − k q ~ n ( 1 ) , \tilde {q} _ {n} ^ {(k + 1)} = c \left(A ^ {\mathsf {T}}\right) ^ {- k} \tilde {q} _ {n} ^ {(1)}, q ~ n ( k + 1 ) = c ( A T ) − k q ~ n ( 1 ) , 其中 c c c 为某个常数. 假设 A A A 的特征值满足 ∣ λ 1 ∣ ≥ ∣ λ 2 ∣ ≥ ⋯ ≥ ∣ λ n − 1 ∣ > ∣ λ n ∣ > 0 |\lambda_1| \geq |\lambda_2| \geq \dots \geq |\lambda_{n-1}| > |\lambda_n| > 0 ∣ λ 1 ∣ ≥ ∣ λ 2 ∣ ≥ ⋯ ≥ ∣ λ n − 1 ∣ > ∣ λ n ∣ > 0 , 则 λ n − 1 \lambda_n^{-1} λ n − 1 是 ( A T ) − 1 \left(A^{\mathsf{T}}\right)^{-1} ( A T ) − 1 的模最大的特征值. 由幂迭代的收敛性可知, q ~ n ( k + 1 ) \tilde{q}_n^{(k+1)} q ~ n ( k + 1 ) 收敛到 ( A T ) − 1 \left(A^{\mathsf{T}}\right)^{-1} ( A T ) − 1 的模最大特征值 λ n − 1 \lambda_n^{-1} λ n − 1 所对应的特征向量, 即当 k k k 充分大时, 有
( A T ) − 1 q ~ n ( k + 1 ) → λ n − 1 q ~ n ( k + 1 ) . \left(A ^ {\mathsf {T}}\right) ^ {- 1} \tilde {q} _ {n} ^ {(k + 1)} \rightarrow \lambda_ {n} ^ {- 1} \tilde {q} _ {n} ^ {(k + 1)}. ( A T ) − 1 q ~ n ( k + 1 ) → λ n − 1 q ~ n ( k + 1 ) . 所以
A T q ~ n ( k + 1 ) → λ n q ~ n ( k + 1 ) . A ^ {\mathsf {T}} \tilde {q} _ {n} ^ {(k + 1)} \to \lambda_ {n} \tilde {q} _ {n} ^ {(k + 1)}. A T q ~ n ( k + 1 ) → λ n q ~ n ( k + 1 ) . 由(4.1)可知, A k + 1 T A_{k + 1}^{\mathsf{T}} A k + 1 T 的最后一列为
A k + 1 T ( : , n ) = Q ~ k T A T q ~ n ( k ) → λ n Q ~ k T q ~ n ( k ) = λ n e n , A _ {k + 1} ^ {\mathsf {T}} (:, n) = \tilde {Q} _ {k} ^ {\mathsf {T}} A ^ {\mathsf {T}} \tilde {q} _ {n} ^ {(k)} \rightarrow \lambda_ {n} \tilde {Q} _ {k} ^ {\mathsf {T}} \tilde {q} _ {n} ^ {(k)} = \lambda_ {n} e _ {n}, A k + 1 T ( : , n ) = Q ~ k T A T q ~ n ( k ) → λ n Q ~ k T q ~ n ( k ) = λ n e n , 即 A k + 1 A_{k + 1} A k + 1 的最后一行的最后一个元素收敛到 λ n \lambda_{n} λ n , 而其它元素都趋向于0, 收敛速度取决于 ∣ λ n / λ n − 1 ∣ \left|\lambda_{n} / \lambda_{n - 1}\right| ∣ λ n / λ n − 1 ∣ 的大小.
4.4.3 QR迭代与正交迭代的关系 下面的定理给出了QR迭代法与正交迭代法(取 Z 0 = I Z_{0} = I Z 0 = I )之间的关系.
定理4.2 设正交迭代法4.4和QR算法4.5中所涉及的QR分解都是唯一的。设 A k A_{k} A k 是由QR迭代法4.5生成的矩阵, Z k Z_{k} Z k 是由正交迭代法4.4(取 Z 0 = I Z_{0} = I Z 0 = I )生成的矩阵,则有
A k + 1 = Z k T A Z k . A _ {k + 1} = Z _ {k} ^ {\mathsf {T}} A Z _ {k}. A k + 1 = Z k T A Z k . (板书)
证明. 我们用归纳法证明该结论
当 k = 0 k = 0 k = 0 时, A 1 = A A_{1} = A A 1 = A , Z 0 = I Z_{0} = I Z 0 = I . 结论显然成立.
设 A k = Z k − 1 T A Z k − 1 A_{k} = Z_{k - 1}^{\mathsf{T}}AZ_{k - 1} A k = Z k − 1 T A Z k − 1 .由于 Z k − 1 Z_{k - 1} Z k − 1 是正交矩阵,我们有
Z k R ^ k = Y k = A Z k − 1 = ( Z k − 1 Z k − 1 T ) A Z k − 1 = Z k − 1 A k = ( Z k − 1 Q k ) R k , Z _ {k} \hat {R} _ {k} = Y _ {k} = A Z _ {k - 1} = (Z _ {k - 1} Z _ {k - 1} ^ {\mathsf {T}}) A Z _ {k - 1} = Z _ {k - 1} A _ {k} = (Z _ {k - 1} Q _ {k}) R _ {k}, Z k R ^ k = Y k = A Z k − 1 = ( Z k − 1 Z k − 1 T ) A Z k − 1 = Z k − 1 A k = ( Z k − 1 Q k ) R k , 即 Z k R ^ k Z_{k}\hat{R}_{k} Z k R ^ k 和 ( Z k − 1 Q k ) R k (Z_{k - 1}Q_k)R_k ( Z k − 1 Q k ) R k 都是 Y k Y_{k} Y k 的QR分解.由QR分解的唯一性可知, Z k = Z k − 1 Q k , R ^ k = R k Z_{k} = Z_{k - 1}Q_{k},\hat{R}_{k} = R_{k} Z k = Z k − 1 Q k , R ^ k = R k 所以
Z k ⊤ A Z k = ( Z k − 1 Q k ) ⊤ A ( Z k − 1 Q k ) = Q k ⊤ A k Q k = Q k ⊤ ( Q k R k ) Q k = R k Q k = A k + 1 , Z _ {k} ^ {\top} A Z _ {k} = \left(Z _ {k - 1} Q _ {k}\right) ^ {\top} A \left(Z _ {k - 1} Q _ {k}\right) = Q _ {k} ^ {\top} A _ {k} Q _ {k} = Q _ {k} ^ {\top} \left(Q _ {k} R _ {k}\right) Q _ {k} = R _ {k} Q _ {k} = A _ {k + 1}, Z k ⊤ A Z k = ( Z k − 1 Q k ) ⊤ A ( Z k − 1 Q k ) = Q k ⊤ A k Q k = Q k ⊤ ( Q k R k ) Q k = R k Q k = A k + 1 , 即 A k + 1 = Z k ⊤ A Z k A_{k + 1} = Z_k^\top A Z_k A k + 1 = Z k ⊤ A Z k 由归纳法可知,定理结论成立
4.4.4 QR迭代的收敛性 定理4.3 设 A = V Λ V − 1 ∈ R n × n A = V\Lambda V^{-1} \in \mathbb{R}^{n \times n} A = V Λ V − 1 ∈ R n × n , 其中 Λ = d i a g ( λ 1 , λ 2 , … , λ n ) \Lambda = \mathrm{diag}(\lambda_1, \lambda_2, \dots, \lambda_n) Λ = diag ( λ 1 , λ 2 , … , λ n ) , 且 ∣ λ 1 ∣ > ∣ λ 2 ∣ > ⋯ > ∣ λ n ∣ > 0 |\lambda_1| > |\lambda_2| > \dots > |\lambda_n| > 0 ∣ λ 1 ∣ > ∣ λ 2 ∣ > ⋯ > ∣ λ n ∣ > 0 . 若 V − 1 V^{-1} V − 1 的所有顺序主子矩阵都非奇异(即 V − 1 V^{-1} V − 1 存在 LU 分解),则 A k A_k A k 的对角线以下的元素均收敛到 0. (板书)
证明. 设 V = Q v R v V = Q_{v}R_{v} V = Q v R v 是 V V V 的QR分解, V − 1 = L v U v V^{-1} = L_{v}U_{v} V − 1 = L v U v 是 V − 1 V^{-1} V − 1 的LU分解, 其中 L v L_{v} L v 是单位下三角矩阵. 则
A k = V Λ k V − 1 = Q v R v Λ k L v U v = Q v R v ( Λ k L v Λ − k ) Λ k U v . A ^ {k} = V \Lambda^ {k} V ^ {- 1} = Q _ {v} R _ {v} \Lambda^ {k} L _ {v} U _ {v} = Q _ {v} R _ {v} \left(\Lambda^ {k} L _ {v} \Lambda^ {- k}\right) \Lambda^ {k} U _ {v}. A k = V Λ k V − 1 = Q v R v Λ k L v U v = Q v R v ( Λ k L v Λ − k ) Λ k U v . 注意到矩阵 Λ k L v Λ − k \Lambda^k L_v\Lambda^{-k} Λ k L v Λ − k 是一个下三角矩阵, 且其 ( i , j ) (i,j) ( i , j ) 位置上的元素为
( Λ k L v Λ − k ) ( i , j ) = { 0 , i < j , 1 , i = j , L v ( i , j ) λ i k / λ j k , i > j . \left(\Lambda^ {k} L _ {v} \Lambda^ {- k}\right) (i, j) = \left\{ \begin{array}{l l} 0, & i < j, \\ 1, & i = j, \\ L _ {v} (i, j) \lambda_ {i} ^ {k} / \lambda_ {j} ^ {k}, & i > j. \end{array} \right. ( Λ k L v Λ − k ) ( i , j ) = ⎩ ⎨ ⎧ 0 , 1 , L v ( i , j ) λ i k / λ j k , i < j , i = j , i > j . 由于当 i > j i > j i > j 时有 ∣ λ i / λ j ∣ < 1 \left|\lambda_i / \lambda_j\right| < 1 ∣ λ i / λ j ∣ < 1 , 故当 k k k 充分大时, λ i k / λ j k \lambda_i^k /\lambda_j^k λ i k / λ j k 趋向于0. 所以我们可以把 Λ k L v Λ − k \Lambda^k L_v\Lambda^{-k} Λ k L v Λ − k 写成
Λ k L v Λ − k = I + E k , \Lambda^ {k} L _ {v} \Lambda^ {- k} = I + E _ {k}, Λ k L v Λ − k = I + E k , 其中 E k E_{k} E k 满足 lim k → ∞ E k = 0. \lim_{k\to \infty}E_k = 0. lim k → ∞ E k = 0. 于是
A k = Q v R v ( I + E k ) Λ k U v = Q v ( I + R v E k R v − 1 ) R v Λ k U v . (4.3) A ^ {k} = Q _ {v} R _ {v} \left(I + E _ {k}\right) \Lambda^ {k} U _ {v} = Q _ {v} \left(I + R _ {v} E _ {k} R _ {v} ^ {- 1}\right) R _ {v} \Lambda^ {k} U _ {v}. \tag {4.3} A k = Q v R v ( I + E k ) Λ k U v = Q v ( I + R v E k R v − 1 ) R v Λ k U v . ( 4.3 ) 对矩阵 I + R v E k R v − 1 I + R_{v}E_{k}R_{v}^{-1} I + R v E k R v − 1 做QR分解: I + R v E k R v − 1 = Q E k R E k I + R_{v}E_{k}R_{v}^{-1} = Q_{E_{k}}R_{E_{k}} I + R v E k R v − 1 = Q E k R E k 由于 E k → 0 E_{k}\to 0 E k → 0 ,所以 Q E k → I , Q_{E_k}\rightarrow I, Q E k → I , R E k → I . R_{E_k}\rightarrow I. R E k → I . 将其代入 ( \ref e q : 2 ) (\ref{eq:2}) ( \ref e q : 2 ) 可得
A k = Q v Q E k R E k R v Λ k U v = Q v Q E k D k ( D k − 1 R E k R v Λ k U v ) , (4.4) A ^ {k} = Q _ {v} Q _ {E _ {k}} R _ {E _ {k}} R _ {v} \Lambda^ {k} U _ {v} = Q _ {v} Q _ {E _ {k}} D _ {k} \left(D _ {k} ^ {- 1} R _ {E _ {k}} R _ {v} \Lambda^ {k} U _ {v}\right), \tag {4.4} A k = Q v Q E k R E k R v Λ k U v = Q v Q E k D k ( D k − 1 R E k R v Λ k U v ) , ( 4.4 ) 其中 D k D_{k} D k 是对角矩阵, 其对角线元素的模均为 1, 它使得上三角矩阵 D k − 1 R E k R v Λ k U v D_{k}^{-1}R_{E_{k}}R_{v}\Lambda^{k}U_{v} D k − 1 R E k R v Λ k U v 的对角线元素均为正. 这样, (??) 就构成 A k A^{k} A k 的 QR 分解. 又由 (4.2) 可知 A k = Q ~ k R ~ k A^{k} = \tilde{Q}_{k}\tilde{R}_{k} A k = Q ~ k R ~ k , 根据 QR 分解的唯一性, 我们可得
Q ~ k = Q v Q E k D k , R ~ k = D k − 1 R E k R v Λ k U v . \tilde {Q} _ {k} = Q _ {v} Q _ {E _ {k}} D _ {k}, \qquad \tilde {R} _ {k} = D _ {k} ^ {- 1} R _ {E _ {k}} R _ {v} \Lambda^ {k} U _ {v}. Q ~ k = Q v Q E k D k , R ~ k = D k − 1 R E k R v Λ k U v . 所以由 (4.1) 可知
A k + 1 = Q ~ k T A Q ~ k = ( Q v Q E k D k ) ⊺ V Λ V − 1 Q v Q E k D k = D k ⊤ Q E k ⊤ Q v ⊤ Q v R v Λ R v − 1 Q v − 1 Q v Q E k D k = D k ⊤ Q E k ⊤ R v Λ R v − 1 Q E k D k . \begin{array}{l} A _ {k + 1} = \tilde {Q} _ {k} ^ {\mathsf {T}} A \tilde {Q} _ {k} \\ = \left(Q _ {v} Q _ {E _ {k}} D _ {k}\right) ^ {\intercal} V \Lambda V ^ {- 1} Q _ {v} Q _ {E _ {k}} D _ {k} \\ = D _ {k} ^ {\top} Q _ {E _ {k}} ^ {\top} Q _ {v} ^ {\top} Q _ {v} R _ {v} \Lambda R _ {v} ^ {- 1} Q _ {v} ^ {- 1} Q _ {v} Q _ {E _ {k}} D _ {k} \\ = D _ {k} ^ {\top} Q _ {E _ {k}} ^ {\top} R _ {v} \Lambda R _ {v} ^ {- 1} Q _ {E _ {k}} D _ {k}. \\ \end{array} A k + 1 = Q ~ k T A Q ~ k = ( Q v Q E k D k ) ⊺ V Λ V − 1 Q v Q E k D k = D k ⊤ Q E k ⊤ Q v ⊤ Q v R v Λ R v − 1 Q v − 1 Q v Q E k D k = D k ⊤ Q E k ⊤ R v Λ R v − 1 Q E k D k . 由于 Q E k → I Q_{E_k} \to I Q E k → I , 所以当 k → ∞ k \to \infty k → ∞ 时, A k + 1 A_{k+1} A k + 1 收敛到一个上三角矩阵. 收敛速度取决于
max 1 ≤ i < n ∣ λ i + 1 λ i ∣ \max _ {1 \leq i < n} \left| \frac {\lambda_ {i + 1}}{\lambda_ {i}} \right| 1 ≤ i < n max λ i λ i + 1 的大小. □需要指出的是,由于 D k D_{k} D k 的元素不一定收敛,故 A k + 1 A_{k + 1} A k + 1 对角线以上
(不含对角线)的元素不一定收敛, 但这不妨碍 A k + 1 A_{k+1} A k + 1 的对角线元素收敛到 A A A 的特征值(即 A k + 1 A_{k+1} A k + 1 的对角线元素是收敛的).
例4.3 QR迭代法演示.设
A = X [ 9 5 3 1 ] X − 1 , A = X \left[ \begin{array}{c c c c} 9 & & & \\ & 5 & & \\ & & 3 & \\ & & & 1 \end{array} \right] X ^ {- 1}, A = X 9 5 3 1 X − 1 , 其中 X X X 是由MATLAB随机生成的非奇异矩阵
在迭代过程中, 对于 A k A_{k} A k 的下三角部分中元素, 如果其绝对值小于某个阈值 tol \text{tol} tol , 则直接将其设为 0, 即
a i j ( k ) = 0 i f i > j a n d ∣ a i j ( k ) ∣ < t o l . a _ {i j} ^ {(k)} = 0 \quad \text {i f} \quad i > j \text {a n d} | a _ {i j} ^ {(k)} | < t o l. a ij ( k ) = 0 i f i > j a n d ∣ a ij ( k ) ∣ < t o l . 这里我们取 t o l = 10 − 6 max 1 ≤ i , j ≤ n { ∣ a i j ( k ) ∣ } . tol = 10^{-6}\max_{1\leq i,j\leq n}\{|a_{ij}^{(k)}|\} . t o l = 1 0 − 6 max 1 ≤ i , j ≤ n { ∣ a ij ( k ) ∣ } . (Eig_QR.m)
4.4.5 带位移的QR迭代法 为了加快 QR 迭代的收敛速度, 我们可以采用位移策略和反迭代思想.
算法4.6.带位移的QR迭代法(QRIterationwithshift)
1: Set A 1 = A A_{1} = A A 1 = A and k = 1 k = 1 k = 1 2: while not convergence do 3: Choose a shift σ k \sigma_{k} σ k 4: [ Q k , R k ] = Q R ( A k − σ k I ) % Q R [Q_k, R_k] = \mathsf{QR}(A_k - \sigma_k I) \quad \% \quad \mathsf{QR} [ Q k , R k ] = QR ( A k − σ k I ) % QR 分解 5: Compute A k + 1 = R k Q k + σ k I A_{k + 1} = R_kQ_k + \sigma_kI A k + 1 = R k Q k + σ k I 6: k = k + 1 k = k + 1 k = k + 1 7: end while
与不带位移的QR迭代一样,我们有
A k + 1 = R k Q k + σ k I = ( Q k T Q k ) R k Q k + σ k I = Q k T ( A k − σ k I ) Q k + σ k I = Q k T A k Q k A _ {k + 1} = R _ {k} Q _ {k} + \sigma_ {k} I = (Q _ {k} ^ {\mathsf {T}} Q _ {k}) R _ {k} Q _ {k} + \sigma_ {k} I = Q _ {k} ^ {\mathsf {T}} (A _ {k} - \sigma_ {k} I) Q _ {k} + \sigma_ {k} I = Q _ {k} ^ {\mathsf {T}} A _ {k} Q _ {k} A k + 1 = R k Q k + σ k I = ( Q k T Q k ) R k Q k + σ k I = Q k T ( A k − σ k I ) Q k + σ k I = Q k T A k Q k 所以, 带位移的 QR 算法中所得到的矩阵 A k A_{k} A k 仍然与 A 1 = A A_{1} = A A 1 = A 正交相似.
在带位移的QR迭代法中, 一个很重要的问题就是位移 σ k \sigma_{k} σ k 的选取. 在前面的分析中我们已经知道, A k + 1 ( n , n ) A_{k + 1}(n,n) A k + 1 ( n , n ) 将收敛到 A A A 的模最小的特征值, 且收敛速度取决于模最小特征值与模第二小特征值之间的比值. 显然, 若 σ k \sigma_{k} σ k 就是 A A A 的一个特征值, 则 A k − σ k I A_{k} - \sigma_{k}I A k − σ k I 的模最小特征值为0, 故QR算
法迭代一步就收敛.此时
A k + 1 = R k Q k + σ k I = [ A k + 1 ( n − 1 ) × ( n − 1 ) ∗ 0 σ k ] . A _ {k + 1} = R _ {k} Q _ {k} + \sigma_ {k} I = \left[ \begin{array}{c c} A _ {k + 1} ^ {(n - 1) \times (n - 1)} & * \\ 0 & \sigma_ {k} \end{array} \right]. A k + 1 = R k Q k + σ k I = [ A k + 1 ( n − 1 ) × ( n − 1 ) 0 ∗ σ k ] . 如果需要计算 A A A 的其它特征值, 则可对子矩阵 A k + 1 ( n − 1 ) × ( n − 1 ) A_{k+1}^{(n-1) \times (n-1)} A k + 1 ( n − 1 ) × ( n − 1 ) 使用带位移的 QR 迭代法.
通常, 如果 σ k \sigma_{k} σ k 与 A A A 的某个特征值非常接近, 则收敛速度通常会很快. 由于 A k ( n , n ) A_{k}(n,n) A k ( n , n ) 收敛到 A A A 的一个特征值, 所以在实际使用中, 一个比较直观的位移选择策略是 σ k = A k ( n , n ) \sigma_{k} = A_{k}(n,n) σ k = A k ( n , n ) . 事实上, 这样的位移选取方法通常会使得 QR 迭代有二次收敛速度.
例4.4带位移的QR迭代法演示.所有数据和设置与例4.3相同,在迭代过程中,取 σ k = A k ( n , n ) \sigma_{k} = A_{k}(n,n) σ k = A k ( n , n ) 如果 A k ( n , n ) A_{k}(n,n) A k ( n , n ) 已经收敛,则取 σ k = A k ( n − 1 , n − 1 ) \sigma_{k} = A_{k}(n - 1,n - 1) σ k = A k ( n − 1 , n − 1 ) (Eig_QR_shift.m)