5.1 Jacobi迭代法 该算法的基本思想是通过一系列的 Jacobi 旋转 J k J_{k} J k , 将 A A A 正交相似于一个对角矩阵, 即
A ( 0 ) = A , A ( k + 1 ) = J k A ( k ) J k T , k = 0 , 1 , … , A ^ {(0)} = A, \quad A ^ {(k + 1)} = J _ {k} A ^ {(k)} J _ {k} ^ {\mathsf {T}}, \quad k = 0, 1, \ldots , A ( 0 ) = A , A ( k + 1 ) = J k A ( k ) J k T , k = 0 , 1 , … , 且 A ( k ) A^{(k)} A ( k ) 收敛到一个对角矩阵,其中 J k J_{k} J k 为Jacobi旋转,通常选取 J k J_{k} J k 为Givens变换,即
J k = G ( i k , j k , θ k ) = [ 1 ⋱ 1 cos θ k sin θ k ⋱ − sin θ k cos θ k 1 ⋱ 1 ] J _ {k} = G \left(i _ {k}, j _ {k}, \theta_ {k}\right) = \left[ \begin{array}{c c c c c c c c c} 1 & & & & & & & & \\ & \ddots & & & & & & & \\ & & 1 & & & & & & \\ & & & \cos \theta_ {k} & & \sin \theta_ {k} & & & \\ & & & & \ddots & & & & \\ & & & - \sin \theta_ {k} & & \cos \theta_ {k} & & & \\ & & & & & & 1 & & \\ & & & & & & & \ddots & \\ & & & & & & & & 1 \end{array} \right] J k = G ( i k , j k , θ k ) = 1 ⋱ 1 cos θ k − sin θ k ⋱ sin θ k cos θ k 1 ⋱ 1 易知, 在 A ( k ) A^{(k)} A ( k ) 两边分别左乘 J k J_{k} J k 和右乘 J k T J_{k}^{\mathsf{T}} J k T 时, 只会修改 A ( k ) A^{(k)} A ( k ) 的第 i k i_{k} i k 和第 j k j_{k} j k 行, 以及第 i k i_{k} i k 和第 j k j_{k} j k 列.
由于 A ( k ) A^{(k)} A ( k ) 是对称矩阵, 由下面的引理可知, 通过选取适当的 θ k \theta_{k} θ k , 可以将 A ( k ) ( i k , j k ) A^{(k)}(i_k,j_k) A ( k ) ( i k , j k ) 和 A ( k ) ( j k , i k ) A^{(k)}(j_k,i_k) A ( k ) ( j k , i k ) 同时化为0.
引理5.1设 A ∈ R 2 × 2 A\in \mathbb{R}^{2\times 2} A ∈ R 2 × 2 是对称矩阵,则存在Givens变换 G ∈ R 2 × 2 G\in \mathbb{R}^{2\times 2} G ∈ R 2 × 2 ,使得 G A G ⊤ GAG^{\top} G A G ⊤ 为对角矩阵
(板书)
证明. 设
A = [ a b b c ] , G = [ cos θ sin θ − sin θ cos θ ] , A = \left[ \begin{array}{c c} a & b \\ b & c \end{array} \right], \quad G = \left[ \begin{array}{c c} \cos \theta & \sin \theta \\ - \sin \theta & \cos \theta \end{array} \right], A = [ a b b c ] , G = [ cos θ − sin θ sin θ cos θ ] , 则
G A G T = [ cos θ sin θ − sin θ cos θ ] [ a b b c ] [ cos θ sin θ − sin θ cos θ ] T = [ a cos 2 θ + c sin 2 θ + b sin 2 θ 1 2 ( c − a ) sin 2 θ + b cos 2 θ 1 2 ( c − a ) sin 2 θ + b cos 2 θ a sin 2 θ + c cos 2 θ − b sin 2 θ ] \begin{array}{l} G A G ^ {\mathsf {T}} = \left[ \begin{array}{l l} \cos \theta & \sin \theta \\ - \sin \theta & \cos \theta \end{array} \right] \left[ \begin{array}{l l} a & b \\ b & c \end{array} \right] \left[ \begin{array}{l l} \cos \theta & \sin \theta \\ - \sin \theta & \cos \theta \end{array} \right] ^ {\mathsf {T}} \\ = \left[ \begin{array}{l l} a \cos^ {2} \theta + c \sin^ {2} \theta + b \sin 2 \theta & \frac {1}{2} (c - a) \sin 2 \theta + b \cos 2 \theta \\ \frac {1}{2} (c - a) \sin 2 \theta + b \cos 2 \theta & a \sin^ {2} \theta + c \cos^ {2} \theta - b \sin 2 \theta \end{array} \right] \\ \end{array} G A G T = [ cos θ − sin θ sin θ cos θ ] [ a b b c ] [ cos θ − sin θ sin θ cos θ ] T = [ a cos 2 θ + c sin 2 θ + b sin 2 θ 2 1 ( c − a ) sin 2 θ + b cos 2 θ 2 1 ( c − a ) sin 2 θ + b cos 2 θ a sin 2 θ + c cos 2 θ − b sin 2 θ ] 令 1 2 ( c − a ) sin 2 θ + b cos 2 θ = 0 \frac{1}{2} (c - a)\sin 2\theta +b\cos 2\theta = 0 2 1 ( c − a ) sin 2 θ + b cos 2 θ = 0 ,可得
a − c 2 b = cot 2 θ = 1 − tan 2 θ 2 tan θ . \frac {a - c}{2 b} = \cot 2 \theta = \frac {1 - \tan^ {2} \theta}{2 \tan \theta}. 2 b a − c = cot 2 θ = 2 tan θ 1 − tan 2 θ . 解得
tan θ = sign ( τ ) ∣ τ ∣ + 1 + τ 2 , τ = a − c 2 b . \tan \theta = \frac {\operatorname {s i g n} (\tau)}{| \tau | + \sqrt {1 + \tau^ {2}}}, \qquad \tau = \frac {a - c}{2 b}. tan θ = ∣ τ ∣ + 1 + τ 2 sign ( τ ) , τ = 2 b a − c . 故引理结论成立.
为了使得 A ( k ) A^{(k)} A ( k ) 收敛到一个对角矩阵, 其非对角线元素必须趋向于 0. 记 off ( A ) \operatorname{off}(A) off ( A ) 为所有非对角线元素的平方和, 即
off ( A ) = ∑ i ≠ j a i j 2 = ∥ A ∥ F 2 − ∑ i = 1 n a i i 2 , \operatorname {o f f} (A) = \sum_ {i \neq j} a _ {i j} ^ {2} = \| A \| _ {F} ^ {2} - \sum_ {i = 1} ^ {n} a _ {i i} ^ {2}, off ( A ) = i = j ∑ a ij 2 = ∥ A ∥ F 2 − i = 1 ∑ n a ii 2 , 我们的目标就是使得 off ( A ) \operatorname{off}(A) off ( A ) 尽快趋于 0.
引理5.2 设 A = [ a i j ] ∈ R n × n A = [a_{ij}] \in \mathbb{R}^{n \times n} A = [ a ij ] ∈ R n × n 是对称矩阵, A ^ = [ a ^ i j ] = J A J ⊤ , J = G ( i , j , θ ) \hat{A} = [\hat{a}_{ij}] = JAJ^{\top}, J = G(i,j,\theta) A ^ = [ a ^ ij ] = J A J ⊤ , J = G ( i , j , θ ) , 其中 θ \theta θ 的选取使得 a ^ i j = a ^ j i = 0 \hat{a}_{ij} = \hat{a}_{ji} = 0 a ^ ij = a ^ ji = 0 , 则
o f f ( A ^ ) = o f f ( A ) − 2 a i j 2 , i ≠ j . \mathrm {o f f} (\hat {A}) = \mathrm {o f f} (A) - 2 a _ {i j} ^ {2}, \quad i \neq j. off ( A ^ ) = off ( A ) − 2 a ij 2 , i = j . (板书)
证明. 记 A = [ a 1 , a 2 , … , a n ] A = [a_{1}, a_{2}, \ldots, a_{n}] A = [ a 1 , a 2 , … , a n ] . 令 A ~ = J A = [ a ~ i j ] n × n \tilde{A} = JA = [\tilde{a}_{ij}]_{n \times n} A ~ = J A = [ a ~ ij ] n × n . 由于 J J J 是正交阵, 故
∥ J a k ∥ 2 = ∥ a k ∥ 2 , k = 1 , 2 , … , n . \| J a _ {k} \| _ {2} = \| a _ {k} \| _ {2}, \quad k = 1, 2, \dots , n. ∥ J a k ∥ 2 = ∥ a k ∥ 2 , k = 1 , 2 , … , n . 又 J J J 左乘 a k a_{k} a k 时, 只影响其第 i i i 和第 j j j 个元素的值, 故由 ∥ J a i ∥ 2 = ∥ a i ∥ 2 \| Ja_i\|_2 = \| a_i\|_2 ∥ J a i ∥ 2 = ∥ a i ∥ 2 和 ∥ J a j ∥ 2 = ∥ a j ∥ 2 \| Ja_j\|_2 = \| a_j\|_2 ∥ J a j ∥ 2 = ∥ a j ∥ 2 可得
a ~ i i 2 + a ~ j i 2 = a i i 2 + a j i 2 , a ~ i j 2 + a ~ j j 2 = a i j 2 + a j j 2 . (5.1) \tilde {a} _ {i i} ^ {2} + \tilde {a} _ {j i} ^ {2} = a _ {i i} ^ {2} + a _ {j i} ^ {2}, \quad \tilde {a} _ {i j} ^ {2} + \tilde {a} _ {j j} ^ {2} = a _ {i j} ^ {2} + a _ {j j} ^ {2}. \tag {5.1} a ~ ii 2 + a ~ ji 2 = a ii 2 + a ji 2 , a ~ ij 2 + a ~ jj 2 = a ij 2 + a jj 2 . ( 5.1 ) 同理, 由 A ^ = A ~ J T \hat{A} = \tilde{A} J^{\mathsf{T}} A ^ = A ~ J T 可得
a ^ i i 2 + a ^ i j 2 = a ~ i i 2 + a ~ i j 2 , a ^ j i 2 + a ^ j j 2 = a ~ j i 2 + a ~ j j 2 . (5.2) \hat {a} _ {i i} ^ {2} + \hat {a} _ {i j} ^ {2} = \tilde {a} _ {i i} ^ {2} + \tilde {a} _ {i j} ^ {2}, \quad \hat {a} _ {j i} ^ {2} + \hat {a} _ {j j} ^ {2} = \tilde {a} _ {j i} ^ {2} + \tilde {a} _ {j j} ^ {2}. \tag {5.2} a ^ ii 2 + a ^ ij 2 = a ~ ii 2 + a ~ ij 2 , a ^ ji 2 + a ^ jj 2 = a ~ ji 2 + a ~ jj 2 . ( 5.2 ) 又 a ^ i j = a ^ j i = 0 \hat{a}_{ij} = \hat{a}_{ji} = 0 a ^ ij = a ^ ji = 0 ,故
a ^ i i 2 + a ^ j j 2 = a i i 2 + a j j 2 + a i j 2 + a j i 2 = a i i 2 + a j j 2 + 2 a i j 2 . \hat {a} _ {i i} ^ {2} + \hat {a} _ {j j} ^ {2} = a _ {i i} ^ {2} + a _ {j j} ^ {2} + a _ {i j} ^ {2} + a _ {j i} ^ {2} = a _ {i i} ^ {2} + a _ {j j} ^ {2} + 2 a _ {i j} ^ {2}. a ^ ii 2 + a ^ jj 2 = a ii 2 + a jj 2 + a ij 2 + a ji 2 = a ii 2 + a jj 2 + 2 a ij 2 . 由于 J A J T JAJ^{\mathsf{T}} J A J T 只影响 A A A 的第 i , j i,j i , j 行和第 i , j i,j i , j 列, 故对角线元素中只有 a i i a_{ii} a ii 和 a j j a_{jj} a jj 受影响. 所以
∑ k = 1 n a ^ k k 2 = ∑ k = 1 n a k k 2 + 2 a i j 2 , \sum_ {k = 1} ^ {n} \hat {a} _ {k k} ^ {2} = \sum_ {k = 1} ^ {n} a _ {k k} ^ {2} + 2 a _ {i j} ^ {2}, k = 1 ∑ n a ^ kk 2 = k = 1 ∑ n a kk 2 + 2 a ij 2 , 故
o f f ( A ^ ) = ∥ A ^ ∥ 2 2 − ∑ k = 1 n a ^ k k 2 = ∥ A ∥ 2 2 − ∑ k = 1 n a k k 2 − 2 a i j 2 = o f f ( A ) − 2 a i j 2 , \mathrm {o f f} (\hat {A}) = \| \hat {A} \| _ {2} ^ {2} - \sum_ {k = 1} ^ {n} \hat {a} _ {k k} ^ {2} = \| A \| _ {2} ^ {2} - \sum_ {k = 1} ^ {n} a _ {k k} ^ {2} - 2 a _ {i j} ^ {2} = \mathrm {o f f} (A) - 2 a _ {i j} ^ {2}, off ( A ^ ) = ∥ A ^ ∥ 2 2 − k = 1 ∑ n a ^ kk 2 = ∥ A ∥ 2 2 − k = 1 ∑ n a kk 2 − 2 a ij 2 = off ( A ) − 2 a ij 2 , 即引理结论成立.
由此可知, off ( A ( k ) ) \operatorname{off}(A^{(k)}) off ( A ( k ) ) 总是不断减小的. 下面给出 Jacobi 迭代算法
算法5.1.Jacobi迭代算法 1: Given a symmetric matrix A ∈ R n × n A \in \mathbb{R}^{n \times n} A ∈ R n × n 2: if eigenvectors are desired then 3: set J = I J = I J = I and f l a g = 1 flag = 1 f l a g = 1 4: end if 5: while not converge do 6: choose an index pair ( i , j ) (i,j) ( i , j ) such that a i j ≠ 0 a_{ij} \neq 0 a ij = 0 7: τ = ( a i i − a j j ) / ( 2 a i j ) \tau = (a_{ii} - a_{jj}) / (2a_{ij}) τ = ( a ii − a jj ) / ( 2 a ij ) 8: t = sign ( τ ) / ( ∣ τ ∣ + 1 + τ 2 ) t = \operatorname{sign}(\tau) / (|\tau| + \sqrt{1 + \tau^2}) t = sign ( τ ) / ( ∣ τ ∣ + 1 + τ 2 ) % \% % 计算 tan θ \tan \theta tan θ 9: c = 1 / 1 + t 2 , s = c t c = 1 / \sqrt{1 + t^2}, \quad s = ct c = 1/ 1 + t 2 , s = c t % 计算 cos θ \cos \theta cos θ 和 sin θ \sin \theta sin θ 10: A = G ( i , j , θ ) A G ( i , j , θ ) T A = G(i,j,\theta)AG(i,j,\theta)^{\mathsf{T}} A = G ( i , j , θ ) A G ( i , j , θ ) T 11: if f l a g = 1 flag = 1 f l a g = 1 then 12: J = G ( i , j , θ ) J J = G(i,j,\theta)J J = G ( i , j , θ ) J
13: end if 14: end while
该算法涉及到 a i j a_{ij} a ij 的选取问题, 一种直观的选取方法就是使得 a i j a_{ij} a ij 为所有非对角线元素中绝对值最大的一个, 这就是经典 Jacobi 迭代算法.
算法5.2.经典Jacobi迭代算法 1: Given a symmetric matrix A ∈ R n × n A \in \mathbb{R}^{n \times n} A ∈ R n × n 2: if eigenvectors are desired then 3: set J = I J = I J = I and f l a g = 1 flag = 1 f l a g = 1 4: end if 5: while off ( A ) > tol \operatorname{off}(A) > \operatorname{tol} off ( A ) > tol do 6: choose ( i , j ) (i,j) ( i , j ) such that ∣ a i j ∣ = max k ≠ l ∣ a k l ∣ |a_{ij}| = \max_{k\neq l}|a_{kl}| ∣ a ij ∣ = max k = l ∣ a k l ∣ % 选取绝对值最大的元素 7: τ = ( a i i − a j j ) / ( 2 a i j ) \tau = (a_{ii} - a_{jj}) / (2a_{ij}) τ = ( a ii − a jj ) / ( 2 a ij ) 8: t = s i g n ( τ ) / ( ∣ τ ∣ + 1 + τ 2 ) t = \mathrm{sign}(\tau) / (|\tau| + \sqrt{1 + \tau^2}) t = sign ( τ ) / ( ∣ τ ∣ + 1 + τ 2 ) 9: c = 1 / 1 + t 2 , s = c t c = 1 / \sqrt{1 + t^2}, \quad s = ct c = 1/ 1 + t 2 , s = c t % 计算 cos θ \cos \theta cos θ 和 sin θ \sin \theta sin θ 10: A = G ( i , j , θ ) A G ( i , j , θ ) T A = G(i,j,\theta)AG(i,j,\theta)^{\mathsf{T}} A = G ( i , j , θ ) A G ( i , j , θ ) T 11: if f l a g = 1 flag = 1 f l a g = 1 then 12: J = G ( i , j , θ ) J J = G(i,j,\theta)J J = G ( i , j , θ ) J 13: end if 14: end while
可以证明,经典Jacobi算法至少是线性收敛的
定理5.3 对于经典Jacobi算法5.2,有
o f f ( A ( k + 1 ) ) ≤ ( 1 − 1 N ) o f f ( A ( k ) ) , N = n ( n − 1 ) 2 . \mathrm {o f f} (A ^ {(k + 1)}) \leq \left(1 - \frac {1}{N}\right) \mathrm {o f f} (A ^ {(k)}), \quad N = \frac {n (n - 1)}{2}. off ( A ( k + 1 ) ) ≤ ( 1 − N 1 ) off ( A ( k ) ) , N = 2 n ( n − 1 ) . 故 k k k 步迭代后,有
off ( A ( k ) ) ≤ ( 1 − 1 N ) k off ( A ( 0 ) ) = ( 1 − 1 N ) k off ( A ) . \operatorname {o f f} (A ^ {(k)}) \leq \left(1 - \frac {1}{N}\right) ^ {k} \operatorname {o f f} (A ^ {(0)}) = \left(1 - \frac {1}{N}\right) ^ {k} \operatorname {o f f} (A). off ( A ( k ) ) ≤ ( 1 − N 1 ) k off ( A ( 0 ) ) = ( 1 − N 1 ) k off ( A ) . (板书)
证明. 由于在经典 Jacobi 算法 5.2 中, ∣ a i j ∣ = max k ≠ l ∣ a k l ∣ |a_{ij}| = \max_{k \neq l} |a_{kl}| ∣ a ij ∣ = max k = l ∣ a k l ∣ , 故 off ( A ( k ) ) ≤ n ( n + 1 ) ( a i j ( k ) ) 2 \operatorname{off}(A^{(k)}) \leq n(n + 1) \left(a_{ij}^{(k)}\right)^2 off ( A ( k ) ) ≤ n ( n + 1 ) ( a ij ( k ) ) 2 , 即
2 ( a i j ( k ) ) 2 ≥ 1 N o f f ( A ( k ) ) , N = n ( n − 1 ) 2 . 2 \left(a _ {i j} ^ {(k)}\right) ^ {2} \geq \frac {1}{N} \mathrm {o f f} (A ^ {(k)}), \quad N = \frac {n (n - 1)}{2}. 2 ( a ij ( k ) ) 2 ≥ N 1 off ( A ( k ) ) , N = 2 n ( n − 1 ) . 所以由引理5.2可知
off ( A ( k + 1 ) ) = off ( A ( k ) ) − ( a i j ( k ) ) 2 ≤ ( 1 − 1 N ) off ( A ( k ) ) . \operatorname {o f f} (A ^ {(k + 1)}) = \operatorname {o f f} (A ^ {(k)}) - \left(a _ {i j} ^ {(k)}\right) ^ {2} \leq \left(1 - \frac {1}{N}\right) \operatorname {o f f} (A ^ {(k)}). off ( A ( k + 1 ) ) = off ( A ( k ) ) − ( a ij ( k ) ) 2 ≤ ( 1 − N 1 ) off ( A ( k ) ) .
事实上, 经典 Jacobi 算法最终是 (渐进) 二次收敛的 [30, 100]
定理5.4经典Jacobi算法5.2是 N N N 步(渐进)二次收敛的,即对足够大的 k k k ,有
off ( A ( k + N ) ) = O ( off 2 ( A ( k ) ) ) . \operatorname {o f f} \left(A ^ {(k + N)}\right) = O \left(\operatorname {o f f} ^ {2} \left(A ^ {(k)}\right)\right). off ( A ( k + N ) ) = O ( off 2 ( A ( k ) ) ) . 由于在经典 Jacobi 算法中, 每一步都要寻找绝对值最大的非对角元, 比较费时, 因此实用性较差. 我们可以通过逐行扫描来选取 ( i , j ) (i,j) ( i , j ) , 这就是循环 Jacobi 迭代算法.
算法5.3. 循环Jacobi迭代算法(逐行扫描) 1: Given a symmetric matrix A ∈ R n × n A \in \mathbb{R}^{n \times n} A ∈ R n × n 2: if eigenvectors are desired then 3: set J = I J = I J = I and f l a g = 1 flag = 1 f l a g = 1 4: end if 5: while off ( A ) > t o l (A) > tol ( A ) > t o l do 6: for i = 1 i = 1 i = 1 to n − 1 n - 1 n − 1 do 7: for j = i + 1 j = i + 1 j = i + 1 to n n n do 8: if a i j ≠ 0 a_{ij} \neq 0 a ij = 0 then 9: τ = ( a i i − a j j ) / ( 2 a i j ) \tau = (a_{ii} - a_{jj}) / (2a_{ij}) τ = ( a ii − a jj ) / ( 2 a ij ) 10: t = s i g n ( τ ) / ( ∣ τ ∣ + 1 + τ 2 ) t = \mathrm{sign}(\tau) / (|\tau| + \sqrt{1 + \tau^2}) t = sign ( τ ) / ( ∣ τ ∣ + 1 + τ 2 ) 11: c = 1 / 1 + t 2 c = 1 / \sqrt{1 + t^2} c = 1/ 1 + t 2 12: s = c ⋅ t s = c \cdot t s = c ⋅ t 13: A = G ( i , j , θ ) T A G ( i , j , θ ) A = G(i, j, \theta)^{\mathrm{T}} AG(i, j, \theta) A = G ( i , j , θ ) T A G ( i , j , θ ) 14: if flag = 1 then 15: J = J ⋅ G ( i , j , θ ) J = J \cdot G(i, j, \theta) J = J ⋅ G ( i , j , θ ) 16: end if 17: end if 18: end for 19: end for 20: end while
循环 Jacobi 也具有 (渐进) 二次收敛性 [135, page 270].
Jacobi迭代法的优缺点 优点:能够达到很高的计算精度 (特别是小特征值); 同时非常适合并行计算.
缺点: 计算速度较慢; 矩阵稀疏性得不到充分的利用.