7.3_共轭梯度法

7.3 共轭梯度法

共轭梯度法 (Conjugate Gradient, CG) 是当前求解对称正定线性方程组的首选方法. CG 算法可以从优化角度得出, 也可以从代数角度推导. 我们这里就从代数角度来介绍 CG 算法.

7.3.1 算法基本过程

我们首先给出CG算法的一个性质,然后根据这个性质来推导CG算法.

定理7.5 设 AA 对称正定, 则

x(m)=argminxx(0)+KmxxA(7.9)x ^ {(m)} = \arg \min _ {x \in x ^ {(0)} + \mathcal {K} _ {m}} \| x - x _ {*} \| _ {A} \tag {7.9}

的充要条件是

x(m)x(0)+KmbAx(m)Km.(7.10)x ^ {(m)} \in x ^ {(0)} + \mathcal {K} _ {m} \quad \text {且} \quad b - A x ^ {(m)} \perp \mathcal {K} _ {m}. \tag {7.10}

(板书)

证明. 首先证明充分性. 设 x(m)x^{(m)} 满足 (7.10). 记 x~=x(m)x(0)\tilde{x} = x^{(m)} - x^{(0)} , 则 x~Km\tilde{x} \in \mathcal{K}_m

r0Ax~Km.r _ {0} - A \tilde {x} \perp \mathcal {K} _ {m}.

由正交投影的性质可知, x~\tilde{x} 是最佳逼近问题

minxKmxA1r0A\min _ {x \in \mathcal {K} _ {m}} \| x - A ^ {- 1} r _ {0} \| _ {A}

的解,即

x~=argminxKmxA1r(0)A=argminxKmxA1(bAx(0))A=argminxKm(x(0)+x)xA.\begin{array}{l} \tilde {x} = \arg \min _ {x \in \mathcal {K} _ {m}} \| x - A ^ {- 1} r ^ {(0)} \| _ {A} \\ = \arg \min _ {x \in \mathcal {K} _ {m}} \| x - A ^ {- 1} (b - A x ^ {(0)}) \| _ {A} \\ = \arg \min _ {x \in \mathcal {K} _ {m}} \| (x ^ {(0)} + x) - x _ {*} \| _ {A}. \\ \end{array}

所以, x(m)=x(0)+x~x^{(m)} = x^{(0)} + \tilde{x} 是最佳逼近问题

minxx(0)+KmxxA\min _ {x \in x ^ {(0)} + \mathcal {K} _ {m}} \| x - x _ {*} \| _ {A}

的解,即结论(7.9)成立

必要性只需利用正交投影的性质即可,见作业7.3

AA 对称正定时, Arnoldi 过程就转化为 Lanczos 过程, 且

AVm=Vm+1Tm+1,m=VmTm+βmvm+1emT,A V _ {m} = V _ {m + 1} T _ {m + 1, m} = V _ {m} T _ {m} + \beta_ {m} v _ {m + 1} e _ {m} ^ {\mathsf {T}},
VmTAVm=Tm,V _ {m} ^ {\mathsf {T}} A V _ {m} = T _ {m},

其中 Tm=tridiag(βi,αi+1,βi+1)T_{m} = \mathrm{tridiag}(\beta_{i},\alpha_{i + 1},\beta_{i + 1}) ,见(7.3).由定理7.5可知,此时我们需要在 x(0)+Kmx^{(0)} + \mathcal{K}_m 寻找最优近似解 x(m)x^{(m)} ,满足

bAx(m)Km.(7.11)b - A x ^ {(m)} \perp \mathcal {K} _ {m}. \tag {7.11}

根据这个性质, 我们就可以给出 CG 算法

x(m)=x(0)+Vmz(m)x^{(m)} = x^{(0)} + V_mz^{(m)} ,其中 z(m)Rmz^{(m)}\in \mathbb{R}^m .由(7.11)可知

0=VmT(bAx(m))=VmT(r0AVmz(m))=VmT(βv1)VmTAVmz(m)=βe1Tmz(m).0 = V _ {m} ^ {\mathsf {T}} (b - A x ^ {(m)}) = V _ {m} ^ {\mathsf {T}} (r _ {0} - A V _ {m} z ^ {(m)}) = V _ {m} ^ {\mathsf {T}} (\beta v _ {1}) - V _ {m} ^ {\mathsf {T}} A V _ {m} z ^ {(m)} = \beta e _ {1} - T _ {m} z ^ {(m)}.

因此,

z(m)=Tm1(βe1).z ^ {(m)} = T _ {m} ^ {- 1} (\beta e _ {1}).

于是可得

x(m)=x(0)+Vmz(m)=x(0)+VmTm1(βe1).x ^ {(m)} = x ^ {(0)} + V _ {m} z ^ {(m)} = x ^ {(0)} + V _ {m} T _ {m} ^ {- 1} (\beta e _ {1}).

如果 x(m)x^{(m)} 满足精度要求, 则计算结束. 否则令 mm+1m \gets m + 1 , 继续下一步迭代. 这就是 CG 方法的基本过程.

7.3.2 实用迭代格式

下面我们推导CG方法的具体实施过程.由于 AA 是对称正定的,因此我们可以借助三项递推公式来简化运算.

首先, 根据 TmT_{m} 对称正定性, 我们知道 TmT_{m} 存在 LDL\mathrm{LDL}^{\top} 分解, 即 Tm=LmDmLmT_{m} = L_{m}D_{m}L_{m}^{\top} . 于是

x(m)=x(0)+Vmz(m)=x(0)+VmTm1(βe1(m))=x(0)+(VmLmT)(βDm1Lm1e1(m)),x ^ {(m)} = x ^ {(0)} + V _ {m} z ^ {(m)} = x ^ {(0)} + V _ {m} T _ {m} ^ {- 1} (\beta e _ {1} ^ {(m)}) = x ^ {(0)} + (V _ {m} L _ {m} ^ {- T}) (\beta D _ {m} ^ {- 1} L _ {m} ^ {- 1} e _ {1} ^ {(m)}),

其中 e1(m)e_1^{(m)} 表示 mm 阶单位矩阵的第一列. 如果 x(m)x^{(m)} 满足精度要求, 则计算结束. 否则我们需要计算 x(m+1)x^{(m+1)} , 即

x(m+1)=x(0)+Vm+1Tm+11(βe1(m+1))=x(0)+(Vm+1Lm+1T)(βDm+11Lm+11e1(m+1)).x ^ {(m + 1)} = x ^ {(0)} + V _ {m + 1} T _ {m + 1} ^ {- 1} (\beta e _ {1} ^ {(m + 1)}) = x ^ {(0)} + (V _ {m + 1} L _ {m + 1} ^ {- T}) (\beta D _ {m + 1} ^ {- 1} L _ {m + 1} ^ {- 1} e _ {1} ^ {(m + 1)}).

这里假定 Tm+1T_{m + 1}LDLT\mathrm{LDL}^{\mathsf{T}} 分解为 Tm+1=Lm+1Dm+1Lm+1TT_{m + 1} = L_{m + 1}D_{m + 1}L_{m + 1}^{\mathsf{T}} . 下面就考虑如何利用递推方式来计算 x(m+1)x^{(m + 1)} . 记

P~mVmLmT=[p~1,p~2,,p~m]Rn×m,\tilde {P} _ {m} \triangleq V _ {m} L _ {m} ^ {- T} = \left[ \tilde {p} _ {1}, \tilde {p} _ {2}, \dots , \tilde {p} _ {m} \right] \in \mathbb {R} ^ {n \times m},
ymβDm1Lm1e1(m)=[η1,,ηm]Rm.y _ {m} \triangleq \beta D _ {m} ^ {- 1} L _ {m} ^ {- 1} e _ {1} ^ {(m)} = [ \eta_ {1}, \dots , \eta_ {m} ] ^ {\intercal} \in \mathbb {R} ^ {m}.

我们首先证明下面的结论

引理7.6 设 P~m\tilde{P}_mymy_m 由上面的式子所定义, 则下面的递推公式成立:

P~m+1Vm+1Lm+1T=[P~m,p~m+1],\tilde {P} _ {m + 1} \triangleq V _ {m + 1} L _ {m + 1} ^ {- T} = [ \tilde {P} _ {m}, \tilde {p} _ {m + 1} ],
ym+1βDm+11Lm+11e1(m+1)=[ymT,ηm+1]T,m=1,2,.y _ {m + 1} \triangleq \beta D _ {m + 1} ^ {- 1} L _ {m + 1} ^ {- 1} e _ {1} ^ {(m + 1)} = [ y _ {m} ^ {\mathsf {T}}, \eta_ {m + 1} ] ^ {\mathsf {T}}, \quad m = 1, 2, \ldots .

(板书)

证明. 设 TmT_{m}LDL\mathrm{LDL}^{\top} 分解为

Tm=LmDmLmT=[1l11lm11][d1d2dm][1l11lm11]T.T _ {m} = L _ {m} D _ {m} L _ {m} ^ {\mathsf {T}} = \left[ \begin{array}{c c c c} 1 & & & \\ l _ {1} & 1 & & \\ & \ddots & \ddots & \\ & & l _ {m - 1} & 1 \end{array} \right] \left[ \begin{array}{c c c c} d _ {1} & & & \\ & d _ {2} & & \\ & & \ddots & \\ & & & d _ {m} \end{array} \right] \left[ \begin{array}{c c c c} 1 & & & \\ l _ {1} & 1 & & \\ & \ddots & \ddots & \\ & & l _ {m - 1} & 1 \end{array} \right] ^ {\mathsf {T}}.

由待定系数法可知 d1=α1d_{1} = \alpha_{1}

li=βi/di,di+1=αi+1liβi,i=1,2,,m1.l _ {i} = \beta_ {i} / d _ {i}, \quad d _ {i + 1} = \alpha_ {i + 1} - l _ {i} \beta_ {i}, \quad i = 1, 2, \dots , m - 1.

γ=[0,,0,βm]TRm\gamma = [0,\dots ,0,\beta_{m}]^{\mathsf{T}}\in \mathbb{R}^{m} ,则 Tm+1T_{m + 1} 的LDL分解为

Tm+1=[TmγmγmTαm+1]=Lm+1Dm+1Lm+1T=[1l11lm11lm1][d1d2dmdm+1][1l11lm11lm1]T,\begin{array}{l} T _ {m + 1} = \left[ \begin{array}{c c} T _ {m} & \gamma_ {m} \\ \gamma_ {m} ^ {\mathsf {T}} & \alpha_ {m + 1} \end{array} \right] = L _ {m + 1} D _ {m + 1} L _ {m + 1} ^ {\mathsf {T}} \\ = \left[ \begin{array}{c c c c c} 1 & & & & \\ l _ {1} & 1 & & & \\ & \ddots & \ddots & & \\ & & l _ {m - 1} & 1 & \\ & & & l _ {m} & 1 \end{array} \right] \left[ \begin{array}{c c c c c} d _ {1} & & & & \\ & d _ {2} & & & \\ & & \ddots & & \\ & & & d _ {m} & \\ & & & & d _ {m + 1} \end{array} \right] \left[ \begin{array}{c c c c c} 1 & & & & \\ l _ {1} & 1 & & & \\ & \ddots & \ddots & & \\ & & l _ {m - 1} & 1 & \\ & & & l _ {m} & 1 \end{array} \right] ^ {\mathsf {T}}, \\ \end{array}

其中 lm=βm/dm,dm+1=αm+1lmβm.l_{m} = \beta_{m} / d_{m},d_{m + 1} = \alpha_{m + 1} - l_{m}\beta_{m}.γ~=[0,,0,lm]Rm\tilde{\gamma} = [0,\dots ,0,l_m]^\top \in \mathbb{R}^m ,则

Lm+1=[Lm0γ~T1]a n dLm+11=[Lm10γ~TLm11].L _ {m + 1} = \left[ \begin{array}{c c} L _ {m} & 0 \\ \tilde {\gamma} ^ {\mathsf {T}} & 1 \end{array} \right] \quad \text {a n d} \quad L _ {m + 1} ^ {- 1} = \left[ \begin{array}{c c} L _ {m} ^ {- 1} & 0 \\ - \tilde {\gamma} ^ {\mathsf {T}} L _ {m} ^ {- 1} & 1 \end{array} \right].

所以有

P~m+1=Vm+1Lm+1T=[Vm,vm+1][LmTLmTγ~01]=[VmLmT,VmLmTγ~+vm+1].\tilde {P} _ {m + 1} = V _ {m + 1} L _ {m + 1} ^ {- T} = [ V _ {m}, v _ {m + 1} ] \left[ \begin{array}{c c} L _ {m} ^ {- T} & - L _ {m} ^ {- T} \tilde {\gamma} \\ 0 & 1 \end{array} \right] = [ V _ {m} L _ {m} ^ {- T}, - V _ {m} L _ {m} ^ {- T} \tilde {\gamma} + v _ {m + 1} ].

VmLmTγ~=P~m[0,,0,lm]T=lmp~mV_{m}L_{m}^{-T}\tilde{\gamma} = \tilde{P}_{m}[0,\dots ,0,l_{m}]^{\mathsf{T}} = l_{m}\tilde{p}_{m} ,所以 P~m+1=[P~m,p~m+1]\tilde{P}_{m + 1} = \left[\tilde{P}_m,\tilde{p}_{m + 1}\right] ,其中

p~m+1=vm+1lmp~m.(7.12)\tilde {p} _ {m + 1} = v _ {m + 1} - l _ {m} \tilde {p} _ {m}. \tag {7.12}

另外,

ym+1=βDm+11Lm+11e1(m+1)=β[Dm100dm+11][Lm10γ~TLm11][e1(m)0]=[βDm1Lm1e1(m)βdm+11γ~Lm1e1(m)]=[ymηm+1],\begin{array}{l} y _ {m + 1} = \beta D _ {m + 1} ^ {- 1} L _ {m + 1} ^ {- 1} e _ {1} ^ {(m + 1)} = \beta \left[ \begin{array}{c c} D _ {m} ^ {- 1} & 0 \\ 0 & d _ {m + 1} ^ {- 1} \end{array} \right] \left[ \begin{array}{c c} L _ {m} ^ {- 1} & 0 \\ - \tilde {\gamma} ^ {\mathsf {T}} L _ {m} ^ {- 1} & 1 \end{array} \right] \left[ \begin{array}{c} e _ {1} ^ {(m)} \\ 0 \end{array} \right] \\ = \left[ \begin{array}{c} \beta D _ {m} ^ {- 1} L _ {m} ^ {- 1} e _ {1} ^ {(m)} \\ - \beta d _ {m + 1} ^ {- 1} \tilde {\gamma} L _ {m} ^ {- 1} e _ {1} ^ {(m)} \end{array} \right] \\ = \left[ \begin{array}{c} y _ {m} \\ \eta_ {m + 1} \end{array} \right], \\ \end{array}

其中 ηm+1βdm+11γ~Lm1e1(m)\eta_{m + 1}\triangleq -\beta d_{m + 1}^{-1}\tilde{\gamma} L_m^{-1}e_1^{(m)} .所以结论成立.

有了上面的性质, 我们就可以得到从 x(m)x^{(m)}x(m+1)x^{(m+1)} 的递推公式

x(m+1)=P~m+1ym+1=[P~m,p~m+1][ymηm+1]=x(m)+ηm+1p~m+1.x ^ {(m + 1)} = \tilde {P} _ {m + 1} y _ {m + 1} = [ \tilde {P} _ {m}, \tilde {p} _ {m + 1} ] \left[ \begin{array}{c} y _ {m} \\ \eta_ {m + 1} \end{array} \right] = x ^ {(m)} + \eta_ {m + 1} \tilde {p} _ {m + 1}.

为了判断近似解是否满足要求, 我们还需要计算残量. 直接计算可知, 残量满足下面的递推公式:

rm+1=bAx(m+1)=bA(x(m)+ηm+1p~m+1)=rmηm+1Ap~m+1.r _ {m + 1} = b - A x ^ {(m + 1)} = b - A \left(x ^ {(m)} + \eta_ {m + 1} \tilde {p} _ {m + 1}\right) = r _ {m} - \eta_ {m + 1} A \tilde {p} _ {m + 1}.

另一方面,我们有

rm=bAx(m)=bA(x(0)+Vmz(m))=r0AVmz(m)=βv1VmTmz(m)βmvm+1emTz(m)=βm(emTz(m))vm+1,\begin{array}{l} r _ {m} = b - A x ^ {(m)} = b - A \left(x ^ {(0)} + V _ {m} z ^ {(m)}\right) \\ = r _ {0} - A V _ {m} z ^ {(m)} \\ = \beta v _ {1} - V _ {m} T _ {m} z ^ {(m)} - \beta_ {m} v _ {m + 1} e _ {m} ^ {\mathsf {T}} z ^ {(m)} \\ = - \beta_ {m} (e _ {m} ^ {\mathsf {T}} z ^ {(m)}) v _ {m + 1}, \\ \end{array}

rmr_mvm+1v_{m + 1} 平行.记

rm=τmvm+1,m=0,1,2,,(7.13)r _ {m} = \tau_ {m} v _ {m + 1}, \quad m = 0, 1, 2, \dots , \tag {7.13}

其中

τ0=β=r02,τm=βm(emTz(m)),m=1,2,.\tau_ {0} = \beta = \| r _ {0} \| _ {2}, \quad \tau_ {m} = - \beta_ {m} \left(e _ {m} ^ {\mathsf {T}} z ^ {(m)}\right), \quad m = 1, 2, \dots .

引入变量

pmτm1p~m,m=1,2,.p _ {m} \triangleq \tau_ {m - 1} \tilde {p} _ {m}, \quad m = 1, 2, \dots .

(\refeq:1)(\ref{eq:1}) 和(7.12)可知, {pm}\{p_m\} 满足下面的递推公式:

pm+1=τmp~m+1=τm(vm+1lmp~m)=rm+μmpm,(7.14)p _ {m + 1} = \tau_ {m} \tilde {p} _ {m + 1} = \tau_ {m} \left(v _ {m + 1} - l _ {m} \tilde {p} _ {m}\right) = r _ {m} + \mu_ {m} p _ {m}, \tag {7.14}

其中 μm=lmτmτm1,m=1,2,\mu_{m} = -\frac{l_{m}\tau_{m}}{\tau_{m - 1}},m = 1,2,\dots

于是我们就得到递推公式 (m=1,2,)(m = 1,2,\ldots)

x(m+1)=x(m)+ηm+1p~m+1=x(m)+ξm+1pm+1,(7.15)x ^ {(m + 1)} = x ^ {(m)} + \eta_ {m + 1} \tilde {p} _ {m + 1} = x ^ {(m)} + \xi_ {m + 1} p _ {m + 1}, \tag {7.15}
rm+1=rmηm+1Ap~m+1=rmξm+1Apm+1,(7.16)r _ {m + 1} = r _ {m} - \eta_ {m + 1} A \tilde {p} _ {m + 1} = r _ {m} - \xi_ {m + 1} A p _ {m + 1}, \tag {7.16}

其中 ξm+1=ηm+1τm,m=1,2,.\xi_{m + 1} = \frac{\eta_{m + 1}}{\tau_m}, m = 1,2,\dots.

下面需要考虑系数 ξm+1\xi_{m + 1}μm\mu_{m} 的计算方法. 首先给出下面的性质

引理7.7 对于共轭梯度法,下面的结论成立:

(1) r1,r2,,rmr_1, r_2, \ldots, r_m 相互正交;
(2) p1,p2,,pmp_1, p_2, \ldots, p_m 相互 AA -共轭 (或 AA -正交), 即当 iji \neq j 时有 piApj=0p_i^\top A p_j = 0 .

(板书)

证明. (1) 由于 rkr_kvk+1v_{k+1} 平行, 所以结论成立.

(2) 只需证明 PmTAPmP_{m}^{\mathsf{T}} A P_{m} 是对角矩阵即可, 即证 P~mTAP~m\tilde{P}_{m}^{\mathsf{T}} A \tilde{P}_{m} 是对角矩阵. 通过直接计算可得

P~mTAP~m=(VmLmT)TAVmLmT=Lm1VmTAVmLmT=Lm1TmLmT=Lm1(LmDmLm)LmT=Dm.\begin{array}{l} \tilde {P} _ {m} ^ {\mathsf {T}} A \tilde {P} _ {m} = (V _ {m} L _ {m} ^ {- T}) ^ {\mathsf {T}} A V _ {m} L _ {m} ^ {- T} \\ = L _ {m} ^ {- 1} V _ {m} ^ {\mathsf {T}} A V _ {m} L _ {m} ^ {- T} \\ = L _ {m} ^ {- 1} T _ {m} L _ {m} ^ {- T} \\ = L _ {m} ^ {- 1} (L _ {m} D _ {m} L _ {m} ^ {\top}) L _ {m} ^ {- T} \\ = D _ {m}. \\ \end{array}

下面给出 ξm+1\xi_{m + 1}μm\mu_{m} 的计算公式. 在等式 (7.13) 两边同时左乘 pm+1TAp_{m + 1}^{\mathsf{T}}A 可得

pm+1Apm+1=pm+1Arm+μmpm+1Apm=rmApm+1.p _ {m + 1} ^ {\top} A p _ {m + 1} = p _ {m + 1} ^ {\top} A r _ {m} + \mu_ {m} p _ {m + 1} ^ {\top} A p _ {m} = r _ {m} ^ {\top} A p _ {m + 1}.

再用 rmTr_m^{\mathsf{T}} 左乘方程(7.15)可得

0=rmTrm+1=rmTrmξm+1rmTApm+1.0 = r _ {m} ^ {\mathsf {T}} r _ {m + 1} = r _ {m} ^ {\mathsf {T}} r _ {m} - \xi_ {m + 1} r _ {m} ^ {\mathsf {T}} A p _ {m + 1}.

于是

ξm+1=rmrmrmApm+1=rmrmpm+1Apm+1.(7.17)\xi_ {m + 1} = \frac {r _ {m} ^ {\top} r _ {m}}{r _ {m} ^ {\top} A p _ {m + 1}} = \frac {r _ {m} ^ {\top} r _ {m}}{p _ {m + 1} ^ {\top} A p _ {m + 1}}. \tag {7.17}

在等式 (7.13) 两边同时左乘 pmTAp_{m}^{\mathsf{T}} A 可得

0=pmApm+1=pmArm+μmpmApm.0 = p _ {m} ^ {\top} A p _ {m + 1} = p _ {m} ^ {\top} A r _ {m} + \mu_ {m} p _ {m} ^ {\top} A p _ {m}.

于是

μm=pmArmpmApm=rmApmpmApm.(7.18)\mu_ {m} = - \frac {p _ {m} ^ {\top} A r _ {m}}{p _ {m} ^ {\top} A p _ {m}} = - \frac {r _ {m} ^ {\top} A p _ {m}}{p _ {m} ^ {\top} A p _ {m}}. \tag {7.18}

为了进一步减少运算量, 我们还可以将上式进行简化. 用 rm+1Tr_{m+1}^{\mathsf{T}} 左乘方程 (7.15) 可得

rm+1Trm+1=rm+1Trmξm+1rm+1TApm+1=rmTrmpm+1TApm+1rm+1TApm+1.r _ {m + 1} ^ {\mathsf {T}} r _ {m + 1} = r _ {m + 1} ^ {\mathsf {T}} r _ {m} - \xi_ {m + 1} r _ {m + 1} ^ {\mathsf {T}} A p _ {m + 1} = - \frac {r _ {m} ^ {\mathsf {T}} r _ {m}}{p _ {m + 1} ^ {\mathsf {T}} A p _ {m + 1}} \cdot r _ {m + 1} ^ {\mathsf {T}} A p _ {m + 1}.

于是

rm+1TApm+1=rm+1Trm+1rmTrmpm+1TApm+1.r _ {m + 1} ^ {\mathsf {T}} A p _ {m + 1} = - \frac {r _ {m + 1} ^ {\mathsf {T}} r _ {m + 1}}{r _ {m} ^ {\mathsf {T}} r _ {m}} \cdot p _ {m + 1} ^ {\mathsf {T}} A p _ {m + 1}.

所以 rmTApm=rmTrmrm1Trm1pmTApm,r_m^{\mathsf{T}}Ap_m = -\frac{r_m^{\mathsf{T}}r_m}{r_{m - 1}^{\mathsf{T}}r_{m - 1}}\cdot p_m^{\mathsf{T}}Ap_m, 代入(7.17)可得

μm=rmApmpmApm=rmrmrm1rm1.(7.19)\mu_ {m} = - \frac {r _ {m} ^ {\top} A p _ {m}}{p _ {m} ^ {\top} A p _ {m}} = \frac {r _ {m} ^ {\top} r _ {m}}{r _ {m - 1} ^ {\top} r _ {m - 1}}. \tag {7.19}

综上, 由 (7.18), (7.13), (7.16) 和 (7.14), (7.15) 就构成了 CG 算法的迭代过程, 即

pm+1=rm+μmpm,其 中μm=rmrmrm1rm1,p _ {m + 1} = r _ {m} + \mu_ {m} p _ {m}, \quad \text {其 中} \quad \mu_ {m} = \frac {r _ {m} ^ {\top} r _ {m}}{r _ {m - 1} ^ {\top} r _ {m - 1}},
x(m+1)=x(m)+ξm+1pm+1,x ^ {(m + 1)} = x ^ {(m)} + \xi_ {m + 1} p _ {m + 1},
rm+1=rmξm+1Apm+1,其 中ξm+1=rmrmpm+1Apm+1.r _ {m + 1} = r _ {m} - \xi_ {m + 1} A p _ {m + 1}, \quad \text {其 中} \quad \xi_ {m + 1} = \frac {r _ {m} ^ {\top} r _ {m}}{p _ {m + 1} ^ {\top} A p _ {m + 1}}.

注意, 以上递推公式是从 m=1m = 1 开始的. 因此 m=0m = 0 时的计算公式需要另外推导

首先, 由 p~1\tilde{p}_1 的定义可知

p~1=P~1=V1L1T=v1.\tilde {p} _ {1} = \tilde {P} _ {1} = V _ {1} L _ {1} ^ {- T} = v _ {1}.

因此

p1=τ0p~1=βv1=r0.p _ {1} = \tau_ {0} \tilde {p} _ {1} = \beta v _ {1} = r _ {0}.

其次, 由 Lanczos 过程可知 T1=α1=v1Av1T_{1} = \alpha_{1} = v_{1}^{\top} A v_{1} . 注意到 β=r0r0\beta = r_{0}^{\top} r_{0} , 于是

x(1)=x(0)+V1T11(βe1)=x(0)+βv1Av1v1=x(0)+r0r0p1Ap1p1.x ^ {(1)} = x ^ {(0)} + V _ {1} T _ {1} ^ {- 1} (\beta e _ {1}) = x ^ {(0)} + \frac {\beta}{v _ {1} ^ {\top} A v _ {1}} v _ {1} = x ^ {(0)} + \frac {r _ {0} ^ {\top} r _ {0}}{p _ {1} ^ {\top} A p _ {1}} p _ {1}.

ξ1=r0r0p1Ap1\xi_{1} = \frac{r_{0}^{\top}r_{0}}{p_{1}^{\top}Ap_{1}} (注: 之前的 ξm+1\xi_{m+1} 计算公式 (7.16) 只对 m1m \geq 1 有定义), 则当 m=0m = 0 时关于 x(m+1)x^{(m+1)} 的递推公式仍然成立.

最后考虑残量. 易知

r1=bAx(1)=bAx(0)r0r0p1Ap1Ap1=r0ξ1Ap1,r _ {1} = b - A x ^ {(1)} = b - A x ^ {(0)} - \frac {r _ {0} ^ {\top} r _ {0}}{p _ {1} ^ {\top} A p _ {1}} A p _ {1} = r _ {0} - \xi_ {1} A p _ {1},

即当 m=0m = 0 时关于 rm+1r_{m + 1} 的递推公式也成立

于是,我们就可以给出下面的CG算法

算法7.7. 共轭梯度法(CG)

1: 选取初值 x(0)x^{(0)} , 停机准则 ε>0\varepsilon > 0 , 最大迭代步数 IterMax
2: r0=bAx(0)r_0 = b - Ax^{(0)}
3: β=r02\beta = \| r_0\| _2
4: if β<ε\beta < \varepsilon then
5: 停止迭代, 输出近似解 x(0)x^{(0)}
6: end if
7: for m=1m = 1 to IterMax do
8: ρ=rm1Trm1\rho = r_{m - 1}^{\mathsf{T}}r_{m - 1}
9: if m>1m > 1 then
10: μm1=ρ/ρ0\mu_{m - 1} = \rho /\rho_0
11: pm=rm1+μm1pm1p_{m} = r_{m - 1} + \mu_{m - 1}p_{m - 1}
12: else
13: pm=r0p_{m} = r_{0}
14: end if
15: qm=Apmq_{m} = Ap_{m}
16: ξm=ρ/(pmTqm)\xi_{m} = \rho /(p_{m}^{\mathsf{T}}q_{m})
17: x(m)=x(m1)+ξmpmx^{(m)} = x^{(m - 1)} + \xi_{m}p_{m}
18: rm=rm1ξmqmr_m = r_{m - 1} - \xi_mq_m
19: relres = ||rm||2/β
20: if relres <ε< \varepsilon then
21: 停止迭代, 输出近似解 x(m)x^{(m)}
22: end if
23: ρ0=ρ\rho_0 = \rho

24: end for
25: if relres <ε< \varepsilon then
26: 输出近似解 x(m)x^{(m)} 及相关信息
27: else
28: 输出算法失败信息
29: end if

CG算法的每个迭代步的主要运算为一个矩阵向量乘积和两个向量内积;