3.3_Bellman_optimality_equation

3.3 Bellman optimality equation

The tool for analyzing optimal policies and optimal state values is the Bellman optimality equation (BOE). By solving this equation, we can obtain optimal policies and optimal state values. We next present the expression of the BOE and then analyze it in detail.

For every sSs \in S , the elementwise expression of the BOE is

v(s)=maxπ(s)Π(s)aAπ(as)(rRp(rs,a)r+γsSp(ss,a)v(s))=maxπ(s)Π(s)aAπ(as)q(s,a),(3.1)\begin{array}{l} v (s) = \max _ {\pi (s) \in \Pi (s)} \sum_ {a \in \mathcal {A}} \pi (a | s) \left(\sum_ {r \in \mathcal {R}} p (r | s, a) r + \gamma \sum_ {s ^ {\prime} \in \mathcal {S}} p (s ^ {\prime} | s, a) v (s ^ {\prime})\right) \\ = \max _ {\pi (s) \in \Pi (s)} \sum_ {a \in \mathcal {A}} \pi (a | s) q (s, a), \tag {3.1} \\ \end{array}

where v(s),v(s)v(s), v(s') are unknown variables to be solved and

q(s,a)rRp(rs,a)r+γsSp(ss,a)v(s).q (s, a) \doteq \sum_ {r \in \mathcal {R}} p (r | s, a) r + \gamma \sum_ {s ^ {\prime} \in \mathcal {S}} p (s ^ {\prime} | s, a) v (s ^ {\prime}).

Here, π(s)\pi(s) denotes a policy for state ss , and Π(s)\Pi(s) is the set of all possible policies for ss .

The BOE is an elegant and powerful tool for analyzing optimal policies. However, it may be nontrivial to understand this equation. For example, this equation has two unknown variables v(s)v(s) and π(as)\pi(a|s) . It may be confusing to beginners how to solve two unknown variables from one equation. Moreover, the BOE is actually a special Bellman equation. However, it is nontrivial to see that since its expression is quite different from that of the Bellman equation. We also need to answer the following fundamental questions about the BOE.

\diamond Existence: Does this equation have a solution?
Uniqueness: Is the solution unique?
Algorithm: How to solve this equation?
\diamond Optimality: How is the solution related to optimal policies?

Once we can answer these questions, we will clearly understand optimal state values and optimal policies.

3.3.1 Maximization of the right-hand side of the BOE

We next clarify how to solve the maximization problem on the right-hand side of the BOE in (3.1). At first glance, it may be confusing to beginners how to solve two unknown variables v(s)v(s) and π(as)\pi(a|s) from one equation. In fact, these two unknown variables can be solved one by one. This idea is illustrated by the following example.

Example 3.1. Consider two unknown variables x,yRx, y \in \mathbb{R} that satisfy

x=maxyR(2x1y2).x = \max _ {y \in \mathbb {R}} (2 x - 1 - y ^ {2}).

The first step is to solve yy on the right-hand side of the equation. Regardless of the value of xx , we always have maxy(2x1y2)=2x1\max_y (2x - 1 - y^2) = 2x - 1 , where the maximum is achieved when y=0y = 0 . The second step is to solve xx . When y=0y = 0 , the equation becomes x=2x1x = 2x - 1 , which leads to x=1x = 1 . Therefore, y=0y = 0 and x=1x = 1 are the solutions of the equation.

We now turn to the maximization problem on the right-hand side of the BOE. The BOE in (3.1) can be written concisely as

v(s)=maxπ(s)Π(s)aAπ(as)q(s,a),sS.v (s) = \max _ {\pi (s) \in \Pi (s)} \sum_ {a \in \mathcal {A}} \pi (a | s) q (s, a), \quad s \in \mathcal {S}.

Inspired by Example 3.1, we can first solve the optimal π\pi on the right-hand side. How to do that? The following example demonstrates its basic idea.

Example 3.2. Given q1,q2,q3Rq_{1}, q_{2}, q_{3} \in \mathbb{R} , we would like to find the optimal values of c1,c2,c3c_{1}, c_{2}, c_{3} to maximize

i=13ciqi=c1q1+c2q2+c3q3,\sum_ {i = 1} ^ {3} c _ {i} q _ {i} = c _ {1} q _ {1} + c _ {2} q _ {2} + c _ {3} q _ {3},

where c1+c2+c3=1c_{1} + c_{2} + c_{3} = 1 and c1,c2,c30c_{1},c_{2},c_{3}\geq 0

Without loss of generality, suppose that q3q1,q2q_{3} \geq q_{1}, q_{2} . Then, the optimal solution is c3=1c_{3}^{*} = 1 and c1=c2=0c_{1}^{*} = c_{2}^{*} = 0 . This is because

q3=(c1+c2+c3)q3=c1q3+c2q3+c3q3c1q1+c2q2+c3q3q _ {3} = (c _ {1} + c _ {2} + c _ {3}) q _ {3} = c _ {1} q _ {3} + c _ {2} q _ {3} + c _ {3} q _ {3} \geq c _ {1} q _ {1} + c _ {2} q _ {2} + c _ {3} q _ {3}

for any c1,c2,c3c_{1},c_{2},c_{3}

Inspired by the above example, since aπ(as)=1\sum_{a}\pi (a|s) = 1 , we have

aAπ(as)q(s,a)aAπ(as)maxaAq(s,a)=maxaAq(s,a),\sum_ {a \in \mathcal {A}} \pi (a | s) q (s, a) \leq \sum_ {a \in \mathcal {A}} \pi (a | s) \max _ {a \in \mathcal {A}} q (s, a) = \max _ {a \in \mathcal {A}} q (s, a),

where equality is achieved when

π(as)={1,a=a,0,aa.\pi (a | s) = \left\{ \begin{array}{l l} 1, & a = a ^ {*}, \\ 0, & a \neq a ^ {*}. \end{array} \right.

Here, a=argmaxaq(s,a)a^* = \arg \max_a q(s, a) . In summary, the optimal policy π(s)\pi(s) is the one that selects the action that has the greatest value of q(s,a)q(s, a) .

3.3.2 Matrix-vector form of the BOE

The BOE refers to a set of equations defined for all states. If we combine these equations, we can obtain a concise matrix-vector form, which will be extensively used in this chapter.

The matrix-vector form of the BOE is

v=maxπΠ(rπ+γPπv),(3.2)v = \max _ {\pi \in \Pi} (r _ {\pi} + \gamma P _ {\pi} v), \tag {3.2}

where vRSv \in \mathbb{R}^{|S|} and maxπ\max_{\pi} is performed in an elementwise manner. The structures of rπr_{\pi} and PπP_{\pi} are the same as those in the matrix-vector form of the normal Bellman equation:

[rπ]saAπ(as)rRp(rs,a)r,[Pπ]s,s=p(ss)aAπ(as)p(ss,a).[ r _ {\pi} ] _ {s} \doteq \sum_ {a \in \mathcal {A}} \pi (a | s) \sum_ {r \in \mathcal {R}} p (r | s, a) r, \qquad [ P _ {\pi} ] _ {s, s ^ {\prime}} = p (s ^ {\prime} | s) \doteq \sum_ {a \in \mathcal {A}} \pi (a | s) p (s ^ {\prime} | s, a).

Since the optimal value of π\pi is determined by vv , the right-hand side of (3.2) is a function of vv , denoted as

f(v)maxπΠ(rπ+γPπv).f (v) \doteq \max _ {\pi \in \Pi} (r _ {\pi} + \gamma P _ {\pi} v).

Then, the BOE can be expressed in a concise form as

v=f(v).(3.3)v = f (v). \tag {3.3}

In the remainder of this section, we show how to solve this nonlinear equation.

3.3.3 Contraction mapping theorem

Since the BOE can be expressed as a nonlinear equation v=f(v)v = f(v) , we next introduce the contraction mapping theorem [6] to analyze it. The contraction mapping theorem is a powerful tool for analyzing general nonlinear equations. It is also known as the fixed-point theorem. Readers who already know this theorem can skip this part. Otherwise, the reader is advised to be familiar with this theorem since it is the key to analyzing the

BOE.

Consider a function f(x)f(x) , where xRdx \in \mathbb{R}^d and f:RdRdf: \mathbb{R}^d \to \mathbb{R}^d . A point xx^* is called a fixed point if

f(x)=x.f (x ^ {*}) = x ^ {*}.

The interpretation of the above equation is that the map of xx^{*} is itself. This is the reason why xx^{*} is called "fixed". The function ff is a contraction mapping (or contractive function) if there exists γ(0,1)\gamma \in (0,1) such that

f(x1)f(x2)γx1x2\left\| f (x _ {1}) - f (x _ {2}) \right\| \leq \gamma \| x _ {1} - x _ {2} \|

for any x1,x2Rdx_1, x_2 \in \mathbb{R}^d . In this book, \| \cdot \| denotes a vector or matrix norm.

Example 3.3. We present three examples to demonstrate fixed points and contraction mappings.

x=f(x)=0.5xx = f(x) = 0.5x xRx\in \mathbb{R}

It is easy to verify that x=0x = 0 is a fixed point since 0=0.500 = 0.5 \cdot 0 . Moreover, f(x)=0.5xf(x) = 0.5x is a contraction mapping because 0.5x10.5x2=0.5x1x2γx1x2\| 0.5x_1 - 0.5x_2 \| = 0.5 \| x_1 - x_2 \| \leq \gamma \| x_1 - x_2 \| for any γ[0.5,1)\gamma \in [0.5, 1) .

x=f(x)=Axx = f(x) = Ax , where xRn,ARn×nx\in \mathbb{R}^n,A\in \mathbb{R}^{n\times n} and Aγ<1\| A\| \leq \gamma < 1

It is easy to verify that x=0x = 0 is a fixed point since 0=A00 = A0 . To see the contraction property, Ax1Ax2=A(x1x2)Ax1x2γx1x2\| Ax_1 - Ax_2 \| = \| A(x_1 - x_2) \| \leq \| A \| \| x_1 - x_2 \| \leq \gamma \| x_1 - x_2 \| . Therefore, f(x)=Axf(x) = Ax is a contraction mapping.

x=f(x)=0.5sinx,xR.x = f(x) = 0.5\sin x,x\in \mathbb{R}.

It is easy to see that x=0x = 0 is a fixed point since 0=0.5sin00 = 0.5\sin 0 . Moreover, it follows from the mean value theorem [7, 8] that

0.5sinx10.5sinx2x1x2=0.5cosx30.5,x3[x1,x2].\left| \frac {0 . 5 \sin x _ {1} - 0 . 5 \sin x _ {2}}{x _ {1} - x _ {2}} \right| = | 0. 5 \cos x _ {3} | \leq 0. 5, \quad x _ {3} \in [ x _ {1}, x _ {2} ].

As a result, 0.5sinx10.5sinx20.5x1x2|0.5\sin x_1 - 0.5\sin x_2| \leq 0.5|x_1 - x_2| and hence f(x)=0.5sinxf(x) = 0.5\sin x is a contraction mapping.

The relationship between a fixed point and the contraction property is characterized by the following classic theorem.

Theorem 3.1 (Contraction mapping theorem). For any equation that has the form x=f(x)x = f(x) where xx and f(x)f(x) are real vectors, if ff is a contraction mapping, then the following properties hold.

\diamond Existence: There exists a fixed point xx^{*} satisfying f(x)=xf(x^{*}) = x^{*} .

Uniqueness: The fixed point xx^{*} is unique.
\diamond Algorithm: Consider the iterative process:

xk+1=f(xk),x _ {k + 1} = f (x _ {k}),

where k=0,1,2,k = 0,1,2,\ldots . Then, xkxx_{k}\to x^{*} as kk\to \infty for any initial guess x0x_0 . Moreover, the convergence rate is exponentially fast.

The contraction mapping theorem not only can tell whether the solution of a nonlinear equation exists but also suggests a numerical algorithm for solving the equation. The proof of the theorem is given in Box 3.1.

The following example demonstrates how to calculate the fixed points of some equations using the iterative algorithm suggested by the contraction mapping theorem.

Example 3.4. Let us revisit the abovementioned examples: x=0.5xx = 0.5x , x=Axx = Ax , and x=0.5sinxx = 0.5\sin x . While it has been shown that the right-hand sides of these three equations are all contraction mappings, it follows from the contraction mapping theorem that they each have a unique fixed point, which can be easily verified to be x=0x^{*} = 0 . Moreover, the fixed points of the three equations can be iteratively solved by the following algorithms:

xk+1=0.5xk,x _ {k + 1} = 0. 5 x _ {k},
xk+1=Axk,x _ {k + 1} = A x _ {k},
xk+1=0.5sinxk,x _ {k + 1} = 0. 5 \sin x _ {k},

given any initial guess x0x_0 .

Box 3.1: Proof of the contraction mapping theorem

Part 1: We prove that the sequence {xk}k=1\{x_{k}\}_{k = 1}^{\infty} with xk=f(xk1)x_{k} = f(x_{k - 1}) is convergent.

The proof relies on Cauchy sequences. A sequence x1,x2,x_{1}, x_{2}, \ldots is called Cauchy if for any small ε>0\varepsilon > 0 , there exists NN such that xmxn<ε\| x_{m} - x_{n} \| < \varepsilon for all m,n>Nm, n > N . The intuitive interpretation is that there exists a finite integer NN such that all the elements after NN are sufficiently close to each other. Cauchy sequences are important because it is guaranteed that a Cauchy sequence converges to a limit. Its convergence property will be used to prove the contraction mapping theorem. Note that we must have xmxn<ε\| x_{m} - x_{n} \| < \varepsilon for all m,n>Nm, n > N . If we simply have xn+1xn0x_{n+1} - x_{n} \to 0 , it is insufficient to claim that the sequence is a Cauchy sequence. For example, it holds that xn+1xn0x_{n+1} - x_{n} \to 0 for xn=nx_{n} = \sqrt{n} , but apparently, xn=nx_{n} = \sqrt{n} diverges.

We next show that {xk=f(xk1)}k=1\{x_{k} = f(x_{k - 1})\}_{k = 1}^{\infty} is a Cauchy sequence and hence converges.

First, since ff is a contraction mapping, we have

xk+1xk=f(xk)f(xk1)γxkxk1.\left\| x _ {k + 1} - x _ {k} \right\| = \left\| f (x _ {k}) - f (x _ {k - 1}) \right\| \leq \gamma \left\| x _ {k} - x _ {k - 1} \right\|.

Similarly, we have xkxk1γxk1xk2\| x_{k} - x_{k - 1}\| \leq \gamma \| x_{k - 1} - x_{k - 2}\| , \dots , x2x1γx1x0\| x_{2} - x_{1}\| \leq \gamma \| x_{1} - x_{0}\| . Thus, we have

xk+1xkγxkxk1γ2xk1xk2γkx1x0.\begin{array}{l} \left\| x _ {k + 1} - x _ {k} \right\| \leq \gamma \left\| x _ {k} - x _ {k - 1} \right\| \\ \leq \gamma^ {2} \| x _ {k - 1} - x _ {k - 2} \| \\ \begin{array}{c} \bullet \\ \bullet \\ \bullet \end{array} \\ \leq \gamma^ {k} \| x _ {1} - x _ {0} \|. \\ \end{array}

Since γ<1\gamma < 1 , we know that xk+1xk\| x_{k + 1} - x_k\| converges to zero exponentially fast as kk\to \infty given any x1,x0x_{1},x_{0} . Notably, the convergence of {xk+1xk}\{\| x_{k + 1} - x_k\|\} is not sufficient for implying the convergence of {xk}\{x_{k}\} . Therefore, we need to further consider xmxn\| x_m - x_n\| for any m>nm > n . In particular,

xmxn=xmxm1+xm1xn+1+xn+1xnxmxm1++xn+1xnγm1x1x0++γnx1x0=γn(γm1n++1)x1x0γn(1++γm1n+γmn+γmn+1+)x1x0=γn1γx1x0.(3.4)\begin{array}{l} \left\| x _ {m} - x _ {n} \right\| = \left\| x _ {m} - x _ {m - 1} + x _ {m - 1} - \dots - x _ {n + 1} + x _ {n + 1} - x _ {n} \right\| \\ \leq \left\| x _ {m} - x _ {m - 1} \right\| + \dots + \left\| x _ {n + 1} - x _ {n} \right\| \\ \leq \gamma^ {m - 1} \| x _ {1} - x _ {0} \| + \dots + \gamma^ {n} \| x _ {1} - x _ {0} \| \\ = \gamma^ {n} \left(\gamma^ {m - 1 - n} + \dots + 1\right) \| x _ {1} - x _ {0} \| \\ \leq \gamma^ {n} (1 + \dots + \gamma^ {m - 1 - n} + \gamma^ {m - n} + \gamma^ {m - n + 1} + \dots) \| x _ {1} - x _ {0} \| \\ = \frac {\gamma^ {n}}{1 - \gamma} \| x _ {1} - x _ {0} \|. \tag {3.4} \\ \end{array}

As a result, for any ε\varepsilon , we can always find NN such that xmxn<ε\|x_m - x_n\| < \varepsilon for all m,n>Nm, n > N . Therefore, this sequence is Cauchy and hence converges to a limit point denoted as x=limkxkx^* = \lim_{k \to \infty} x_k .

Part 2: We show that the limit x=limkxkx^{*} = \lim_{k\to \infty}x_{k} is a fixed point. To do that, since

f(xk)xk=xk+1xkγkx1x0,\left\| f (x _ {k}) - x _ {k} \right\| = \left\| x _ {k + 1} - x _ {k} \right\| \leq \gamma^ {k} \| x _ {1} - x _ {0} \|,

we know that f(xk)xk\| f(x_k) - x_k\| converges to zero exponentially fast. Hence, we have f(x)=xf(x^{*}) = x^{*} at the limit.

Part 3: We show that the fixed point is unique. Suppose that there is another fixed point xx' satisfying f(x)=xf(x') = x' . Then,

xx=f(x)f(x)γxx.\left\| x ^ {\prime} - x ^ {*} \right\| = \left\| f (x ^ {\prime}) - f (x ^ {*}) \right\| \leq \gamma \left\| x ^ {\prime} - x ^ {*} \right\|.

Since γ<1\gamma < 1 , this inequality holds if and only if xx=0\| x' - x^* \| = 0 . Therefore, x=xx' = x^* .

Part 4: We show that xkx_{k} converges to xx^{*} exponentially fast. Recall that xmxnγn1γx1x0\| x_{m} - x_{n}\| \leq \frac{\gamma^{n}}{1 - \gamma}\| x_{1} - x_{0}\| , as proven in (3.4). Since mm can be arbitrarily large, we have

xxn=limmxmxnγn1γx1x0.\| x ^ {*} - x _ {n} \| = \lim _ {m \to \infty} \| x _ {m} - x _ {n} \| \leq \frac {\gamma^ {n}}{1 - \gamma} \| x _ {1} - x _ {0} \|.

Since γ<1\gamma < 1 , the error converges to zero exponentially fast as nn \to \infty .

3.3.4 Contraction property of the right-hand side of the BOE

We next show that f(v)f(v) in the BOE in (3.3) is a contraction mapping. Thus, the contraction mapping theorem introduced in the previous subsection can be applied.

Theorem 3.2 (Contraction property of f(v)f(v) ). The function f(v)f(v) on the right-hand side of the BOE in (3.3) is a contraction mapping. In particular, for any v1,v2RSv_{1}, v_{2} \in \mathbb{R}^{|S|} , it holds that

f(v1)f(v2)γv1v2,\left\| f (v _ {1}) - f (v _ {2}) \right\| _ {\infty} \leq \gamma \left\| v _ {1} - v _ {2} \right\| _ {\infty},

where γ(0,1)\gamma \in (0,1) is the discount rate, and \| \cdot \|_{\infty} is the maximum norm, which is the maximum absolute value of the elements of a vector.

The proof of the theorem is given in Box 3.2. This theorem is important because we can use the powerful contraction mapping theorem to analyze the BOE.

Box 3.2: Proof of Theorem 3.2

Consider any two vectors v1,v2RSv_{1}, v_{2} \in \mathbb{R}^{|S|} , and suppose that π1argmaxπ(rπ+γPπv1)\pi_1^* \doteq \arg \max_{\pi}(r_{\pi} + \gamma P_{\pi}v_{1}) and π2argmaxπ(rπ+γPπv2)\pi_2^* \doteq \arg \max_{\pi}(r_{\pi} + \gamma P_{\pi}v_{2}) . Then,

f(v1)=maxπ(rπ+γPπv1)=rπ1+γPπ1v1rπ2+γPπ2v1,f (v _ {1}) = \max _ {\pi} (r _ {\pi} + \gamma P _ {\pi} v _ {1}) = r _ {\pi_ {1} ^ {*}} + \gamma P _ {\pi_ {1} ^ {*}} v _ {1} \geq r _ {\pi_ {2} ^ {*}} + \gamma P _ {\pi_ {2} ^ {*}} v _ {1},
f(v2)=maxπ(rπ+γPπv2)=rπ2+γPπ2v2rπ1+γPπ1v2,f (v _ {2}) = \max _ {\pi} (r _ {\pi} + \gamma P _ {\pi} v _ {2}) = r _ {\pi_ {2} ^ {*}} + \gamma P _ {\pi_ {2} ^ {*}} v _ {2} \geq r _ {\pi_ {1} ^ {*}} + \gamma P _ {\pi_ {1} ^ {*}} v _ {2},

where \geq is an elementwise comparison. As a result,

f(v1)f(v2)=rπ1+γPπ1v1(rπ2+γPπ2v2)rπ1+γPπ1v1(rπ1+γPπ1v2)=γPπ1(v1v2).\begin{array}{l} f (v _ {1}) - f (v _ {2}) = r _ {\pi_ {1} ^ {*}} + \gamma P _ {\pi_ {1} ^ {*}} v _ {1} - (r _ {\pi_ {2} ^ {*}} + \gamma P _ {\pi_ {2} ^ {*}} v _ {2}) \\ \leq r _ {\pi_ {1} ^ {*}} + \gamma P _ {\pi_ {1} ^ {*}} v _ {1} - \left(r _ {\pi_ {1} ^ {*}} + \gamma P _ {\pi_ {1} ^ {*}} v _ {2}\right) \\ = \gamma P _ {\pi_ {1} ^ {*}} (v _ {1} - v _ {2}). \\ \end{array}

Similarly, it can be shown that f(v2)f(v1)γPπ2(v2v1)f(v_{2}) - f(v_{1}) \leq \gamma P_{\pi_{2}^{*}}(v_{2} - v_{1}) . Therefore,

γPπ2(v1v2)f(v1)f(v2)γPπ1(v1v2).\gamma P _ {\pi_ {2} ^ {*}} (v _ {1} - v _ {2}) \leq f (v _ {1}) - f (v _ {2}) \leq \gamma P _ {\pi_ {1} ^ {*}} (v _ {1} - v _ {2}).

Define

zmax{γPπ2(v1v2),γPπ1(v1v2)}RS,z \doteq \max \left\{\left| \gamma P _ {\pi_ {2} ^ {*}} (v _ {1} - v _ {2}) \right|, \left| \gamma P _ {\pi_ {1} ^ {*}} (v _ {1} - v _ {2}) \right| \right\} \in \mathbb {R} ^ {| S |},

where max(),,\max (\cdot),|\cdot |, and \geq are all elementwise operators. By definition, z0z\geq 0 .On the one hand, it is easy to see that

zγPπ2(v1v2)f(v1)f(v2)γPπ1(v1v2)z,- z \leq \gamma P _ {\pi_ {2} ^ {*}} (v _ {1} - v _ {2}) \leq f (v _ {1}) - f (v _ {2}) \leq \gamma P _ {\pi_ {1} ^ {*}} (v _ {1} - v _ {2}) \leq z,

which implies

f(v1)f(v2)z.| f (v _ {1}) - f (v _ {2}) | \leq z.

It then follows that

f(v1)f(v2)z,(3.5)\left\| f \left(v _ {1}\right) - f \left(v _ {2}\right) \right\| _ {\infty} \leq \| z \| _ {\infty}, \tag {3.5}

where \| \cdot \|_{\infty} is the maximum norm.

On the other hand, suppose that ziz_{i} is the ii th entry of zz , and piTp_{i}^{T} and qiTq_{i}^{T} are the ii th row of Pπ1P_{\pi_1^*} and Pπ2P_{\pi_2^*} , respectively. Then,

zi=max{γpiT(v1v2),γqiT(v1v2)}.z _ {i} = \max \{\gamma | p _ {i} ^ {T} (v _ {1} - v _ {2}) |, \gamma | q _ {i} ^ {T} (v _ {1} - v _ {2}) | \}.

Since pip_i is a vector with all nonnegative elements and the sum of the elements is equal to one, it follows that

piT(v1v2)piTv1v2v1v2.| p _ {i} ^ {T} (v _ {1} - v _ {2}) | \leq p _ {i} ^ {T} | v _ {1} - v _ {2} | \leq \| v _ {1} - v _ {2} \| _ {\infty}.

Similarly, we have qiT(v1v2)v1v2|q_i^T (v_1 - v_2)| \leq \| v_1 - v_2\|_\infty . Therefore, ziγv1v2z_{i} \leq \gamma \| v_{1} - v_{2}\|_{\infty} and hence

z=maxiziγv1v2.\left\| z \right\| _ {\infty} = \max _ {i} \left| z _ {i} \right| \leq \gamma \left\| v _ {1} - v _ {2} \right\| _ {\infty}.

Substituting this inequality to (3.5) gives

f(v1)f(v2)γv1v2,\left\| f (v _ {1}) - f (v _ {2}) \right\| _ {\infty} \leq \gamma \left\| v _ {1} - v _ {2} \right\| _ {\infty},

which concludes the proof of the contraction property of f(v)f(v) .