Language learning | English » Fekete Imre - Numerical methods for the linear algebraic systems with M-matrices

Please log in to read this in our online viewer!

Fekete Imre - Numerical methods for the linear algebraic systems with M-matrices

Please log in to read this in our online viewer!


 2010 · 43 page(s)  (348 KB)    English    22    March 06 · 2011  
       
Comments

No comments yet. You can be the first!

Content extract

http://www.doksihu Numerical methods for the linear algebraic systems with M-matrices B.Sc Thesis by Imre Fekete Mathematics B.Sc, Mathematical Analyst Supervisor: István Faragó Associate Professor and Head of the Department of Applied Analysis and Computational Mathematics Eötvös Loránd University Budapest 2010 http://www.doksihu Acknowledgement I am heartily thankful to my supervisor, István Faragó, whose encouragement, guidance and support from the initial to the nal level enabled me to develop an understanding of the subject. http://www.doksihu Contents 1 Introduction 4 2 Mathematical Background 5 2.1 About the systems of linear algebraic equations 5 2.2 Numerical methods for the solution of the systems of linear algebraic equations 6 2.3 Connection between the condition numbers and the systems of linear algebraic equations . 10 2.31 When b is charged with error 10 2.32 When A is

charged with error 12 3 Nonnegative Matrices 14 3.1 Bounds for the spectral radius of a matrix 14 3.2 Spectral radius of nonnegative matrices 17 3.3 Reducible Matrices 22 4 M-matrices 24 4.1 Nonsingular M-matrices 24 4.2 Theorem 41 25 5 Iterative methods for the linear algebraic systems with Mmatrices 32 5.1 Basic iterative methods 5.11 Jacobi and Gauss-Seidel method 5.12 SOR method 5.2 Convergence 6 Summary 33 33 36 37 42 3 http://www.doksihu 1 Introduction The idea of solving large systems of linear equations by directly or iterative methods is certainly not new, dating back at least to Gauss1 . Very often problems in the biological, physical, economic and social sciences can be reduced to problems involving matrices which have some special structure. One of

the most common situations is where the matrix A in question has nonpositive o-diagonal and positive diagonal entries, that is, A is a nite matrix of the type   a11   −a21  A=  −a31  . . −a12 −a13 ··· a22 −a23 −a32 a33  ···   , ···   . . . . . (1.1) . where aij are nonnegative. Such matrix (1.1) often occur in the real life modelling Thus, our main aim is to get the hang of dierent methods for system of linear algebraic equations. This thesis consists of four main sections. The rst section includes the mathematical background on the system of linear algebraic equations In this section we show that the reliable numerical solution of some system of linear algebraic equations depends on the condition number of the system. We also show that the condition number is important for the realization of the method on computer. We get acquainted with nonnegative matrices. We consider all the reducible and irreducible

cases and we add bounds for the spectral radius of nonegative matrcies based on Perron-Frobenius theorem. We show why the spectral radius is important to solve these equations. We realize that these type of matrices play very important role to set up the theory of M -matrices. Afterwards we formulate a lot of properties of M -matrices and we give 50 equivalent conditions for A being a nonsingular M -matrix. These statements help us in the understanding the qualitative properties of the iterative methods. Finally we formulate dierent iterative methods for M -matrices. For the better understanding in each single section we give examples. 1 Johann Carl Friedrich Gauss (1777-1855) was a German mathematician and scientist who contributed signicantly to many elds. Gauss is ranked as one of history's most inuential mathematicians. He referred to mathematics as the 'Queen of Sciences' 4 http://www.doksihu 2 2.1 Mathematical Background About the systems of linear

algebraic equations A general system of k linear algebraic equations (SLAE) with n unknows can be written as a11 x1 + a12 x2 + . + a1n xn = b1 a21 x1 + a22 x2 + . + a2n xn = b2 . . (2.1) ak1 x1 + ak2 x2 + . + akn xn = bk , where x1 , x2 , . , xn are the unknows, the numbers a11 , a12 , , akn (the coecients of the system) and b1 , b2 , , bk (the right hand side) are given We call the SLAE homogeneous, when the right hand side equals the null vector, otherwise SLAE is inhomogeneous. We deal with the case k = n, because then the numbers of unknows equal to the numbers of equations. Let A = (aij ) ∈ Rn×n , and we seek the solution of the system of linear equations n X aij xj = bi , 1 ≤ i ≤ n, (2.2) j=1 which we can written in matrix notation as Ax = b, (2.2') where b is a given column vector. Theorem 2.1 An A ∈ Rn×n is nonsingular, (ie, invertible) if and only if detA 6= 0. Theorem 2.2 If A is nonsingular, there exists nonsingular P ∈ Rn×n (so

called permutation matrix) such that P Ax = P b and diag(P A) doesn't contain zero elements. The solution vector x exits and it is unique if and only if A is nonsingular, and this solution vector is given explicitly by x = A−1 b. (2.3) Hence, without loss of generality, we may assume that A is nonsingular and its diagonal entries are nonzero. 5 http://www.doksihu 2.2 Numerical methods for the solution of the systems of linear algebraic equations There are two general schemes for solving linear systems: Direct and Iterative Methods. The Direct Methods (DM) attempt to solve the problem by a nite sequence of operations, and in the absence of rounding errors, would deliver an exact solution. The most popular methods are the following: Gaussian elimination, LU factorization, Cholesky factorization. These will be discussed in the section 5. The other methods are the Iterative Methods (IM) which attempt to solve the problem by nding successive approximations to the solution

starting from an initial guess. Our aim is to show the convergence of the vector sequence x(m) to solution of the systems of linear algebraic equation. To this we dene the convergence of the sequence of vectors. The concept of vector norms, matrix norms, and the spectral radii of matrices play an important role in iterative numerical analysis. Just as it is convenient to compare two vectors in terms of their lenght, it will be similarly convenient to compare two matrices by norm. To begin with, let Vn (R) be the n-dimensional vector space over the eld of real numbers R of column vectors x. Denition 2.1 Let be Vn (R) given vector space with the norms: k·k and k| · |k. The two norms called equavilent, in notation: k·k ∼ = k| · |k, if there exist M ≥ m ≥ 0 such that mkxk ≤ k|x|k ≤ M kxk for all x ∈ Vn (R). Theorem 2.3 If x and y are vectors of Vn (R), then kxk > 0, unless x = 0; if α is a scalar, then kαxk = |α| · kxk; (2.4) kx + yk ≤ kxk + kyk. Denition

2.2 Let x be a (column) vector of Vn (R) Then the kxkp is dened as n X p kxkp = ( |xi | )1/p , 1 ≤ p < ∞ i=1 kxk∞ = max |xi |, when p = ∞. 1≤i≤n 6 (2.5) http://www.doksihu The case p = 2 is called the Euclidean2 norm. In throughout the k·kp (where p is arbitrary) denoted by k·k. With this denition, the following results are well known. Theorem 2.4 In nite-dimensional normed spaces the norms are equivalent Denition 2.3 The x(m) converges to x if and only if: lim kx(m) − xk = 0. m∞ Corollary 2.1 If we have an innite sequence x0 , x1 , x2 , of vectors of Vn (R), this sequence converges to a vector x of Vn (R) if lim xj (m) m∞ = xj , for all 1 ≤ j ≤ n, where x(m) and xj are the j th components of the vectors x(m) and x respectively. j Similarly, under the convergence of an innite series to a vector y of Vn (C), we mean that lim N X N ∞ (m) yi ∞ P m=0 y(m) of vectors of Vn (R) = yi , for all 1 ≤ i ≤ n. m=0 Denition 2.4 Let

A = (aij ) ∈ Rn×n with eigenvalues λi , 1 ≤ i ≤ n Then, ρ(A) ≡ max |λi | 1≤i≤n (2.6) is called the spectral radius of the matrix A. Denition 2.5 Let A : Rn Rn a linear mapping, ie, A(x1 + x2 ) = A(x1 ) + A(x2 ) A(λx1 ) = λ · A(x) for all x1 , x2 ∈ Rn and λ ∈ R. If A is linear and continuous(ie, A ∈ Lin(Rn , Rn )), then kAxk x∈Rn kxk sup x6=0 is nite. Introducing the notation kAk := sup x∈R x6=0 n kAxk ∈ R, kxk (2.7) 2 Euclid (360 a.C-295 aC) was a Greek mathematician and is often referred to as the 'Father of Geometry'. 7 http://www.doksihu we can prove the following: Theorem 2.5 The function k·k : Lin(Rn , Rn ) R denes a norm Remark 2.1 Every real k × n matrix yields a linear map from Rn to Rk Each such choice of norms gives rise to an operator norm and therefore yields a norm on the space of all k × n matrices. If k = n and one uses the same norm on the domain and the range, then the induced operator norm is a

sub-multiplicative matrix norm. Example 2.1 The operator norm corresponding to the p − norm3 for vectors is: kAxkp . x6=0 kxkp kAkp = max In the case of p = 1 and p = ∞, the norms can be dened as: kAk1 = max 1≤j≤n kAk∞ = max 1≤i≤m m X |aij | i=1 n X |aij |. j=1 For instance, for the matrix   1 2 3  A :=   4 5  6  , 6 8 9 we get kAk1 = 18 and kAk∞ = 25. Remark 2.2 In the special case of p = 2 (the Euclidean norm) and m = n (square matrices), the induced matrix norm is the spectral norm. kAk2 = p λmax (A∗ A), where A∗ denotes the conjugate transpose of A. Hence, for the symmetric matrices: kAk2 = ρ(A) In Example 21: kAk2 = 164701 3 In Matlab we can compute kAk p as norm(A, p), where A denotes the given matrix, p = {1, 2 and inf } the given norm. 8 http://www.doksihu Theorem 2.6 Let A, B ∈ Rn×n arbitrary matrices Then the following relations are true: kA · Bk ≤ kAk · kBk, (2.8) kA · xk ≤ kAk ·

kxk, (2.9) ρ(A) ≤ kAk, (2.10) n where k·k denotes any norm is R . Proof. The (28) and (29) statements are evident due to the denition We prove (2.10): if λ is any eigenvalue of A, and is x is an eigenvector associated with the eigenvalue λ, then Ax = λx. Thus, from Theorem 23 and (29), |λ| · kxk = kλxk = kAxk ≤ kAk · kxk, (2.11) from which we conclude that kAk ≥ |λ| for all eigenvalues of A, which proves (2.10) Remark 2.3 In addition to (29), we note that for any A ∈ Rn×n there exists a vector x ∈ Rn such that kAxk = kAk · kxk. Denition 2.6 An A ∈ Rn×n is convergent matrix, if lim Am = O. m∞ Theorem 2.7 The matrix A ∈ Rn×n is convergent if and only if ρ(A) < 1 Proof. It can be seen in [6] as Theorem 14 Corollary 2.2 If A ∈ Rn×n and kAk < 1, then A is convergent Proof. Using (210) of Theorem 26 and Theorem 27 it is evident 9 http://www.doksihu 2.3 Connection between the condition numbers and the systems of linear algebraic

equations Mainly we meet large SLAE. To solve (22') we need computers But, in practice, as the vector b also A matrix are given inaccurately, charged with errors: for example input errors. For this reason we will investigate that how serious error makes these ones in the solution vector. The result of our investigation is that we will recognise: the condition number determines that can we solve on a given computer a given system of linear algebraic equations with a given error is acceptable or not. Consider now (2.2') in that case, when A is n × n nonsingular matrix and b 6= 0 Applying the norm to both sides, we get 0 < kbk = kAxk ≤ kAk · kxk. (2.12) 2.31 When b is charged with error Let us estimate the error, when we consider b + δ b instead of b on the right side. Denoting by x + δ x the solution of the new perturbed system, we have b + δb = A(x + δx) = Ax + Aδx. (2.13) Since Ax = b, therefore (2.13) implies the relation δ x = A−1 δ b Applying again

the norm, we obtain the estimation kδ xk ≤ kA−1 k · kδ bk. (2.13') Hence, the relations (2.13') and (212) together imply: kA−1 k · kδ bk kA−1 k · kδ bk kδ bk kδ xk ≤ ≤ = kA−1 k · kAk · . kxk kxk kbk/kAk kbk (2.14) The (2.14) estimation is sharp, because in (214) the equality can be obtained (For example, the choice A = I is good.) Denition 2.7 The number cond(A) := kA−1 k·kAk is called condition number of matrix. Since the condition number depends on the choosen norm, sometimes we use the notation: condp (A) := kA−1 kp · kAkp . 10 http://www.doksihu Hence, for the relative error in (2.14) we have kδ xk kδ bk ≤ cond(A) · . kxk kbk (2.15) On the right side of (2.15) we can see the the relative error of b (for example this can be input error) and on the left side the relative error of solution. Thus cond(A) plays an important role, because if it is big, then the relative error in the input data may cause a signicant increase in

the error of the solution. Knowing condition number is important for solving these equations on computer. In general, it is too dicult to compute the condition number of a matrix A exactly because it is far too expensive for large matrices. Avoiding this we can give approximation algorithm for the condition number. Remark 2.4 The condition number cannot be less then one in the case of induced norm, because 1 = kIk = kAA−1 k ≤ kAk · kA−1 k = cond(A). Remark 2.5 Let A is nonsingular, ie, λ = 0 is not an eigenvalue of A Then, due to the relation Av = λv (where v ∈ Rn is the eigenvector), we got λ1 v = A−1 v, which means that λ1 is the eigenvalue of A−1 and therefore kA−1 k ≥ |λmin (A−1 )| = |λmin1 (A)| . Hence cond(A) ≥ Example 2.2  λmax (A) . λmin (A)  1 2 5 7   3 A=  1  6 4 4 5 6 5 16  11  , 9   2 For dierent p : cond1 (A) ≈ 57.7130, cond2 (A) ≈ 363474 and cond∞ (A) ≈ 54.4888 Applying the

estimation of cond(A)4 of Remark 22: cond(A) ≥ 158691 If A is symmetric matrix, then the estimation is sharp for cond2 (A). Remark 2.6 Let be kδkbbkk = 1 , where 1 is the machine epsilon, 1 signicance consists in it, that it is constituted absolutely, relative error bar at the input 4 In Matlab we can compute cond(A) as cond(A, p) where A denotes the given matrix, p = { 1, 2and inf } the given norm. 11 http://www.doksihu and at the four fundamental operations. cond(A) ≥ 1 . 1 Then (2.14) shows us that the relative error of solution can be quite huge: cond(A) kδ bk ≥ 1. kbk Such a system is called ill-conditioned. Example 2.3 Let Hn ∈ Rn×n the so called Hilbert5 matrix, dened as Hn = 1 i+j−1 !n i,j=1 For instence, for n = 4 it is dened as:   1 1/2 1/3 1/4   1/2 H4 =   1/3  1/4 1/3 1/4 1/4 1/5 1/5 1/6  1/5  . 1/6   1/7 The following table shows the behavior of the cond2 (Hn ) for some n: n cond2 (Hn ) 4

7 4 1.55 · 10 10 8 4.75 · 10 1.60 · 10 18 13 25 18 2.39 · 10 6.14 · 1018 One can see that cond2 (Hn ) increases very rapidly. Therefore, the usual solver of the SLAE cannot work eciently on the test-problems with Hn -matrices. The reliability of the solution under given computer architecture of some given SLAE is independent on the determinant and it depends mainly on the condition number. 2.32 When A is charged with error Let us consider the problem when A is charged with error: (A + δA) · (x + δx) = b (2.16) First, we need the condition under which A + δA is nonsingular. (We have assumed only the regularity of A.) 5 David Hilbert (1862-1943) was a German mathematician. He formulated the theory of Hilbert spaces, one of the foundations of functional analysis. 12 http://www.doksihu Lemma 2.1 (Perturbation lemma): Let S := I + R and kRk =: q < 1 Then 1 . S is nonsingular and kS −1 k ≤ 1−q Proof. It can be seen in [5] in section 233 As A nonsingular,

we can rewrite (2.16) as (I + A−1 δA) · (x + δx) = A−1 b. (2.16') Using Lemma 2.1 to (216') we assume: kA−1 δAk < 1 Then kδAk < kA1−1 k is a sucient condition for that I + A−1 δA be nonsingular, because using (2.8) of Theorem 2.6 : kA−1 δAk ≤ kA−1 k · kδAk < 1. (2.17) Obviously, we can write (A + δA) as A(I + A−1 δA). In case of (217) and with the help of Lemma 2.1, we can estimate k(A + δA)−1 k as follows k(A + δA)−1 k = k(I + A−1 δA)−1 · A−1 k ≤ k(I + A−1 δA)−1 k · kA−1 k ≤ ≤ kA−1 k . 1 − kA−1 k · kδAk (2.18) Hereupon we can estimate the distinction of (2.2') and (216) As Ax = b = = (A + δA) · (x + δ x) = b + (A + δA) · δ x, so δ x = −(A + δA)−1 δAx and using (2.18) we get: kδ xk ≤ k(A + δA)−1 k · kδAk · kxk ≤ kA−1 k · kδAk · kxk, 1 − kA−1 k · kδAk cond(A) kδAk kδ xk kAk ≤ . kxk 1 − cond(A) kδAk (2.19) kAk In this case the relative error of

result expressible with the help of condition number, and with the relative error of the data of A. 13 http://www.doksihu 3 Nonnegative Matrices 3.1 Bounds for the spectral radius of a matrix It is generally dicult to determine precisely the spectral radius of a given matrix. Nevertheless, upper bounds can easily be found from the following theorem Theorem 3.1 (Gerschgorin6 theorem) Let A = (aij ) be an arbitrary n × n matrix, and let Λi ≡ n X (3.1) |aij |, 1 ≤ i ≤ n. j=1 j6=i Then, all the eigenvalues λ of A lie in the union of the disks |z − aii | ≤ Λi 1 ≤ i ≤ n. Proof. Let λ be any eigenvalue of the matrix A, and let x be the corre- sponding normalized eigenvector. (We normalize the vector x so that its largest component in modulus is unity.) By denition, (λ − aii )xi = n X aij xj , 1 ≤ i ≤ n. (3.2) j=1 j6=i In particular, if |xr | = 1, then |λ − arr | ≤ n X |arj | · |xj | ≤ n X |arj | = Λr . j=1 j6=r j=1 j6=r

Thus, the eigenvalues λ lies in the disk |z − arr | ≤ Λr . But since λ was an arbitrary eigenvalue of A, it follows that all the eigenvalues of the matrix A lie in the union of the disks |z − aii | ≤ Λi , 1 ≤ i ≤ n, completing the proof. Since the disk |z − aii | ≤ Λi is a subset of the disk |z| ≤ |aii | + Λi , we have the immediate result of 6 Semyon Aranovich Gerschgorin (1901-1933) was a Russian mathematician. More information about his theorem in the book: Richard S Varga: Gerschgorin and His Circles 14 http://www.doksihu Corollary 3.1 If A = (aij ) is an arbitrary n × n matrix, and v ≡ max n X 1≤i≤n (3.3) |aij |, j=1 then ρ(A) ≤ v = kAk1 . But as A and AT have the same eigenvalues, the application of Corollary 3.1 to AT gives us Corollary 3.2 If A = (aij ) is an arbitrary n × n matrix, and n X v 0 ≡ max 1≤j≤n (3.4) |aij |, i=1 then ρ(A) ≤ v 0 = kAk∞ . To improve further on these results, we use now the fact that

similarity transformations on a matrix leave its eigenvalues invariant. Corollary 3.3 If A = (aij ) is an arbitrary n × n matrix, and x1 , x2 , · · · , xn are any n positive real numbers, let ( v ≡ max n P j=1 |aij |xj ) v ≡ max ; xi 1≤i≤n ( 0 1≤j≤n xj n X |aij | i=1 xi ) (3.5) Then, ρ(A) ≤ min(v, v 0 ). Proof. Let D = diag(x1 , x2 , · · · , xn ) and apply Corollary 31 and 32 to the matrix D−1 AD, whose spectral radius necessarily coincides with that of A. Example 3.1 Let   0 0 1/4 1/4   0 A=  1/4  1/4 0 1/4 1/4 0 1/4 0  1/4  , 0   0 if x1 = x2 = x3 = x4 = 1, then v = v 0 = 21 , which is the exact value of ρ(A). This shows that equality is possible in Corollary 3.1, 32 and 33 15 http://www.doksihu Remark 3.1 In an attempt to improve the Corollary 31, suppose that the row sums of the moduli of the entries of the matrix A were not equal to v of (3.3) Could we hope to conclude that ρ(A) < v

? The counterexample given by the matrix ! B= 1 1 0 3 , shows this to be false, since v of (3.3) is 3, but ρ(B) = 3 also This leads us to the following important denition. Denition 3.1 For n ≥ 2, an n × n matrix A is reducible if there exits an n × n permutation matrix P such that P AP T = A11 A12 0 A22 ! , where A11 is an r × r submatrix and A22 is an (n − r) × (n − r) submatrix, where 1 ≤ r ≤ n. If no such permutation matrix exits, then A is irreducible If A is a 1 × 1 matrix, then A is irreducible7 if its single entry is nonzero, and reducible otherwise. With the concept of irreducibility, we can sharpen the result of Theorem 3.1 as follows: Theorem 3.2 Let A = (aij ) be an irreducible n × n matrix, and assume that λ, an eigenvalue of A, is a boundary point of union of the disks |z − aii | ≤ Λi . Then, all the n circles |z − aii | = Λi pass through the point λ. Proof. It can be seen in [6] as Theorem 17 This sharpening of Theorem 3.1

immediately gives rise to the sharpened form of Corollary 3.3 of Theorem 31 Corollary 3.4 Let A = (aij ) be an irreducible n×n matrix, and x1 , x2 , · · · , xn be any n positive real numbers. If n P |aij |xj j=1 ≤v xi (3.6) 7 The term irreducible(unzerlegbar) was introduced by the German mathematician, Ferdinand Georg Frobenius (1849-1917) in 1912; it is also called unreduced and indecomposable in the literature. 16 http://www.doksihu for all 1 ≤ i ≤ n, with strict inequality for at least one i, then ρ(A) < v . Similarly, if n P |aij | xj i=1 ≤ v0 xi (3.7) for all 1 ≤ j ≤ n, with strict inequality for at least one j , then ρ(A) < v 0 . Example 3.2 The following matrix   1 2 0  A=  0 4  1   and x = (1, 1, 1) 1 0 3 is good example for Corollary 3.4, in this case v = 5, v 0 = 6 and ρ(A) = 4.4142 We get strict inequality for i = 1 in (36) and for j = 3 in (37) 3.2 Spectral radius of nonnegative matrices In

investigations of the rapidity of convergence of various iterative methods, the spectral radius ρ(A) of the corresponding iteration matrix A plays a major role. For the practical estimation of this constant ρ(A), we have thus far only upper bounds for ρ(A) by the extensions of Gerschgorin's Theorem 3.1 In this section, we shall look closely into the Perron-Frobenius theory of square matrices having nonnegative real numbers as entries. Theory of Perron-Frobenius plays an important role in the study of M -matrices. First, we dene an (partial) ordering for the matrices. Denition 3.2 Let A = (aij ) and B = (bij ) be two n × r matrices Then, A ≥ B(> B) if aij ≥ bij (> bij ) for all 1 ≤ i ≤ n, 1 ≤ j ≤ r. If O is the nullmatrix and A ≥ O(> O), we say that A is a non-negative (positive) matrix. Finally, if B = (bij ) is an arbitrary n × r matrix, then |B| denotes the matrix with |bij |. In developing the Perron-Frobenius theory, we shall rst establish a

series of lemmas on non-negative irredducible square matrices. Lemma 3.1 If A ≥ O is an irreducible n × n matrix, then (I + A)n−1 > O 17 http://www.doksihu Proof. It can be seen in [6] as Lemma 21 If A = (aij ) ≥ O is an irreducible n × n matrix and x ≥ 0 is any nonzero vector, and let ( rx = min n P j=1 aij xi ) xi , (3.8) where the minimum is taken over all i for which xi > 0. Clearly, rx is a nonnegative real number and is the suprerum of all numbers ψ ≥ 0 for which Ax ≥ ψ x. (3.9) We now consider the non-negative quantity r dened by r = sup {rx }. x≥0 x6=0 (3.10) As rx and rαx have the same value for any scalar α > 0 (due to (3.8)), we need consider only the set P of vectors x ≥ 0 with kx = 1k, and we correspondingly let Q be the set of all vectors y = (I + A)n−1 x, where x ∈ P . From Lemma 3.1, Q consists only of positive vectors Multiplying both sides of the inequality Ax ≥ rx x by (I + A)n−1 , we obtain Ay ≥ rx y, and we

conclude from (3.9) that ry ≥ rx Therefore, the quantity r of (310) can be dened equivalenty as r = sup {ry }. y∈Q (3.10') As P is a compact set of vectors, so is Q, and as rx is a continuous function on Q, there neccessarily exists a positive vector z for which Az ≥ rz, (3.11) and no vector w ≥ 0 exists for which Aw > rw. We shall call all non-negative nonzero vectors z satisfying (3.11) extremal vectors of the matrix A Lemma 3.2 If A ≥ 0 is an irreducible n × n matrix, the quantity r of (310) is positive. Moreover, each extremal vector z is a positive eigenvector of the matrix A with corresponding eigenvalue r, i.e, Az = rz, and z > 0 18 http://www.doksihu Proof. If ζ is the positive vector whose components are all unity, then since the matrix A is irreducible, no row of A can vanish, and consequently no component of Aζ can vanish. Thus, rζ > 0, proving that r > 0 For the second part of this lemma, let z be an extremal vector with Az − rz =

η , where η ≥ 0. If η 6= 0, then some component of η is positive; multiplying through by the matrix (I + A)n−1 , we have Aw − rw > 0 where w = (I +A)n−1 z > 0. It would then follow that rw > r, contradicting the denition of r in (3.10') Thus, Az = rz, and since w > 0 and w = (1 + r)n−1 z, then z > 0, completing the proof. Lemma 3.3 Let A = (aij ) ≥ 0 be an irreducible n×n matrix, and let B = (bij ) be an n × n matrix with |B| ≤ A. If β is any eigenvalue of B , then |β| ≤ r, where r is the positive constant of (3.10) Moreover, equality is valid, ie, β = reiΦ , if and only if |B| = A, and where B has the form B = eiΦ DAD−1 , and D is a diagonal matrix whose diagonal entries have modulus unity. Proof. It can be seen in [6] as Lemma 23 Setting B = A in Lemma 3.3 immediately gives us Corollary 3.5 If A ≥ 0 is an irreducible n × n matrix, then the positive eigenvalue r of Lemma 3.2 equals ρ(A) of A Lemma 3.4 If A ≥ 0 is an

irreducible n × n matrix, and B is any principal square submatrix of A, then ρ(B) < ρ(A). Proof. If B is any principal submatrix of A, then there is an n×n permutation matrix P such that B = A11 , where C≡ A11 0 0 0 ! ; P AP T ≡ A11 A12 A21 A22 ! . Here, A11 and A22 are, respectively, m × m and (n − m) × (n − m) principal square submatrices of P AP T , 1 ≤ m < n. Clearly, O ≤ C ≤ P AP T , and 19 http://www.doksihu ρ(C) = ρ(B) = ρ(A11 ), but as C = |C| 6= P AP T , the conclusion follows immediately from Lemma 3.3 and Corollary 35 We now collect above results into the following main theorem. Theorem 3.3 (Perron8 -Frobenius) Let A ≥ O be an irreducible n × n matrix Then, 1. A has a positive real eigenvalue equal to its spectral radius 2. ρ(A) there corrensponds an eigenvector x > 0 3. ρ(A) increases when any entry of A increases 4. ρ(A) is a simple eigenvalue of A Proof. Parts 1 and Parts 2 follow immediately from Lemma 32 and the

Corollary 3.5 to Lemma 33 To prove part 3, suppose we increase some entry of the matrix A, giving us a new irreducible matrix Ae, where Ae ≥ A and Ae 6= A. e > ρ(A). To prove that ρ(A) is a Applying Lemma 3.3, we conclude that ρ(A) simple eigenvalue of A, i.e, ρ(A) is a zero of multiplicity one of the characteristic 0 polynomial Φ(t) = det(tI − A), we make use of the fact that Φ (t) is the sum of the determinant of the principal (n − 1) × (n − 1) submatrices of tI − A. If Ai is any principal submatrix of A, then from Lemma 3.4, det(tI − Ai ) cannot vanish for any t ≥ ρ(A). From this it follows that det(ρ(A)I − Ai ) > 0, and thus 0 Φ (ρ(A)) > 0. Consequently, ρ(A) cannot be a zero of Φ(t) of multiplicity greater then one, and thus ρ(A) is a simple eigenvalue of A. Remark 3.2 This result of Theorem 33 incidentally shows us that if Ax = ρ(A)x, where x > 0 and kxk = 1, we cannor nd another eigenvector y 6= γ x (γ a scalar) of A with Ay =

ρ(A)y, so that the eigenvector x so normalized is uniquely determined. We now return to the problem of nding bounds for the spectral radius of a matrix. For non-negative matrices, the Perron-Frobenius theory gives us 8 Historically, the German mathematician Perron (1880-1975) proved in 1907 this theorem assuming that A > O . Later (1912), Frobenius extended most of Perron's results to the class of non-negative irreducible matrices. 20 http://www.doksihu a nontrivial lower-bound estimates of A. Coupled with Gerschgorin's Theorem 3.1, we thus obtain lemma for the spectral radius of a non-negative irreducible square matrix. Lemma 3.5 If A = (aij ) ≥ O is an irreducible n × n matrix, then either n X (3.12) aij = ρ(A) f or all 1 ≤ i ≤ n, j=1 or min 1≤i≤n n X ! aij n X < ρ(A) < max 1≤i≤n j=1 ! aij . (3.13) j=1 Proof. First, suppose that all the row sums of A are equal to σ If ζ is the vector with all components unity, then

obviously Aζ = σζ , and σ ≤ ρ(A) by Denition 2.2 But, Corollary 31 shows us that ρ(A) ≤ σ , and we conclude that ρ(A) = σ , which is the result of (3.12) If all the row sums of A are not equal, we can construct a non-negative irreducible n × n matrix B = (bij ) by decreasing certain positive entries of A so that for all 1 ≤ i ≤ n, n X n X bij = α = min 1≤i≤n j=1 ! aij , j=1 where O ≤ B ≤ A and B 6= A. As all the row sums of B are equal to α, we can apply the result of (3.12) to the matrix B , and thus ρ(B) = α Now, by the statment 3 of Theorem 3.3, we must have that ρ(B) < ρ(A), so that min 1≤i≤n n X ! aij < ρ(A). j=1 On the other hand, we can similarly construct a non-negative irreducible n × n matrix C = (cij ) by incerasing certain of the positive entries of A so that all the row sums of C are equal, where C ≥ A and C 6= A. It follows that ρ(A) < ρ(C) = max 1≤i≤n n X ! aij , j=1 and the combination of these

inequalities gives the desired result of (3.13) 21 http://www.doksihu Example 3.3 Provable, that  1  A(α) =   3 4 1 + 0.9α 0.8α 3 5 1 α   , α ≥ 0  is irreducible. For dierent α, we determine ρ(A[α)], lower and upper bounds for ρ[A(α)]. As A(α) is irreducible we know by the statment 3 of Theorem 33 that as α increases ρ[A(α)] is strictly increasing. α ρ[A(α)] Lower bound Upper bound 0 6 6 6 1 6.8878 6.8 7 2 7.7757 7.6 8 3 8.6643 8.4 9 5 10.4438 10 11 10.283 15.1617 14.2264 16.2830 For α = 0 the row sums of A(0) are equal, thus by the rst part of Lemma 3.5 we know ρ[A(0)], lower and upper bound are equal. 3.3 Reducible Matrices In the previous sections of this chapter we have investigated the PerronFrobenius theory of non-negative irreducible square matrices. We now consider extensions of these basic results which make no assumption of irreducibility. Let A be a reducible n × n matrix. By Denition 31, there exits

an n × n permutation matrix P1 such that P1 AP1T = A11 A12 0 A22 ! , where A11 is an r × r submatrix and A22 is an (n − r) × (n − r) submatrix, where 1 ≤ r ≤ n. We again ask if A11 and A22 are irreducible, and if not, we reduce them in the manner we initially reduced the matrix. Thus, there exists an n × n permutation matrix P such that  P AP T  R11 R12 ··· R1m   O  = .  .  O R22 ···  R2m   , .  .   Rmm O 22 . . ··· http://www.doksihu where each square submatrix Rjj , 1 ≤ j ≤ m (3.14), is either irreducible or a 1×1 null matrix. We shall say that (314) is the normal form9 of a reducible matrix A. Clearly, the eigenvalues of A are the eigenvalues of the square submatrices Rjj , 1 ≤ j ≤ m. With (314), we prove the following generalization of Theorem 3.3 Theorem 3.4 Let A ≥ O be an n × n matrix Then, 1. A has a non-negative real eigenvalue equal to its spectral radius Moreover, this

eigenvalue is positive unless A is reducible and the normal form of A is strictly upper triangular. 2. To ρ(A), there corrensponds an eigenvector x ≥ 0 3. ρ(A) does not decrease when any entry of A is increased Proof. If A is irreducible, the conclusions follow immediately from Theorem 3.3 If A is reducible, assume that A is in the normal form of (314) If any submatrix Rjj of (3.14) is irreducible, then Rjj has a positive eigenvalue equal to its spectral radius. Similarly, if Rjj is a 1 × 1 null matrix, its single eigenvalue is zero. Clearly, A then has a non-negative real eigenvalue equal to its spectral radius. If ρ(A) = 0, then each Rjj of (314) is a 1 × 1 null matrix, which proves that the matrix of (3.14) is strictly upper triangular The remaining statements of this theorem follow by applying a continuity argument to the result of Theorem 3.3 Using the notation of Denition 3.2, we include the following generalization of Lemma 3.3 Theorem 3.5 Let A and B be two n × n

matrices with O ≤ |B| ≤ A Then, ρ(B) ≤ ρ(A). Proof. If A is irreducible, then we know from Lemma 33 and its Corollary 3.5 that ρ(B) ≤ ρ(A) On the other hand, if A is reducible, we note that the property O ≤ |B| ≤ A is invariant under similarity transformations by permutation matrices. Putting A into its normal form (314), we now simply apply the argument above to the submatrices |Rjj (B)| and Rjj (A) of the matrices |B| and A, respectively. 9 This expression is introduced in 1959 by the Russian mathematician Ganthmakher. 23 http://www.doksihu 4 M-matrices Consider the matrix A in (1.1) Since A can be expressed in the form A = sI − B, s > 0, B ≥ O, (4.1) it should come no suprise that the theory of non-negative matrices plays a dominant role in the study of certain of these matrices. Matrices of the form (4.1) often occur in relation to systems of linear algebraic equations or eigenvalue problems in a wide variety of areas including nite diernce methods

for partial dierencial equations, input-output production and growth models in economics and Markov process in probality and statistics. We adopt here the traditional notation by letting Z n×n = {A = (aij ) ∈ Rn×n : aij ≤ 0, i 6= j}. Our aim is to give a systematic treatment of a certain subclass of matrices in Z n×n called M -matrices. Denition 4.1 Any matrix A of the form (41) for which s ≥ ρ(B) is called an M -matrix. In the following section we consider nonsingular M-matrices, that is, those of the form (4.1) for which s > ρ(B) Characterization theorems are given for A ∈ Z n×n and A ∈ Rn×n to be a nonsingular M -matrix and the symmetric and irreducible cases are considered. 4.1 Nonsingular M-matrices Before proceeding to the main characterization theorem for nonsingular M matrices, it will be convenient to have available the following lemma, which is the matrix version of the Neumann10 lemma for convergent series. Lemma 4.1 The non-negative matrix T ∈ Z

n×n is convergent; that is, ρ(T ) < 1, if and only if (I − T )−1 exists and (I − T )−1 = ∞ X T k ≥ 0. (4.2) k=0 10 János Neumann (1903-1957) was a Hungarian mathematician, who made major contributions to a vast range of elds. He is generally regarded as one of the greatest mathematicians in modern history. Anecdotes from his life in  PRHalmos: The legend of John von Neumann article. 24 http://www.doksihu Proof. If T is convergent then (42) follows from the identity (I − T )(I + T + T 2 · · · T k ) = I − T k+1 , k ≥ 0, by letting k approach innity. For the converse let T x = ρ(T )x for some x > 0. (Such an x exists by the Perron-Frobenius Theorem 3.3) Then ρ(T ) 6= 1 since (I − T )−1 exists and thus (I − T )x = [1 − ρ(T )]x implies that (I − T )−1 x = x 1 − ρ(T ) −1 Since x > 0 and (I − T ) > 0 ⇒ ρ(T ) < 1. . For practical purposes in characterizing nonsingular M -matrices, it is evident that we can

often begin by assuming that A ∈ Z n×n . However, many of the statements of these characterizations are equivalent without this assumption. We have attempted here to group together all such statements into certain categories. Moreover, certain other implications follow without assuming that A ∈ Z n×n and we point out such implications in the following inclusive theorem. All matrices and vectors considered in this theorem are real. 4.2 Theorem 4.1 Theorem 4.1 [2] Let A ∈ Rn×n Then for each xed letter ϑ representing one of the following conditions, conditions ϑi are equivalent for each i. Moreover, letting ϑ then represent any of the equivalent conditions ϑi , the following implication tree holds: 25 http://www.doksihu Finally, if A ∈ Z n×n then each of the following 50 conditions is equivalent to the statement:  A is a nonsingular M -matrix. Positivity of principal minors (A1 ) All of the principal minors of A are positive. (A2 ) Every real eigenvalue of

each principal submatrix of A is positive. (A3 ) For each x 6= 0 there exists a positive diagonal matrix D such that xT AD x > 0. (A4 ) For each x 6= 0 there exists a nonnegative diagonal matrix D such that xT AD x > 0. (A5 ) A does not reverse the sign of any vector; that is, if x 6= 0 and y = Ax, then for some subscript i, xi yi > 0. (A6 ) For each signature matrix S (here S is diagonal with diagonal entries ±1), there exists an x >> 0 such that SAS x > 0. (B7 ) The sum of all the k × k principal minors of A is positive for k = 1, · · · , n. (C8 ) A is nonsingular and all the principal minors of A are non-negative. (C9 ) A is nonsingular and every real eigenvalue of each principal submatrix of A is non-negative. (C10 ) A is nonsingular and A + D is nonsingular for each positive diagonal matrix D . (C11 ) A + D is nonsingular for each non-negative diagonal matrix D. (C12 ) A is nonsingular and for each x 6= 0 there exists a non-negative diagonal matrix D

such that xT D x 6= 0 and xT AD x ≥ 0. (C13 ) A is nonsingular and if x 6= 0 and y = Ax, then for some subscript i, xi 6= 0, and xi yi ≥ 0. (C14 ) A is nonsingular and for each signature matrix S there exists a vector x > 0 such that SAS x ≥ 0. 26 http://www.doksihu (D15 ) A + αI is nonsingular for each α ≥ 0. (D16 ) Every real eigenvalue of A is positive. (E17 ) All the leading principal minors of A are positive. (E18 ) There exist lower and upper triangular matrices L and U , respectively, with positive diagonals such that A = LU. (F19 ) There exists a permutation matrix P such that P AP T satises condition (E17 ) or (E18 ). Positive Stability (G20 ) A is positive stable; that is, the real part of each eigenvalue of A is positive. (G21 ) There exists a symmetric positive denite matrix W such that AW +W AT is positive denite. (G22 ) A + I is nonsingular and G = (A + I)−1 (A + I) is convergent, (G23 ) A + I is nonsingular and for G = (A + I)−1 (A + I)

there exists a positive denite matrix W such that W − GT W G is positive denite. (H24 ) There exists a positive diagonal matrix D such that AD + DAT is positive denite. (H25 ) There exists a positive diagonal matrix E such that for B = E −1 AE , the matrix B + BT 2 is positive denite. (H26 ) For each positive semidenite matrix Q, the matrix QA has a positive 27 http://www.doksihu diagonal element. Semipositivity and Diagonal Dominance (I27 ) A is semipositive; that is, there exists x >> 0 with Ax >> 0. (I28 ) There exists x > 0 with Ax >> 0. (I29 ) There exists a positive diagonal matrix D such that AD has all positive row sums. (J30 ) There exists x >> 0 with Ax > 0 and i X aij xj > 0, i = 1, . , n j=1 (K31 ) There exists a permutation matrix P such that P AP T satises (J30 ). (L32 ) There exists x >> 0 with y = Ax > 0 such that if yi0 = 0, then there exists a sequence of indices i1 , . , n with aij−1 ij 6= 0,

j = 1, . , r, and with yir 6= 0. g e = (a (L33 ) There exists x >> 0 with y = Ax > 0 such that the matrix A ij ) dened by ( g (a ij ) = 1 if aij 6= 0 or yi 6= 0, 0 otherwise is irreducible. (M34 ) There exists x >> 0 such that for each signature matrix S, SAS x >> 0. (M35 ) A has all positive diagonal elements and there exists a positive diagonal matrix D such that AD is strictly diagonally dominant; that is, aii di > X |aij |dj , i = 1, . , n j6=i (M36 ) A has all positive diagonal elements and there exists a positive diagonal matrix E such that E −1 AE is strictly diagonally dominant. (M37 ) A has all positive diagonal elements and there exists a positive diagonal matrix D such that AD is lower semistrictly diagonally dominant; that is, aii di ≥ X |aij |dj , i = 1, . , n, j6=i 28 http://www.doksihu and aii di > i−1 X |aij |dj , i = 2, . , n j=1 Inverse-Positivity and Splittings (N38 ) A is inverse-positive; that

is, A−1 exists and A−1 ≥ 0. (N39 ) A is monotone; that is, Ax ≥ 0 x ≥ 0 for all x ∈ Rn . (N40 ) There exist inverse-positive matrices B1 and B2 such that B1 ≤ A ≤ B2 . (N41 ) There exists an inverse-positive matrix B ≥ A such that I − B −1 A is convergent. (N42 ) There exists an inverse-positive matrix B ≥ A and A satises (I27 ), (I28 ) or (I29 ). (N43 ) There exists an inverse-positive matrix B ≥ A and a nonsingular M -matrix C , such that A = BC. (N44 ) There exists an inverse-positive matrix B and a nonsingular M -matrix C such that A = BC. (N45 ) A has a convergent regular splitting; that is, A has a representation A = M − N, M −1 ≥ O, N ≥ O, where M −1 N is convergent. (N46 ) A has a convergent weak regular splitting; that is, A has a representation A = M − N, M −1 ≥ O, M −1 N ≥ O, where M −1 N is convergent. (O47 ) Every weak regular splitting of A is convergent. (P48 ) Every regular splitting of A is convergent. Linear

Inequalities 29 http://www.doksihu (Q49 ) For each y ≥ 0 the set Sy = {x ≥ 0 : AT x ≤ y} is bounded, and A is nonsingular. (Q50 ) S0 that is; the inequalities AT x and x ≥ 0 have only the trivial solution x = 0, and A is nonsingular. Proof. We show now that each of the 50 conditions, (A1 ) − (Q50 ), can be used to characterize a nonsingular M -matrix A, beginning with the assumption that A ∈ Z n×n . Suppose rst that A is a nonsingular M -matrix Then in view of the implication tree in the statement of the theorem, we need only show that conditions (M ) and (N ) hold, for these conditions together imply each of the remaining conditions in the theorem for arbitrary A ∈ Rn×n . Now by Denition 4.1, A has the representation A = sI − B, B ≥ O with s ≥ ρ(B) Moreover, s > ρ(B), since A is nonsingular. Letting T = Bs , it follows that ρ(T ) < 1 so by Lemma 4.1, A−1 = (I − T )−1 s ≥ 0. Thus condition (N38 ) holds. In addition, A has all positive

diagonals since the inner product of the ith row of A with the ith column of A−1 is one for i = 1, , n Let x = A−1 e, where e = (1, . , 1)T Then for D = diag(x1 , x2 , , xn ), D is a positive diagonal matrix and ADe = Ax = e >> 0, and thus AD has all positive row sums. But since A ∈ Z n×n , this means that AD is strictly diagonally dominant and thus M35 holds. We show next that if A ∈ Z n×n satises any of the conditions (A) − (Q), then DA is a nonsingular M -matrix. Once again, in view of the implication tree, it suces to consider only conditions (D), (F ), (L) and (P ). Suppose condition (D16 ) holds for A and let A = sI − B, B ≥ O, s > 0. Suppose that s ≤ ρ(B). Then if B x = ρ(B)x, x 6= 0, Ax = [s − ρ(B)]x, so that s − ρ(B) would be a nonpositive real eigenvalue of A, contradicting (D16 ). Now suppose condition (F19 ) holds for A Thus suppose P AP T = LU where L is lower triangular with positive diagonals and U is upper triangular 30

http://www.doksihu with positive diagonals. We rst show that the o-diagonal elements of both L and U are nonpositive. Let L = (rij ), U = (sij ) so that rij = 0 for i < j and sij = 0 for i > j and rii > 0, sii > 0 for 1 ≤ i, j ≤ n. We shall prove the inequalities rij ≤ 0, sij ≤ 0 for i 6= j by induction on i + j . Let 0 A0 = P AP T = (aij ). If i + j = 3 the inequalities r21 ≤ 0 and s21 ≤ 0 follow from a12 = r11 s12 and 0 a21 = r21 s11 . Let i + j > 3, i 6= j , and suppose the inequalities rkl ≥ 0 and skl ≥ 0, k 6= l, are valid if k + l < i + j . Then if i < j in the realtion 0 aij = rii sij + X rik skj , k<i we have aij ≤ 0, X rik skj ≥ 0 since rik ≤ 0, skj ≤ 0 k<i according to i + k < i + j, k + j < i + j . Thus sij ≤ 0 Analogously, for i > j the inequality rij ≤ 0 can be proved. It is easy to see then that L−1 and U −1 exist and are nonnegative. Thus A−1 = (P T LU P )−1 = P T U −1 L−1 P ≥ O.

Now letting A = sI − B, s > 0, B ≥ O it follows that (I − T )−1 ≥ O, where T = Bs . Then ρ(B) < 1 by Lemma 41, and thus s > ρ(B) and A is a nonsingular M -matrix. Next, assume that condition (L33 ) holds for A We write A = sI − B, s > 0, B ≥ O and let T = Bs . Then since y = Ax > 0 for some x >> 0, it follows that T x < x. Now dene T̂ = (t̂ij ) by     tij t̂ij =     0 if tij 6= 0, if tij = 0 and yi 6= 0, otherwise. It follows then that T is irreducible since Ae dened in (L33 ) is irreducible. Moreover, for suciently small  > 0, T̂ x < x, so that ρ(T̂ ) < 1 by the Perron-Frobenius Theorem 3.3 Finally, since T ≤ T̂ it follows that ρ(T ) ≤ ρ(T̂ ) < 1, 31 http://www.doksihu by Theorem 3.5 Thus A is a nonsingular M -matrix Next, assume that condition (P48 ) holds for A. Then since A has a regular splitting of the form A = sI − B, s > 0, B ≥ O, it follows from (P48 ) that T = Bs is

convergent, so that s > ρ(B) and A is a nonsinguIar M -matrix. This completes the proof of Theorem 4.1 for A ∈ Z n×n 5 Iterative methods for the linear algebraic systems with M-matrices As we have seen in chapter two, there are two general schemes for solving SLAE of the form (2.2'): direct and iterative methods We mention two DM The following two general schemes applicable for M -matrices. We prove the Gaussian elimination and the LU factorization exist for M matrices. The LU factorization is a matrix decomposition which writes a matrix as the product of a lower triangular L matrix and an upper triangular U matrix. Theorem 5.1 If A ∈ Rn×n is an M -matrix, then the LU factorization exists and the Gaussian elimination is executable. Proof. To prove this theorem we use the fact there exist a unique LU factorization if and only if the pricipal minors of A are nonzero (It can be seen in [3]). This theorem is valid for M -matrices by the statement of (A1 ) of Theorem 4.1

The LU factors of an M -matrix are guaranteed to exist and can be stably computed without need for numerical pivoting, also have positive diagonal entries and non-positive o-diagonal entries. In this chapter we apply nonnegativity to the study of iterative methods for solving SLAE of the form (2.2') We shall study various iterative methods for approximating x. Such methods are usually ideally suited for problems involving large sparse matrices, much more so in most cases than direct methods such as Gaussian elimination. A typical iterative method involves the selection of an initial approximation x(0) to the solution x to (2.2') and the determination of a sequence x(0) , x(1) , . according to some specied algorithm which, if the method is properly chosen, will converge to the exact solution x of (2.2') Since 32 http://www.doksihu x is unknown, a typical method for terminating the iteration might be whenever |xi (k+1) |xi − xi | (k) (k+1) (5.1) <  i =

1, . , n, | where  is some pre-chosen small number, depending upon the precision of the computer being used. Essentially, the stopping criteria (51) for the iteration is that we will choose x(k+1) as the approximation to the solution x whenever the relative dierence between successive approximations x(k) and x(k+1) becomes i i suciently small for i = 1, . , n 5.1 Basic iterative methods 5.11 Jacobi and Gauss-Seidel method We begin this section by describing two iterative formulas for solving the linear system (2.2') Here we assume that the coecient matrix A = (aij ) for (2.2') is nonsingular and has all nonzero diagonal entries Assume that the kth approximating vector x(k) to x = A−1 b has been computed. Then the Jacobi11 method for computing x(k+1) is given by x (k+1) = i ! X 1 (k) bi − aij xj , i = 1, . , n aii (5.2) j6=i Now let D = diag(A) and −L and −U be the strictly lower and strictly upper triangular parts of A, respectively; that is, 

0   a21  L = − .  .  an1 0 . .  0 a12       , U = −     . ann−1  0 . . . 0 a1n  . .    .  an−1n  0 Then, clearly, (5.2) may be written in matrix form as x(k+1) = D−1 (L + U )x(k) + D−1 b, k = 0, 1, . (5.3) A closely related iteration may be derived from the following observation. If we assume that the computations of (5.2) are done sequentially for i = 1, , n, then 11 It was originally considered by the Prussian mathematician Carl Gustav Jacob Jacobi (1804-1851). 33 http://www.doksihu (k+1) at the time we are ready to compute x(k+1) the new components x(k+1) , . , xi−1 1 i are already available, and it would seem reasonable to use them instead of the old components; that is, we compute x (k+1) = i ! i−1 n X X 1 (k+1) (k) bi − aij xj − aij xj , i = 1, . , n aii j=1 j=i+1 (5.4) This is the Gauss-Seidel12 method. It is easy to see that (54) may be

written in the form Dx(k+1) = b + Lx(k+1) + U x(k) , so that the matrix form of the Gauss-Seidel method (5.4) is given by x(k+1) = (D − L)−1 U x(k) + (D − L)−1 b, k = 0, 1, . (5.5) Of course (5.2) and (54) should be used rather than their matrix formulations (5.3) and (55) in programming these methods We shall return to these two fundamental procedures later, but rst we consider the more general iterative formula x(k+1) = H x(k) + c, k = 0, 1, . (5.6) The matrix H is called the iteration matrix for (5.6) and it is easy to see that if we split A into A = M − N, M nonsingular, then for H = M −1 N, c = M −1 b and x = H x + c if and only if Ax = b. Clearly, the Jacobi method is based upon the choice M = D, N = L + U, while the Gauss-Seidel method is based upon the choice M = D − L, N = U. Then for the Jacobi method H = M −1 N = D−1 (L + U ) for the Gauss-Seidel method H = M −1 N = (D−L)−1 U . We next prove the basic convergence lemma for (5.6) 12

It is named after the German mathematicians Carl Friedrich Gauss and Philipp Ludwig von Seidel (1821-1896). In numerical linear algebra also known as the Liebmann method or the method of successive displacement. 34 http://www.doksihu Lemma 5.1 Let A = M − N is an arbitrary n × n matrix with A and M nonsingular. Then for H = M −1 N and c = M −1 b, the iterative method (56) converges to the solution x = A−1 b to (2.2') for each x(0) if and only if ρ(H) < 1 Proof. If we subtract x = H x + c from (5.6), we obtain the error equation x(k+1) − x(k) = H(x(k) − x) = . = H k+1 (x(0) − x) Hence the sequence x(0) , x(1) , . , converges to x for each x(0) if and only if lim H k = O, k∞ that is, if and only if ρ(H) < 1, by considering the Jordan form for H , which proves the lemma. In short we shall say that a given iterative method converges if the iteration (5.6) associated with that method converges to the solution to the given linear system for every x(0)

. Now we discuss the general topic of rates of convergence Denition 5.1 For H ∈ Cn×n assume that ρ(H) < 1 and let x = H x + c Then for ( α = sup lim kx (k) k∞ − xk : x 1 k (0) ) n ∈C (5.7) the number R∞ (H) = − ln(α) is called the asymptotic rate of convergence of the iteration (5.6) The supremum is taken in (5.7) so as to reect the worst possible rate of convergence for any x(0) . Clearly, the larger the R∞ (H), the smaller the α and thus the faster the convergence of the process. Also note that α is independent of the particular vector norm used in (5.7) In the next section, the GaussSeidel method is modied using a relaxation parameter ω and it is shown how, in certain instances, to choose ω in order to maximize the asymptotic rate of convergence for the resulting process. Remark 5.1 Let H ∈ Rn×n assume that ρ(H) < 1 Then, the asymptotic rate of convergence of (5.6) is R∞ (H) = − ln ρ(H). 35 http://www.doksihu 5.12 SOR method

Here we investigate in some detail a procedure that can sometimes be used to accelerate the convergence of the Gauss-Seidel method. We rst note that be the Gauss-Seidel method can be expressed in the following way. Let x(k+1) i (k+1) the ith component of x computed by the formula (5.4) and set ∆xi = xi (k+1) − xi . (k) Then for ω = 1, the Gauss-Seidel method can be restated as (k) x(k+1) = xi + ω∆xi , i = 1, . , n, ω > 0 i (5.8) It was discovered during the years of hand computation (probably by accident) that the convergence is often faster if we go beyond the Gauss-Seidel correction ∆xi . If ω > 1 we are overcorrecting while if ω < 1 we are undercorrecting As just indicated, if to ω = 1 we recover the Gauss-Seidel method (5.4) In general the method (5.8) is called the successive overrelaxation (SOR13 ) method Of course the problem here is to choose the relaxation parameter ω to so as to maximize the asymptotic rate of convergence of (5.8) In order to

write this procedure in matrix form, we replace x(k+1) in (5.8) by i the expression in (5.4) and rewrite (58) as x x (k+1) (k) = (1 − ω) i + i ! i−1 n X X ω (k+1) (k) bi − aij xj − aij xj . aii j=1 j=i+1 (5.9) Then (5.9) can be rearranged into the form aii xi (k+1) +ω i−1 X aij xj (k+1) = (1 − ω)aii xi (k) −ω j=1 n X aij xj (k) + ω bi . j=i+1 This relation of the new iterates x(k+1) to the old x(k) holds for i = 1, . , n, i i and by means of the decomposition A = D − L − U , we can written as Dx(k+1) − ωLx(k+1) = (1 − ω)Dx(k) + ωU x(k) + ω b or, under the assumption that aii 6= 0, i = 1, . , n, x(k+1) = Hω x(k) + ω(D − ωL)−1 b, k = 0, 1, . , (5.10) 13 This iterative method is also called the accelerated Liebmann method, the extrapolated Gauss-Seidel method and the method of systematic overrelation. 36 http://www.doksihu where Hω = (D − ωL)−1 [(1 − ω)D + ωU ]. (5.11) We rst prove a result that gives

the maximum range of values of ω > 0 for which the SOR iteration can possibly converge. Theorem 5.2 Let A an arbitrary n × n matrix have all nonzero diagonal elements. Then the SOR method (58) converges only if 0 < ω < 2. (5.12) Proof. Let Hω be given by (511) In order to establish (512) under the assumption that ρ(Hω ) < 1, it suces to prove that |ω − 1| ≤ ρ(Hω ). (5.13) Because L is strictly lower triangular, detD−1 = det(D − ωL)−1 . Thus detHω = det(D − ωL)−1 det[(1 − ω)D + ωU ] = = det[(1 − ω)I + ωD−1 U ] = det[(1 − ω)I] = (1 − ω)n , because D−1 U is strictly upper triangular. Then since detHω , is the product of its eigenvalues, (5.12) must hold and the theorem is proved It will be shown in following section that for certain important classes of matrices, (5.12) is also a sucient condition for convergence of the SOR method For other classes this method converges if and only if 0 < ω < c for some xed c >

0 which depends upon A. 5.2 Convergence In this section we investigate several important convergence criteria for the iterative methods for solving SLAE in the form (1.1) We now investigate certain types of general splittings, discussed rst in Theorem 4.1, in terms of characterizations of nonsingular M -matrices Denition 5.2 The splitting A = M − N with A and M nonsingular is called a regular splitting if M −1 ≥ O and N > O. It is called a weak regular splitting if M −1 ≥ O and M −1 N ≥ O. 37 http://www.doksihu Clearly, a regular splitting is a weak regular splitting. The next result relates the convergence of (5.4) to the inverse-positivity of A We shall call it the (weak) regular splitting theorem. Remark 5.2 If A ∈ Rn×n is an M -matrix, then it has been [weak] regular splitting by the statments (N45 ) and (N46 ) of Theorem 4.1 M =D N =L+U M =D−L N =U M = D − ωL N = (1 − ω)L + U (Jacobi method) (Gauss-Seidel method) (SOR method) We know by

Denition 5.2 that A = M − N and we receive the desired result if we do the extractions. Theorem 5.3 Let A = M − N be a weak regular splitting of A Then the following statements are equivalent: −1 (1) A ≥ O that is, A is inverse-positive −1 (2) A N ≥ O ρ(A−1 N ) (3) ρ(H) = 1+ρ(A−1 N ) so that ρ(H) < 1. Proof. It can be seen in [2] as Theorem 56 Corollary 5.1 Let A be inverse-positive and let A = M1 −N1 and A = M2 −N2 be two regular splittings of A where N2 ≤ N1 . Then for H 1 = M1−1 N1 and H 2 = M2−1 N2 , ρ(H 2 ) ≤ ρ(H 1 ) < 1 so that R∞ (K) ≥ R∞ (H). Proof. The proof follows from (3) of Theorem 53, together with the fact that α(1 + α)−1 is an increasing function of α for α ≥ 0, which proves the theorem. As indicated earlier, every [weak] regular splitting of a nonsingular M -matrix is convergent by the statments (N45 ) and (N46 ) of Theorem 4.1 Clearly for such matrices, the Jacobi and Gauss-Seidel methods dened by (5.3) and

(55) are based upon regular splittings. Moreover, if 0 < ω ≤ 1, then the SOR method dened by (5.10) is based upon a regular splitting and in this case the SOR 38 http://www.doksihu method is convergent by the Kahan14 -theorem. These concepts will now be extended to an important class of complex matrices. Let A ∈ Rn×n , have all nonzero diagonal elements and let A = D − L − U , where as usual D = diag(A) and where −L and −U represent the lower and the upper parts of A, respectively. We return to the case where A ∈ Rn×n and A is a nonsingular M -matrix. In the following analysis we may assume, without loss of generality, that D = diag(A) = I . In this case the Jacobi iteration matrix for A is given by J = L + U, while the SOR iteration matrix for A is given by Hω = (I − ωL)−1 [(1 − ω)I + ωU ]. Theorem 5.4 Let A = I − L − U ∈ Rn×n where L ≥ O and U ≥ O are strictly lower and upper triangular, respectively. Then for 0 < ω < 1, (1) ρ(J)

< 1 if and only if ρ(Hω ) < 1. (2) ρ(J) < 1 if and only if A is a nonsingular M -matrix, in which case ρ(Hω ) ≤ 1 − ω + ωρ(J). (3) if ρ(J) ≥ 1 then ρ(Hω ) ≥ 1 − ω + ωρ(J) ≥ 1. Proof. It can be seen in [2] as Theorem 521 Corollary 5.2 Let A be as in Theorem 54 Then (1) ρ(J) < 1 if and only if ρ(H1 ) < 1. (2) ρ(J) < 1 and ρ(H1 ) < 1 if and only if A is a nonsingular M -matrix, moreover, ρ(J) < 1 then ρ(H1 ) ≤ ρ(J). (3) if ρ(J) ≥ 1 then ρ(H1 ) ≥ ρ(J) ≥ 1. 14 William Morton Kahan (1933-) is a Canadian mathematician and computer scientist whose main area of contribution has been numerical analysis. 39 http://www.doksihu Example 5.1 We use the Jacobi and the Gauss-Seidel method to solve the linear system  18 −4 0   −3 A·x=  −5  18 −3 −2 −5 18 0 −7 3 x1          −6   x2   −2  , ·  = −9    x3   5

 4 x4 18 −8      and let the starting vector be p = (0, −1, 1, 1)T . Since the ρ(A) is 0700678 < 1, methods in this example will converge. A tolerance can be supplied to either the Jacobi or Gauss-Seidel method which will permit it to exit the loop if convergence has been achieved. We use dierent tolerances and a maximum of 1000 iterations. Tolerance Jacobi Gauss-Seidel 10−2 10−3 10−4 10−6 10−10 10−18 7 6 13 9 19 12 32 18 58 30 99 48 We next give a comparison theorem of the convergence rates of the SOR method for nonsingular M -matrices. Theorem 5.5 Let A be a nonsingular M -matrix and let 0 < ω1 ≤ ω2 ≤ 1 Then ρ(Hω2 ) ≤ ρ(Hω1 ) < 1, so that R∞ (Hω2 ) ≥ R∞ (Hω1 ). Proof. Let A = D − L − U Then Hω = Mω−1 Nω , where Mω = ω −1 D − L, Nω = (ω −1 − 1)D + U. But Mω−1 ≥ O and Nω ≥ O for 0 < ω ≤ 1 as before, and A = Mω − Nω is a regular splitting of A. Now since to

ω −1 is a decreasing function of ω for 0 < ω ≤ 1, it follows that if 0 < ω1 ≤ ω2 ≤ 1, then Nω2 ≤ Nω1 . The result then follows from Corollary 5.1, which proves Theorem 55 40 http://www.doksihu Now if A is an M -matrix, then ρ(J) < 1 by Theorem 5.4, and by Theorem 5.5, ρ(Hω ) is a nonincreasing function of ω in the range 0 < ω ≤ 1 Moreover, ρ(Hω ) < 1. By the continuity of ρ(Hω ) as a function of ω , it must follow that ρ(Hω ) < 1 for 0 < ω ≤ α with some α > 1. Theorem 5.6 If A is an M -matrix then ρ(Hω ) < 1 for all ω satisfying 0<ω< 2 . 1 + ρ(J) (5.14) Proof. The convergence follows from Theorem 54, whenever 0 < ω ≤ 1, so assume that ω > 1. Assume that D = diag(A) = I , as usual, and dene the matrix Tω = (I − ωL)−1 [ωU + (ω − 1)I]. Then clearly Tω ≥ O and |Hω | ≤ Tω . Let λ = ρ(Tω ). Then for some x > 0, we have T x = λx by Theorem 33, and so (ωU + ωλL)x = (λ +

1 − ω)x. Hence λ + 1 − ω ≤ ρ(ωU + ωλL) and if λ ≥ 1, then λ + 1 − ω ≤ ωλρ(J) since λL ≥ L. In this case then ω≥ 1+λ 2 ≥ . 1 + λρ(J) 1 + ρ(J) Hence if (5.14) hold then we must have λ < 1 Then from |Hω | ≤ Tω it follows that ρ(Hω ) ≤ λ < 1, and the theorem is proved. 41 http://www.doksihu 6 Summary In this thesis we have studied dierent numerical methods for solving SLAE with M -matrices. Each single method is important from a viewpoint of the solvability: knowing of the condition number for solving these equations on computer, the knowledge of bounds for spectral radius can guarantee that the method will be convergent, the theory of nonnegative matrices set up the theory of M -matrices and nally the dierent iterative methods for SLAE with M matrices help to accelerate the convergence. We got dierent special properties for M -matrices. As we have seen in part of DM, the proof of the existence and uniqueness of LU

factorization is simple and we can stably computed without need for numerical pivoting in this case. We can easily got the regular splittings of various iterative methods. In the end we considered an extension of Kahan's theorem. 42 http://www.doksihu References [1] Richard Bellman : Introduction to matrix analysis, Mcgraw-Hill, 1960. [2] Abraham Berman, Robert J. Plemmons : Nonnegative matrices in the mathematical sciences, Academic Press, 1979. [3] István Faragó: Alkalmazott Analízis I-II., ELTE kézirat [4] Pál Rózsa : Lineáris algebra és alkalmazásai, Tankönyvkiadó, 1974. [5] Gisbert Stoyan : Numerikus matematika mérnököknek és programozóknak, Typotex, 2007. [6] Richard S. Varga : Matrix Iterative Analysis, Prentice-Hall, 1962 43