Convergence Analysis and Improvement of Iterative Solutions for Linear Equation Systems

1. Introduction

The study of linear equation systems can be traced back to ancient China. The earliest mathematical work in China, the Book of Arithmetic, included some early problems with linear equation systems. The Nine Chapters on Arithmetic, discussed the solutions to linear equation systems in the “Equations” chapter.

In the late summer of 1949, Harvard University professor Wassily Leontief decomposed the American economy into 500 sectors, such as the coal industry, automobile industry, transportation system, and so on. For each department, he wrote a linear equation describing how the output of that department is allocated to other economic sectors. Leontief simplified the problem into a system of 42 equations containing 42 unknowns. To solve Leontief’s 42 equations, the Mark II computer took 56 hours of computation to obtain the final answer. Leontief was awarded the Nobel Prize in Economics in 1973 and opened the door to a new era of economic mathematical modeling. The work at Harvard in 1949 marked the beginning applying computers to analyse large-scale mathematical models, and since then, many researchers in other fields have applied computers to analyze mathematical models. Due to the large amount of data involved, these models are usually linear, that is, they are described by a system of linear equations.

In addition, linear equation systems have wide applications in different fields and aspects, and many mathematical and physical models ultimately boil down to solving linear equation systems.

Consider n-order linear differential equation problems

\( {x^{(n)}}+{a_{1}}(t){x^{(n-1)}}+⋯+{a_{n-1}}(t)\dot{x}+{a_{n}}(t)x=f(t) \) (1)

where the coefficient polynomials and right-hand polynomial are continuous functions.

Set

\( {x_{1}}=x \) , \( {x_{2}}=\dot{x} \) , \( {x_{3}}=\ddot{x} \) , \( ⋯ \) , \( {x_{n}}={x^{(n-1)}} \) (2)

there is

\( \dot{{x_{1}}}={x_{2}} \) , \( \dot{{x_{2}}}={x_{3}} \) , \( ⋯ \) , \( \dot{{x_{n-1}}}={x_{n}} \) (3)

\( \dot{{x_{n}}}=-{a_{n}}(t){x_{1}}-{a_{n-1}}(t){x_{2}}-⋯{a_{1}}(t){x_{n}}+f(t) \) (4)

set up

\( \widetilde{x}=(\begin{matrix}{x_{1}} \\ {x_{2}} \\ \begin{matrix}⋮ \\ {x_{n}} \\ \end{matrix} \\ \end{matrix}) \) , \( \dot{\widetilde{x}}=(\begin{matrix}\dot{{x_{1}}} \\ \dot{{x_{2}}} \\ \begin{matrix}⋮ \\ \dot{{x_{n}}} \\ \end{matrix} \\ \end{matrix}) \) (5)

then

\( \dot{\widetilde{x}}=( \begin{array}{c} \begin{array}{c} \begin{array}{c} \begin{matrix}\begin{matrix}0 & 1 & 0 \\ \end{matrix} & \begin{matrix}⋯ & 0 \\ \end{matrix} \\ \begin{matrix}0 & 0 & 1 \\ \end{matrix} & \begin{matrix}⋯ & 0 \\ \end{matrix} \\ \end{matrix} \\ \begin{matrix}\begin{matrix}⋮ & ⋮ & ⋮ \\ \end{matrix} & \begin{matrix} & ⋮ \\ \end{matrix} \\ \end{matrix} \end{array} \\ \begin{matrix}\begin{matrix}0 & 0 & 0 \\ \end{matrix} & \begin{matrix}⋯ & 1 \\ \end{matrix} \\ \end{matrix} \end{array} \\ \begin{matrix}\begin{matrix}-{a_{n}}(t) & {-a_{n-1}}(t) & -{a_{n-2}}(t) \\ \end{matrix} & \begin{matrix}⋯ & -{a_{1}}(t) \\ \end{matrix} \\ \end{matrix} \end{array} )\widetilde{x}+(\begin{matrix}0 \\ 0 \\ \begin{matrix}⋮ \\ 0 \\ f(t) \\ \end{matrix} \\ \end{matrix}) \) (6)

This becomes a problem of linear differential equation systems, then we can explore the general theory of linear differential equation systems.

When scientists and engineers study the flow in some networks, they will derive a system of linear equations. The problem of network analysis is to determine the traffic of each branch when local information (such as the input and output of the network) is known. The basic assumption of network flow is that the total inflow of the network is equal to the total outflow. Because traffic is conserved in each node, we have similarly, the traffic of each node can be described by an equation, and multiple nodes can be represented by a system of linear equations.

2. Three common methods

Studying how to solve linear equation systems is an important issue, and we have the following three common methods for solving the system of linear equations.

2.1. Gauss elimination method

Set the initial linear equation system

\( \begin{cases} \begin{array}{c} {a_{11}}{x_{1}}+{a_{12}}{x_{2}}+⋯+{a_{1n}}{x_{n}}={b_{1}} \\ {a_{21}}{x_{1}}+{a_{22}}{x_{2}}+⋯+{a_{2n}}{x_{n}}={b_{2}} \\ ⋮ \\ {a_{n1}}{x_{1}}+{a_{n2}}{x_{2}}+⋯+{a_{nn}}{x_{n}}={b_{n}} \end{array} \end{cases}\begin{cases} \begin{array}{c} a_{11}^{(1)}{x_{1}}+a_{12}^{(1)}{x_{2}}+⋯+a_{1n}^{(1)}{x_{n}}=b_{1}^{(1)} \\ a_{21}^{(1)}{x_{1}}+a_{22}^{(1)}{x_{2}}+⋯+a_{2n}^{(1)}{x_{n}}=b_{2}^{(1)} \\ ⋮ \\ a_{n1}^{(1)}{x_{1}}+a_{n2}^{(1)}{x_{2}}+⋯+a_{nn}^{(1)}{x_{n}}=b_{n}^{(1)} \end{array} \end{cases} \) (7)

the coefficient matrix is A. This article only discusses the reversible case of A.

If \( a_{11}^{(1)}≠0 \) ,let \( {l_{i1}}=\frac{a_{i1}^{(1)}}{a_{11}^{(1)}} \) , \( i=2,3⋯n \) ,subtract the i-th equation from the first equation and multiply it by \( {l_{i1}}(i=2,3⋯n) \) to obtain the following system of equations

\( \begin{cases} \begin{array}{c} a_{11}^{(1)}{x_{1}}+a_{12}^{(1)}{x_{2}}+⋯+a_{1n}^{(1)}{x_{n}}=b_{1}^{(1)} \\ a_{22}^{(2)}{x_{2}}+⋯+a_{2n}^{(2)}{x_{n}}=b_{2}^{(2)} \\ ⋮ \\ a_{n2}^{(2)}{x_{2}}+⋯+a_{nn}^{(2)}{x_{n}}=b_{n}^{(2)} \end{array} \end{cases} \) (8)

where

\( a_{ij}^{(2)}=a_{ij}^{(1)}-{l_{i1}}a_{1j}^{(1)},b_{2}^{(2)} \) = \( b_{i}^{(1)} \) - \( {l_{i1}}b_{1}^{(1)}(i,j=2,3⋯n) \) (9)

If \( a_{22}^{(2)}≠0 \) ,let \( {l_{i2}}=\frac{a_{i1}^{(2)}}{a_{22}^{(2)}},i=3,4⋯n \) ,Subtract the i-th equation from the second equation and multiply it by \( {l_{i2}},i=3,4⋯n. \) Generally, at step k, there is a system of equations.

\( \begin{cases} \begin{array}{c} a_{11}^{(1)}{x_{1}}+a_{12}^{(1)}{x_{2}}+⋯+a_{1n}^{(1)}{x_{n}}=b_{1}^{(1)} \\ a_{22}^{(2)}{x_{2}}+⋯+a_{2n}^{(2)}{x_{n}}=b_{2}^{(2)} \\ ⋮ \\ a_{k2}^{(k)}{x_{k}}+⋯+a_{kn}^{(k)}{x_{n}}=b_{k}^{(k)} \\ ⋮ \\ a_{nk}^{(k)}{x_{k}}+⋯+a_{nn}^{(k)}{x_{n}}=b_{n}^{(k)} \end{array} \end{cases} \) (10)

where

\( a_{ij}^{(k)}=a_{ij}^{(k-1)}-{l_{i,k-1}}a_{k-1,j}^{(k-1)} \) , \( b_{i}^{(k)}=b_{i}^{(k-1)}-{l_{i,k-1}}b_{k-1}^{(k-1)} \) , \( {l_{i,k-1}}=\frac{a_{i,k-1}^{(k-1)}}{a_{k-1,k-1}^{(k-1)}}(i,j=k,k+1⋯n) \) (11)

If the value of each step \( a_{ii}^{(i)}≠0,i=1,2⋯n \) ,the system of equations can ultimately be

\( \begin{cases} \begin{array}{c} a_{11}^{(1)}{x_{1}}+a_{12}^{(1)}{x_{2}}+⋯+a_{1n}^{(1)}{x_{n}}=b_{1}^{(1)} \\ a_{22}^{(2)}{x_{2}}+⋯+a_{2n}^{(2)}{x_{n}}=b_{2}^{(2)} \\ ⋮ \\ a_{nn}^{(n)}{x_{n}}=b_{n}^{(n)} \end{array} \end{cases} \) (12)

We can use back substitution

\( {x_{n}}=\frac{b_{n}^{(n)}}{a_{nn}^{(n)}} \) , \( {x_{k}}=(b_{k}^{(k)}-\sum _{j=k+1}^{n}a_{kj}^{(k)}{x_{j}})/a_{kk}^{(k)} \) , \( k=1,2⋯n-2,n-1 \) .(13)

As discussed above, this method requires \( a_{ii}^{(i)}≠0 \) at each step.It can be proven that the necessary and sufficient condition for \( a_{ii}^{(i)}≠0 \) is that each order principal minor determinant of the coefficient matrix A is all non-zero.

The Gauss elimination method in MATLAB algorithm is simple as follows:

for k=1:(n-1)

m=A(k+1:n,k)/A(k,k);

A(k+1:n,k+1:n)=A(k+1:n,k+1:n)-m*A(k,k+1:n);

b(k+1:n)=b(k+1:n)-m*b(k);

A(k+1:n,k)=zeros(n-k,1);

end

x=zeros(n,1);

x(n)=b(n)/A(n,n);

for k=n-1:-1:1

x(k)=(b(k)-A(k,k+1:n)*x(k+1:n))/A(k,k);

end

In addition, there are also the following iterative methods. Consider a system of linear equations \( Ax=b \) ,where \( A=D-L-U \) and D is a matrix composed of the main diagonal elements of A,-L is the strict lower triangular matrix of A (with diagonal elements of 0),-U is the strict upper triangular matrix of A.The following only considers the case where D is an invertible matrix.

2.2. Jacobi iterative method

The following only considers the case where D is an invertible matrix.

\( Ax=b(D-L-U)x=bDx=(L+U)x+bx={D^{-1}}（L+U）x+{D^{-1}}b \) (14)

thus the Jacobi iteration method is

\( {x^{(k+1)}}={D^{-1}}(L+U){x^{(k)}}+{D^{-1}}b \) (15)

This shows the reason why D needs to be reversible.

The Jacobi iterative method and Matlab algorithm are as follows:

ep=1e-6;

D=diag(diag(A));

L=-tril(A,-1);

U=-triu(A,1);

x=D\(L+U)*x0+D\b;

while norm(x-x0)>=ep

x0=x;

x=D\(L+U)*x0+D\b;

end

2.3. Gauss-Seidel iteration method (G-S iteration)

Similar to the Jacobi iteration method, except that

\( {x^{(k+1)}}={D^{-1}}(L+U){x^{(k)}}+{D^{-1}}b \) (16)

this step is transformed to

\( {x^{(k+1)}}={D^{-1}}(L{x^{(k+1)}}+U{x^{(k)}})+{D^{-1}}b \) (17)

and simply it to

\( {x^{(k+1)}}=({D-L)^{-1}}U{x^{(k)}}+{(D-L)^{-1}}b \) (18)

Here appears \( ({D-L)^{-1}} \) ,a lower triangular in common with the main diagonal elements of D. It can be inferred that this only requires D to be reversible to ensure reversibility.

The Gauss Seidel iterative method and Matlab algorithm are as follows:

ep=1e-6;

D=diag(diag(A));

L=-tril(A,-1);

U=-triu(A,1);

x=(D-L)\U*x0+(D-L)\b;

while norm(x-x0)>=ep

x0=x;

x=(D-L)\U*x0+(D-L)\b;

end

3. Convergence

In 2.2 and 2.3, it is natural to ask when the iterative vector sequence converges to the exact solution of the system of equations, thus leading to the following convergence discussion.

3.0.1. Convergence of iterative method

Consider transforming \( Ax=b \) into \( x=Gx+d \) (simply split A and perform a simple transformation), which means the iterative formula is \( {x^{(k+1)}}=G{x^{(k)}}+d \) ,where G is called the iterative matrix,and the spectral radius is set to \( ρ(G) \) .

Theorem 3.1.1: The sufficient and necessary condition for iterative formula \( {x^{(k+1)}}=G{x^{(k)}}+d \) converge is that \( ρ(G) \) <1.

Proof: Let

\( {ε^{(k)}}={x^{(k)}}-{x^{*}} \) (19)

\( {x^{*}} \) be the solution of

\( Ax=b \) (20)

so as to

\( {x^{(k+1)}}=G{x^{(k)}}+d \) converge \( {ε^{(k)}}0 \) ,k \( ∞ \) (21)

We can know that

\( {ε^{(k+1)}}=G{ε^{(k)}}={G^{k}}{ε^{(1)}} \) (22)

thus

\( {ε^{(k)}}0 \) ,k \( ∞{G^{k}}0 \) , k \( ∞ \) (23)

It can be seen that the proof of theorem 3.1.1 turns into the proof of

\( {G^{k}}0 \) , k \( ∞ \) ρ<1(24)

This requires knowledge of the subordinate norm of matrix, the subordinate norm of matrix A is

\( ||A||=\underset{x≠0}{max}{\frac{||Ax||}{||x||}}=\underset{||x||=1}{max}{||Ax||} \) (25)

and the norm of vector x can be p-norm \( {||x||_{p}} \) (p=1,2, \( ⋯∞) \) or any other norm.We provide following properties without proof:

\( ||A||≥0,||A||=0 \) if and only if A=0;

\( |αA|=|α|||A||, \) for \( ∀α∈R \) ;

\( ||A+B||≤||A||+||B||, \) \( ∀A,B∈{R^{n×n}} \) ;

\( ||Ax||≤||A||∙||x||,∀x∈{R^{n}} \) ;

\( ||AB||≤||A||∙||B||,∀A,B∈{R^{n×n}} \) ;

\( {||G||_{∞}}=\underset{i}{max}{\sum _{j=1}^{n}|{a_{ij}}|} \) ;

On the one hand, consider “”.Assume a certain eigenvalue \( λ \) of G and \( |λ|≥ \) 1 and

\( Gx=λx(x≠0) \) (26)

then

\( {G^{k}}x={λ^{k}}x \) (27)

\( ||{G^{k}}x||=||{λ^{k}}x||≥||x|| \) (28)

By definition of the subordinate of G,

\( ||{G^{k}}||≥\frac{||{G^{k}}x||}{||x||}≥ \) 1(29)

This is contradictory to

\( {G^{k}}0 \) , k \( ∞ \) (30)

On the other hand, another derivation direction is more difficult. We can easily deduce that \( ρ(G)≤||G||. \)

In fact, for \( ∀λ \) , \( λ \) is \( the eigenvalue of G \) , \( x is a real eigenvector belonging to λ \) .Combine property(iv)

\( ||Gx||≤||G||∙||x|| \) (31)

with

\( ||Gx||=||λx||=|λ|||x|| \) (32)then

\( |λ|≤||G|| \) (33)

Therefore

\( ρ(G)≤||G|| \) (34)

What if it adds a little more?

When \( ρ(G) \) has a slight perturbation, it becomes to \( ρ(G) \) + \( ε, \) then we have following theorem.

Theorem 3.1.2:For \( ∀ε \gt 0,there is a subordinate norm of G that ||G||≤ρ(G) \) + \( ε. \)

Proof:Let

\( G=PJ{P^{-1}} \) (35)

J is the standard type of A,

\( J=[\begin{matrix}\begin{matrix}{J_{1}} & \\ \end{matrix} & \begin{matrix} & \\ \end{matrix} \\ \begin{matrix}\begin{matrix} \\ \\ \\ \end{matrix} & \begin{matrix}{J_{2}} \\ \\ \\ \end{matrix} \\ \end{matrix} & \begin{matrix}\begin{matrix} \\ ⋱ \\ \\ \end{matrix} & \begin{matrix} \\ \\ {J_{S}} \\ \end{matrix} \\ \end{matrix} \\ \end{matrix}] \) , \( {J_{i}}={[\begin{matrix}\begin{matrix}{λ_{i}} & 1 \\ \end{matrix} & \begin{matrix} & \\ \end{matrix} \\ \begin{matrix}\begin{matrix} \\ \\ \\ \end{matrix} & \begin{matrix}{λ_{i}} \\ \\ \\ \end{matrix} \\ \end{matrix} & \begin{matrix}\begin{matrix}⋱ \\ ⋱ \\ \\ \end{matrix} & \begin{matrix} \\ 1 \\ {λ_{i}} \\ \end{matrix} \\ \end{matrix} \\ \end{matrix}]_{{n_{i}}×{n_{i}}}} \) , \( \sum _{i=1}^{s}{n_{i}}=n \) (36)

Let

\( D={[\begin{matrix}\begin{matrix}1 & \\ \end{matrix} & \begin{matrix} & \\ \end{matrix} \\ \begin{matrix}\begin{matrix} \\ \\ \\ \end{matrix} & \begin{matrix}ε \\ \\ \\ \end{matrix} \\ \end{matrix} & \begin{matrix}\begin{matrix} \\ ⋱ \\ \\ \end{matrix} & \begin{matrix} \\ \\ {ε^{n-1}} \\ \end{matrix} \\ \end{matrix} \\ \end{matrix}]_{n×n}} \) (37)

Then

\( \widetilde{J}={D^{-1}}JD \) (38)

has the following form \( : \)

\( \widetilde{J}=[\begin{matrix}\begin{matrix}{\widetilde{J}_{1}} & \\ \end{matrix} & \begin{matrix} & \\ \end{matrix} \\ \begin{matrix}\begin{matrix} \\ \\ \\ \end{matrix} & \begin{matrix}{\widetilde{J}_{2}} \\ \\ \\ \end{matrix} \\ \end{matrix} & \begin{matrix}\begin{matrix} \\ ⋱ \\ \\ \end{matrix} & \begin{matrix} \\ \\ {\widetilde{J}_{S}} \\ \end{matrix} \\ \end{matrix} \\ \end{matrix}],{\widetilde{J}_{i}}={[\begin{matrix}\begin{matrix}{λ_{i}} & ε \\ \end{matrix} & \begin{matrix} & \\ \end{matrix} \\ \begin{matrix}\begin{matrix} \\ \\ \\ \end{matrix} & \begin{matrix}{λ_{i}} \\ \\ \\ \end{matrix} \\ \end{matrix} & \begin{matrix}\begin{matrix}⋱ \\ ⋱ \\ \\ \end{matrix} & \begin{matrix} \\ ε \\ {λ_{i}} \\ \end{matrix} \\ \end{matrix} \\ \end{matrix}]_{{n_{i}}×{n_{i}}}} \) (39)

Thus

\( {||\widetilde{J}||_{∞}}≤ρ(G)+ε \) (in fact,the equal sign holds)(40)

Consider the relationship between G and \( \widetilde{J} \) ,Let

\( Q=PD \) (41)

Then

\( G=Q\widetilde{J}{Q^{-1}} \) , \( \widetilde{J}={Q^{-1}}GQ \) (42)

\( {||\widetilde{J}||_{∞}}=\underset{{||y||_{∞}}=1}{max}{{||\widetilde{J}y||_{∞}}}=\underset{{||y||_{∞}}=1}{max}{{||{Q^{-1}}GQy||_{∞}}} \) (43)

Let

\( Qy=x \) , (44)

Then

\( {||\widetilde{J}||_{∞}}=\underset{{||{Q^{-1}}x||_{∞}}=1}{max}{||{Q^{-1}}Gx||_{∞}} \) (45)

Through observation, we can let

\( ||x||={||{Q^{-1}}x||_{∞}} \) (46)

\( {||\widetilde{J}||_{∞}}=\underset{{||{Q^{-1}}x||_{∞}}=1}{max}{||{Q^{-1}}Gx||_{∞}}=\underset{||x||=1}{max}{||Gx||}=||G|| \) (47)

Then

\( ||G||≤ρ(G)+ε \) (48)

Now we can prove “” of theorem 3.1.1. We can always find a \( ε \) that

\( ρ(G)+ε \lt 1 \) (49)

holds. By property(v),

\( ||{G^{k}}||≤{||G||^{k}}≤{(ρ(G)+ε)^{k}} \) 0(50)

By property(i),

\( {||G^{k}}||0{G^{k}}0 \) (51)

proof completed.

When the coefficient matrix is some special matrix, there will be some special results.

3.1. Special coefficient matrix

3.1.1. The coefficient matrix is a strictly diagonally dominant matrix

Theorem 3.2.1:If A is a strictly diagonally dominant matrix, then the G-S iteration method and Jacobi iteration method for solving the linear equation system Ax=b both converge.

Proof: Firstly, prove the convergence of the G-S iterative method.

In the discussion of theorem 3.1.1 and the G-S iterative method mentioned above, it is necessary to prove that the spectral radius of \( G={(D-L)^{-1}}U \) is less than 1.Proof by contradiction, assuming a certain eigenvalue of G \( |λ|≥1 \) .

\( |λI-{(D-L)^{-1}}U|=|{(D-L)^{-1}}||λ(D-L)-U|=|{(D-L)^{-1}}||λD-λL-U| \) (52)

because

\( |{(D-L)^{-1}}|≠0 \) (A is a strictly diagonally dominant matrix)(53)

\( |λD-λL-U| \) only needs to be considered, \( λD-λL- \) U is denoted as B (54)

\( \sum _{ \begin{array}{c} j=1 \\ j≠i \end{array} }^{n}|{b_{ij}}|=|λ|\sum _{j=1}^{i-1}|{a_{ij}}|+\sum _{j=i+1}^{n}|{a_{ij}}|≤|λ|\sum _{j=1}^{i-1}|{a_{ij}}|+|λ|\sum _{j=i+1}^{n}|{a_{ij}}|≤|λ|(\sum _{j=1}^{i-1}|{a_{ij}}|+\sum _{j=i+1}^{n}|{a_{ij}}|) \lt |λ||{a_{ii}}|=|{b_{ii}}| \) (55)

This indicates that B is also a strictly diagonally matrix, thus B is a non-singular matrix,

\( |B|≠0 \) (56)

This \( λ \) contradicts the eigenvalue of G, so the spectral radius of G is less than 1.In accordance with the theorem 3.1.1,proof completed.

Further, prove the convergence of Jacobi iteration method.

At this point, \( {G=D^{-1}}(L+U) \) , it is actually similar to the convergence proof of the G-S iteration method, just note that

\( |λI-{D^{-1}}(L+U)|=|{D^{-1}}||λD-L-U| \) (57)

\( B=λD-L-U \) (58)

\( \sum _{ \begin{array}{c} j=1 \\ j≠i \end{array} }^{n}|{b_{ij}}|=\sum _{j=1}^{i-1}|{a_{ij}}|+\sum _{j=i+1}^{n}|{a_{ij}}|≤|λ|(\sum _{j=1}^{i-1}|{a_{ij}}|+\sum _{j=i+1}^{n}|{a_{ij}}|) \lt |λ||{a_{ii}}|=|{b_{ii}}| \) (59)

B is a strictly diagonally matrix, therefore B is a non-singular matrix,

\( |B|≠0 \) (60)

so this \( λ \) contradicts the eigenvalues of G.

3.1.2. The coefficient matrix is a positive definite real symmetric matrix

Theorem 3.2.2:If A is a real symmetric positive definite matrix, then when 2D-A is positive, the Jacobi iteration method converges.

Proof: The iterative matrix

\( G={D^{-1}}(L+U)={D^{-1}}(D-A)=I-{D^{-1}}A \) , \( x≠0 \) , \( x∈{R^{n}} \) (61)

\( ||Gx||_{A}^{2}=(AGx,Gx)=(A(I-{D^{-1}}A)x,(I-{D^{-1}}A)x)=||x||_{A}^{2}-((2D-A){D^{-1}}Ax,{D^{-1}}Ax) \) (62)

we easily know \( {D^{-1}}A \) positive definite, and \( 2D-A \) is a positive definite.

Therefore,

\( ((2D-A){D^{-1}}Ax,{D^{-1}}Ax) \gt 0 \) (63)

then

\( {||Gx||_{A}} \lt {||x||_{A}} \) holds(64)

Thus

\( {||G||_{A}}=\underset{x≠0}{max}{\frac{{||Gx||_{A}}}{{||x||_{A}}}} \lt 1 \) , \( ρ(G)≤{||Gx||_{A}} \lt 1 \) (65)

then the Jacobi iterative method converges.

4. Example

We will give two examples of system of linear equations Ax=b, run them in MATLAB, solve them using the three methods mentioned above, and compare the results of their runs. For the iterative methods, an initial value x0 needs to be given.

A= \( (\begin{matrix}4 & 2 & 1 \\ 3 & 7 & 2 \\ 1 & -1 & 3 \\ \end{matrix}) \) ,b= \( (\begin{matrix}10 \\ 3 \\ 5 \\ \end{matrix}) \) , \( x \) 0= \( (\begin{matrix}0 \\ 0 \\ 0 \\ \end{matrix}) \) (66)

Operation results	Run solution	Duration time/second	Number of iterations
Gauss elimination method	[2.8529,-0.9118,0.4118]’	0.001969	\
Gauss-Seidel iterative method	[2.8529,-0.9118,0.4118]’	0.000453	11
Jacobi iterative method	[2.8529,-0.9118,0.4118]’	0.001062	22

A= \( (\begin{matrix}\begin{matrix}10 & 2 \\ \end{matrix} & \begin{matrix}3 & 4 \\ \end{matrix} \\ \begin{matrix}\begin{matrix}4 & 13 \\ \end{matrix} \\ \begin{matrix}7 & 1 \\ \end{matrix} \\ \begin{matrix}9 & 8 \\ \end{matrix} \\ \end{matrix} & \begin{matrix}\begin{matrix}5 & 3 \\ \end{matrix} \\ \begin{matrix}9 & 0 \\ \end{matrix} \\ \begin{matrix}3 & 21 \\ \end{matrix} \\ \end{matrix} \\ \end{matrix}) \) ,b= \( (\begin{matrix}2 \\ 7 \\ \begin{matrix}5 \\ 3 \\ \end{matrix} \\ \end{matrix}) \) , \( x \) 0= \( (\begin{matrix}0 \\ 0 \\ \begin{matrix}0 \\ 0 \\ \end{matrix} \\ \end{matrix}) \) (67)

Operation results	Run solution/10^-1	Duration time/second	Number of iterations
Gauss elimination method	[0.009,3.553,5.153,-0.665]’	0.001742	\
Gauss-Seidel iterative method	[0.009,3.553,5.153,-0.665]’	0.000374	14
Jacobi iterative method	[0.009,3.553,5.153,-0.665]’	0.000728	149

From the table, it can be seen that the Gauss-Seidel iteration method has the shortest duration, the fastest solution, and the least number of iterations. Therefore, if Gauss-Seidel iteration can be used among the three methods, this method should be adopted.

5. Improvement of the iterative method

The Jacobi iterative and G-S iterative methods require the condition that D is reversible, so what improvement should be taken when D is irreversible. At this point, we need to make some changes to A. Obviously, doing so will result in the solution and we are supposed to take the impact of disturbance of A on the solution into consideration.

Set \( Δ \) be a matrix with very small absolute values of its component elements, the exact solution of \( Ax=b \) is \( {x^{*}} \) .When \( Ax=b \) transforms into \( (A+Δ)x=b \) ,the solution of the equations become \( {x^{*}}+δ \) ,thus \( (A+Δ)({x^{*}}+δ)=b. \) Expand and simplify it,we can obtain

\( δ=-{A^{-1}}∙Δ∙{x^{*}}-{A^{-1}}∙Δ∙δ. \) (68)

Therefore,

\( ||δ||≤||{A^{-1}}||∙||Δ||∙||{x^{*}}||+||{A^{-1}}||∙||Δ||∙||δ|| \) (69)

\( (1-||{A^{-1}}||∙||Δ||)||δ||≤||{A^{-1}}||∙||Δ||∙||{x^{*}}||. \) (70)

Due to the character of \( Δ \) , \( ||{A^{-1}}||∙||Δ|| \lt 1 \) can always hold.

Thereby

\( \frac{||δ||}{||{x^{*}}||}≤\frac{||{A^{-1}}||∙||Δ||}{1-||{A^{-1}}||∙||Δ||}=\frac{||{A^{-1}}||∙||A||∙(\frac{||Δ||}{||A||)}}{1-||{A^{-1}}||∙||A||∙(\frac{||Δ||}{||A||)}} \) (71)

Using the condition number of A, \( cond(A)=||A||∙||{A^{-1}}|| \) ,further obtain

\( \frac{||δ||}{||{x^{*}}||}≤\frac{cond(A)∙(\frac{||Δ||}{||A||)}}{1-cond(A)∙(\frac{||Δ||}{||A||)}}. \) (72)

Through the character of the formula, when \( cond(A) \) is very small, the slight perturbation of the coefficient matrix A makes the relative error of solution very small.So adding a very small non-zero number to the zero diagonal element of A makes D reversible,thus Jacobi and G-S methods can be used.

6. Conclusion

This article summarizes the formulas and corresponding Matlab codes for Gauss elimination, Jacobi iteration, and G-S iteration in solving system of equations. The convergence of different iteration methods under different types of coefficient matrices is studied, and it is found that when the coefficient matrix is strictly diagonal, both Jacobi iteration and Gauss-Seidel iteration converge. The coefficient matrix is a positive definite real symmetric matrix, and when 2D-A is positive, Jacobi iteration method converges Meanwhile, through simulation programming, two tables were obtained, from which the running results show that the Gauss-Seidel iteration method has the shortest duration, the fastest solution, and the least number of iterations. Therefore, when the diagonal elements of the coefficient matrix are non-zero, the Gauss-Seidel iteration method has the best effect. When there are zero elements in the diagonal elements and the condition number of A is small, the iterative methods can be used by perturbing at the zero elements.

References

[1]. David C.Lay Steven R.Lay Judi J.McDonald.Linear Algebra and Its Applications.

[2]. Wang G.X.Zhou Z.M.Zhu S.M.Ordinary Differential Equation.

[3]. Huang Y.Q.Shu S.Yang Y..Numerical Method.

[4]. Shams M, Kausar N, Agarwal P, et al.Triangular intuitionistic fuzzy linear system of equations with applications: an analytical approach[J].Applied Mathematics in Science and Engineering, 2024, 32(1):

[5]. Darvishi T M , Khani F , Godarzi M A , et al.Symmetric modified AOR method to solve systems of linear equations[J].Journal of Applied Mathematics and Computing, 2011, 36(1-2):41-59.

[6]. Zhong-Zhi B.On convergence of the matrix splitting iteration paradigm for solving systems of linear equations[J].Applied Mathematics Letters, 2024, 150108969-.

[7]. EisenstatC S, ElmanC H, SchultzH M.Variational Iterative Methods for Nonsymmetric Systems of Linear Equations[J].SIAM Journal on Numerical Analysis, 2006, 20(2):345-357.

Cite this article

Liu,Y. (2025). Convergence Analysis and Improvement of Iterative Solutions for Linear Equation Systems. Theoretical and Natural Science,101,14-24.

Data availability

The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.

Disclaimer/Publisher's Note

The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of EWA Publishing and/or the editor(s). EWA Publishing and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

About volume

Volume title: Proceedings of CONF-MPCS 2025 Symposium: Mastering Optimization: Strategies for Maximum Efficiency

ISBN：978-1-80590-017-7(Print) / 978-1-80590-018-4(Online)

Editor：Marwan Omar

Conference website: https://2025.confmpcs.org/workshop_chicago.html

Conference date: 21 March 2025

Series: Theoretical and Natural Science

Volume number: Vol.101

ISSN：2753-8818(Print) / 2753-8826(Online)

© 2024 by the author(s). Licensee EWA Publishing, Oxford, UK. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license. Authors who publish this series agree to the following terms:
1. Authors retain copyright and grant the series right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this series.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the series's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this series.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See Open access policy for details).

References

[1]. David C.Lay Steven R.Lay Judi J.McDonald.Linear Algebra and Its Applications.

[2]. Wang G.X.Zhou Z.M.Zhu S.M.Ordinary Differential Equation.

[3]. Huang Y.Q.Shu S.Yang Y..Numerical Method.

[5]. Darvishi T M , Khani F , Godarzi M A , et al.Symmetric modified AOR method to solve systems of linear equations[J].Journal of Applied Mathematics and Computing, 2011, 36(1-2):41-59.

[6]. Zhong-Zhi B.On convergence of the matrix splitting iteration paradigm for solving systems of linear equations[J].Applied Mathematics Letters, 2024, 150108969-.

[7]. EisenstatC S, ElmanC H, SchultzH M.Variational Iterative Methods for Nonsymmetric Systems of Linear Equations[J].SIAM Journal on Numerical Analysis, 2006, 20(2):345-357.