Skip to main content

Advertisement

A new Newton-like method for solving nonlinear equations

Article metrics

  • 1185 Accesses

  • 1 Citations

Abstract

This paper presents an iterative scheme for solving nonline ar equations. We establish a new rational approximation model with linear numerator and denominator which has generalizes the local linear model. We then employ the new approximation for nonlinear equations and propose an improved Newton’s method to solve it. The new method revises the Jacobian matrix by a rank one matrix each iteration and obtains the quadratic convergence property. The numerical performance and comparison show that the proposed method is efficient.

Background

We consider the system of nonlinear equations

$$\begin{aligned} F(x)=0, \end{aligned}$$
(1)

where \(F:{\mathbb {R}}^n\rightarrow {\mathbb {R}}^m\) is a continuously differentiable function. All practical algorithms for solving (1) are iterative. Newton’s method is the most widely used method in applications (see Traub 1964; Ortega and Rheinboldt 1970; Dennis and Schnabel 1993; Kelley 2003; Petković et al. 2013a).

The linearization of Eq. (1) at an iteration point \(x_k\) is

$$\begin{aligned} F(x_k)+J(x_k)s=0, \end{aligned}$$
(2)

where \(s=x-x_k\) and \(J(x_k)\) is the Jacobian matrix of F(x) at \(x_k\). For notation purposes, let \(F_k=F(x_k)\) and \(J_k=J(x_k)\). If \(m=n\) and \(J(x_k)\) is nonsingular, then the linear approximation (2) gives the Newton–Raphson iteration

$$\begin{aligned} x_{k+1}=x_k-J^{-1}_k F_k. \end{aligned}$$
(3)

In 1669, Newton first used the Newton iteration (2) to solve a cubic equation. In 1690 Raphson first employed the formula (3) to solve a general cubic equations. Then Fourier (1890), Cauchy (1829), and Fine (1916) established the convergence theorem of Newton’s method for different cases. In 1948, Kantorovich (1948) established the convergence theorem referred to the Newton–Kantorovich theorem. This theorem is the main tool for proving the convergence of various Newton-type methods.

There are various Newton-Type methods for solving nonlinear equations. Dembo et al. (1982) proposed an inexact Newton method. This method approximately solves the linear equation (2). Another most efficient approach is approximating the Jacobian or inverse of the Jacobian in some way. In this way, the approximation of the Jacobian satisfies the secant equation

$$\begin{aligned} B_ks_{k-1}=F(x_k)-F(x_{k-1}), \end{aligned}$$
(4)

where \(B_k\) is an approximation for the Jacobian and \(s_{k-1} = x_k - x_{k-1}\). For this kind of method, the secant equation (4) plays a vital role; therefore a wide variety of methods that satisfy the secant equation have been designed (Dennis and Schnabel 1993; Kelley 2003). Qi and Sun (1993) extended Newton’s method for solving a nonlinear equation of several variables to a nonsmooth case by using the generalized Jacobian instead of the derivative. This extension includes the B-derivative version of Newton’s method as a special case. In order to improve the convergence order of Newton-type methods, many higher order approaches have been proposed in past years. In particular, there is much literature focused on the nonlinear scalar function. Petković et al. (2013b) provide a survey, many of which are presented in the book (Petković et al. 2013a). For the nonlinear vector function F(x) in (1), there are still a lot of higher order methods. For instance, Grau-Sánchez et al. (2011), Noor and Waseem (2009), Homeier (2004), and Frontini and Sormani (2004) have proposed a third order method using one function value, two Jacobian matrices and two matrix inversions per iteration. In Darvishi and Barati (2007a), a third order method has been proposed with two function values, one Jacobian and one matrix inversion per iteration. Darvishi and Barati (2007b), and Sharma et al. (2013) developed a fourth order method. In pursuit of a higher order algorithm, researchers have also proposed fifth and sixth order methods in Grau-Sánchez et al. (2011). In summary, these higher order methods need more function values, Jacobians or matrix inversions per iteration.

In this paper, we are interested in a Newton-type method with high computational efficiency for solving the system of nonlinear equations (1). Motivated by the approach in Sui et al. (2014), we provide a new rational model \(R:{\mathbb {R}}^n \rightarrow {\mathbb {R}}^m\). Although our approximation function is similar to the real valued function RALND studied in Sui et al. (2014), the proposed function is different from the RALND function. Based on this model, we linearize the nonlinear function F(x) and obtain a linear equation that is different from the first order Taylor polynomial. We then propose an improved Newton’s algorithm to solve nonlinear equations (1). In the new algorithm, in order to reflect more curvature information of nonlinear functions, the Jacobian matrix is updated by rank one matrix in each iteration. This method possesses high computational efficiency , and therefore does not increase calculation of function value, Jacobian or inverse Jacobian. Applying Newton’s method’s validation criteria, we prove that the algorithm is well-defined and the convergence rate is quadratic under some suitable conditions. The preliminary numerical experiment results and comparison are reported, showing the effectiveness of the algorithm.

This paper is organized as follows. We give a new rational approximation and improved Newton’s method in the next section. In section “Convergence analysis”, converge analysis is discussed and some numerical experiment results are reported in section “Numerical experiments”. The last section is a brief conclusion.

Rational approximation and improved Newton’s method

Based on the information of the last two points, Sui proposed a RALND function (Sui et al. 2014) \(r: {\mathbb {R}}^n \rightarrow {\mathbb {R}}\) with linear numerator and denominator that is defined by

$$\begin{aligned} r({x})=a_{0}+\frac{a^{\mathrm{T}}_k(x - x_{k})}{1+ b^{\mathrm{T}}_{k}(x -x_{k})}, \end{aligned}$$
(5)

where \(a_k, b_k \in {\mathbb {R}}^n\) are the undetermined coefficient vectors and \(x_{k}\in {\mathbb {R}}^n\) is the current point. Let

$$\begin{aligned} c_{0}=\nabla ^{\mathrm {T}} f(x_{k-1})(x_{k-1}-x_k), \quad c_{1}=\nabla ^{\mathrm {T}} f(x_k)(x_{k-1}-x_k). \end{aligned}$$

Under the following interpolation conditions

$$\begin{aligned} r(x_k)=f(x_k),\nabla r(x_k)=\nabla f(x_k), \quad \nabla r(x_{k-1})=\nabla f(x_{k-1}), \end{aligned}$$

we obtain the RALND function

$$\begin{aligned} r(x)=f(x_k)+\frac{\nabla ^{\mathrm {T}} f(x_k)(x-x_k)}{1+\frac{1}{c_{0}} \left(\sqrt{\frac{c_{0}}{c_{1}}}\nabla ^{\mathrm {T}} f(x_k)-\nabla ^{\mathrm {T}}f(x_{k-1})\right)(x-x_k)}, \end{aligned}$$
(6)

where \(x_k\in {\mathbb {R}}^n\), \(x_{k-1}\in {\mathbb {R}}^n\) are the current point and the preceding point. The RALND function has many good properties (Sui et al. 2014). For example, it is monotone with any direction and has more curvature information of the nonlinear function F(x) than the linear approximation model. These properties may be able to reduce the number of iterations when using an iteration method that was constructed by RALND to solve (1). Although the RALND function possesses some nice properties, the function \(r:{\mathbb {R}}^{n}\rightarrow {\mathbb {R}}\) defined by (6) is a real valued function with each function having a different vector \(b_k\). This make it more complex for nonlinear equations.

Next, we employ the RALND function with the same horizon vector \(b_k\) for all nonlinear functions \(F_i(x),i=1,\ldots ,n\) at \(x_k\), and approximate the nonlinear equations (1) by

$$\begin{aligned} F(x_{k}+s)\approx R(x_{k}+s)= F_{k}+\frac{J_{k} s}{1+b^{\mathrm {T}}_{k}s}=0. \end{aligned}$$
(7)

When \(b_k=0\), the rational function (7) reduces to the linear expansion (2). There is a well-known analogy between the rational function (7) and RALND (6), but the function (7) is different from (6). For the RALND function (6), each function \(F_i(x),i=1,\ldots ,m\) has a different vector \(b^{(k)}_i,i=1,\ldots ,m\) at current iteration point \(x_k\), but the new approximation function (7) has the same vector \(b_k\) for all functions \(F_i(x),i=1,\ldots ,m\) at the same iteration point \(x_k\). This is the main difference between the two functions (7) and (6). Because of this difference, the function (7) is more suitable for nonlinear equations.

Similar to the linearization approach in (2), from approximate equations (7) we can obtain a new iterative formula

$$\begin{aligned} (J_{k} + F_{k} b^{\mathrm {T}}_{k}) s_k=-F_{k}. \end{aligned}$$
(8)

If the matrix \(J_k + F_k b^{T}_k\) is invertible, it follows that

$$\begin{aligned} x_{k+1}=x_k -(J_k + F_k b^{T}_k)^{-1}F_k. \end{aligned}$$
(9)

when \(b_k=0\), the iterative scheme (8) and (9) reduce to the linear equations (2) and Newton–Raphson iteration (3), respectively.

Moreover, Davidon proposed the conic model (Davidon 1980; Sorensen 1980) and many researchers have studied the conic model and collinear scaling algorithms (Ariyawansa 1990; Ariyawansa and Lau 1992; Deng and Li 1995; Gourgeon and Nocedal 1985). Near the current iteration point \(x_k\), the conic function c(x) is defined by

$$\begin{aligned} f(x)\approx c_k(x)= f(x_k)+ \frac{\nabla f(x_k)^{\mathrm {T}}s}{1+b^\mathrm {T}_k s}+\frac{s^\mathrm {T}B_k s}{2\left( 1+b^\mathrm {T}_k s\right) ^2}. \end{aligned}$$
(10)

In the conic model (10), the horizon vector \(b_k\) is a parameter. This parameter gives the conic model more freedom. Many researchers have given more attention to \(b_k\). As a result, some methods of choosing the horizon vector have been developed (Davidon 1980; Deng and Li 1995; Sheng 1995). Interestingly, the function (7) is the first two terms of conic model (10). In what follows we use these methods to determine the vector \(b_k\) in (7).

After a step from \(x_{k-1}\) to \(x_{k}\), we update \(b_{k-1}\) to \(b_{k}\) by requiring the following extra interpolation condition

$$\begin{aligned} R(x_{k-1})=F(x_{k-1}). \end{aligned}$$
(11)

This causes the search direction in (9) to depend on the Jacobian of the current point and the function values of the preceding point as well as the current point. In Newton’s method the search direction is determined by the Jacobian and function value of the current point. Compared with Newton’s method, more flexibility and more accurate approximation of the nonlinear function may be expected for the rational model (7).

From (11) we have

$$\begin{aligned} F_{k-1}=F_{k}-\frac{J_{k}s_{k-1}}{1-b^\mathrm {T}_{k}s_{k-1}}, \end{aligned}$$
(12)

where \(s_{k-1}=x_{k}-x_{k-1}\). Let

$$\begin{aligned} \beta _{k}= 1-b^\mathrm {T}_{k} s_{k-1}, \end{aligned}$$
(13)
$$\begin{aligned} y_{k-1}= F_{k}-F_{k-1}. \end{aligned}$$
(14)

Considering (12), we get

$$\begin{aligned} \beta _{k} y_{k-1} =J_{k}s_{k-1}, \end{aligned}$$

thus

$$\begin{aligned} \beta _{k}=\frac{y^\mathrm {T}_{k-1} J_{k}s_{k-1}}{y^\mathrm {T}_{k-1} y_{k-1}}. \end{aligned}$$
(15)

Note that

$$\begin{aligned} b_k = \frac{(1-\beta _k)a_k}{a^\mathrm {T}_k s_{k-1}} \end{aligned}$$
(16)

for any \(a_k\in {\mathbb {R}}^n\) with \(a^\mathrm {T}_{k}s_{k-1}\ne 0\), will satisfy (13). Considering the special choice \(a_k = s_{k-1}\), we have

$$\begin{aligned} b_{k}=\frac{(1-\beta _{k})s_{k-1}}{s^\mathrm {T}_{k-1} s_{k-1}}=\frac{y^\mathrm {T}_{k-1}(y_{k-1} - J_{k}s_{k-1})s^\mathrm {T}_{k-1}}{\left( y^\mathrm {T}_{k-1} y_{k-1}\right) \left( s^\mathrm {T}_{k-1} s_{k-1}\right) }. \end{aligned}$$
(17)

Analogously, we can consider another method (Sheng 1995) for constructing horizon vectors. Using (17) and (15), we see that

$$\begin{aligned} F_{k} b^\mathrm {T}_{k}= \frac{y^\mathrm {T}_{k-1}(y_{k-1} -J_{k}s_{k-1})}{y^\mathrm {T}_{k-1}y_{k-1}}\frac{F_{k}s^\mathrm {T}_{k-1}}{s^\mathrm {T}_{k-1} s_{k-1}}. \end{aligned}$$
(18)

Next, we give the improved Newton’s method for system of nonlinear equations.

figurea

There are two differences between Algorithm 1 and Newton’s method. First, INM uses the rank one technique to revise the Jacobian in every iteration. Second, INM utilises the function values of the previous iteration point.

For the one dimensional nonlinear equation \(f(x)=0\), where \(f:{\mathbb {R}}\rightarrow {\mathbb {R}}\) is continuously differentiable on \(D \subset {\mathbb { R}}\), the nonlinear function of \(f(x)\) is approximated by

$$\begin{aligned} f(x)\approx f(x_k)+\frac{f'(x_k)(x-x_k)}{1+b_k(x-x_k)}. \end{aligned}$$

Then, we have

$$\begin{aligned} x_{k+1}=x_k - \frac{f(x_k)}{f'(x_k)+f(x_k)b_k}. \end{aligned}$$
(19)

We also use the interpolation method to determined the parameter \(b_k\) by

$$\begin{aligned} b_k=\frac{f(x_k)-f(x_{k-1})-f'(x_{k})(x_k-x_{k-1})}{(f(x_{k})-f(x_{k-1}))(x_k-x_{k-1})}. \end{aligned}$$
(20)

Then (20) together with (19) gives the following iteration scheme

$$\begin{aligned} x_{k+1}=x_k-\frac{f(x_k)}{\frac{f'(x_{k})f(x_{k-1})}{f(x_{k-1})-f(x_{k})}+\frac{f(x_{k})}{x_k-x_{k-1}}}. \end{aligned}$$
(21)

This is a new modified Newton formula.

Convergence analysis

In this section, we prove the local quadratic convergence of Algorithm 1 for system of nonlinear equations. The techniques of the proof are similar to Newton’s method for nonlinear equations. In the rest of this paper, we make the following assumptions:

Assumption 1

  1. (i)

    \(J(x^*)\) is nonsingular and there exist a constant \(\mu > 0\), such that \(\Vert J(x^*)\Vert \le \mu\).

  2. (ii)

    The function F is continuously differentiable in the open convex set \(D\subset {\mathbb {R}}^n\), and there exists a constant \(\gamma >0\), such that for all \(x,y\in D\)

    $$\begin{aligned} \Vert J(x) - J(y)\Vert \le \gamma \Vert x-y\Vert . \end{aligned}$$

For proving the convergence theorem we need the following Lemmas.

Lemma 1

Let \(F:{\mathbb {R}}^n\rightarrow {\mathbb {R}}^m\) satisfy the (ii) of Assumption 1. Then for any \(x+s\in D\),

$$\begin{aligned} \Vert F(x+s)-F(x)-J(x)s\Vert \le \frac{\gamma }{2}\Vert s\Vert ^2. \end{aligned}$$
(22)

Proof

Please refer to Lemma 4.1.12 in Dennis and Schnabel (1993). \(\square\)

Lemma 2

Let FJ satisfy the conditions of Lemma 1, and assume that \(J(x^*)\) exists. Then there exist \(\varepsilon >0\) and \(0<m<M\), such that

$$\begin{aligned} m\Vert v-u\Vert \le \Vert F(v)-F(u)\Vert \le M\Vert v-u\Vert , \end{aligned}$$
(23)

for all \(v,u\in D\) for which \(\max \{\Vert v-x^*\Vert ,\Vert u-x^*\Vert \}\le \varepsilon\).

Proof

Please refer to Lemma 4.1.16 in Dennis and Schnabel (1993). \(\square\)

With the help of the preceding two lemmas we can prove the following Theorem of convergence. We denote the epsilon neighborhood of \(x_*\) by \(N(x_*,\varepsilon )\), i.e.,

$$\begin{aligned} N(x_*,\varepsilon ) = \{x, \ \Vert x - x_*\Vert \le \varepsilon , \forall x \in {\mathbb {R}} \}. \end{aligned}$$

Theorem 1

Let \(F:{\mathbb {R}}^n\rightarrow {\mathbb {R}}^n\) satisfy Assumption 1 and suppose that there exist \(x_*\in {\mathbb {R}}^n\), \(m>0\) and \(r > 0\), such that \(N(x_*,r)\subset D\), \(F(x_*)=0\). Then there exist \(\varepsilon >0\) such that for all \(x_0\in N(x_*,\varepsilon )\) the sequence \(\{x_2,x_3,\cdots \}\) generated by Algorithm 1 is well defined, converges to \(x_*\), and obeys

$$\begin{aligned} \Vert x_{k+1}-x_*\Vert \le \frac{\mu \gamma (m+\gamma )}{m}\Vert x_{k}-x_*\Vert ^2, \quad k=1,2,\ldots \end{aligned}$$
(24)

Proof

Since \(b_0=0\), we obtain the following inequality from the proof of Newton’s method (Dennis and Schnabel 1993),

$$\begin{aligned} \Vert x_1-x_*\Vert \le \frac{1}{2}\Vert x_0-x_*\Vert . \end{aligned}$$
(25)

Let

$$\begin{aligned} \varepsilon = \min \left\{ r, \frac{2m}{\mu \gamma (2m+\gamma )}\right\} . \end{aligned}$$

By a routine computation,

$$\begin{aligned} \left\| J(x_*)^{-1}\left[ \left( J_1+F_1b^T_1 \right) -J(x_*)\right] \right\| & \le \Vert J(x_*)^{-1}\Vert \left( \left\| J_1-J(x_*)\right\| +\left\| F_1b^T_1 \right\| \right) \\ & \le \mu \left( \left\| J_1 - J(x_*)\right\| +\left\| F_1 b^T_1\right\| \right) \\ & \le \mu \left( \gamma \left\| x_1-x_*\right\| +\left\| F_1 b^T_1 \right\| \right) . \end{aligned}$$

Considering the second term of the above expression, it follows from (22) and (23) that

$$\begin{aligned} \left\| F_1 b^T_1 \right\| & = \Vert F_{1}\Vert \Vert b_1\Vert \le \Vert F_{1}\Vert \left\| \frac{y^\mathrm {T}_0(F_1- F_0 -J_{1}s_0)}{y^\mathrm {T}_0 y_0}\right\| \frac{1}{\Vert s_0\Vert } \\ & \le \Vert F_{1}\Vert \frac{\Vert (F_1- F_0 - J_{1}s_0)\Vert }{\Vert y_0\Vert \Vert s_0\Vert } \le \Vert F_{1}\Vert \frac{\gamma \Vert s_0\Vert }{2\Vert y_0\Vert } \\ & \le \Vert F_{1}\Vert \frac{\gamma }{2m} = \frac{\gamma }{2m} \Vert F_{1}-F_*\Vert \\ & \le \frac{\gamma ^2}{2m} \Vert x_1-x_*\Vert . \end{aligned}$$

Then,

$$\begin{aligned} \left\| J(x_*)^{-1}\left[ \left( J_1+F_1b^T_1 \right) -J(x_*)\right] \right\| & \le \mu \gamma \left( 1 + \frac{\gamma }{2m}\right) \Vert x_1-x_*\Vert \nonumber \\ & \le \frac{\mu \gamma }{2} \left( 1 + \frac{\gamma }{2m}\right) \Vert x_0-x_*\Vert \nonumber \\ & \le \frac{\mu \gamma }{2}\left( 1 + \frac{\gamma }{2m}\right) \varepsilon \le \frac{1}{2}. \end{aligned}$$
(26)

Therefore, by the perturbation theorem, \(J_1+F_1b^T_1\) is nonsingular and

$$\begin{aligned} \left\| \left( J_1+F_1b^T_1\right) ^{-1}\right\| & = \frac{ \left\| J(x_*)^{-1}\right\| }{1- \left\| J(x_*)^{-1}\left[ \left( J_1+F_1b^T_1 \right) -J_*)\right] \right\| } \nonumber \\ & \le 2 \Vert J(x_*)^{-1}\Vert \nonumber \\ & \le 2\mu . \end{aligned}$$
(27)

Thus \(x_2\) is well defined. From our method, we get

$$\begin{aligned} x_2 - x_* & = x_1 - x_* -\left( J_1 + F_1b^\mathrm {T}_1\right) ^{-1}F_1 \\ & = x_1 - x_* -\left( J_1 + F_1b^\mathrm {T}_1\right) ^{-1}(F_1-F_*) \\ & = \left( J_1 + F_1b^\mathrm {T}_1\right) ^{-1}\left[ F_* -F_1 - \left( J_1 + F_1b^\mathrm {T}_1\right) (x_* - x_1)\right] . \end{aligned}$$

Furthermore,

$$\begin{aligned} \Vert x_2 - x_*\Vert & \le \left\| \left( J_1 + F_1b^\mathrm {T}_1\right) ^{-1}\right\| \left\| F_* -F_1 - \left( J_1 + F_1b^\mathrm {T}_1\right) (x_* - x_1)\right\| \\ & \le 2\mu \left\| F_* -F_1 - \left( J_1 + F_1b^\mathrm {T}_1\right) (x_* - x_1)\right\| \\ & \le 2\mu \left( \frac{\gamma }{2}\Vert (x_* - x_1)\Vert ^2+\left\| F_1b^\mathrm {T}_1(x_* - x_1)\right\| \right) \\ & \le 2\mu \left( \frac{\gamma }{2}\Vert (x_* - x_1)\Vert ^2 + \frac{\gamma ^2}{2m}\Vert x_*-x_1\Vert ^2\right) \\ & = \frac{\mu \gamma (m+\gamma )}{m}\Vert x_* - x_1\Vert ^2. \end{aligned}$$

This proves (24). Taking (25) into consideration leads to

$$\begin{aligned} \Vert x_2 - x_*\Vert & \le \frac{\mu \gamma (m+\gamma )}{4m}\Vert x_0-x_*\Vert ^2 \\ & \le \frac{\mu \gamma (m+\gamma )}{4m}\varepsilon \Vert x_0-x_*\Vert \\ & < \frac{1}{2} \Vert x_0-x_*\Vert . \end{aligned}$$

Then \(x_2\in N(x_*,r)\) and completes the case \(k=1\). The proof of the induction step proceeds identically. \(\square\)

Numerical experiments

This section is devoted to the numerical results. First, we show the numerical comparison between Algorithm 1, Newton’s method and a third order Newton’s method for finding a root of real function. This provides the numerical evidence that Algorithm 1 is better then Newton’s method. Secondly, we demonstrate the performance of Algorithm 1 for solving system of nonlinear equations. Algorithm 1 has been applied to some popular test problems and compared with Newton’s method and a third order method. All codes were written in Mathematica10.0 and run on a PC with an Intel i7 3.6GHz CPU processor, 4GB memory and 64-bit Windows 7 operating system.

Finding roots of real function

In this subsection we demonstrate the performance of our improved Newton’s method for finding the root of real functions \(f:{\mathbb {R}}\rightarrow {\mathbb {R}}\). In other words, we show the efficiency of the new iteration formula (21) in solving a root of the nonlinear equation. Specifically, we chose ten particular nonlinear equations from the literature (Thukral 2016) which are listed in Table 1.

Table 1 Test equations and range of initial point

In our tests, the stopping criteria used are \(\Vert F(x_k)\Vert <10^{-6}\) or the number of iterations exceeds 100. We compute these 10 problems by using the iteration formula (21), Newton’s Method and a third order Newton’s Method introduced in Darvishi and Barati (2007a). In our experiments, the initial point for each problem is randomly generated ten times in the range of the initial point, and the average numerical results are listed in Table 2, where

INM:

denotes the iteration formula (21),

NM:

denotes Newton’s method,

3NM:

denotes the third order Newton’s method (Darvishi and Barati 2007a),

It:

denotes the average number of iterations,

Re:

denotes the average value of \(|f(x_k)|\) when the iteration stop,

Fa:

denotes the number of failures in solving equations.

From Table 2, in terms of the number of iterations, the efficiency of the improved Newton formula (21) is better than Newton’s method, but not as good as the third order method.

Table 2 Numerical experiment results of INM, NM and 3NM

To compare the performance of the iteration formula (21), Newton’s method and the third order method (Darvishi and Barati 2007a), we consider the performance profile introduced in Dolan and More (2002) as a means. We assume that there are \(n_s\) solvers and \(n_p\) test problems from the test set \(\mathcal {P}\) which is chosen from Table 1. The initial point is selected randomly from the range of the initial point. We are interested in using the iteration number as a measure of performance for the iteration formula (21), NM and 3NM. For each problem p and solver s, let

$$\begin{aligned} f_{p,s} = \hbox {iteration number required to solve problem } p\hbox { by solver }s. \end{aligned}$$

We employ the performance ratio

$$\begin{aligned} r_{p,s} := \frac{f_{p,s}}{\min \{f_{p,s}:s\in \mathcal {S}\}}, \end{aligned}$$

where \(\mathcal {S}\) is the three solvers set. We assume that a parameter \(r_{M} \ge r_{p,s}\) is chosen for all ps, and \(r_{p,s} = r_{M}\) if and only if solver s does not solve problem p. In order to obtain an overall assessment for each solver, we define

$$\begin{aligned} \rho _s(\tau ) := \frac{1}{n_p}\mathrm {size}\{p \in \mathcal {P}:r_{p,s}\le \tau \}, \end{aligned}$$

which is called the performance profile of the number of iterations for solver s. Then, \(\rho _s(\tau )\) is the probability for solver \(s\in \mathcal {S}\) that a performance ratio \(f_{p,s}\) is within a factor \(\tau \in {\mathbb {R}}\) of the best possible ratio.

Fig. 1
figure1

Performance profile of iteration numbers of INM, NM and 3NM

Figure 1 shows the performance profile of iteration numbers in the range of \(\tau \in [1,2]\) for three solvers on 200 test problem which were selected from Table 1 with random initial points. From this figure, we see that the numerical performance of solver INM is between 3NM and NM. In summary, from the viewpoint of iteration numbers, we conclude that

$$\begin{aligned} \text {3NM}> \text {INM} > \text {NM}, \end{aligned}$$

where “>” means “better performance”.

Solving system of nonlinear equations

In this subsection we show the numerical efficiency of Algorithm 1 for solving system of nonlinear equations. Listed in Table 3 are the 12 multivariable test problems that were chosen from the test problems set (Dennis and Schnabel 1993; Moré et al. 1981; Andrei 2008). The starting points for each problem are the standard starting points. Illustrative examples further demonstrate the superiority of our proposed algorithm. The numerical results are listed in Table 4, where

INM:

denotes Algorithm 1,

NM:

denotes Newton’s method,

3NM:

denotes the third order Newton method (Darvishi and Barati 2007a),

Dim:

denotes the size of problem,

It:

denotes the number of iterations,

Ti:

denotes the value of the CPU time in seconds,

–:

denotes that the number of iterations exceeded 100.

It is observed from Table 4 that in terms of the number of iterations and computation time, the efficiency of Algorithm 1 is better than Newton’s method for most of the testing problems, and the efficiency of Algorithm 1 is close to the third order convergence method 3NM (Darvishi and Barati 2007a).

Table 3 Test problems
Table 4 Numerical experiment results of INM, NM and 3NM

The above experiments were conducted on the standard initial point. We then also need to test the three methods for test problems (Table 3) at random starting points. In particular, starting points for each problem are randomly chosen 10 times from a box surrounding the standard starting points. In order to obtain an overall assessment for the three methods, we are also interested in using the number of iterations as a performance measure for Algorithm 1, Newton’s method and the third order method (Darvishi and Barati 2007a). The performance plot based on iteration number is presented in Fig. 2. From this figure, we can see that Algorithm 1 has the best performance for \(\tau > 1.3\). Again, from the viewpoint of large test problems with a perturbed initial point, we conclude that Algorithm 1 is better than Newton’s method or the third order method (Darvishi and Barati 2007a).

Fig. 2
figure2

Performance profile of INM, NM and 3NM

Conclusion

In this paper, we present an improved Newton’s method for system of nonlinear equations by re-use of the previous iteration information. In the novel method, the function value of the previous iteration point was utilized for correcting the Newton direction. The proposed new method also has the quadratic convergence property. From the numerical results obtained for a set of standard test problems, it appears that the rank one revised implementation scheme described, where the Jacobian matrix is updated by a rank one matrix, may allow considerable computational savings for iteration number and computing time. Moreover, two kinds of numerical comparisons are presented in this paper. The first one is the numerical comparison between the new Newton formula, Newton’s method and a third order Newton method for finding roots of scalar functions. From this comparison we see that the proposed algorithm is efficient for one dimensional real function. The second comparison is for multivariate vector equations. From this comparison we see that the numerical performance of the proposed algorithm in the case of multidimensional is better than the one-dimensional case. This is a very interesting discovery which may be helpful in other contexts.

References

  1. Andrei N (2008) An unconstrained optimization test functions collection. Adv Model Optim 10:147–161

  2. Ariyawansa KA (1990) Deriving collinear scaling algorithms as extensions of quasi-Newton methods and the local convergence of DFP-and BFGS-related collinear scaling algorithms. Math Program 49:23–48

  3. Ariyawansa KA, Lau DTM (1992) Local and Q-superlinear convergence of a class of collinear scaling algorithms that extends quasi-newton methods with broyden’s bounded class of updates. Optimization 23(4):323–339

  4. Cauchy AL (1829) Sur la détermination approximative des racines d’une équation algébrique ou transcendante. In: Lecons sur le calcul differentiel, Buré fréres, Paris, pp 575–600

  5. Darvishi MT, Barati A (2007) A third-order Newton-type method to solve system of nonlinear equations. Appl Math Comput 187:630–635

  6. Darvishi MT, Barati A (2007) Afourth-order method from quadrature formulae to solve systems of nonlinear equations. Appl Math Comput 188:257–261

  7. Davidon WC (1980) Conic approximation and collinear Horizontal for optimizer. SIAM J Numer Anal 17:268–281

  8. Dembo RS, Eisenstat SC, Steihaug T (1982) Inexact newton methods. SIAM J Numer Anal 19(2):400–408

  9. Deng NY, Li ZF (1995) Some global convergence properties of a conic-variable metric algorithm for minimization with inexact line searches. Optim Methods Softw 5(1):105–122

  10. Dennis JE, Schnabel RB (1993) Numerical methods for unconstrained optimization and nonlinear equations. SIAM, Philadelphia

  11. Dolan ED, More JJ (2002) Benchmarking optimization software with performance profiles. Math Program 91:201–213

  12. Fine HB (1916) On Newton’s method of approximation. Proc Natl Acad Sci USA 2(9):546–552

  13. Fourier JBJ (1890) Question d’analyse algébrique. In: Oeuvres complétes(2), Gauthier-Villars, Paris, pp 243–253

  14. Frontini M, Sormani E (2004) Third-order methods from quadrature formulae for solving systems of nonlinear equations. Appl Math Comput 149:771–782

  15. Gourgeon H, Nocedal J (1985) A conic algorithm for optimization. SIAM J Sci Stat Comput 6(2):253–267

  16. Grau-Sánchez M, Grau A, Noguera M (2011) On the computational efficiency index and some iterative methods for solving systems of nonlinear equations. J Comput Appl Math 236:1259–1266

  17. Homeier HHH (2004) Amodified Newton method with cubic convergence: the multivariable case. J Comput Appl Math 169:161–169

  18. Kantorovich LL (1948) On Newton’s method for functional equations. Dokl Akad Nauk SSSR 59:1237–1240

  19. Kelley CT (2003) Solving nonlinear equations with Newton’s method. SIAM, Philadelphia

  20. Moré JJ, Garbow BS, Hillstrom KE (1981) Testing unconstrained optimization software. ACM Trans Math Softw 7:17–41

  21. Noor MA, Waseem M (2009) Some iterative methods for solving a system of nonlinear equations. Comput Math Appl 57:101–106

  22. Ortega JM, Rheinboldt WC (1970) Iterative solution of nonlinear equations in several variables. Academic Press, New York

  23. Petković MS, Neta B, Petković LD, Džunić J (2013) Multipoint methods for solving nonlinear equations. Elsevier, Amsterdam

  24. Petković MS, Neta B, Petković LD, Džunić J (2013) Multipoint methods for solving nonlinear equations: a survy. Appl Math Comput 226:635–660

  25. Qi L, Sun J (1993) A nonsmooth version of Newton’s method. Math Program 58(1–3):353–367

  26. Sharma JR, Guha RK, Sharma R (2013) An efficient fourth order weighted-Newton method for systems of nonlinear equations. Numer Algorithms 62:307–323

  27. Sheng S (1995) Interpolation by conic model for unconstrained optimization. Computing 54:83–98

  28. Sorensen DC (1980) The q-superlinear convergence of a collinear scaling algorithm for unconstrained optimization. SIAM J Numer Anal 17(1):84–114

  29. Sui Y, Saheya, Chen, G (2014) An improvement for the rational approximation RALND at accumulated two-point information. Math Numer Sinica 36(1):51–64

  30. Thukral R (2016) New modification of Newton method with third-order convergence for solving nonlinear equations of type \(f(0)=0\). Am J Comput Appl Math 69(1):14–18

  31. Traub JF (1964) Iterative method for the solution of equations. Prentice-Hall, Englewood Cliffs

Download references

Authors' contributions

BS and GC conceived and designed the study; YS organized the manuscript; BS and CW performed the numerical experiments. All authors read and approved the final manuscript.

Acknowledgements

This paper was partly supported by the Natural Science Foundation of Inner Mongolia (Award Number: 2014MS0119, 2014MS0102) and the China National Funds for Distinguished Young Scientists (Award Number: 11401326).

Competing interests

The authors declare that they have no competing interests.

Author information

Correspondence to Yun-kang Sui.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Keywords

  • Rational approximate function
  • Improved Newton’s method
  • Local convergence

Mathematics Subject Classification

  • 90C25
  • 90C30