1. Home
  2. Archives
  3. Vol 56 (2024) Issue 3
  4. Articles

Discretization and Associated Asymptotic Behavior for the Lax Equation with Skew-symmetry

Abstract

This paper presents an algorithm to construct a tridiagonal matrix factored by bidiagonal matrices with prescribed eigenvalues and specified matrix entries. The proposed algorithm addresses inverse eigenvalue problems (IEPs) constrained by LR decomposition. Using techniques from discrete soliton theory, we derive recurrence relations that connect matrix entries and eigenvalues. The algorithm systematically computes unknown entries in the matrix from given spectrum data and partial matrix information. Several examples, including cases with real, complex, and multiple eigenvalues, demonstrate the efficiency of the proposed algorithm. Additionally, we provide conditions under which the algorithm successfully solves the IEP.

Keywords

1 Introduction

Matrix operations and eigenvalue computations that are capable of efficiently handling large-scale numerical data are indispensable for recent developments in advanced information processing [1,2]. These techniques play a crucial role in analyzing physical phenomena in fields such as electrical engineering and quantum mechanics, as well as in the internal processes of search engines, which are integral to our daily lives.

A fundamental problem is often formulated in terms of the eigenvalues of a square matrix ∈ ℝ×, where each eigenvalue is a root of the characteristic equation of . The eigenvalue computations are typically performed through iterative methods that manipulate the matrix elements to ensure convergence toward the desired eigenvalues. A key technique in eigenvalue computations is matrix factorization. A well-known example is the QR factorization [3], where a nonsingular matrix ∈ ℝ× is factorized as = , using an orthogonal matrix and an upper triangular matrix . Other methods include the LR factorization, which expresses as the product of a lower triangular matrix and an upper triangular matrix , and the Cholesky factorization, in which, for symmetric matrices, is factorized as = T, where is an upper triangular matrix and T is its transpose [4].

Eigenvalue computation methods are chosen depending on the problem's formulation and intended application. Among the numerically stable algorithms, the QR method, based on the QR factorization, is widely employed. The QR method [5,6] iteratively applies the factorization of matrices () and () as two operations: () = () () and (+1) = () () . The equation () = () () represents the QR factorization of the matrix () , while the equation (+1) = () () updates the matrix by reversing the order of multiplication. This system of equations also leads to a similarity transformation (+1) = ( () ) T () () , which ensures that, for sufficiently large , the matrix () converges to an upper triangular form, with its diagonal entries corresponding to the eigenvalues of the initial (0) .

Various analyses of Lax dynamics continue to be actively pursued in the study of integrable systems [7-9]. The connections between numerical methods and integrable systems, particularly nonlinear dynamical systems that allow for exact solutions, have been established. For example, it was shown in [10-12] that one step of the QR method corresponds to a time evolution step of a Lax equation, which describes an integrable system. Furthermore, the Lax equation involving Hessenberg matrices has revealed the relationship between integrable system theory and numerical algorithms, concerning the eigenvalue computations based on LR factorization [13].

The accuracy and speed of matrix eigenvalue computations in numerical analysis depend heavily on the size and structure of the matrix and, thus, the introduction of shift parameters is often explored to improve performance. The shifted QR method, which incorporates a shift parameter, enables faster eigenvalue computations than the standard QR method, provided that the shift amount is chosen appropriately [14,15]. However, despite extensive research on the relationship between the QR method and integrable systems, little attention has been given to the connections between the shifted QR method and Lax equations. As the introduction of shift parameters breaks the direct correspondence with the governing equations of integrable systems, merely extending results from the standard QR method does not suffice to interpret the shifted QR method within the framework of integrable systems.

Therefore, this paper aims to reinterpret the shifted QR method from the perspective of discrete dynamical systems, treating the shift parameter as a discretization step arising from the integrable system theory. We uncover the integrable structure underlying the shifted QR method and propose a novel matrix

eigenvalue computation method utilizing skew-symmetric matrices based on this framework, distinct from the shifted QR method. We also analyze the asymptotic behavior of this proposed method.

The structure of this paper is as follows. In Section 2, we introduce a shift parameter () and demonstrate that, in the limit () ≔ −1/ () → 0, the shifted QR method corresponds to the Lax equation involving skew-symmetric matrices. In Section 3, we propose a novel discretization method based on matrix factorization, focusing on the skew-symmetric structure of the matrices in the Lax equation associated with the shifted QR method. We also analyze the asymptotic behavior of this new method. Finally, we present our conclusions in Section 4.

2 Connection between the Shifted QR Method and the Lax System with Skew-symmetry

In this section, we first explain the eigenvalue computation process using the shifted QR method. We then demonstrate the derivation of the Lax equation by taking the limit of the shift parameter in the shifted QR method.

The shifted QR method [6] is an eigenvalue computation technique that introduces a shift parameter to accelerate the convergence of the standard QR method. Given an initial matrix (0) = ∈ ℝ× , the shifted QR method performs the following two operations:

1. The QR factorization is performed on a matrix whose diagonal entries in () have been shifted by the parameter () as follows:

\[A^{(k)} - \mu^{(k)}I = Q^{(k)}R^{(k)}, \quad k = 0, 1, ...,\] where () and () are orthogonal and upper triangular matrices, respectively, (0) , (1) , … are arbitrary constants, and denotes the identity matrix.

2. After obtaining the matrices () and () from the shifted QR factorization, the matrix for the next iteration (+1) is computed by multiplying () and () , and subsequently adding the shift parameter () back to the diagonal entries. This process is expressed as:

\[A^{(k+1)} = R^{(k)}Q^{(k)} + \mu^{(k)}I, \quad k = 0, 1, \dots.\]

By repeating these steps for \(A^{(0)}\), \(A^{(1)}\), ..., the shifted QR method iteratively computes the eigenvalues of the initial matrix A. The transition from \(A^{(k)}\) to \(A^{(k+1)}\) in the shifted QR method can be interpreted as a similarity transformation \(A^{(k+1)} = (Q^{(k)})^{\mathsf{T}} A^{(k)} Q^{(k)}\) that preserves the eigenvalues of the matrices.

Considering a single step of the shifted QR method as a discrete-time evolution in a discrete system suggests the existence of an underlying continuous system, represented by a differential equation in continuous time t. To derive the continuous analogue of the shifted QR method, we first introduce two auxiliary matrices defined in terms of \(Q^{(k)}\), \(R^{(k)}\), and the shift parameter \(\mu^{(k)}\) as follows:

\[\tilde{Q}^{(k)} \coloneqq \mu^{(k)} \big( I - Q^{(k)} \big), \quad \tilde{R}^{(k)} \coloneqq R^{(k)} + \mu^{(k)} I,\] which imply that \(A^{(k)}\) can be decomposed as \(A^{(k)} = \tilde{Q}^{(k)} + \tilde{R}^{(k)} - \tilde{Q}^{(k)} \tilde{R}^{(k)} / \mu^{(k)}\). From the similarity transformation \(A^{(k+1)} = (Q^{(k)})^{\mathsf{T}} A^{(k)} Q^{(k)}\), by introducing a discrete parameter \(\delta^{(k)} \coloneqq -1/\mu^{(k)}\) to replace the shift parameter \(\mu^{(k)}\), we consider that the correspondence between the discrete-time index k and the continuous-time variable t is established as \(A^{(k)} = A(t)\) and \(A^{(k+1)} = A(t + \delta^{(k)})\). Then, as \(\delta^{(k)} \to 0\), the following continuous equation holds:

\[\frac{dA(t)}{dt} = \lim_{\delta^{(k)} \to 0} \left( A(t)\tilde{Q}^{(k)} - \tilde{Q}^{(k)}A(t + \delta^{(k)}) \right). \tag{1}\]

The auxiliary matrix \(\tilde{Q}^{(k)}\) is computed from the QR factorization of \(A^{(k)} - \mu^{(k)}I\). To investigate its behavior as \(\delta^{(k)} \to 0\), it is sufficient to analyze the computation of the Q-factor in the QR factorization. Although the Q-factor can also be computed using Householder matrices, we employ the Gram-Schmidt orthogonalization method. Let the j-th column vector of \(\hat{A} \coloneqq A^{(k)} - \mu^{(k)}I \in \mathbb{R}^{m \times m}\) be denoted by \(\hat{a}_j\). Then, the vectors \(b_1, b_2, \ldots, b_m\) determined by the Gram-Schmidt orthogonalization [2] are given as:

\[b_1 = \hat{a}_1, \quad b_j = \hat{a}_j - \sum_{\ell=1}^{j-1} \frac{\hat{a}_j \cdot b_\ell}{\|b_\ell\|^2} b_\ell, \quad j = 2, 3, ..., m,\] (2)

where \(\hat{a}_j \cdot b_\ell\) is the Euclidean inner product of \(\hat{a}_j\) and \(b_\ell\), and \(||b_\ell||\) denotes the Euclidean norm of \(b_\ell\).

Regarding the vectors:

\[c_{j,\ell} := \frac{\widehat{a}_j \cdot b_\ell}{\|b_\ell\|^2} b_\ell, \quad \ell = 1, 2, ..., j - 1, \quad j = 2, 3, ..., m,\] (3)

arising in the Gram-Schmidt process (2), we present the following lemma concerning the existence of these vectors as the limit \(\delta^{(k)} \to 0\), which corresponds to \(\mu^{(k)} \to -\infty\).

Lemma 1. Let \(c_{j,\ell}\) be the vectors in Eq. (3) generated by the Gram-Schmidt orthogonalization applied to the matrix \(\hat{A} = A^{(k)} - \mu^{(k)}I\). Then,

\[\lim_{\mu^{(k)} \to -\infty} \sum_{\ell=1}^{j-1} \mathbf{c}_{j,\ell} = (a_{1,j} + a_{j,1}) \mathbf{e}_1 + (a_{2,j} + a_{j,2}) \mathbf{e}_2 + \cdots + (a_{j-1,j} + a_{j,j-1}) \mathbf{e}_{j-1}, \quad j = 2, 3, ..., m,\] where \(a_{i,j}\) denotes the (i,j)-th entry in \(A^{(k)}\), and \(e_1, e_2, ..., e_{j-1}\) are unit vectors, with the i-th unit vector \(e_i\) having a 1 in the i-th element and 0 in all other elements.

Proof. The vector obtained from Eq. (2) applied to the 2nd column vector of \(\hat{A}\) is given by \(\boldsymbol{b}_2 = \hat{\boldsymbol{a}}_2 - \boldsymbol{c}_{2,1}\). Since \(\boldsymbol{b}_1 = \hat{\boldsymbol{a}}_1\), we obtain \(\boldsymbol{c}_{2,1} = (\hat{\boldsymbol{a}}_2 \cdot \hat{\boldsymbol{a}}_1)\hat{\boldsymbol{a}}_1/\|\hat{\boldsymbol{a}}_1\|^2\). Using the elements of \(\hat{\boldsymbol{a}}_1 = (a_{1,1} - \mu^{(k)}, a_{2,1}, \dots, a_{m,1})^{\mathsf{T}}\) and \(\hat{\boldsymbol{a}}_2 = (a_{1,2}, a_{2,2} - \mu^{(k)}, \dots, a_{m,2})^{\mathsf{T}}\), the 1st element of \(\boldsymbol{c}_{2,1}\) becomes:

\[(\boldsymbol{c}_{2,1})_1 = \frac{a_{1,2} + a_{2,1} \frac{a_{2,2} - \mu^{(k)}}{a_{1,1} - \mu^{(k)}} + \frac{\sum_{\ell=3}^m a_{\ell,1} a_{\ell,2}}{a_{1,1} - \mu^{(k)}}}{1 + \frac{\sum_{\ell=2}^m a_{\ell,1}^2}{\left(a_{1,1} - \mu^{(k)}\right)^2}}.\]

Taking \(\mu^{(k)} \to -\infty\), we obtain \(\lim_{\mu^{(k)} \to -\infty} (c_{2,1})_1 = a_{1,2} + a_{2,1}\). The 2nd, 3rd, ..., m-th elements of \(c_{2,1}\) can be expressed as:

\[\text{[rumus tidak dapat ditampilkan dengan baik — lihat PDF asli]}\]

Observing the right-hand side of Eq. (4), we obtain \(\lim_{\mu^{(k)}\to-\infty} (\boldsymbol{c}_{2,1})_i = 0\) for \(i=2,3,\ldots,m\). Thus, the limit of \(\boldsymbol{c}_{2,1}\) as \(\mu^{(k)}\to-\infty\) is given by \(\lim_{\mu^{(k)}\to-\infty} \boldsymbol{c}_{2,1} = (a_{1,2}+a_{2,1})\boldsymbol{e}_1\). Next, let us assume that, for \(j=2,3,\ldots,s\), the following holds:

\[\lim_{\mu^{(k)} \to -\infty} \sum_{\ell=1}^{j-1} c_{j,\ell} = (a_{1,j} + a_{j,1}) \boldsymbol{e}_1 + (a_{2,j} + a_{j,2}) \boldsymbol{e}_2 + \cdots + (a_{j-1,j} + a_{j,j-1}) \boldsymbol{e}_{j-1}.\] (5)

Using \(c_{s+1,\ell} = (\widehat{a}_{s+1} \cdot b_{\ell})b_{\ell}/\|b_{\ell}\|^2\) and \(b_1 = \widehat{a}_1\), we express \(c_{s+1,1}\) as:

\[c_{s+1,1} = \frac{a_{1,s+1} \left( a_{1,1} - \mu^{(k)} \right) + \left( a_{s+1,s+1} - \mu^{(k)} \right) a_{s+1,1} + \sum_{\ell \neq 1, \ell \neq s+1}^{m} a_{\ell,s+1} a_{\ell,1}}{\left( a_{1,1} - \mu^{(k)} \right)^2 + \sum_{\ell=2}^{m} a_{\ell,1}^2}\]

\[\times \begin{pmatrix} a_{1,1} - \mu^{(k)} \\ a_{2,1} \\ \vdots \\ a_{m,1} \end{pmatrix}, \tag{6}\] where the term \(\sum_{\ell \neq 1, \ell \neq s+1}^m a_{\ell,s+1} a_{\ell,1}\) represents the sum \(\sum_{\ell=1}^m a_{\ell,s+1} a_{\ell,1}\), excluding the cases \(\ell=1\) and \(\ell=s+1\). Taking the limit \(\mu^{(k)} \to -\infty\) on both sides of Eq. (6), we obtain \(\lim_{\mu^{(k)} \to -\infty} c_{s+1,1} = (a_{1,s+1} + a_{s+1,1}) e_1\). We now express the 2nd element of \(c_{s+1,2}\) is then:

\[(\boldsymbol{c}_{s+1,2})_{2} = \frac{a_{2,s+1} + \frac{a_{s+1,s+1} - \mu^{(k)}}{(\boldsymbol{b}_{2})_{2}} (\boldsymbol{b}_{2})_{s+1} + \frac{\sum_{\ell \neq 2, \ell \neq s+1}^{m} a_{\ell,s+1} (\boldsymbol{b}_{2})_{\ell}}{(\boldsymbol{b}_{2})_{2}}}{1 + \frac{\sum_{\ell \neq 2}^{m} (\boldsymbol{b}_{2})_{\ell}^{2}}{(\boldsymbol{b}_{2})_{2}^{2}}}, \quad (7)\] where \(\sum_{\ell\neq 2}^m (\boldsymbol{b}_2)_{\ell}^2\) denotes the sum \(\sum_{\ell=1}^m (\boldsymbol{b}_2)_{\ell}^2\) excluding the case \(\ell=2\). Thus, using the assumption (5) for j=2, and considering the limit of Eq. (7) as \(\mu^{(k)} \to -\infty\), we obtain \(\lim_{\mu^{(k)} \to -\infty} (\boldsymbol{c}_{s+1,2})_2 = a_{2,s+1} + a_{s+1,2}\). The remaining elements of \(\boldsymbol{c}_{s+1,2}\), including \((\boldsymbol{c}_{s+1,2})_1\) and \((\boldsymbol{c}_{s+1,2})_3\), \((\boldsymbol{c}_{s+1,2})_4\), ..., \((\boldsymbol{c}_{s+1,2})_m\), tend to 0 as \(\mu^{(k)} \to -\infty\), yielding \(\lim_{\mu^{(k)} \to -\infty} \boldsymbol{c}_{s+1,2} = (a_{2,s+1} + a_{s+1,2})\boldsymbol{e}_2\). The limits for \(\boldsymbol{c}_{s+1,3}\), \(\boldsymbol{c}_{s+1,4}\), ..., \(\boldsymbol{c}_{s+1,5}\) can be investigated in a similar manner, noting that the parameter \(\mu^{(k)}\) appears in the \(\ell\) -th element of \(\boldsymbol{b}_{\ell}\). Therefore, for \(\boldsymbol{c}_{s+1,3}, \boldsymbol{c}_{s+1,4}, \dots, \boldsymbol{c}_{s+1,s}\), we have \(\lim_{\mu^{(k)} \to -\infty} \boldsymbol{c}_{s+1,\ell} = (a_{\ell,s+1} + a_{s+1,\ell}) \boldsymbol{e}_{\ell}\).

By applying Lemma 1 to the auxiliary matrix \(\tilde{Q}^{(k)}\), we obtain the following theorem concerning the existence of its continuous analogue.

Theorem 1. Let \(A^{(k)} = (a_{i,j})_{1 \le i,j \le m}\) be a matrix at some step k and let \(Q^{(k)}\) denote the orthogonal matrix obtained from the shifted QR method. Then, the limit of the auxiliary matrix \(\tilde{Q}^{(k)} = \mu^{(k)} (I - Q^{(k)})\) exists. Furthermore, the matrix \(\tilde{Q}(t) \coloneqq \lim_{\mu^{(k)} \to -\infty} \tilde{Q}^{(k)}\) is a skew-symmetric matrix constructed from the strictly lower triangular part of \(A^{(k)}\).

Proof. By applying the Gram-Schmidt orthogonalization to the matrix \(\hat{A}^{(k)}\) = \(A^{(k)} - \mu^{(k)}I\), we obtain the orthogonal matrix \(Q^{(k)}\). Let the j-th column vector of \(\widetilde{Q}^{(k)} = \mu^{(k)} (I - Q^{(k)})\) be denoted by \(\widetilde{q}_j\). Since the Euclidean norm \(\|\widehat{a}_1\|\) of the 1st column vector of \(\hat{A}^{(k)}\) can be evaluated as \(\|\hat{a}_1\| = (a_{1,1} - a_{1,1})\)\(\mu^{(k)}\))\(\sqrt{1 + O((\mu^{(k)})^{-2})}\), as \(\mu^{(k)} \to -\infty\), the limit of the 1st element of \(\tilde{q}_1\) with \(\boldsymbol{b}_1 = \widehat{\boldsymbol{a}}_1\) is given as \(\lim_{\mu(k) \to -\infty} (\widetilde{\boldsymbol{q}}_1)_1 = 0\). Taking the limit \(\mu^{(k)} \to -\infty\) for the 2nd element \((\widetilde{q}_1)_2 = -\mu^{(k)} a_{2,1} / \|\widehat{a}_1\|\), we obtain \(\lim_{\mu^{(k)} \to -\infty} (\widetilde{q}_1)_2 = a_{2,1}\). Similarly, for the remaining elements \((\tilde{q}_1)_3, (\tilde{q}_1)_4, ..., (\tilde{q}_1)_m\), it follows that \(\lim_{\mu^{(k)}\to-\infty} (\widetilde{q}_1)_i = a_{i,1}, i = 3, 4, \dots, m\). Thus, as \(\mu^{(k)}\to-\infty\), the 1st column vector of \(\tilde{Q}^{(k)}\) converges to \((0, a_{2,1}, a_{3,1}, \dots, a_{m,1})^{\mathsf{T}}\). Using Lemma 1 for \(\boldsymbol{b}_2 =\)\(\hat{a}_2 - c_{2,1}\), we can evaluate the \(\|b_2\|\) as \(\|b_2\| = (b_2)_2 \sqrt{1 + O((\mu^{(k)})^{-2})}\), \(\mu^{(k)} \to -\infty\). From \((\widetilde{\boldsymbol{q}}_2)_2 = \mu^{(k)} - \mu^{(k)}(\boldsymbol{b}_2)_2/\|\boldsymbol{b}_2\|\), it follows that \(\lim_{\mu^{(k)} \to -\infty} (\widetilde{\boldsymbol{q}}_2)_2 = 0\). Considering the limits of \((\widetilde{\boldsymbol{q}}_2)_1 = -\mu^{(k)}(\boldsymbol{b}_2)_1/\|\boldsymbol{b}_2\|\)and \((\tilde{q}_2)_i = -\mu^{(k)}(\boldsymbol{b}_2)_i / \|\boldsymbol{b}_2\|\), for i = 3, 4, ..., m, as \(\mu^{(k)} \to -\infty\), we obtain \(\lim_{u^{(k)}\to-\infty} (\widetilde{q}_2)_1 = -a_{2,1}\) and \(\lim_{u^{(k)}\to-\infty} (\widetilde{q}_2)_i = a_{i,2}\), respectively. Thus, for \(\widetilde{\boldsymbol{q}}_2\), it holds that \(\lim_{\mu^{(k)}\to-\infty}\widetilde{\boldsymbol{q}}_2=\left(-a_{2,1},0,a_{3,2},a_{4,2},\dots,a_{m,2}\right)^{\mathsf{T}}\). The same approach can be applied to the 3rd, 4th, ..., m-th column vectors of \(\tilde{Q}^{(k)}\), yielding limits \(\lim_{\mu^{(k)} \to -\infty} \widetilde{\boldsymbol{q}}_j = \left(-a_{j,1}, \dots, -a_{j,j-1}, 0, a_{j+1,j}, \dots, a_{m,j}\right)^{\mathsf{T}}\), j =3, 4, ..., m.

Theorem 1 states that a continuous analogue of \(\tilde{Q}^{(k)}\) exists, denoted \(\tilde{Q}(t) = \lim_{\delta^{(k)} \to 0} \tilde{Q}^{(k)}\), and that \(\tilde{Q}(t)\) is a skew-symmetric matrix. From Eq. (1), the derivative of the matrix A(t) is given as the Lax equation:

\[\frac{dA(t)}{dt} = A(t)\tilde{Q}(t) - \tilde{Q}(t)A(t), \quad t \ge 0.\] (8)

3 Discretization of the Lax Equation with Embedded Skewsymmetry

In this section, we first propose a discretization of the Lax equation (8), which is a continuous analogue of the shifted QR method introduced in the previous section. It is based on a matrix factorization that employs skew-symmetric matrices instead of the standard QR factorization. Then, we show that the discretized approach leads to an eigenvalue-preserving transformation.

We introduce an alternative factorization that employs skew-symmetric matrices. We present the following lemma concerning the factorization for a given matrix = (,)1≤,≤ ∈ ℝ×.

Lemma 2. Let be a nonsingular matrix, where is a shift parameter. Then, the matrix can be uniquely factorized as the matrix product:

\[A - \mu I = (I + \delta N)\hat{R},\tag{9}\] where is a skew-symmetric matrix, ̂ is an upper triangular matrix, and parameter ≔ −1/.

Proof. Since the skew-symmetric matrix = (,)1≤,≤ satisfies the relation = − T , the matrix + can be represented as:

\[I + \delta N = \begin{pmatrix} 1 & -\delta n_{2,1} & \cdots & -\delta n_{m,1} \\ \delta n_{2,1} & 1 & \ddots & \vdots \\ \vdots & \ddots & \ddots & -\delta n_{m,m-1} \\ \delta n_{m,1} & \cdots & \delta n_{m,m-1} & 1 \end{pmatrix}.\](10)

Now, we denote ̂ = (̂,)1≤,≤, where its entries ̂, satisfy ̂, = 0 for > . By observing the 1st column vector on both sides of the factorization (9), we obtain 1,1 − = 1̂,1 and ,1 = ,11̂,1, = 2, 3, … , . The equation 1,1 − = 1̂,1 uniquely determines 1̂,1 from given 1,1 and . The entries 2,1, 3,1, … , ,1 that appear in the 1st column of the skew-symmetric matrix are then uniquely determined, assuming that 1,1 − ≠ 0, from ,1 = ,11̂,1.

Here, we define the leading principal minor of order j of the matrix \(I + \delta N\) in Eq. (10) as the submatrix that is obtained by removing the rows and columns indexed by j + 1, j + 2, ..., m from \(I + \delta N\) as follows:

\[(I+\delta N)_{[1,j]\times[1,j]} \coloneqq \begin{pmatrix} 1 & -\delta n_{2,1} & \cdots & -\delta n_{j,1} \\ \delta n_{2,1} & 1 & \ddots & \vdots \\ \vdots & \ddots & \ddots & -\delta n_{j,j-1} \\ \delta n_{j,1} & \cdots & \delta n_{i,j-1} & 1 \end{pmatrix}.\]

In the factorization \(A - \mu I = (I + \delta N)\hat{R}\), the system of equations determining the elements of the j-th column of \(\hat{R}\) is given by:

\[(I + \delta N)_{[1,j]\times[1,j]} \hat{\mathbf{r}}_{[1,j]\times j} = \mathbf{a}_{[1,j]\times j}, \quad j = 2, 3, \dots, m,\] (11)

where \(\hat{r}_{[1,j]\times j} := (\hat{r}_{1,j},\hat{r}_{2,j},...,\hat{r}_{j,j})^{\mathsf{T}}\) and \(\boldsymbol{a}_{[1,j]\times j} := (a_{1,j},a_{2,j},...,a_{j,j})^{\mathsf{T}}\). Since the eigenvalues of the skew-symmetric matrix N are either 0 or purely imaginary numbers, it is clear that \(I + \delta N\) is nonsingular, which implies that \(((I + \delta N)_{[1,j]\times[1,j]})^{-1}\) exists. Solving the system of Eq. (11) for \(\hat{r}_{[1,j]\times j}\) yields:

\[\hat{\mathbf{r}}_{[1,j]\times j} = \left( (I + \delta N)_{[1,j]\times[1,j]} \right)^{-1} \mathbf{a}_{[1,j]\times j}, \quad j = 2, 3, \dots, m, \tag{12}\] which uniquely determines the elements of the j-th column of \(\hat{R}\).

To determine the skew-symmetric matrix N, it is sufficient to calculate the entries below the diagonal as these fully specify the matrix. The elements \(n_{j+1,j}, n_{j+2,j}, \ldots, n_{m,j}\) of the j-th column of N can be determined using the following formula:

\[n_{i,j} = \frac{a_{i,j} - \delta \sum_{\ell=1}^{j-1} \hat{r}_{\ell,j} n_{i,\ell}}{\delta \hat{r}_{j,j}}, \quad i = j+1, j+2, \dots, m.\] (13)

Thus, the entries in N and \(\hat{R}\) in the factorization \(A - \mu I = (I + \delta N)\hat{R}\) are uniquely determined by recursively applying the Eq. (12) for the j-th column of \(\hat{R}\) and Eq. (13) for the j-th column of N, for j = 1, 2, ..., m.

Henceforth, we refer to the matrix factorization (9) introduced in Lemma 2 as the NR factorization. Based on the NR factorization, we propose the following matrix update method, consisting of two operations:

1. Given the matrix \(A^{(k)}\) and the shift parameter \(\mu^{(k)}\), perform the NR factorization (9) as follows:

\[A^{(k)} - \mu^{(k)} I = (I + \delta^{(k)} N^{(k)}) \hat{R}^{(k)}, \quad k = 0, 1, \dots,\] (14)

where \(N^{(k)}\) and \(\hat{R}^{(k)}\) are skew-symmetric and upper triangular matrices, respectively.

2. Update the matrix \(A^{(k)}\) to \(A^{(k+1)}\) by computing the matrix product of \(\widehat{R}^{(k)}\) and \(I + \delta^{(k)} N^{(k)}\), and adding back the shift parameter \(\mu^{(k)}\) to the diagonal entries as follows:

\[A^{(k+1)} = \hat{R}^{(k)} (I + \delta^{(k)} N^{(k)}) + \mu^{(k)} I, \quad k = 0, 1, \dots\] (15)

This method, called the shifted NR method, provides a structured approach for matrix updates using NR factorization with skew-symmetric matrices and shift parameters.

The matrix operation that updates \(A^{(k)}\) to \(A^{(k+1)}\), as defined by Eqs. (14) and (15), leads to the following similarity transformation:

\[A^{(k+1)} = (I + \delta^{(k)} N^{(k)})^{-1} A^{(k)} (I + \delta^{(k)} N^{(k)}), \quad k = 0, 1, ...,\] (16)

which preserves the eigenvalues of \(A^{(k)}\). As discussed earlier, where the shifted QR method is linked to the Lax equation (8), we can establish a similar correspondence between the discrete-time index k and continuous-time variable t. Specifically, we let \(A^{(k)} = A(t)\) and \(A^{(k+1)} = A(t + \delta^{(k)})\). Taking the limit as \(\delta^{(k)} \to 0\) in similarity transformation (16), we obtain the derivative of A(t):

\[\frac{dA(t)}{dt} = \lim_{\delta^{(k)} \to 0} \left( A(t) N^{(k)} - N^{(k)} A(t + \delta^{(k)}) \right). \tag{17}\]

We present the following lemma concerning the existence of a continuous analogue for the \(N^{(k)}\).

Lemma 3. Let \(N^{(k)}\) denote the skew-symmetric matrix obtained through the shifted NR method for a given matrix \(A^{(k)} = (a_{i,j})_{1 \le i,j \le m}\) at step k. Then, the limit \(N(t) := \lim_{\mu^{(k)} \to -\infty} N^{(k)}\) exists, and N(t) is a skew-symmetric matrix constructed from the strictly lower triangular entries in \(A^{(k)}\).

Proof. Considering the NR factorization (9), the elements of 1st column vector of the skew-symmetric matrix \(N^{(k)} = (n_{i,j}^{(k)})_{1 \le i,j \le m}\) are expressed as \(n_{i,1}^{(k)} = a_{i,1}/(1+\delta^{(k)}a_{1,1})\), i=2,3,...,m, which immediately leads to:

\[\lim_{S(k) \to 0} n_{i,1}^{(k)} = a_{i,1}, \quad i = 2, 3, \dots, m.\] (18)

Next, for the 2nd column vector of \(N^{(k)}\), using Eqs. (12) and (13), we obtain:

\[n_{i,2}^{(k)} = \frac{a_{i,2} - \delta^{(k)} n_{i,1}^{(k)} \left(a_{1,2} + n_{2,1}^{(k)}\right) - (\delta^{(k)})^2 n_{2,1}^{(k)} \left(n_{i,1}^{(k)} a_{2,2} - n_{2,1}^{(k)} a_{i,2}\right)}{1 + \delta^{(k)} a_{2,2} - (\delta^{(k)})^2 n_{2,1}^{(k)} a_{1,2}}.\]

Taking the limit as \(\delta^{(k)} \to 0\), and using Eq. (18), we have \(\lim_{\delta^{(k)} \to 0} n_{i,2}^{(k)} = a_{i,2}\), i = 3, 4, ..., m. Similarly, by calculating the 3rd, 4th, ..., (m-1)-th columns in \(\hat{R}^{(k)}\) and \(N^{(k)}\), and taking the limit \(\delta^{(k)} \to 0\), we see that the elements of the j-th column of \(N^{(k)}\) can be expressed as:

\[n_{i,j}^{(k)} = \frac{a_{i,j} + O(\delta^{(k)}) + O((\delta^{(k)})^2) + \dots + O((\delta^{(k)})^j)}{1 + O(\delta^{(k)}) + O((\delta^{(k)})^2) + \dots + O((\delta^{(k)})^j)},\]\[i = j + 1, j + 2, \dots, m, \quad j = 3, 4, \dots, m - 1.\]

Thus, by summarizing the results for the 1st through (m-1)-th columns in \(N^{(k)}\), we conclude that, as \(\delta^{(k)} \to 0\), the strictly lower triangular entries \(n^{(k)}_{i,j}\) in \(N^{(k)}\) converge to those in \(A^{(k)}\) as \(\lim_{\delta^{(k)} \to 0} n^{(k)}_{i,j} = a_{i,j}\) for i = j+1, j+2, ..., m and j = 1, 2, ..., m-1.

Lemma 3 guarantees the existence of the matrix \(N(t) = \lim_{\delta^{(k)} \to 0} N^{(k)}\), which can be regarded as a continuous analogue of \(N^{(k)}\). Furthermore, according to Lemma 3, by using Eq. (17) and \(\tilde{Q}(t) = N(t)\), we obtain the Lax equation (8). Therefore, the proposed shifted NR method (14)-(15) can be viewed as a discrete analogue of the Lax equation (8).

The following lemma provides properties related to the convergence of shifted NR factorization method using \(A^{(k)} - \mu^{(k)}I = \widehat{N}^{(k)}\widehat{R}^{(k)}\), where \(\widehat{N}^{(k)} := (I + \delta^{(k)}N^{(k)})\).

Lemma 4. Let () − () be factorizable for any into the matrix product given in Eq. (9). Then, by setting = (0) , the following relation holds:

\[\prod_{\ell=1}^{k} \left( A - \mu^{(\ell-1)} I \right) = \widehat{N}^{(0)} \widehat{N}^{(1)} \cdots \widehat{N}^{(k-1)} \widehat{R}^{(k-1)} \cdots \widehat{R}^{(1)} \widehat{R}^{(0)}, \quad k = 1, 2, \dots (19)\]

Furthermore, the similarity transformation from = (0) to () is expressed as:

\[A^{(k)} = (\widehat{N}^{(0)}\widehat{N}^{(1)} \cdots \widehat{N}^{(k-1)})^{-1} A(\widehat{N}^{(0)}\widehat{N}^{(1)} \cdots \widehat{N}^{(k-1)}), \quad k = 1, 2, \dots.\] (20)

Proof. Noting that − (0) = ̂(0)̂ (0) , it is clear that Eq. (19) holds for = 1. Assuming that Eq. (19) holds for = 1, 2, … , − 1, we obtain:

\[\begin{split} & \prod_{\ell=1}^{s} \left(A - \mu^{(\ell-1)}I\right) \\ & = \widehat{N}^{(0)} \big[ \widehat{R}^{(0)} \widehat{N}^{(0)} + \big(\mu^{(0)} - \mu^{(s-1)}\big)I \big] \widehat{N}^{(1)} \widehat{N}^{(2)} \cdots \widehat{N}^{(s-2)} \widehat{R}^{(s-2)} \cdots \widehat{R}^{(1)} \widehat{R}^{(0)}. \end{split}\]

From Eqs. (14) and (15), it follows that ̂(1)̂(1) + (1) = ̂(0)̂(0) + (0) , which gives:

\[\prod_{\ell=1}^s \left(A - \mu^{(\ell-1)}I\right)\]

\[= \widehat{N}^{(0)} \widehat{N}^{(1)} \left[ \widehat{R}^{(1)} \widehat{N}^{(1)} + \left( \mu^{(1)} - \mu^{(s-1)} \right) I \right] \widehat{N}^{(2)} \widehat{N}^{(3)} \cdots \widehat{N}^{(s-2)} \widehat{R}^{(s-2)} \cdots \widehat{R}^{(1)} \widehat{R}^{(0)}.\]

Proceeding in a similar manner and using the relation ̂(+1)̂(+1) + (+1) = ̂()̂() + () , we can continue transforming the expression iteratively. Therefore, by applying the relation ̂(−1)̂(−1) + (−1) = ̂(−2)̂(−2) + (−2) , we have Eq. (19) with = . Furthermore, from the similarity transformation between () and (+1) given by Eq. (16), we obtain () = (̂(−1) ) −1 (−1)̂(−1) , = 1, 2, …, which immediately leads to Eq. (20). ∎

Employing the results of Lemma 4, we derive the following theorem concerning the convergence of the shifted NR method based on the factorization (9), which utilizes skew-symmetric matrices.

Theorem 2. Let \(A = A^{(0)} \in \mathbb{R}^{m \times m}\) satisfy conditions i and ii as follows:

i. For the shift parameters \(\mu^{(k)}\) and the real eigenvalues \(\lambda_1, \lambda_2, ..., \lambda_m\) of matrix A, the following inequality holds:

\[\left|\lambda_1-\mu^{(k)}\right|>\left|\lambda_2-\mu^{(k)}\right|>\dots>\left|\lambda_m-\mu^{(k)}\right|>0,\quad k=0,1,\dots.\]

ii. Suppose the matrix A is diagonalizable, such that, using orthogonal matrix V, it holds that \(V^{-1}AV = \operatorname{diag}(\lambda_1, \lambda_2, \dots, \lambda_m)\). Then, \(V^{-1}\) admits an LU factorization, namely, \(V^{-1} = \mathcal{L}U\), where \(\mathcal{L}\) and \(\mathcal{U}\) are lower and upper triangular matrices, respectively.

Then, A in the shifted NR method (14)-(15) essentially converges to upper triangular matrix.

Proof. Let the matrix A be expressed as \(A = V\Lambda V^{-1}\), where \(\Lambda \coloneqq \operatorname{diag}(\lambda_1, \lambda_2, \dots, \lambda_m)\). Then, it follows that: \(\prod_{\ell=1}^k (A - \mu^{(\ell-1)}I) = V\Lambda_k V^{-1}\), where \(\Lambda_k \coloneqq \operatorname{diag}(\prod_{\ell=1}^k (\lambda_1 - \mu^{(\ell-1)}), \prod_{\ell=1}^k (\lambda_2 - \mu^{(\ell-1)}), \dots, \prod_{\ell=1}^k (\lambda_m - \mu^{(\ell-1)}))\). Assume that \(V^{-1}\) has an LU factorization as \(V^{-1} = \mathcal{LU}\). Thus, we can express the matrix product as follows:

\[\prod_{\ell=1}^{k} \left( A - \mu^{(\ell-1)} I \right) = V \left( \Lambda_k \mathcal{L} \Lambda_k^{-1} \right) \Lambda_k \mathcal{U}. \tag{21}\]

Since \(\mathcal{L}=(l_{i,j})_{1\leq i,j\leq m}\) is a lower triangular matrix, the matrix \(\Lambda_k\mathcal{L}\Lambda_k^{-1}\) is also lower triangular, with its (i,j)-th entry is \(l_{i,j}(\bar{\lambda}_i/\bar{\lambda}_j)\) for i>j, where \(\bar{\lambda}_i\coloneqq\prod_{\ell=1}^k(\lambda_i-\mu^{(\ell-1)}),\,i=1,2,...,m\). In the case where i>j, since \(\left|\lambda_i-\mu^{(\ell)}\right|<\left|\lambda_j-\mu^{(\ell)}\right|,\,\ell=0,1,...\), for the (i,j)-entries \(\left(\Lambda_k\mathcal{L}\Lambda_k^{-1}\right)_{i,j}\) in the matrix \(\Lambda_k\mathcal{L}\Lambda_k^{-1}\), it follows that:

\[\left|\left(\Lambda_k \mathcal{L} \Lambda_k^{-1}\right)_{i,j}\right| \leq \left|l_{i,j}\right| \left|\frac{\lambda_i - \mu^*}{\lambda_j - \mu^*}\right|, \quad i > j,\] where

\[\left|\frac{\lambda_i - \mu^*}{\lambda_j - \mu^*}\right| \coloneqq \max\left\{\left|\frac{\lambda_i - \mu^{(0)}}{\lambda_j - \mu^{(0)}}\right|, \left|\frac{\lambda_i - \mu^{(1)}}{\lambda_j - \mu^{(1)}}\right|, \dots, \left|\frac{\lambda_i - \mu^{(k-1)}}{\lambda_j - \mu^{(k-1)}}\right|\right\}.\]

Taking the limit as \(k \to \infty\), we find that the matrix \(\Lambda_k \mathcal{L} \Lambda_k^{-1}\) converges to the identity matrix I. Using Eqs. (19) and (21), we obtain \(V(\Lambda_k \mathcal{L} \Lambda_k^{-1}) \Lambda_k \mathcal{U} = \mathcal{N} \mathcal{R}\), where \(\mathcal{N} \coloneqq \widehat{N}^{(0)} \widehat{N}^{(1)} \cdots \widehat{N}^{(k-1)}\) and \(\mathcal{R} \coloneqq \widehat{R}^{(k-1)} \widehat{R}^{(k-2)} \cdots \widehat{R}^{(0)}\). From \(A = V\Lambda V^{-1}\) and \(V = \mathcal{N} \mathcal{R} \left[ (\Lambda_k \mathcal{L} \Lambda_k^{-1}) \Lambda_k \mathcal{U} \right]^{-1}\), the similarity transformation (20) becomes:

\[A^{(k)} = \mathcal{R} \mathcal{U}^{-1} \Lambda_k^{-1} (\Lambda_k \mathcal{L} \Lambda_k^{-1})^{-1} \Lambda (\Lambda_k \mathcal{L} \Lambda_k^{-1}) \Lambda_k \mathcal{U} \mathcal{R}^{-1}.\]

Considering \(\Lambda_k \mathcal{L} \Lambda_k^{-1} \to I\) as \(k \to \infty\), we have \(\lim_{k \to \infty} A^{(k)} = \mathcal{R} \mathcal{U}^{-1} \Lambda \mathcal{U} \mathcal{R}^{-1}\). Since both \(\mathcal{R} \mathcal{U}^{-1}\) and \(\mathcal{U} \mathcal{R}^{-1}\) are upper triangular, their product is also an upper triangular. Therefore, as each step from \(A^{(k)}\) to \(A^{(k+1)}\) in the shifted NR method is a similarity transformation from Lemma 4, the matrix \(A^{(k)}\) converges to an upper triangular form while preserving its eigenvalues.

4 Conclusion

In this study, we first demonstrated that, by appropriately relating the shift parameter \(\mu^{(k)}\) to the discrete parameter \(\delta^{(k)}\), the shifted QR method corresponds to the Lax equation with skew-symmetry as the limit \(\delta^{(k)} \to 0\). This finding not only situates the shifted QR method within the framework of integrable systems, but also establishes a correspondence between the discrete and continuous systems, mediated by the shift parameter.

Second, we proposed a discretization method based on NR factorization as an alternative to the shifted QR method for the Lax equation involving skew-symmetric matrices. This approach, termed the shifted NR method, provides an eigenvalue computation algorithm, which we proved converges to the matrix eigenvalues in the discrete time limit as \(k \to \infty\).

Since both the shifted QR and shifted NR methods correspond to the Lax equation as the limit \(\delta^{(k)} = -1/\mu^{(k)} \to 0\), other discretization methods corresponding to the Lax equations may also yield discrete dynamical systems that converge to eigenvalues. We anticipate further progress in this area through future research, particularly in identifying discretization that are suitable in terms of computational accuracy and efficiency.

Acknowledgments

We are grateful to the reviewers for their comprehensive assessment and helpful comments which enhanced the quality of this paper. This work was supported by the Grant-in-Aid for Early-Career Scientists No. 21K13844 from the Japan Society for the Promotion of Science.

Research Intelligence

Data from OpenAlex ↗

Metrics

0.00
FWCIfield-weighted
7th
Percentilevs same year + field
Article
Work type
Open Access

Semantic Profile AI-classified research signals

Biology 0.72
level 0
level 2
Agronomy 0.45
level 1

Institution Network

References

  1. Bai, Z., Demmel, J., Dongarra, J., Ruhe, A. & van der Vorst, H., Templates for the Solution of Algebraic Eigenvalue Problems, SIAM, 2000.
  2. Saad, Y., Numerical Methods for Large Eigenvalue Problems, Revised Edition, SIAM, 2011.
  3. Golub, G.H. & Van Loan, C.F., Matrix Computations, 4th ed., Johns Hopkins University Press, 2013.
  4. Watkins, D.S., Fundamentals of Matrix Computations, 2nd ed., John Wiley & Sons, Inc., 2002.
  5. Francis, J.G.F., The QR Transformation A Unitary Analogue to the LR Transformation - Part 1, The Computer Journal, 4(1), pp. 265-271, 1961.
  6. Watkins, D.S., Understanding the QR Algorithm, SIAM Review, 24(4), pp. 427-440, 1982.
  7. Grava, T., Gisonni, M., Gubbiotti, G. & Mazzuca, G., Discrete Integrable Systems and Random Lax Matrices, Journal of Statistical Physics, 190, 10, 2023.
  8. Chu, M.T., Lax Dynamics for Cartan Decomposition with Applications to Hamiltonian Simulation, IMA Journal of Numerical Analysis, 44(3), pp. 1406-1434, 2024.
  9. Pu, J. & Chen, Y., Lax Pairs Informed Neural Networks Solving Integrable Systems, Journal of Computational Physics, 510, 113090, 2024.
  10. Symes, W.W., The QR Algorithm and Scattering for the Finite Nonperiodic Toda Lattice, Physica D: Nonlinear Phenomena, 4(2), pp. 275-280, 1982.
  11. Watkins, D.S., Isospectral Flow, SIAM Review, 26(3), pp. 379-391, 1984.
  12. Chu, M.T., A Differential Equation Approach to the Singular Value Decomposition of Bidiagonal Matrices, Linear Algebra and its Applications, 80, pp. 71-79, 1986.
  13. Shinjo, M., Fukuda, A., Kondo, K., Yamamoto, Y., Ishiwata, E., Iwasaki, M. & Nakamura, Y., Discrete Hungry Integrable Systems - 40 Years from the Physica D Paper by W.W. Symes, Physica D: Nonlinear Phenomena, 439, 133422, 2022.
  14. Watkins, D.S., Shifting Strategies for the Parallel QR Algorithm, SIAM Journal of Scientific Computing, 15(4), pp. 953-958, 1994.
  15. Wang, T.L. & Gragg, W.B., Convergence of the Shifted QR Algorithm for Unitary Hessenberg Matrices, Mathematics of Computation, 71(240), pp. 1473-1496, 2002.