StarNEig Library  version v0.1-beta.1
A task-based library for solving nonsymmetric eigenvalue problems
Introduction

StarNEig library aims to provide a full suite of algorithms for solving non-symmetric (generalized) eigenvalue problems. The library is built on top of the StarPU runtime system and targets both shared memory and distributed memory machines. Some components of the library support GPUs.

The four main components of the library are:

  • Hessenberg(-triangular) reduction: A dense matrix (or a dense matrix pencil) is reduced to upper Hessenberg (or Hessenberg-triangular) form.
  • Schur reduction: A upper Hessenberg matrix (or a Hessenberg-triangular matrix pencil) is reduced to (generalized) Schur form. The (generalized) eigenvalues can be determined from the diagonal blocks.
  • Eigenvalue reordering: Reorders a user-selected set of (generalized) eigenvalues to the upper left corner of an updated (generalized) Schur form.
  • Eigenvectors: Computes (generalized) eigenvectors for a user-selected set of (generalized) eigenvalues.

The library has been developed as a part of the NLAFET project. The project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 671633. Support has also been received from eSSENCE, a collaborative e-Science programme funded by the Swedish Government via the Swedish Research Council (VR).

Current status

The library is currently in a beta state and only real arithmetic is supported. In addition, some interface functions are implemented as LAPACK and ScaLAPACK wrappers.

Current status with standard eigenvalue problems:

Component Shared memory Distributed memory Accelerators (GPUs)
Hessenberg reduction Complete ScaLAPACK wrapper Single GPU
Schur reduction Complete Experimental Experimental
Eigenvalue reordering Complete Complete Experimental
Eigenvectors Complete Waiting integration Not planned

Current status with generalized eigenvalue problems:

Component Shared memory Distributed memory Accelerators (GPUs)
Hessenberg-triangular reduction LAPACK wrapper / PlannedScaLAPACK wrapper Not planned
Generalized Schur reduction Complete Experimental Experimental
Generalized eigenvalue reordering Complete Complete Experimental
Generalized eigenvectors Complete Waiting integration Not planned

Known problems

  • Some older OpenMPI versions (pre summer 2017, e.g. <= 2.1.1) have a bug that might lead to a segmentation fault during a parallel AED.
  • OpenBLAS version 0.3.1 has a bug that might lead to an incorrect result.
  • OpenBLAS versions 0.3.3-0.3.5 and MKL 2018.3.222 might lead to poor scalability.
  • StarPU versions 1.2.4 - 1.2.6 and some StarPU 1.3 snapshots cause poor CUDA performance. The problem can be fixed by compiling StarPU with --disable-cuda-memcpy-peer. It is possible that newer version are also effected by this problem.
  • STARPU_LIMIT_CUDA_MEM environmental variable may fix some GPU related memory allocation problems. For example, if the GPU has 6 GB of memory, then setting STARPU_LIMIT_CUDA_MEM=5500 might help.
  • The library has an unsolved memory leak problem with OpenMPI. Only large problem sizes are effected. It is not known whether this problem is related to StarNEig, StarPU, OpenMPI or something else. A memory leak is sometimes accompanied by the following warning:
    mpool.c:38 UCX WARN object 0x2652000 was not returned to mpool ucp_requests
    The problem is known to occur with PMIx 2.2.1, UCX 1.5.0, OpenMPI 3.1.3, and StarPU 1.2.8.
  • If the GPU support is enabled, then the starneig_SEP_SM_Hessenberg() interface function cannot always handle problems that do not fit into GPU's memory. The cause of this problem is is not known.
  • The outputs of the starneig_GEP_SM_Schur() and starneig_GEP_DM_Schur() interface functions are not in the so-called standard format. It is possible that some diagonal entries in the right-hand side output matrix are negative. This will be fixed in the next version of the library.
  • The starneig_GEP_SM_Eigenvectors() interface function may scale the input matrices. This will be fixed in the next version of the library.

The standard eigenvalue problem

Given a square matrix $A$ of size $n \times n$, the standard eigenvalue problem (SEP) consists of finding eigenvalues $\lambda_i \in \mathbb C$ and associated eigenvectors $0 \neq v_i \in \mathbb C^{n}$ such that

\[ A v_i = \lambda_i v_i, \text{ for } i = 1, 2, \dots, n. \]

The eigenvalues are the $n$ (potentially complex) roots of the polynomial $\text{det}(A - \lambda I) = 0$ of degree $n$. There is often a full set of $n$ linearly independent eigenvectors, but if there are multiple eigenvalues (i.e., if $\lambda_{i} = \lambda_{j}$ for some $i \neq j$) then there might not be a full set of independent eigenvectors.

Reduction to Hessenberg form

The dense matrix $A$ is condensed to Hessenberg form by computing a Hessenberg decomposition

\[ A = Q_{1} H Q_{1}^{H}, \]

where $Q_{1}$ is unitary and $H$ is upper Hessenberg. This is done in order to greatly accelerate the subsequent computation of a Schur decomposition since when working on $H$ of size $n \times n$, the amount of work in each iteration of the QR algorithm is reduced from $\mathcal{O}(n^3)$ to $\mathcal{O}(n^2)$ flops.

Reduction to Schur form

Starting from the Hessenberg matrix $H$ we compute a Schur decomposition

\[ H = Q_{2} S Q_{2}^H, \]

where $Q_{2}$ is unitary and $S$ is upper triangular. The eigenvalues of $A$ can now be determined as they appear on the diagonal of $S$, i.e., $\lambda_i = s_{ii}$. For real matrices there is a similar decomposition known as the real Schur decomposition

\[ H = Q_{2} S Q_{2}^T, \]

where $Q_{2}$ is orthogonal and $S$ is upper quasi-triangular with $1\times 1$ and $2 \times 2$ blocks on the diagonal. The $1 \times 1$ blocks correspond to the real eigenvalues and each $2 \times 2$ block corresponds to a pair of complex conjugate eigenvalues.

Eigenvalue reordering and invariant subspaces

Given a subset consisting of $m \leq n$ of the eigenvalues, we can reorder the eigenvalues on the diagonal of the Schur form by constructing a unitary matrix $Q_{3}$ such that

\[ S = Q_{3} \begin{bmatrix} \hat S_{11} & \hat S_{12} \\ 0 & \hat S_{22} \end{bmatrix} Q_{3}^{H} \]

and the eigenvalues of the $m \times m$ block $\hat S_{11}$ are the selected eigenvalues. The first $m$ columns of $Q_{3}$ span an invariant subspace associated with the selected eigenvalues.

Computation of eigenvectors

Given a subset consisting of $m \leq n$ of the eigenvalues $\lambda_{i}$ for $i = 1, 2, \ldots, m$ and a Schur decomposition $A = Q S Q^{H}$, we can compute for each $\lambda_{i}$ an eigenvector $v_{i} \neq 0$ such that $A v_{i} = \lambda_{i} v_{i}$ by first computing an eigenvector $w_{i}$ of $S$ and then transform it back to the original basis by pre-multiplication with $Q$.

The generalized eigenvalue problem

Given a square matrix pencil $A - \lambda B$, where $A, B \in \mathbb{C}^{n \times n}$, the generalized eigenvalue problem (GEP) consists of finding generalized eigenvalues $\lambda_i \in \mathbb C$ and associated generalized eigenvectors $0 \neq v_i \in \mathbb C^{n}$ such that

\[ A v_{i} = \lambda_{i} B v_{i}, \text{ for } i = 1, 2, \dots, n. \]

The eigenvalues are the $n$ (potentially complex) roots of the polynomial $\text{det}(A - \lambda B) = 0$ of degree $n$. There is often a full set of $n$ linearly independent generalized eigenvectors, but if there are multiple eigenvalues (i.e., if $\lambda_{i} = \lambda_{j}$ for some $i \neq j$) then there might not be a full set of independent eigenvectors.

At least in principle, a GEP can be transformed into a SEP provided that $B$ is invertible, since

\[ A v = \lambda B v \Leftrightarrow (B^{-1} A) v = \lambda v. \]

However, in finite precision arithmetic this practice is not recommended.

Reduction to Hessenberg-triangular form

The dense matrix pair $(A, B)$ is condensed to Hessenberg-triangular form by computing a Hessenberg-triangular decomposition

\[ A = Q_{1} H Z_{1}^{H}, \quad B = Q_{1} Y Z_{1}^{H}, \]

where $Q_{1}, Z_{1}$ are unitary, $H$ is upper Hessenberg, and $Y$ is upper triangular. This is done in order to greatly accelerate the subsequent computation of a generalized Schur decomposition.

Reduction to generalized Schur form

Starting from the Hessenberg-triangular pencil $H - \lambda Y$ we compute a generalized Schur decomposition

\[ H = Q_{2} S Z_{2}^H, \quad Y = Q_{2} T Z_{2}^{H}, \]

where $Q_{2}, Z_{2}$ are unitary and $S, T$ are upper triangular. The eigenvalues of $A - \lambda B$ can now be determined from the diagonal element pairs $(s_{ii}, t_{ii})$, i.e., $\lambda_i = s_{ii} / t_{ii}$ (if $t_{ii} \neq 0$). If $s_{ii} \neq 0$ and $t_{ii} = 0$, then $\lambda_{i} = \infty$ is an infinite eigenvalue of the matrix pair $(S,T)$. (If both $s_{ii} = 0$ and $t_{ii} = 0$ for some $i$, then the pencil is singular and the eigenvalues are undetermined; all complex numbers are eigenvalues). For real matrix pairs there is a similar decomposition known as the real generalized Schur decomposition

\[ H = Q_{2} S Z_{2}^T, \quad Y = Q_{2} T Z_{2}^{T}, \]

where $Q_{2}, Z_{2}$ are orthogonal, $S$ is upper quasi-triangular with $1 \times 1$ and $2 \times 2$ blocks on the diagonal, and $T$ is upper triangular. The $1 \times 1$ blocks on the diagonal of $S - \lambda T$ correspond to the real generalized eigenvalues and each $2 \times 2$ block corresponds to a pair of complex conjugate generalized eigenvalues.

Eigenvalue reordering and deflating subspaces

Given a subset consisting of $m \leq n$ of the generalized eigenvalues, we can reorder the generalized eigenvalues on the diagonal of the generalized Schur form by constructing unitary matrices $Q_{3}$ and $Z_{3}$ such that

\[ S - \lambda T = Q_{3} \begin{bmatrix} \hat S_{11} - \lambda \hat T_{11} & \hat S_{12} - \lambda \hat T_{12} \\ 0 & \hat S_{22} - \lambda \hat T_{22} \end{bmatrix} Z_{3}^{H} \]

and the eigenvalues of the $m \times m$ block pencil $\hat S_{11} - \lambda \hat T_{11}$ are the selected generalized eigenvalues. The first $m$ columns of $Z_{3}$ spans a right deflating subspace associated with the selected generalized eigenvalues.

Computation of generalized eigenvectors

Given a subset consisting of $m \leq n$ of the eigenvalues $\lambda_{i}$ for $i = 1, 2, \ldots, m$ and a generalized Schur decomposition $A - \lambda B = Q (S - \lambda T) Z^{H}$, we can compute for each $\lambda_{i}$ a generalized eigenvector $v_{i} \neq 0$ such that $A v_{i} = \lambda_{i} B v_{i}$ by first computing a generalized eigenvector $w_{i}$ of $S - \lambda_{i} T$ and then transform it back to the original basis by pre-multiplication with $Z$.