StarNEig User's Guide  master branch
A task-based library for solving dense nonsymmetric eigenvalue problems
Known problems and changelog

Known problems

Known compatibility problems

BLAS

  • With some OpenBLAS versions, it is necessary to set the OPENBLAS_NUM_THREADS environmental variable to value 1 (export OPENBLAS_NUM_THREADS=1).
  • Some MKL versions can cause poor scalability. The problem appears to be related to Intel's OpenMP library. Setting the KMP_AFFINITY environmental variable to value disabled fixes the problem (export KMP_AFFINITY=disabled).
  • OpenBLAS version 0.3.1 has a bug that can cause an incorrect result.
  • OpenBLAS versions 0.3.3-0.3.5 can cause poor scalability.

MPI

  • Some older OpenMPI versions (<= 2.1.1) have a bug that can cause a segmentation fault during a parallel AED.
  • The library has an unsolved memory leak problem with OpenMPI. Only large problem sizes are effected. It is not known whether this problem is related to StarNEig, StarPU, OpenMPI or something else. The problem is known to occur with PMIx 2.2.1, UCX 1.5.0, OpenMPI 3.1.3, and StarPU 1.2.8. The memory leak is sometimes accompanied by the following warning:
    mpool.c:38 UCX WARN object 0x2652000 was not returned to mpool ucp_requests
  • The test program can trigger the following bug in UCX 1.6.1: https://github.com/openucx/ucx/issues/4525

StarPU

  • For optimal CUDA performance, StarPU version that is newer than 1.3.3 is recommended.
  • StarPU versions 1.2.4 - 1.2.8 and some StarPU 1.3 snapshots cav cause poor CUDA performance. The problem can be fixed by compiling StarPU with --disable-cuda-memcpy-peer. It is possible that newer versions of StarPU are also effected by this problem.
  • The STARPU_MINIMUM_AVAILABLE_MEM and STARPU_TARGET_AVAILABLE_MEM environmental variables can be used to fix some GPU-related memory allocation problems:
    STARPU_MINIMUM_AVAILABLE_MEM=10 STARPU_TARGET_AVAILABLE_MEM=15 ...

Changelog

Planned for v0.2.0-beta.1:

  • New experimental Hessenberg reduction code with distributed memory and multi-GPU support.
  • Fix a bug that may cause the code to hang in distributed memory Schur reduction.
  • Remove deprecated interface functions.
  • Improved performance models.
  • Disable OpenCL workers.
  • Rename aed_shift_count parameter as shift_count. Rename the default value STARNEIG_SCHUR_DEFAULT_AED_SHIFT_COUNT as STARNEIG_SCHUR_DEFAULT_SHIFT_COUNT.
  • Rename STARPU_LIBRARIES_BASE and STARPU_LIBRARIES_MPI environmental variables as STARPU_LIBRARIES and STARPU_MPI_LIBRARIES, respectively.
  • Updates to the documentation.
  • Add deb packages.

v0.1.0:

  • First stable release of the library.

v0.1-beta.6:

v0.1-beta.5:

  • Improve the performance of the Hessenberg reduction phase by limiting the number of submitted tasks. This should reduce the task scheduling overhead.
  • Allocate pinned memory by default when CUDA support is enabled. Add starneig_enable_pinning() and starneig_disable_pinning().

v0.1-beta.4:

  • Fix a problem where infinite eigenvalues were detected too late.
  • Add an option to choose between the norm stable deflation condition (STARNEIG_SCHUR_NORM_STABLE_THRESHOLD) and and the LAPACK style deflation condition (STARNEIG_SCHUR_LAPACK_THRESHOLD).

v0.1-beta.3:

  • Re-implemented Hessenberg reduction.

v0.1-beta.2:

  • Fix an installation-related bug.
  • Fix a MPI-related compile error.
  • Remove unused code.

v0.1-beta.1:

  • First beta release of the library.