![]() |
StarNEig Library
version v0.1-beta.1
A task-based library for solving nonsymmetric eigenvalue problems
|
The STARNEIG_HINT_DM initialization flag tells the library to configure itself for distributed memory computation. The flag is indented to be only a hint and the library will automatically reconfigure itself for the correct computation mode. A user is allowed to mix shared memory and distributed memory functions without reninitializing the library. The library is intended to be run in a hybrid configuration (each MPI rank is mapped to several CPU cores). Failing to do so leads to CPU core oversubscription. It is generally a good idea to map each MPI rank to a full node or a NUMA island / CPU socket:
The library assumes that the MPI library is already initialized when the starneig_node_init() interface function is called with the STARNEIG_HINT_DM flag or when the library reconfigures itself for distributed memory after a user has called a distributed memory interface function. The MPI library should be initialized either in the serialized mode:
Or in the multi-threaded mode:
A user is allowed to change the library MPI communicator with the starneig_mpi_set_comm() interface function. This interface function should be called before the library is initialized.
Distributed matrices are represented using two opaque objects:
Each matrix is divided into rectangular blocks of uniform size (excluding the last block row and column):
The blocks are indexed using a two-dimensional index space. A data distribution encapsulates an arbitrary mapping from this two-dimensional block index space to the one-dimensional MPI rank space:
In the above example, the rank 0 owns the blocks (0,1), (1,2), (1,5), (1,6), (2,6), (3,0) and (3,5). Naturally, a data distribution can describe a two-dimensional block cyclic distribution that is very common with ScaLAPACK subroutines:
A data distribution can be created using one of the following interface functions:
Fox example,
would create a two-dimensional block cyclic distribution with 4 rows and 6 columns in the mesh. Alternatively, a user can create an equivalent data distribution using the starneig_distr_init_func() interface function:
A data distribution is destroyed with the starneig_distr_destroy() interface function.
A distributed matrix is created using the starneig_distr_matrix_create() interface function. The function call will automatically allocate the required local resources. For example,
would create a double-precision real matrix that is distributed in a two-dimensional block cyclic fashion in
blocks. Or,
would create a double-precision real matrix with a default data distribution (
NULL
argument) and a default block size (-1, -1
).
A user may access the locally owned blocks using the starneig_distr_matrix_get_blocks() interface function. A distributed matrix is destroyed using the starneig_distr_matrix_destroy() interface function. This will deallocate all local resources. See module Distributed Memory / Distributed matrices for further information.
An entire distributed matrix can be copied with the starneig_distr_matrix_copy() interface function:
This copies distributed matrix dB
to a distributed matrix dA
. A region (submatrix) of a distributed matrix can be copied to a second distributed matrix using the starneig_distr_matrix_copy_region() interface function.
A local matrix can be converted to a "single owner" distributed matrix with the starneig_distr_matrix_create_local() interface function:
This creates a wrapper object, i.e., the pointer A
and the distributed matrix lA
point to the same data on the owner
node. The created distributed matrix is associated with a data distribution that indicated that the whole matrix is owned by the node owner
. The used block size is .
Copying from a "single owner" distributed matrix to a distributed matrix performs a scatter operation and copying from a distributed matrix to a "single owner" distributed matrix performs a gather operation.
The library provides a ScaLAPACK compatibility layer:
A two-dimensional block cyclic data distribution can be converted to a BLACS context and vice versa using the starneig_distr_to_blacs_context() and starneig_blacs_context_to_distr() interface functions, respectively. Similarly, a distributed matrix that uses a two-dimensional block cyclic data distribution can be converted to a BLACS descriptor (and a local buffer) and vice versa using the starneig_distr_matrix_to_blacs_descr() and starneig_blacs_descr_to_distr_matrix() interface functions, respectively. The conversion is performed in-place and a user is allowed to mix StarNEig interface functions with ScaLAPACK style subroutines/functions without reconversion.
For example,
converts a distributed matrix dA
to a BLACS descriptor descr_a
and a local pointer local_a
. The descriptor and the local array are then fed to a ScaLAPACK subroutine. A user must make sure that the live time of the distributed matrix dA
is at least as long as the live time of the matching BLACS descriptor descr_a
. See modules ScaLAPACK compatibility / BLACS helpers and ScaLAPACK compatibility / BLACS matrices for further information.