StarNEig User's Guide  v0.1.8
A task-based library for solving dense nonsymmetric eigenvalue problems
Intra-node execution environment

Interface to configure the intra-node execution environment. More...

Functions

void starneig_node_init (int cores, int gpus, starneig_flag_t flags)
 Initializes the intra-node execution environment. More...
 
int starneig_node_initialized ()
 Checks whether the intra-node execution environment is initialized. More...
 
int starneig_node_get_cores ()
 Returns the number of cores (threads) per MPI rank. More...
 
void starneig_node_set_cores (int cores)
 Changes the number of CPUs cores (threads) to use per MPI rank. More...
 
int starneig_node_get_gpus ()
 Returns the number of GPUs per MPI rank. More...
 
void starneig_node_set_gpus (int gpus)
 Changes the number of GPUs to use per MPI rank. More...
 
void starneig_node_finalize ()
 Deallocates resources associated with the intra-node configuration.
 

Library initialization flags

typedef unsigned starneig_flag_t
 Library initialization flag data type.
 
#define STARNEIG_USE_ALL   -1
 Use all resources. More...
 
#define STARNEIG_DEFAULT   0x0
 Default mode. More...
 
#define STARNEIG_HINT_SM   0x0
 Shared memory mode. More...
 
#define STARNEIG_HINT_DM   0x1
 Distributed memory mode. More...
 
#define STARNEIG_FXT_DISABLE   0x2
 No FxT traces mode. More...
 
#define STARNEIG_AWAKE_WORKERS   0x4
 Awake worker mode. More...
 
#define STARNEIG_AWAKE_MPI_WORKER   0x8
 Awake MPI worker mode. More...
 
#define STARNEIG_FAST_DM   (STARNEIG_HINT_DM | STARNEIG_AWAKE_WORKERS | STARNEIG_AWAKE_MPI_WORKER)
 Fast distributed memory mode. More...
 
#define STARNEIG_NO_VERBOSE   0x10
 No verbose mode. More...
 
#define STARNEIG_NO_MESSAGES   (STARNEIG_NO_VERBOSE | 0x20)
 No messages mode. More...
 

Pinned host memory

void starneig_node_enable_pinning ()
 Enable CUDA host memory pinning. More...
 
void starneig_node_disable_pinning ()
 Disables CUDA host memory pinning. More...
 

Detailed Description

Interface to configure the intra-node execution environment.

Macro Definition Documentation

◆ STARNEIG_USE_ALL

#define STARNEIG_USE_ALL   -1

Use all resources.

Tells StarNEig to use all available CPU cores / GPUs.

Examples
gep_dm_full_chain.c, gep_sm_eigenvectors.c, gep_sm_full_chain.c, sep_dm_full_chain.c, sep_sm_eigenvectors.c, and sep_sm_full_chain.c.

◆ STARNEIG_DEFAULT

#define STARNEIG_DEFAULT   0x0

Default mode.

As a default, the library configures itself to shared memory mode.

◆ STARNEIG_HINT_SM

#define STARNEIG_HINT_SM   0x0

Shared memory mode.

Initializes the library for shared memory computation. The library will automatically reconfigure itself for distributed memory computation if necessary

Examples
gep_sm_eigenvectors.c, gep_sm_full_chain.c, sep_sm_eigenvectors.c, and sep_sm_full_chain.c.

◆ STARNEIG_HINT_DM

#define STARNEIG_HINT_DM   0x1

Distributed memory mode.

Initializes the library for distributed memory computation. The library will automatically reconfigure itself for shared memory computation if necessary

Examples
sep_dm_full_chain.c.

◆ STARNEIG_FXT_DISABLE

#define STARNEIG_FXT_DISABLE   0x2

No FxT traces mode.

Disables FXT traces.

Attention
This flag does not work reliably with all StarPU versions.

◆ STARNEIG_AWAKE_WORKERS

#define STARNEIG_AWAKE_WORKERS   0x4

Awake worker mode.

Keeps the StarPU worker threads awake between interface function calls. Improves the performance in certain situations but can interfere with other software.

Examples
gep_sm_full_chain.c.

◆ STARNEIG_AWAKE_MPI_WORKER

#define STARNEIG_AWAKE_MPI_WORKER   0x8

Awake MPI worker mode.

Keeps the StarPU-MPI communication thread awake between interface function calls. Improves the performance in certain situations but can interfere with other software.

◆ STARNEIG_FAST_DM

Fast distributed memory mode.

Keeps the worker threads and StarPU-MPI communication thread awake between interface function calls. Improves the performance in certain situations but can interfere with other software.

Examples
gep_dm_full_chain.c.

◆ STARNEIG_NO_VERBOSE

#define STARNEIG_NO_VERBOSE   0x10

No verbose mode.

Disables all additional verbose messages.

◆ STARNEIG_NO_MESSAGES

#define STARNEIG_NO_MESSAGES   (STARNEIG_NO_VERBOSE | 0x20)

No messages mode.

Disables all messages (including verbose messages).

Function Documentation

◆ starneig_node_init()

void starneig_node_init ( int  cores,
int  gpus,
starneig_flag_t  flags 
)

Initializes the intra-node execution environment.

The interface function initializes StarPU (and cuBLAS) and pauses all worker The cores argument specifies the total number of used CPU cores. In distributed memory mode, one CPU core is automatically allocated for the StarPU-MPI communication thread. One or more CPU cores are automatically allocated for GPU devices.

Parameters
[in]coresThe number of cores (threads) to use per MPI rank. Can be set to STARNEIG_USE_ALL in which case the library uses all available cores.
[in]gpusThe number of GPUs to use per MPI rank. Can be set to STARNEIG_USE_ALL in which case the library uses all available GPUs.
[in]flagsInitialization flags.
Examples
gep_dm_full_chain.c, gep_sm_eigenvectors.c, gep_sm_full_chain.c, sep_dm_full_chain.c, sep_sm_eigenvectors.c, and sep_sm_full_chain.c.

◆ starneig_node_initialized()

int starneig_node_initialized ( )

Checks whether the intra-node execution environment is initialized.

Returns
Non-zero if the environment is initialized, 0 otherwise.

◆ starneig_node_get_cores()

int starneig_node_get_cores ( )

Returns the number of cores (threads) per MPI rank.

Returns
The number of cores (threads) per MPI rank.

◆ starneig_node_set_cores()

void starneig_node_set_cores ( int  cores)

Changes the number of CPUs cores (threads) to use per MPI rank.

Parameters
coresThe number of CPUs to use per MPI rank.

◆ starneig_node_get_gpus()

int starneig_node_get_gpus ( )

Returns the number of GPUs per MPI rank.

Returns
The number of GPUs per MPI rank.

◆ starneig_node_set_gpus()

void starneig_node_set_gpus ( int  gpus)

Changes the number of GPUs to use per MPI rank.

Parameters
gpusThe number of GPUs to use per MPI rank.

◆ starneig_node_enable_pinning()

void starneig_node_enable_pinning ( )

Enable CUDA host memory pinning.

Should be called before any memory allocations are made.

◆ starneig_node_disable_pinning()

void starneig_node_disable_pinning ( )

Disables CUDA host memory pinning.

Should be called before any memory allocations are made.