SLATE 2024.05.31
Software for Linear Algebra Targeting Exascale
|
OMP_NUM_THREADS
Sets the number of OpenMP threads per MPI rank.
CUDA_VISIBLE_DEVICES
(for CUDA)ROCR_VISIBLE_DEVICES
(for HIP/ROCm)
Sets which GPUs are visible to the program. For example, to make only GPU 0 visible:
export CUDA_VISIBLE_DEVICES=0
SLATE_GPU_AWARE_MPI
Setting to 1
enables use of GPU-aware MPI within SLATE. If the MPI library is not actually GPU-aware, this will cause segfaults.
Uses 4 MPI ranks, GPU-aware MPI enabled, with 10 OpenMP threads and 1 GPU per MPI rank. (Output abbreviated.)