Ask Dr. Zhou for access to this NSF MRI GPU cluster (DeepBlizzard). Then submit this form

DeepBlizzard is more powerful than our AC GPU Cluster; however, it is not as convenient as our lab GPU computers and AC GPU Cluster. You cannot debug your program on DeepBlizzard

Download MRI_CLUSTER_HOW_TO.docx

Download MTU_MRI_User_Mannual.pdf

DeepBlizzard consists of 1 head node that has AMD EPYC 7343 3.2GHz processor with 128M cache and 256 GB RDIMM memory. DeepBlizzard also has a storage array that consists of 120TB HDD SAS of hard disk and 30TB of SSD. This cluster consists of 3 high memory (dual CPU nodes with a total of 64 cores and a terabyte of ram) A30 GPU nodes. Each A30 node contains 2 NVIDIA Ampere A30 GPU cards. An A30 GPU contains 3584 CUDA cores, 224 Tensor cores, and 24GB of HBM2 RAM. Additionally, DeepBlizzard cluster also consists of another 2 A30 GPU nodes. This brings the total number of A30 GPUs to 10 A30GPUs.

Furthermore, DeepBlizzard cluster has 7 A100 GPU nodes which contains 4 A100 GPU. An A100 GPU contains 6912 CUDA cores, 432 Tensor cores, and 80GB of HBM2e RAM with a bandwidth of 1555GB/sec. Hence, there are a total of 28 A100 GPUs. In each GPU node, the GPUs are connected through a third-generation NVLink bridge, yielding up to a total bandwidth of 112.5 GB/sec for A40s to 600GB/sec for A100s. The GPUs communicate with the host CPUs through PCIe-4 which delivers a rate of 64GB/s. Each GPU node is hosted by two CPUs, each with 32 physical cores/64 logical cores. Finally, to obtain the best results from these multiple GPU’s a high-performance 40 Port HDR Infiniband interconnect to handle GPU-to-GPU communications.