Ask Dr. Zhou for access to this NSF MRI GPU cluster (DeepBlizzard). Then submit this form

Dr. Bo Zhang from CS is providing administration and technical support to DeepBlizzard. Dr. Zhou will help you (students) to get the permission.

DeepBlizzard is more powerful than our AC GPU Cluster; however, it is not as convenient as our lab GPU computers and AC GPU Cluster. You cannot debug your program on DeepBlizzard; it doesn't have GUI; your program will be added into a queue in the server

Download MRI_CLUSTER_HOW_TO.docx

Download MTU_MRI_User_Mannual.pdf

DeepBlizzard consists of 1 head node that has AMD EPYC 7343 3.2GHz processor with 128M cache and 256 GB RDIMM memory. DeepBlizzard also has a storage array that consists of 120TB HDD SAS of hard disk and 30TB of SSD. This cluster consists of 3 high memory (dual CPU nodes with a total of 64 cores and a terabyte of ram) A30 GPU nodes. Each A30 node contains 2 NVIDIA Ampere A30 GPU cards. An A30 GPU contains 3584 CUDA cores, 224 Tensor cores, and 24GB of HBM2 RAM. Additionally, DeepBlizzard cluster also consists of another 2 A30 GPU nodes. This brings the total number of A30 GPUs to 10 A30GPUs.

Furthermore, DeepBlizzard cluster has 7 A100 GPU nodes which contains 4 A100 GPU. An A100 GPU contains 6912 CUDA cores, 432 Tensor cores, and 80GB of HBM2e RAM with a bandwidth of 1555GB/sec. Hence, there are a total of 28 A100 GPUs. In each GPU node, the GPUs are connected through a third-generation NVLink bridge, yielding up to a total bandwidth of 112.5 GB/sec for A40s to 600GB/sec for A100s. The GPUs communicate with the host CPUs through PCIe-4 which delivers a rate of 64GB/s. Each GPU node is hosted by two CPUs, each with 32 physical cores/64 logical cores. Finally, to obtain the best results from these multiple GPU’s a high-performance 40 Port HDR Infiniband interconnect to handle GPU-to-GPU communications.