IBM and Nvidia are building an all-new DGX SuperPOD system which is based on IBM’s ESS 3200 storage server and Spectrum Scale. It has brand new updated reference architecture for DGX PODs and has released a benchmark data for the ESS 3200 and a DGX POD that is meant for primary storage in the M2M connection.
A DGX POD is a reference architecture, which is mostly used for AI processing and it also contains up to nine Nvidia DGX-1 servers, twelve storage servers, and three networking switches that will support single- and multi-node AI model training using NVIDIA AI software. A SuperPOD is an updated and larger POD architecture, which starts at 20 Nvidia DGX A100 systems and scales to 140 of them.
The ESS 3200 is way faster than the ESS 3000, and that edition of the RA only using the Storage Fabric in keeping with the NVIDIA guidelines.IBM stated that the Reference Architecture uses the all-new Nvidia’s CPU/DRAM bypass GPU-Direct scheme which is used to set up RDMA links between the ESS 3200 storage and the DGX-A100 GPUs.