Groq is now one of the only two with commercially available accelerators to be used as part of a cloud service. Groq’s Tensor Streaming Processor (TSP) silicon is now available to accelerate customer’s AI workloads in the cloud and will become an addition to their cloud-based AI and Deep Learning platform. Nimbix now offers machine learning acceleration on Groq hardware as an on-demand service for a few “selected customers”.
Chief Executive Officer of Nimbix, Steve Herbert said that “the simplified processing architecture of Groq is one of a kind and helps it to provide an unprecedented, deterministic performance to compute intensive workloads. “
Groq’s TSP chip clocks a speed of 1,000 TOPS (1 peta operations per second) which is more than double the performance of today’s GPU-based systems. As part of Groq’s software-driven approach, the control features have been removed from the silicon and instead given to the compiler. This move resulted in a completely predictable and deterministic operation orchestrated by the compiler, which allows performance to be fully understood at compile time.
The advantage it offers over GPUs at large batch sizes is 2.5x and at batch = 1 the advantage is closer to 17x.An important feature of the chip is that its performance advantage does not rely on batching – multipledata samples are processed simultaneously as batches in the data center.