Customers are considering applications for AI inference and want to evaluate multiple inference accelerators. As we discussed last month, TOPS do NOT correlate with inference throughput and you should ...
Much has been written about the computational complexity of inference acceleration: very large matrix multiplies for fully-connected layers and huge numbers of 3×3 convolutions across megapixel images ...
Today AI chip startup Groq announced that their new processor has achieved 21,700 inferences per second (IPS) for ResNet-50 v2 inference. Groq’s level of inference performance exceeds that of other ...
Groq, the inventor of the Tensor Streaming Processor (TSP) architecture, today announced that its processor has achieved 21,700 inferences per second (IPS) for ResNet-50 v2 inference. Groq said that ...
Today Intel announced a deep learning performance record on image classification workloads. Intel was able to achieve 7878 images per second on ResNet-50 with its latest generation of Intel Xeon ...
It’s important to understand that an inference accelerator is a completely new kind of chip, with many unknowns for the broader market. In our industry, there’s a learning curve for everything, from ...
There is much at stake in the world of datacenter inference and while the market has not yet decided its winners, there are finally some new metrics in the bucket to aid decision-making. Interpreting ...