An AI expert discusses the hardware and infrastructure needed to properly run and train AI models - 3 minutes read





Powerful servers and processors are the workhorses of AI. Behind the scenes, Intel® Xeon® processors handle high-performance computing for a wide range of tasks, including AI applications, while the Habana® Gaudi® accelerator tackles deep learning workloads. Together, this infrastructure duo is helping train and deploy revolutionary AI models.

Transcript:

Monica Livingston:

When you are looking at how to infuse AI in your application, the hardware conversation happens later. When you get to the point where you understand what your workload is, whether it's built in-house or developed externally, you know what workload you want to run, then you start looking at infrastructure. 

What do I run this on? AI is not necessarily a special-built AI box that you plug into the wall and it does your AI. AI needs to run on your general purpose infrastructure.

I'm Monica Livingston and I lead the AI Center of Excellence at Intel. 

Being able to run your AI applications on general purpose infrastructure is incredibly important because then your cost for additional infrastructure is reduced. 

For Intel, we spend a lot of time adding AI performance features into our Intel Xeon scalable processors. For the types of AI that can't just simply run on a processor, we are offering accelerators. We have Intel discrete GPUs and we have our Habana Gaudi product, which is an AI ASIC specializing in deep learning, training, and inference.

When you're trying to think of Xeon and Gaudi, you would use Gaudi to train very large models. So you would have your Gaudi cluster and train a very large model — hundreds of billions of parameters. If your model is under 20 billion parameters, generally that can run inference on Xeon. 

So after you have trained your model to actually go and run it, you can run that on your Xeon boxes. This processor family has a number of AI optimizations in it. The AMX feature, advanced matrix extension, is our newest feature that's in this current generation of Xeon processors and it enables us to run deep learning, training, and inference a lot faster on the CPUs. 

And again, that built-in acceleration would enable a customer or an enterprise to run these models on a CPU versus a more expensive compute.

The future is that you'll have many generative AI models for different types of purposes within your company. And if all of them take millions to train, that return on investment doesn't look as favorable as if you had these smaller models, much more efficient models, that can run on multipurpose architecture. And so you're not having to stand up new infrastructure specifically for this.




Source: Business Insider

Powered by NewsAPI.org