Intel and SambaNova simply constructed a three-chip AI machine that splits work between GPUs, RDUs, and Xeon

GPUs deal with prefill operations by changing prompts into key-value caches
SambaNova RDUs generate tokens at excessive throughput and low latency
Intel Xeon 6 processors handle workload distribution and execute compiled code

Intel and SambaNova Methods have launched a joint {hardware} blueprint combining GPUs, SambaNova RDUs, and Intel Xeon 6 processors for large-scale inference workloads.

The system assigns GPUs to prefill operations, RDUs to decoding, and Xeon CPUs to execution and orchestration duties throughout agent-driven environments.

“Agentic AI is transferring into manufacturing — and the profitable sample we’re seeing is GPUs to begin the job, Intel Xeon 6 to run it, and SambaNova RDUs to complete it quick,” mentioned Rodrigo Liang, CEO and co-founder of SambaNova Methods.

Article continues beneath

You could like

CPU is the execution and management layer

This design is scheduled to be out there within the second half of 2026 for enterprises, cloud suppliers, and sovereign deployments.

The structure locations Intel Xeon 6 processors on the middle of system management, the place they handle workload distribution, execute code, and coordinate instrument interactions.

It contains dealing with compilation, validating outputs, and sustaining communication between simultaneous processes.

“When hundreds of simultaneous coding brokers are producing instrument calls, retrieval requests, code builds, and encrypted inter-agent messages, the CPU shouldn’t be a background part — it’s the system’s govt and motion layer,” mentioned Harry Ault, CRO of SambaNova.

The assertion defines the CPU as the first layer accountable for system conduct quite than a supporting part.

In response to SambaNova, Xeon 6 delivers greater than 50% sooner LLVM compilation occasions in contrast with Arm-based server CPUs.

It additionally delivers as much as 70% sooner vector database efficiency in contrast with different x86-based techniques.

What to learn subsequent

These figures relate to execution velocity inside coding and retrieval workflows, and on this configuration, GPUs course of the prefill stage by changing prompts into key-value caches.

SambaNova RDUs function because the decoding layer, producing tokens at excessive throughput and low latency.

Xeon 6 processors operate as each host CPUs and execution engines, managing system-level operations and working compiled workloads.

“Manufacturing inference is transferring towards heterogeneous {hardware} — no single chip kind is perfect for each stage of an agentic workflow,” mentioned Banghua Zhu, co-founder and CTO at RadixArk.

He added that combining RDUs with Xeon CPUs permits techniques to keep up compatibility with current software program environments.

The system is designed to run inside current air-cooled information facilities with out requiring new builds.

In response to the businesses, this enables scaling of inference workloads with out extra pressure on water and vitality sources.

As Nvidia and Groq proceed to give attention to enhancing inference throughput and latency, this announcement provides a layer of competitors.

It gives another strategy that distributes workloads throughout a number of {hardware} layers quite than counting on a single processing mannequin.

Comply with TechRadar on Google Information and add us as a most popular supply to get our knowledgeable information, opinions, and opinion in your feeds. Make certain to click on the Comply with button!

And naturally you may as well observe TechRadar on TikTok for information, opinions, unboxings in video type, and get common updates from us on WhatsApp too.

What's Hot

Apple reportedly testing 4 designs for upcoming good glasses

New FCC router guidelines may entice thousands and thousands utilizing outdated ISP {hardware} as provide chain limits stall upgrades and complicate safety fixes

The US authorities needs Reddit to snitch on certainly one of its customers by a grand jury

High 10 trending telephones of week 15

Why is Google killing its Assistant earlier than Gemini is able to deal with the fundamentals?

YouTube Premium is the one streaming service that may hike costs

The good dwelling was speculated to be open, but it surely’s changing into a toll sales space

I spent 6 hours with Genshin Impression on the Galaxy S26 Extremely, and I am unable to imagine how far cell gaming has come

The brand new crimson iPhone 18 Professional would possibly look one thing like this

Apple reportedly testing 4 designs for upcoming good glasses

New FCC router guidelines may entice thousands and thousands utilizing outdated ISP {hardware} as provide chain limits stall upgrades and complicate safety fixes

The US authorities needs Reddit to snitch on certainly one of its customers by a grand jury

Apple reportedly testing 4 designs for upcoming good glasses

New FCC router guidelines may entice thousands and thousands utilizing outdated ISP {hardware} as provide chain limits stall upgrades and complicate safety fixes

The US authorities needs Reddit to snitch on certainly one of its customers by a grand jury

Usefull link

categories

What's Hot

CPU is the execution and management layer

Related Posts

Usefull link

categories