- Skymizer claims big AI fashions not want hyperscale GPU infrastructure
- Previous 28nm chips instantly energy large language fashions at surprisingly low wattage
- The HTX301 squeezes 384 GB of reminiscence right into a single PCIe accelerator card
A Taiwanese firm known as Skymizer has unveiled a PCIe AI accelerator that challenges each AMD and Nvidia utilizing surprisingly outdated expertise.
The HTX301 card can run language fashions with as much as 700 billion parameters on a single machine whereas consuming solely 240 watts of energy.
The cardboard achieves this feat utilizing older 28-nanometer chips and commonplace LPDDR4 and LPDDR5 reminiscence as an alternative of pricey HBM or GDDR options.
Newest Movies From
It’s possible you’ll like
Previous tech chip competes with fashionable AI accelerators
Skymizer claims its card delivers 30 tokens per second with simply 0.5 TOPS at 100 GB per second bandwidth.
The HTX301 is constructed on Skymizer’s HyperThought platform, which options next-generation LPU IP designed particularly for big language mannequin workloads.
Every PCIe card comprises six HTX301 chips working collectively, and the cardboard gives as much as 384 GB of whole reminiscence capability.
The design makes use of environment friendly compression strategies for each weights and KV cache, outperforming open supply llama.cpp by 9 to 17.8 p.c.
Its energy consumption sits at lower than half of what main PCIe AI accelerators from AMD and NVIDIA usually require.
The cardboard helps agentic AI for coding, automation, and domain-specific workflows with no need hyperscale GPU clusters.
Working massive language fashions within the cloud introduces privateness issues and unpredictable prices that many organizations discover unacceptable.
What to learn subsequent
Upgrading on-premises infrastructure to assist large GPU accelerator platforms usually requires costly redesigns of information middle energy and cooling methods.
Skymizer’s HTX301 gives enterprises a 3rd possibility that matches into commonplace air-cooled servers with none infrastructure adjustments.
The corporate claims the period of needing hyperscale GPU clusters for ultra-large LLMs is over with their new expertise.
The PCIe card type issue permits companies to scale AI inference on premises whereas sustaining knowledge sovereignty and predictable infrastructure prices.
Skymizer HTX301 awaits real-world testing
Skymizer will preview the HTX301 at Computex this 12 months, permitting impartial verification of its efficiency numbers.
The specs of this chip look spectacular on paper, however real-world testing will decide whether or not the cardboard truly delivers 240 tokens per second on Llama2 7B workloads.
AMD not too long ago launched its Intuition MI350P PCIe card with 144 GB of HBM3E reminiscence and as much as 4,600 peak TFLOPS at MXFP4 precision, but it consumes significantly extra energy than Skymizer’s providing.
Nvidia’s RTX PRO 6000 Blackwell consumes roughly 600 watts, greater than double what Skymizer’s card requires for comparable inference duties.
Ought to the HTX301 work as marketed, it might dramatically decrease the barrier to entry for on-premises AI infrastructure.
Failure to ship would place Skymizer among the many many startups that might not again up their guarantees.
Through Wccftech
Comply with TechRadar on Google Information and add us as a most popular supply to get our professional information, critiques, and opinion in your feeds.

