IBM at this time rolled out its Power11 line of servers, the primary new technology of IBM Energy servers since 2021. The brand new servers supply an incremental efficiency enhance over older Energy servers. However maybe extra importantly, the Power11 containers embrace options that can attraction to organizations attempting to construct and run AI workloads, together with a Spyre Accelerator designed to spice up AI inference workloads and power effectivity good points.
Just like the Power10, Power11 chips are primarily based on 16-core dies, with 15 cores lively at any time. Not like the earlier technology of chips, the additional core may be activated in case one of many different 15 fail, which is able to make these already dependable machines much more dependable. The truth is, IBM goes as far as to boast that Power11 will ship “zero deliberate downtime for system upkeep.”
On the RAM facet of issues, Power11 servers make the most of OpenCAPI Reminiscence Interface (OMI) channels and assist differential DIMM (D-DIMM), which is uncovered as DDR5 reminiscence. IBM says Power11 servers stand up to 55% higher core efficiency in comparison with Power9 and as much as 45% extra capability in comparison with Power10. Power11 containers may even embrace the Spyre Accelerator, a system-on-a-chip, starting within the fourth quarter.
The unveiled Power11 lineup consists of: Energy E1180, a full rack system with as much as 256 cores and 64TB of RAM; Energy E1150, a 4U server with as much as 120 cores and 16TB of RAM; Energy S1124, a 4U server with as much as 60 cores and 8TB of RAM; and the Energy S1122, a 2U server with as much as 60 cores and 4TB of RAM. IBM may even make the S1124 and S1122 accessible in its IBM Cloud by way of the IBM Energy Digital Server.
IBM first launched Spyre on the System Z mainframe, and Power11 marks the start of its utilization on the Energy facet of the body. Formally, it’s an ASIC that connects to the server bus by way of PCIe card. It options 32 AI accelerator cores and 128GB of LPDDR5 reminiscence, and delivers 300 TOPS, or Tera Operations per Second, of computing energy. IBM developed Spyre alongside its Telum II processor, each of which had been launched on the Scorching Chips 2024 convention and which featured closely within the launch of IBM’s new z17 mainframe this April.
Constructed for AI Inference, Not Coaching
After lacking the AI mannequin coaching craze of the previous three years, IBM is joyful to have a brand new server that matches squarely into the AI plans of massive enterprises and scientific organizations–with power effectivity an enormous promoting level, as well.
That’s as a result of, for the previous three years, system distributors have been chasing the marketplace for coaching ever-bigger AI fashions with ever-bigger GPU clusters. As Massive Language Fashions (LLMs) approached the one trillion parameter mark, clients sought–and OEMs delivered–large GPU-equipped compute clusters.
Nonetheless, there was one server OEM who missed that AI coaching boat: IBM. Because the launch of Power10 in 2021, IBM has not offered a brand new system that integrates on the chip degree with GPUs. The brand new Power11 chips, which is predicated on the Power10 design, additionally doesn’t assist GPU connections, which is not going to garner it a lot assist amongst hyperscalers seeking to prepare large LLMs.
As luck would have it for IBM, AI hit the “scaling wall” in late 2024, placing a damper on the ever-growing dimension of LLMs and the ever-growing dimension of GPU clusters to coach them. The emergence of a brand new class of reasoning fashions, reminiscent of DeepSeek, that may ship the accuracy of conventional LLM fashions at a fraction of the dimensions and coaching price, caught some within the AI neighborhood flat-footed. On this case, the shift in emphasis from AI coaching to AI inference, in addition to the emergence of agentic AI, advantages IBM.
The entire Power11 fashions function power effectivity good points, which additionally performs in IBM’s favor. For example, the Energy E1180 has 10% extra rPerf per watt in comparison with the Energy E1080, whereas the E1150 and S1124 supply 20% and 22% extra rPerf per watt, respectively, over the E1050 and S1024. The Energy S1122, in the meantime, affords as much as 6.9X higher efficiency per greenback for AI inferencing in comparison with the Energy S102.
The ability consumption of AI is an rising concern. In keeping with a 2024 Division of Power report, knowledge middle load development has tripled over the previous decade and is projected to double or triple by 2028, with AI driving an enormous chunk of that development.
Associated Objects:
AI Classes Realized from DeepSeek’s Meteoric Rise
What Are Reasoning Fashions and Why You Ought to Care
Nvidia Touts Subsequent Technology GPU Superchip and New Photonic Switches


