The Intel Xeon E7-8800 v3 Review: The POWER8 Killer?
by Johan De Gelas on May 8, 2015 8:00 AM EST- Posted in
- CPUs
- enterprise CPUs
- Enterprise
- server
- IT Computing
- Intel
- Xeon
- Haswell
- POWER8
- POWER
The story behind the high-end Xeon E7 has been an uninterrupted triumphal march for the past 5 years: Intel's most expensive Xeon beats Oracle servers - which cost a magnitude more - silly, and offers much better performance per watt/dollar than the massive IBM POWER servers. Each time a new generation of quad/octal socket Xeons is born, Intel increases the core count, RAS features, and performance per core while charging more for the top SKUs. Each time that price increases is justified, as the total cost of a similar RISC server is a factor more than an Xeon E7 server. From the Intel side, this new generation based upon the Haswell core is no different: more cores (18 vs 15), better RAS, slightly more performance per core and ... higher prices.
However, before you close this tab of your browser, know that even this high-end market is getting (more) exciting. Yes, Intel is correct in that the market momentum is still very much in favor of themselves and thus x86.
No less than 98% of the server shipments have been "Intel inside". No less than 92-94% of the four socket and higher servers contain Intel Xeons. From the revenue side, the RISC based systems are still good for slightly less than 20% of the $49 Billion (per year) server market*. Oracle still commands about 4% (+/- $2 Billion), but has been in a steady decline. IBM's POWER based servers are good for about 12-15% (including mainframes) or $6-7 Billion depending on who you ask (*).
It is however not game over (yet?) for IBM. The big news of the past months is that IBM has sold its x86 server division to Lenovo. As a result, Big Blue finally throw its enormous weight behind the homegrown POWER chips. Instead of a confusing and half heartly "we will sell you x86 and Itanium too" message, we now get the "time to switch over to OpenPOWER" message. IBM spent $1 billion to encourage ISVs to port x86-linux applications to the Power Linux platform. IBM also opened up its hardware: since late 2013, the OpenPower Foundation has been growing quickly with Wistron (ODM), Tyan and Google building hardware on top of the Power SoCs. The OpenPOWER Foundation now has 113 members, and lots of OpenPower servers are being designed and build. Timothy Green of the Motley fool believes OpenPower will threaten Intel's server hegemony in the largest server market, China.
But enough of that. This is Anandtech, and here we quantify claims instead of just rambling about changing markets. What has Intel cooked up and how does it stack up to the competion? Let's find out.
(*) Source: IDC Worldwide Quarterly Server Tracker, 2014Q1, May 2014, Vendor Revenue Share











68 Comments
View All Comments
Shadowmaster625 - Friday, May 08, 2015 - link
This kind of provides more proof that Intel would do well to incease its SMT threads per core count. Replynathanddrews - Friday, May 08, 2015 - link
I'd like to see that alongside another GHz War. Replymapesdhs - Friday, May 08, 2015 - link
Good point, I wonder why they've left it at only 2/core for so long... Replyname99 - Friday, May 08, 2015 - link
It's not easy to ramp up the number of threads. In particular POWER8 uses something I've never seen any other CPU do --- they have a second tier register file (basically an L2 for registers) and the system dynamically moves data between the two register files as appropriate.It's also much easier for POWER8 to decode 8 instructions per cycle (and to do the multiple branch prediction per cycle to make that happen). Intel could maybe do that if they reverted to a trace cache, but the target codes for this type of CPU are characterized by very large I-footprints and not much tight looping, so trace caches, loop caches, micro-op caches are not that much help. Intel might have to do something like a dual-ported I-cache, and running two fetch streams into two independent sets of 4-wide decoders. Reply
xdrol - Saturday, May 09, 2015 - link
Another register file is just a drop in the ocean. The real problem is the increasing L1/2/.. cache pressure; what can only be mitigated by increasing cache size; what in turn will make your cache access slower, even when you use only one of the SMT threads.Also, you need to have enough unused execution capacity (pipeline ports) for another hardware thread to be useful; the 2 threads in Haswell can already saturate the 7 execution ports with quite high probability, so the extra thread can only run in expense of the other, and due to the cache effects, it's probably faster to just get the 2 tasks executed sequentially (within the same thread). This question could be revisited if the processor has 14 execution port, 2x issue, 2x cache, 2x everything, so it can have 4T/1C, but then it's not really different from 2 normal size cores with 4T.. Reply
iAPX - Friday, May 08, 2015 - link
It's because this is the same architecture (mainly) that is used on desktop, laptops, and now even mobility!With this market share, I won't be surprised that Intel decided to create a new architecture (x86-64 based) for future server chips, much more specialized, dropping AVX for cloud servers, having 4+ threads per core with simpler decoder and a lot of integer and load/store units!
That might be complemented by a Xeon Phi socketable for floating-point compute intensive tasks and workstations, but it's unclear even if Intel announced it far far ago! ;) Reply
DanNeely - Friday, May 08, 2015 - link
Intel's 94% market share is still only ~184k systems. That's tiny compared to the mainstream x86 market; and doesn't give a lot of (budgetary) room to make radical changes to CPU vs just scaling shared designs to a huger layout. Replytheeldest - Friday, May 08, 2015 - link
184k for 4S systems. The number of 2S systems *greatly* outnumbers the 184k. ReplySamus - Sunday, May 10, 2015 - link
by 100 orders of magnitude, easily.2S systems are everywhere these days, I picked up a Lenovo 2S Xeon system for $600 NEW (driveless, 4GB RAM) from CDW.
4S, on the other hand, is considerably more rare and starts at many thousands, even with 1 CPU included. Reply
erple2 - Sunday, May 10, 2015 - link
Well, maybe 2 orders of magnitude. 100 orders of magnitude would imply, based on the 184k 4S systems, more 2S systems than atoms in the universe. Ok, I made that up, I don't know how many atoms are in the universe, but 10^100 is a really big number. Well, 10^105, if we assume 184k 4S systems.I think you meant 2 orders of magnitude. Reply