xplus93 wrote:
Of course I know how expensive it is. Anything labeled xeon is expensive until you try to sell it used after decommissioning it. How are you going to say the die size or core architecture has any impact? Do you not remember socket 775? That's not even a good comparison for various reasons, but not the ones you mentioned. Socketed would mean more possibilities for cooling actually. And I'm starting to get where you're confusion comes in. I'm not saying put a socket on a gpu card and just making it an expansion planar, although I already explained that. Yeah, having the memory separate is a bit pointless and only a minor brainstorming-only addition
The reason used Xeons are (relatively) worthless used is because for their target audience, their value is decreased considerably after 5 years. The E5-2690 (v1 SandyBridge 8C/16T/2.9GHz) is about $150 on eBay. The Xeon Platinum 8180 (Skylake-SP 28C/56T/2.5GHz) is $10000. That's a 66:1 ratio. Assuming you can put together 40 dual-socket (80 E5-2690 CPU's) servers against a single dual socket Xeon Platnuim 8180 server, you would probably be ahead computationally with the older E5-2690's.
Great Value? Nope. The data center world also cares about cooling (typically 1-1.5 watt usage = 1 watt cooling requirement) and rackspace/density. Enthusiasts and smaller businesses are (relatively) insensitive to that, but their volume is a drop in the bucket (well maybe 1/10 the bucket). I remember someone on HardOCP bought a Quad Xeon X7560 (4x 8C/16T 2.27GHz Beckton/Nehalem) thinking it was a great deal. People did the math for him, and with his usage model, buying a new single Xeon E5-2695 v4 or something, would allow him to break even on power in like a year.
The argument is that Socket 775/1366/115x were so small because those Xeons were designed against a desktop core. LGA-3647 is the first socket designed NOT to be only for CPU's. Thus the massive increase in pin count. A lot of the pins are 'reserved'. The reason Epyc is huge is because of the trace requirements for the inter-CPU communications (that is the drawback of a non-monolithic die).
Socketed is worse for cooling. You lose whatever height that socket was from your heatsink. Not a big deal for cheapo or mid-range GPU's. Bigger problems with the high end 200W+ designs. Even in the server world, 1U (40mm high) servers for 200W Xeon Platinums are a minor problem that require more customized heatsink designs (which adds to cost).
xplus93 wrote:
PCI-E is certainly modular, but really, how close is it to the CPU and main memory? Compare that to the relationship between CPUs and FPUs. We haven't needed anything like that until now where more and more people need specialized data processing. What i'm saying is that we're moving towards the need for modularity in that context. Intel certainly thinks so.
PCI-E from SandyBridge onwards is on the CPU die. It should have the similar bandwidth (given enough lanes), but slightly worse latency. And judging from the recent Intel Optane demo, HPC needs more memory than it needs latency. Fitting your entire data set into Optane DRAM (at 10x the latency of DRAM) is far better than fitting half your data set into standard DRAM and paging from PCIe SSD's.
Intel's strategy for LGA-3647 is a small hedge against GPGPU. But the bigger hedge is against FPGA's and potentially ASIC's (thus the Altera acquisition). And that is divided into 2 main components (IMO):
The datacenter is quickly devolving from general purpose to (mostly) fixed functions. Simple example is Google/MS/Amazon dedicating entire data centers for search algorithms or AI. Why use complex general purpose CPU's when you can use simpler FPGA's? Benefit of FPGA is that any algorithm change and you can reprogram the FPGA (unlike an ASIC).
The other is the network processing aspect, an FPGA NIC is several orders of magnitude faster than the CPU at a fraction of power consumption. A network processing is a massive bottleneck for the Tier1 cloud providers (e.g. Google/MS/Amazon/FB). It's also of huge importance to Telcos. Because Comcast and Verizon would like nothing more than to use hardware based deep packet inspection to make your basic service seem slow but magically get improved with their premium packages. That cynicism aside, it should greatly speed up packet sorting and make everything cheaper in countries where ISP's don't control the government.