Today, at the 2017 Open Compute Project U.S. Summit, Microsoft unveiled some significant announcements around their hyperscale cloud hardware design, which they first announced in November as Project Olympus. With the explosion of growth in cloud computing, Microsoft is hoping to reduce the costs of their Azure expansion by creating universal platforms in collaboration with the Open Compute Project. Project Olympus is more than just a server standard though. It consists of a universal motherboard, power supplies, 1U and 2U server chassis, power distribution, and more. Microsoft isn’t the first company to want to go down this road, and it makes a lot of sense to cut costs by creating standards when you are buying equipment on the level of Azure.

The company made several big announcements today, with the first one coming somewhat as a surprise, but when you read between the lines, it makes a lot of sense. Microsoft is partnering with Qualcomm and Cavium to bring ARM based servers to Azure. This is a pretty big shift for the company, since they have focused more on x86 computing, and changing to a new ISA is never a small task, so Microsoft is clearly serious about this move.

Microsoft Distinguished Engineer Leendert van Doorn expanded on why the company is exploring this option in a blog post today. Clearly ARM has made some progress in the server world over the last few years, and Microsoft feels it’s the right time to bring some of that capability to their own datacenters. I think one of the key takeaways is that Microsoft wants to shape the hardware capabilities to the workload, and with an open platform like ARM, this can make a lot of sense for certain workloads. He points out that search and indexing, storage, databases, big data, and machine learning, are all named as potential workloads, and in cloud computing, these are all significant in their own right.

Qualcomm Centriq 2400 Platform

Microsoft already has a version of Windows Server running on ARM, and they’ve announced that both of their partners will be demonstrating this internal use port of Windows Server, first with Qualcomm with their Centriq 2400 processor, with 48 cores on Samsung’s 10nm FinFET process. Cavium will be running on their second generation ThunderX2 platform. Our own Johan De Gelas did a thorough investigation of the original ThunderX platform in June 2016 and it is certainly worth a read. The takeaways were that Cavium needed to do a lot of work on power management, and they had some big performance bottlenecks, so they offered inferior performance per watt compared to a Xeon D, but better than advertised single-threaded performance with SPEC 2006 results only 1/3 the Xeon results, rather than the 1/5 that was advertised. If Cavium has fixed some of the issues, especially power consumption, the new ThunderX2 might be a compelling solution for specific tasks.

Cavium ThunderX2 Platform

That is really the kicker though. The ARM platform, if properly executed, should be a good solution for some specific tasks, and if Microsoft can work with the platform makers to shape the hardware to fit specific tasks, but still be more general purpose than an ASIC, but at this time, it’s unlikely to be a serious threat to Intel’s monopoly on the datacenter at the moment. Intel has a pretty sizeable advantage in IPC, and especially on single-threaded workloads, so x86 isn’t going anywhere yet. What really matters is how Qualcomm and Cavium can execute on their platforms, and where they price them, since the end goal for Microsoft with this change is certainly, at least to some extent, to put pressure on Intel’s pricing for their datacenter equipment.

Back on the x86 side, Microsoft also had some announcements there as well. AMD will also be collaborating with Microsoft to include their Naples processor into Project Olympus, which is their new server processor based on the “Zen” architecture. Although much of the news today has been around the ARM announcement, this is arguably the bigger play. Ryzen has already shown it is very competitive with Core, and Naples could be very strong competition for Xeon. We’ll have to wait for the launch to know for sure.

Microsoft didn’t abandon Intel either, and they announced close collaboration with Intel as well. This will be not only for Intel’s general purpose CPUs, but also for Intel’s FPGA accelerators and Nervana support. Microsoft already has FPGAs in Azure, so adding them to Project Olympus is a no-brainer.

Microsoft also announced a partnership with NVIDIA today, bringing the HGX-1 hyperscale GPU accelerator to Project Olympus. HGX-1 is targeted at AI cloud computing, which is certainly an area where there has been tremendous growth. Each HGX-1 will be powered by eight NVIDIA Tesla P100 GPUs, each with 3584 Stream Processors, based on the GP100 chip, and a new switching design based on NVIDIA NVLink and PCIe which allows a CPU to connect to any number of GPUs dynamically. NVIDIA states the HGX-1 provides up to 100x faster deep learning performance compared to CPU-based servers.

This is a pretty substantial update today for Project Olympus, and it looks to be an incredibly modular platform. Anyone reading that Microsoft is dropping Intel for ARM in Azure is misunderstanding this goal. Looking at the platform as a whole, it is abundantly clear that Microsoft wants a platform that can be designed to work with any workload, and still offer optimal performance, and efficiency. Some tasks will be best on ARM, some on x86, while GPUs will be leveraged for performance gains where possible, and FPGAs being utilized for other tasks. When you look at computing on the scale of something like Azure, it only makes sense to dedicate hardware to specific workloads, since you’ll certainly have enough different workloads to make the initial effort worthwhile, and that isn’t always the case when looking at small business, medium business, or even most enterprise workloads.

Source: Microsoft Azure Blog

 

Comments Locked

65 Comments

View All Comments

  • Frenetic Pony - Thursday, March 9, 2017 - link

    This just seems like MS hedging their bets against Intel caving/getting too monopoly hungry. ARM processors are built to dominate mobile devices, which is why they do. x86 devices are, well, were and maybe are again with AMD's new work and Intel's supposed new architecture, built to dominate in performance per watt, as long as you don't mind that wattage being really high, and they do.

    Packing a bunch of mobile Procs into a SOC isn't going to change the fact that they're mobile processors. If you really need that much parallelism to begin with you just go use a GPU. I don't see anything coming of this, especially not with AMD's Naples claiming what it does while ARM still struggles to uh, do anything at all with the server market.
  • close - Thursday, March 9, 2017 - link

    Plenty of tasks need the parallelism but not the actual performance or power consumption a GPU might offer. Nothing wrong with having the option even if sometimes it looks like a solution waiting for a problem.
  • Meteor2 - Thursday, March 9, 2017 - link

    I think the article explains it well. At the scale of something like Azure, you pick the hardware appropriate for the software task; beefy CPU, wimpy CPU, GPU, FPGA or ASIC. The OCP standards make it easier, faster and cheaper to provision so.
  • ddriver - Thursday, March 9, 2017 - link

    My oh my, how have they been doing it for so long without M$ holding their hands...

    I am particularly encouraged by the absence of anything above 1U in this article. Hardware makers are gonna love how quickly hardware needs replacement after operating in those crammed, poorly ventilated racks. Cuz every day a device spends operational after it has ran out of warranty is a waste.

    Good luck to all those that are gonna go and be efficient and saving with m$'s stellar open standards. I am gonna keep going the thing that is actually efficient - go for custom, optimized solutions. It is actually cheaper to order a metal shop custom enclosures adapted to specific hardware and usage scenario than to buy hardware for a "one format fits all" from those leaches.
  • ddriver - Thursday, March 9, 2017 - link

    And it goes without saying, buying standard mass produced hardware that would fit anywhere is much more efficient than buying m$'s "standard" produced hardware, which will not fir anywhere else.

    Last but not least, even if this turns out to actually be more efficient, which I doubt it will, but even so, there is 100$ certainly that the savings will not be passed onto the consumer, but merely translate into even more profits. So I really don't see why 99% of the world would be excited or even remotely care about this. Like every corporate initiative ever undertaken, it will only benefit the top 1%, by increasing the amount of wealth they can leach from the population.
  • Murloc - Friday, March 10, 2017 - link

    if that's your way to measure new technologies, abandon this website.
  • ZeDestructor - Monday, March 13, 2017 - link

    For anything less that Google/Facebook/Amazon/Azure-scale, OCP simply doesn't make any sense. Once you do get that big, the savings get very real, very fast.

    For the record, a lot of companies, a lot of them bitter rivals, are platinum-level OCP members, like Google, Microsoft and Facebook, for example.

    Source: http://www.opencompute.org/about/membership-organi...

    As for the more technical side, OCP is more "standard" than pretty much any standard rackmount hardware. You can even retrofit an OCP rack into an existing 19" rack (thank Microsoft for contributing that particular chassis + backplane design).

    Oh, and as for reliability, there s tons of 1U boxes out there that have been under sustained loads for 10+ years, so piss off with that argument.
  • BrokenCrayons - Thursday, March 9, 2017 - link

    Spot on. It's no different than any other situation where the best tool is selected for the job. I guess in this case, its a little more like designing the best tool for the job.
  • ddriver - Thursday, March 9, 2017 - link

    No, they are, as they say, focused on optimizing cost, complexity and time. So they can lay off more people and make more profits. That's all there is about that. Nothing anyone other than corporate executives outta be excited about, unless you are for example cheering at it in order to appear tech-savy and therefore smart ;)
  • BrokenCrayons - Friday, March 10, 2017 - link

    It's terribly easy to draw you out isn't it?

    I don't make it a point to concern myself with a company's hiring or termination process when I purchase one of their products. Unless cost cutting measures the company takes adversely impact the operation of my own business, cost optimizations are also not a factor. Purchasing decisions for profit seeking company don't generally hinge on the ideology of another profit seeking company that's offering a product or service.

    I know you're still stinging from being made to look foolish, but it might be a good idea to at least pick a position of strength when attempting to save face. Emotionally-driven idealism is almost invariably not a solid foundation because it's so easily disputed. I'll give you a small measure of acknowledgement for at least picking a different article's comments box and a different issue to make yourself feel better. It shows you're trying, but also that I'm getting deep under your skin. Though that was my intent and I realize your view of your own intellect is critical to your self-worth, I didn't consider it was as vital as it appears to be. I offer my apology for making you feel bad earlier.

Log in

Don't have an account? Sign up now