BETA
THIS IS A BETA EXPERIENCE. OPT-OUT HERE

More From Forbes

Edit Story

Supercomputing Conference A Glimpse Into Future Of Mainstream Computing

Updated Jul 25, 2018, 05:39pm EDT
This article is more than 7 years old.

Supercomputers used to be a market unto themselves. In their time, giants such as IBM, Control Data Corporation (CDC), Evans & Sutherland (E&S), Silicon Graphics and Cray Computing ruled the supercomputing market. But with Moore’s Law restricted to only scaling transistor count, scale-out has taken over a conference series that had epitomized scale-up. Because the nature of supercomputing has changed so much in the past two decades, the market has expanded into a much broader high-performance computing (HPC) market. At this year’s SC16 conference, SGI made news by being purchased by Hewlett Packard Enterprise (HPE) and the Cray booth looked a bit forlorn. For the past few years compute silicon vendors have increased their influence on the supercomputing market. This year at SC16, most of the attention was on NVIDIA, Intel and Xilinx. Cloud infrastructure manufacturers, such as Dell EMC, HPE, IBM, Huawei, QCT, Inspur and Sugon and storage vendors, such as DDN, Dell EMC, and HPE, were also in the spotlight.

The key trends this year are:

  • Artificial Intelligence (AI), Machine Learning (ML), Deep Learning (DL) and variants
  • Commodity water-cooled servers and production infrastructure for exotic liquids

While each is described in detail below, it’s worth setting context to understand why they are becoming so important now.

Prior to 2000, innovation entered the computing market through HPC; HPC was the top of the technology waterfall. Two things happened to change this dynamic in the late 1990s:

  1. Microsoft opened access to its Windows graphical user interface (GUI) application programming interfaces (APIs) and enabled graphics chip vendors to accelerate the PC GUI. That sparked the rapid evolution of graphics technology that resulted in NVIDIA and AMD (through its purchase of ATI in the 2000s) being the dominant PC graphics intellectual property generators today. NVIDIA started its push into the HPC market a decade ago, and in November 2016 is the parallel compute accelerator of choice for over 60% of Top500 supercomputer systems. The drivers for NVIDIA’s success are simple: NVIDIA understands software ecosystems and Moore’s Law stopped providing most of the side benefits that drove traditional processor’s compute performance through the mid-2000s, such as increasing operating frequencies without increasing power consumption. Compute accelerators, like NVIDA’s GPU architectures, can provide greater parallel math processing than CPUs.
  2. Cloud computing changed the way datacenters are designed, connected, provisioned and managed. Scale-out is no longer just a cheap and slow way to get scale-up capabilities, cloud datacenter scale is now dependent on scale-out microservices-based system architecture to reach previously unthinkable online service volume and quality of service. Core cloud efficiency concepts, such as rapid provisioning and multitenant scheduling, are becoming mainstream in the HPC world. Former giants such as SGI, Cray, and Bull are struggling to create differentiation from mainstream server vendors who are increasingly focused on the cloud market, such as Dell EMC, HPE and IBM. That is difficult in a market where compute silicon is dominated by Intel and NIVIDIA, and where customers value software flexibility as much as simply optimizing for the fastest compute possible.

The new trends that have been building for the past couple of years at the annual Supercomputing (SC) conference are:

Artificial Intelligence (AI), Machine Learning (ML), Deep Learning (DL) and other neural network variant terms are increasingly hyped by HPC vendors. Last year at SC15, AI was an undercurrent of discussion, but this year almost every silicon and systems vendor brought products and related signage. You can read a high-level overview of why HPC is adopting advanced pattern analytics and deep learning here and view a webinar here.

Many of the systems are based on NVIDIA’s P100 modules and add-in boards. Most of the rest are based on FPGA add-in boards using either Xilinx or Altera (a subsidiary of Intel) parts. A few aspirational systems sport Intel’s current Xeon Phi processor “Knights Landing” generation, but those were universally positioned for upgrade to the “Knights Mill” generation in mid-2017.

Liquid cooling was everywhere. Literally in every system manufacturer’s booth. As Moore’s Law pares back to simple transistor scaling, achieving higher levels of compute performance means packing more transistors into server racks, called “compute density”. Increasing compute density means increasing both rack-level power consumption and rack-level heat dissipation – they are tightly linked. Custom water cooling designs are certainly not new, but investment across the industry to commoditize water cooling is new.

Cloud giants adopt HPC as HPC adopts cloud services approach #SC16 #liquidcooling #AI #deeplearning #supercomputing

Companies such as Asetek offer mass customization of sealed loop, single-loop, and rack-level water cooling. Asetek showed server solutions for a few of their server customers, including Cray, Fujitsu and Penguin, and even showed an Intel co-branded water cooling solution for Xeon Phi. Lenovo, IBM, Bull, Inspur, Huawei and SGI showed tightly integrated custom designed solutions. While NVIDIA showed only air cooled solutions in their booth, several of their customers, including IBM and Bull, showed water cooled designs for NVIDIA’s P100 module.

While full-immersion production systems are still rare, 3M partnered with Allied Control, Gigabyte, and PEZY to show functional demonstrations of 3M’s Novec engineered fluids in the exhibition hall. That was a big step up from the home aquarium based demonstrations shown in previous years.

Wrap-Up

The key point for both AI and for water cooling is that the cloud giants – companies such as Google, AWS, Microsoft, Baidu, Alibaba and Tencent – are starting to buy HPC gear to do advanced analytics and pattern recognition while pioneering artificial intelligence and deep learning techniques. At the same time, traditional HPC customers are investing in multitenant cloud-like clusters that are optimized for both simulations and pattern analytics.

These technologies will waterfall into mainstream enterprise and private cloud solutions over the next few years. That puts companies such as NVIDIA, Asetek and a possibly resurgent AMD (both CPUs and GPUs) in the center of attention. We’ll also have to see how well Intel can catch up to NVIDIA’s Pascal architecture with Knights Mill (introduction expected in mid-2017) for AI training applications.

-- The author and members of the TIRIAS Research staff do not hold equity positions in any of the companies mentioned. TIRIAS Research tracks and consults for companies throughout the electronics ecosystem from semiconductors to systems and sensors to the cloud.

Follow me on Twitter or LinkedInCheck out my website