HPC@Intel: Platforms and Technology CCGSC September 10, 2006

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 29

HPC@Intel

Platforms and Technology CCGSC September 10, 2006

Dr. David Scott

Petascale Product Line Architect [email protected]

Legal Disclaimer
Information in this document is provided in connection with Intel products. No license, express or implied, by estoppel or otherwise, to any intellectual property rights is granted by this document. Except as provided in Intel's Terms and Conditions of Sale for such products, Intel assumes no liability whatsoever, and Intel disclaims any express or implied warranty, relating to sale and/or use of Intel products including liability or warranties relating to fitness for a particular purpose, merchantability, or infringement of any patent, copyright or other intellectual property right. Intel products are not intended for use in medical, life saving, or life sustaining applications. Intel may make changes to specifications and product descriptions at any time, without notice Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined." Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. This document contains information on products in the design phase of development. The information here is subject to change without notice. Do not finalize a design with this information. Intel Xeon, Pentium 4, Itanium, Itanium 2, Prescott, Prestonia, Nocona, Jayhawk, Potomac, Tulsa, and Dempsey processors may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. Contact your local Intel sales office or your distributor to obtain the latest specifications before placing your product order. Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling 1-800-548-4725, or by visiting Intel's website at <http://www.intel.com>.

Intel, Itanium, Xeon and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.

AGENDA

New Processors

New HPC focused platforms


Technologies for the future

Core-Duo Processors

Lets Take A Look Inside

Historical Driving Forces


Increased Performance via Increased Frequency Shrinking Geometry

100000 10000 1000 100 10 1 1970


0.1

Frequency (MHz)

Feature Size (um)

10

1980

1990

2000

2010

2020

0.01 1970

1980

1990

2000

2010

2020

20 Numbers in Main Memory

1946

I4004 Processor 2300 Transistors

1971

65nm 1B+ Transistors

2005

The Challenges
Power Limitations
1000

Diminishing Voltage Scaling


10
0.7um 0.5um 0.35um 0.25um 0.18um 0.13um 90nm 65nm 45nm 30nm

CPU Power 100 (W)

Supply 1 Voltage (V)

~30%

10 1990

0.1

1995

2000

2005

2010

2015

1990 1993 1997 2001 2005 2009

Power = Capacitance x Voltage2 x Frequency also Power ~ Voltage3

Intel Core Microarchitecture


Low Power High Performance Scalable
Woodcrest Intel Wide Dynamic Execution Intel Intelligent Power Capability Intel

Server Optimized
Conroe

Advanced Smart Cache Intel Smart Memory Access Intel

Desktop Optimized

65nm
Merom

Advanced Digital Media Boost

Mobile Optimized

*Graphics not representative of actual die photo or relative size

Intel Wide Dynamic Execution


EACH CORE
EFFICIENT 14 STAGE PIPELINE DEEPER BUFFERS 4 WIDE DECODE TO EXECUTE 4 WIDE MICRO-OP EXECUTE MICRO and MACRO FUSION ENHANCED ALUs

CORE 1
INSTRUCTION FETCH AND PRE-DECODE INSTRUCTION QUEUE DECODE RENAME / ALLOC RETIREMENT UNIT (REORDER BUFFER) SCHEDULERS EXECUTE

CORE 2
INSTRUCTION FETCH AND PRE-DECODE INSTRUCTION QUEUE DECODE RENAME / ALLOC RETIREMENT UNIT (REORDER BUFFER) SCHEDULERS EXECUTE

Intel Intelligent Power Capability

Process

Coarse Grained

Ultra Fine Grained

Transistor

65nm Strained Silicon Low-K Dielectric More Metal Layers

Aggressive Clock Gating Enhanced Speed-Step

Low VCC Arrays Blocks Controlled Via Sleep Transistors

Low Leakage Transistors Sleep Transistors

*Graphics not representative of actual die photo or relative size

Intel Performance Leadership for Life Sciences


Woodcrest single thread relative performance compared to Opteron*
2.0 1.8 1.6 1.4 1.2 1.0 0.8 0.6 0.4 0.2 0.0
Geomeans of relative performance

Intel outperforms AMD across all applications tested


Gaussian(3,4) GAMESS(3,4) Amber(3,4) GROMACS(2,4) NAMD(2,6) HMMER(1,4) BLAST(1,5) ClustalWMPI(2,4)

Higher is better

(1) Woodcrest: Dual-Core Intel Xeon processor, 2-socket sys., 3.0GHz, 4MB L2 cache, 4GB Memory (2) Woodcrest: Dual-Core Intel Xeon processor, 2-socket sys., 3.0GHz, 4MB L2 cache, 8GB Memory (3) Woodcrest: Dual-Core Intel Xeon processor, 2-socket sys., 3.0GHz, 4MB L2 cache, 16GB Memory (4) Dual-Core AMD* Opteron* processor 280, 2-socket sys. 2.4GHz, 1MB L2 cache, 16GB Memory (5) Dual-Core AMD* Opteron* processor 285, 2-socket sys. 2.6GHz, 1MB L2 cache, 4GB Memory (6) AMD* Opteron* processor 252, 2-socket sys. 2.6GHz, 1MB L2 cache, 16GB Memory

Computational Chemistry

Bioinformatics
Source: Intel Internal Measurement * Other brands and names may be claimed as the property of others.

Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, reference http://www.intel.com/performance/resources/benchmark_limitations.htm or call (U.S.) 1-800-628-8686 or 1-916-356-3104.

Core Microarchitecture Advances With Quad Core


Energy Efficient Performance
4X

Quad Core
Clovertown

Clovertown Woodcrest

H1 07

3X

H2 06

Server
2X

Dempsey MV H1 06
Kentsfield 1X

Paxville DP H2 05 Irwindale H1 05

Desktop

DP Performance Per Watt Comparison with SPECint_rate at the Platform Level


Source: Intel
*Graphics not representative of actual die photo or relative size

AGENDA

New Processors

New HPC focused platforms Technologies for the future

Motivation
Caretta & Port Townsend:
Provide a higher memory BW / FLOP option than DP Xeon

Provide a less expensive option than DP Xeon

Atoka
High Density DP solution

Metrics
Performance
Core we lead Bus close (depends on STREAM binaries etc) + 2x cache size Performance / Watt

We lead
Performance / SqFt We match Performance / $

We lead

Caretta Features
GbE
HPC BOARD FEATURES
Single Intel Pentium-D processor (Presler, Smithfield) Support for Pentium4 (CedarMill) Chipset: Mukilteo + ICH7 4 DIMM (max 8GB) - DDR2 533/667 with U-ECC 800 MHz FSB Integrated 2 port SATA2 with RAID 0/1 2xGbE (TekoaE + Tabor) 2x USB2 external Rear video & serial port Internal headers: serial, 2xUSB2, I2C Custom 5.95 x13, 6 layer Custom power connector Client Management iAMT via TekoaE

Video ICH

Memory

MCH

CPU

PortTownsend Features
HPC BOARD FEATURES
Single Intel PentiumD processor (Conroe, Kentsfield) Chipset: Mukilteo2 + ICH7 4 DIMM (max 8GB) - DDR2 533/667 with U-ECC 1066 FSB PCIex8 support for IB MemFree card & SFF GbE card Integrated 2 port SATA2 with RAID 0/1 2xGbE (Tekoa + TekoaE) 2xUSB2 external (crash cart) Rear video & serial port Internal headers: serial (3pin), 2xUSB2, I2C Custom 5.95 x13 , 6 layer Custom power connector Client Management iAMT via TekoaE
GbE

ICH

PCI-E x8

Memory

MCH

CPU VRD

AtokaV Features
HPC BOARD FEATURES
Dual Intel Xeon processor (WC, CTN) Chipset: Greencreek + ESB2 8 FBD (max 32GB) - DDR2 533/667 1333 FSB PCIex8 slot Mellanox IB 4x DDR single port down Integrated 2 port SATA2 with RAID 0/1 2xGbE (Gilgal) 2xUSB2 external (crash cart) Rear video & serial port Internal headers: serial (3pin), 1xUSB2, I2C Custom 6.5 x16.5 Custom power connector Client Management via IPMI module / GbE port Support for 32Mbit flash & embedded Linux
CPU VRD

CPU

MCH

Memory ESB2

PCI-E x8 GbE IB

Pics

PortTownse nd 1U side by side reference chassis

Pics
PortTownse nd 4U Blade Can

PortTownse nd AC Blade

AGENDA

New Processors

New HPC focused platforms


Technologies for the future

Todays Packaging Technology


Multi-Chip Package Wire-Bonded Stacked Die

Flash
DRAM CPU DRAM CPU

3D Stacking Research

Wafer Stacking
Metal lines on backside of thin wafer
Top Thin Wafer

Bonding Interface

ThruSilicon Via

DRAM CPU
Bottom Wafer
Source: Intel

Bonding Structures

3D Stacking Research

Die Stacking
Analog
Flash DRAM DRAM CPU
Pkg. Substrate Metal Pad
Source: Intel

Via Die 7 Die 6 Die 5 Die 4 Die 3 Die 2 Die 1

Chip-to-Chip Signaling Challenge

The Opportunity of Silicon Photonics


Enormous ($ billions) CMOS infrastructure, process learning, and capacity
Draft continued investment in Moores law

Potential to integrate multiple optical devices Micromachining could provide smart packaging Potential to converge computing & communications

To benefit from this optical wafers must run alongside existing product.

Intels Silicon Photonics Research

First Continuous Silicon Laser


(Nature 2/17/05)

1GHz (Nature 04) 10 Gb/s (05)

First: Innovate to prove silicon is a viable optical material

Silicon Photonics
Filter Laser Modulator

CMOS Circuitry

Passive Alignment

Photodetector

Silicon Photonics Future Vision


Data Center Fabrics Chip-to-Chip Interconnects Backplane and Display Interconnects

Chemical Analysis

Medical Lasers

Q & A

You might also like