Exadata Technical Deep Dive: Architecture and Internals

Download as pdf or txt
Download as pdf or txt
You are on page 1of 35

Exadata

Technical Deep Dive:


Architecture and Internals

Kothanda (Kodi) Umamageswaran


Vice President, Exadata Development

Gurmeet Goindi
Exadata Product Management

Copyright 2016, Oracle and/or its aliates. All rights reserved. |


Safe Harbor Statement
The following is intended to outline our general product direcTon. It is intended for
informaTon purposes only, and may not be incorporated into any contract. It is not a
commitment to deliver any material, code, or funcTonality, and should not be relied upon
in making purchasing decisions. The development, release, and Tming of any features or
funcTonality described for Oracles products remains at the sole discreTon of Oracle.

Copyright 2016, Oracle and/or its aliates. All rights reserved. | 2


The Exadata Database Machine Vision
Best Pla-orm for the Oracle Database On Premises and in the Cloud

1. State-of-the-art enterprise-grade hardware, refreshed


yearly (processors, ash, disks, network)

2. Sized, tuned and opTmized exclusively for Oracle


Database workloads (DW, AnalyTcs, OLTP, Mixed)
3. High-powered intelligent storage servers capable of
ooading database workloads Exadata
Unique
Intellectual
4. Smart database protocols and opTmizaTons from Property
servers to network to storage

5. One vendor responsible for all hardware, so`ware and


customer support

Copyright 2016. Oracle and/or its aliates. All rights reserved. 3


Proven at Thousands of CriTcal Deployments since 2008
Half OLTP - Half AnalyGcs - Many Mixed
4 OF THE TOP 5
BANKS, TELCOS, RETAILERS RUN EXADATA
Petabyte Warehouses
Online Financial Trading
Business ApplicaGons
SAP, Oracle, Siebel, PSFT,
Massive DB ConsolidaGon
Public SaaS Clouds
Oracle Fusion Apps,
Salesforce, SAS,

Copyright 2016. Oracle and/or its aliates. All rights reserved. 4


On-Premises Cloud at Customer Public Cloud
Preview:
Exadata Database Machine Exadata Cloud Machine Exadata Cloud Service

Customer Data Center Customer Data Center Oracle Cloud


Purchased SubscripGon SubscripTon
Customer Managed Oracle Managed
Oracle Managed
Copyright 2016. Oracle and/or its aliates. All rights reserved. 5
Exadata Database Machine X6-2
Scale-Out Database Servers Compute So\ware
Oracle Linux 6
2 socket x86 processors Oracle Database Enterprise EdiTon
44 CPU cores Oracle VM (opTonal)
256 GB - 1.5 TB GB DRAM Oracle Database opTons (opTonal)
Fastest Internal Fabric
40 Gb/s InniBand
Ethernet external connecTvity

Scale-Out Intelligent Storage Storage Server So\ware


12.8 TB PCI Flash Smart Scan (SQL Ooad)
96 TB disk Smart Flash Cache
20 CPU cores Hybrid Columnar Compression
High-Capacity Storage Server
I/O Resource Management
25.6 TB PCI Flash
20 CPU cores
Extreme Flash Storage Server

Copyright 2016. Oracle and/or its aliates. All rights reserved. 6
Exadata Database Machine X6-8
Scale-Out Database Servers Large SMP Processor Model
8-socket x86 Large warehouses
processors Massive database consolidaTon
144 cores Big In-Memory databases
2-6 TB DRAM

Fastest Internal Fabric


40 Gb/s InniBand
Ethernet external connecTvity

Scale-Out Intelligent Storage Storage Server So\ware


Same Networking, Storage and So\ware Smart Scan (SQL Ooad)
as X6-2
High-Capacity Storage Server
Smart Flash Cache
Hybrid Columnar Compression
I/O Resource Management

Extreme Flash Storage Server



Copyright 2016. Oracle and/or its aliates. All rights reserved. 7
ElasTc ConguraTons Incrementally Scale Servers
Achieve any Level of Performance with Minimum Hardware

Database Server

v w
Incrementally Add Racks
add DB or Extreme Flash Storage to

Storage ConGnue
Servers Scaling
High Capacity Storage

u Start Small Full Rack MulG-Rack


2 Database Servers
3 Storage Servers Enable Database CPU cores as needed with Capacity on Demand
Expand older Exadata machines with new X6-2 servers

Copyright 2016. Oracle and/or its aliates. All rights reserved. 8


Oracle Database Exadata Cloud Service
Full Oracle Database with all advanced opGons
100% CompaGble with on-premises databases
On fastest and most available database cloud pla-orm
Scale-Out Compute, Scale-Out Intelligent Storage, InniBand, PCIe Flash
Complete isolaGon of tenants with no overprovisioning
All Benets of Public Cloud
Fast, elasTc, web driven provisioning
Oracle experts deploy and manage infrastructure
Monthly or yearly subscripTon with online capacity bursGng

Best of On-Premises with Best of Cloud


Copyright 2016. Oracle and/or its aliates. All rights reserved. 9
Preview: Oracle Public Cloud Services @ Customer

Same PaaS and IaaS hardware and so`ware


as Oracle Public Cloud
Managed by Oracle and delivered as a service
in your datacenter behind your rewall
Same cost-eecTve subscripTon pricing
model as Oracle Cloud
Helps conform to business and government
security requirements
Connect via fast LAN to exisTng systems

Copyright 2016. Oracle and/or its aliates. All rights reserved. 10


Exadata X6 is Much Faster and Cheaper than All-Flash EMC
AnalyGc Scans OLTP Write IOPS
One High Capacity Exadata
beats the fastest EMC 350
5.2 M
301 5
XtremIO all-ash array in 300

every performance metric 250


4 2.5X

GB/sec
12X
12X more throughput 200 3
2 M
150
2.5X more IOPS 100 24
2

2X faster latency 50 1

0 0
8 X-Brick 8 X-Brick
EMC 8 X-Brick XtremIO: $7.8 M EMC 1 Rack HC
EMC 1 Rack HC
Exadata X6-2 Full Rack: $1.1 M XtremIO Exadata XtremIO Exadata

EMC Performance does not scale higher - Exadata scales by adding racks

Copyright 2016. Oracle and/or its aliates. All rights reserved. 11


Preview:
Exadata SL6

Linux on SPARC So\ware in Silicon

Copyright 2016 Oracle and/or its aliates. All rights reserved. | 12


Database Intelligence Extended into CPU Chip
SPARC M7 So\ware in Silicon

TradiTonal DB algorithms too complex for chips


Big Change: In-memory algorithms are much simpler
So\ware 5 years ago Oracle iniTated a revoluTonary project
in Silicon Build fastest ever microprocessor
Most processing cores (32)
Most concurrent threads (256)
Fastest Memory Bandwidth (160 GB/sec)
Add In-Memory DB operaTons directly on chip

Copyright 2016 Oracle and/or its aliates. All rights reserved. | 13


In-Memory Algorithms NaTvely Implemented in Silicon

SQL in Silicon Capacity in Silicon


DB AcceleraTon SPARC M7 Decompression Engines

So\ware
in Silicon

Database So\ware
Silicon Secured Memory Already Available
Fine-Grained Memory
ProtecTon

Copyright 2016 Oracle and/or its aliates. All rights reserved. | 14
SQL in Silicon: Database In-Memory AcceleraTon Engines

SPARC M7 SIMD Vectors instrucTons are fast, but were designed for
graphics, not database

Core Core Core Core New SPARC M7 chip has 32 opTmized database
acceleraTon engines (DAX) built on chip
Shared Cache Independently process streams of columns
E.g. nd all values that match California
DB DB DB DB
Accel Accel Accel Accel Up to 170 Billion rows per second!
Like adding 32 addiTonal specialized cores to chip
Using less than 1% of chip space

Copyright 2016 Oracle and/or its aliates. All rights reserved. |


Capacity in Silicon: Decompression Engines

Compression is key to puung more data in-memory


Decompression is far more important for databases than
compression
Data is loaded once, queried many Tmes

Bit pavern decompression in normal cores is slow


64 CPU cores needed to decompress at full memory speed
Doubles Memory SPARC M7 adds 32 opTmized decompress engines
Capacity
Run bit-pavern decompress at memory speed

Copyright 2016 Oracle and/or its aliates. All rights reserved. |


Silicon Secured Memory: Fine Grained Memory ProtecTon

Database In-memory places terabytes of data in memory


More vulnerable to corrupTon by bugs/avacks than storage

SPARC M7 locks memory as it is allocated so only the owner


can access it
Hidden color bits added to pointers (key), and content (lock)
Pointer color (key) must match content color or program is aborted
Hardware support eliminates performance impact

Helps prevent access o end of structure, stale pointer access,


malicious avacks, etc. plus improves developer producTvity STOP
Memory Memory
Pointers Content
Copyright 2016 Oracle and/or its aliates. All rights reserved. |
Exadata SL6: Exadata with Ultra-fast SPARC Linux Servers
IdenTcal to Exadata with x86 Database servers
replaced by SPARC T7-2 servers
Ultra-fast 32-core SPARC M7 Processors
Two-socket T7-2 Servers
Same elasTc conguraTons as Exadata X6-2
Storage servers idenTcal as Exadata X6-2
Runs same Oracle Linux as Exadata X6-2
Oracle Linux (UEK2) single domain conguraTon
Runs Oracle Database 12.1.0.2

Copyright 2016 Oracle and/or its aliates. All rights reserved. | 18


Preview: Exadata SL6
Worlds Fastest and Most Secure Linux Database Machine

Massive Memory Bandwidth Fastest Database Processor Silicon Secured Memory




End to End
2.2x Intel x86 1.9x Intel x86
Database Security
Copyright 2016 Oracle and/or its aliates. All rights reserved. | 19
Exadata Smart System So`ware

Copyright 2016, Oracle and/or its aliates. All rights reserved.


Smart System So`ware Highlights
Smart AnalyGcs Smart Storage
Move queries to storage, not storage to Hybrid Columnar Compression reduces
queries space usage by 10X
AutomaTcally ooad and parallelize queries Database-aware Flash Caching gives
across all storage servers speed of ash with
100X faster analyTcs capacity of disk PCI Flash


Smart OLTP Smart ConsolidaGon
Special InniBand protocol enables highest Workload prioriGzaGon from CPU to
speed, lowest latency OLTP network to storage ensures QoS
Ultra-fast transacTons using 4X more Databases in same hardware
DB opTmized ash logging algorithms

Fault-tolerant In-Memory DB by mirroring
memory across servers

Copyright 2016. Oracle and/or its aliates. All rights reserved. 21


Smart System So`ware Introduced in 2015
Smart AnalyGcs Smart ConsolidaGon
5X faster scans by converTng data to Zero overhead VMs
Columnar format in Flash Cache Snapshots for test/dev
3X faster JSON/XML by Set ash cache min size per DB to ensure QoS
ooading to storage servers VM VM
InniBand parTToning
IPv6 for Ethernet

Smart OLTP Smart Licensing
3X faster OLTP messaging using direct DB to Capacity-on-Demand reduces license cost by
InniBand access disabling unneeded cores
Instant detecTon of node failure Trusted ParGGons limit license scope of
Sub-second capping of I/O latency by specialized opTons
rerouTng I/Os to faster storage

Copyright 2016. Oracle and/or its aliates. All rights reserved. 22


Preview: New Smart System So`ware
Smart AnalyGcs Smart ConsolidaGon
Database In-Memory columnar format in Hierarchical snapshots
storage server 2X applicaTon connecTons*
AggregaGon in storage Products
Automated VLAN creaTon* Sparse
Snap
Sparse
Snap
Set membership using new Add extra 10g Ethernet Card
Base
DB
type of storage index
64GB DIMMs for 2X Memory CDB


Smart OLTP Smart Availability

Smart Fusion Block Transfer eliminates log Short Range Stretch (Extended) clusters
writes when moving blocks between nodes* 4X faster so`ware updates*
Automated rolling upgrade across full stack High redundancy Quorum disks on Quarter
2X faster disk recovery and Eighth racks*
Storage Index preserved on rebalance*
*Already Released
Copyright 2016. Oracle and/or its aliates. All rights reserved. 23
Upcoming: In memory format in Columnar Flash Cache
In-Memory formats used in Smart Columnar Flash Cache
Enables vector processing on storage server during smart scans
MulTple column values evaluated in single instrucTon In-Memory
Faster decompression speed than Hybrid Columnar Compression Columnar scans
Enables dicTonary lookup and avoids processing unnecessary rows
Smart Scan results sent back to database in In Memory Columnar
format
In-Flash
Reduces Database node CPU uTlizaTon
Columnar scans
In-memory performance seamlessly extended from DB node DRAM
memory to 10x capacity ash in storage
Even bigger dierenTaTon against all-ash arrays and other in-memory
databases

Upcoming release of Exadata So5ware
Copyright 2016. Oracle and/or its aliates. All rights reserved. 24
Upcoming: Storage Index Set Membership
Storage Index ORIGIN DEST NAME ADDRESS
Currently contains up to 8 columns of min/max summary Sierra Leone AZ Alice
Created automaTcally and kept in memory
Sierra Leone UT Bob
Used to skip performing I/Os
Sierra Leone VT John
What about queries with low cardinality columns?
select name, address from travels HASH(AZ) HASH(UT) HASH(VT)
where origin=Sierra Leone and dest=CA
TradiTonal min/max not good enough Create Bloom Filter

Database gathers stats and nd that column has less than


256 disTnct values Bloom Filter in Storage Index
Database requests storage to compute bloom lter First Scan
Storage will compute disTnct values and create a bloom Future Scans
lter
Smart Scans check value CA against bloom lter and HASH(CA) Lookup SAVE I/O
saves performing I/O
Upcoming release of Exadata So5ware

Copyright 2016. Oracle and/or its aliates. All rights reserved. 25
Upcoming: Join and AggregaTon Smart Scan

Extend In-Memory AggregaTon technique into storage


Find Sales per country
SELECT /*+ VECTOR_TRANSFORM */ country_id, sum(amount_sold) amount_sold
FROM customers, sales
WHERE customers.cust_id = sales.cust_id
GROUP BY customers.country_id
ORDER BY customers.country_id;
Storage cells scanning sales fact table will return tuples
{ country_id, sum_amount_sold }

Join and AggregaTon ooaded to the storage server



12.2 Database and 12.2 Exadata Storage Server So5ware


Copyright 2016. Oracle and/or its aliates. All rights reserved. 26
Upcoming: Smart write bursts and temp IO in ash cache
Write throughput of four ash cards has become greater than the
write throughput of 12-disks
Write Burts and Temp IO
When database write throughput exceeds the throughput of disks, in
smart ash cache intelligently caches writes Flash Cache
When queries write a lot of temp IO and it is bovlenecked on disk,
smart ash cache intelligently caches temp IO
Writes to ash for temp spill reduces elapsed Tme
Reads from ash for temp reduces elapsed Tme further

Smart ash cache prioriTzes OLTP data and does not remove hot
OLTP lines from the cache
Smart ash wear management for large writes

Upcoming release of Exadata So5ware



Copyright 2016. Oracle and/or its aliates. All rights reserved. 27
Upcoming: Smart AnalyTcs So`ware Features

Compressed Index Fast Full Scan


Smart Scan VIEWs with LOBs, XML and JSON
not just tables
AWR Enhancements
Di report for Exadata secTon
Flash Cache Metrics
More granular histograms
Up to 25% reducTon in Storage Server CPU for SPARC
SuperCluster during Smart Scans
Reduces endianness conversion overhead

12.2 Database and 12.2 Exadata Storage Server So5ware
Copyright 2016. Oracle and/or its aliates. All rights reserved. 28
Upcoming: Snapshots
Hierarchical Snapshots Nightly Master
Create snapshots of databases on previously created
snapshots
Use case example
Development releases nightly build of the database
Tester creates a snapshot for himself and nds a bug Test Snapshot
Tester creates a snapshot of his snapshot
Tester provides the new copy back to development for analysis
Syntax and technology remain unchanged
Works with pluggable and non-pluggable databases
Sparse backup of snapshots
RMAN backs up the modied blocks and not the unchanged blocks
from parent Snapshot to Dev

12.2 Database and 12.2 Exadata Storage Server So5ware
Copyright 2016. Oracle and/or its aliates. All rights reserved. 29
Upcoming: Extended Distance Clusters
Two sites and a quorum site
InniBand connected for high performance
100m opTcal cables in 2016 (best for re cells)
Created using ASM Extended Diskgroups
Nested failure groups InniBand
Compute nodes at each site read data local to that site
Data is wriven to all sites
Smart Scans scan across cells on both sites increasing
throughput
Row ltering, column projecTon, storage index, and ash cache
provide extreme performance Quorum Failure Group

Data Guard conTnues to be the recommended DR soluTon
12.2 Database and 12.2 Exadata Storage Server So5ware
Copyright 2016. Oracle and/or its aliates. All rights reserved. 30
Smart Fusion Block Transfer
OLTP workloads can have hot blocks that are Prior Inter-Instance Block Transfer Protocol
frequently updated (e.g. right-growing index )
Log le must be wriven before transferring a hot block 1. Issue log write
between instances so the block can be recovered
Adds latency and reduces throughput

On Exadata, Oracle does not wait for the log write 3. Transfer
2. Wait for log
block write compleGon
Exadata ensures the log write completes before changes to
block on another instance commit, guaranteeing durability
Wait for Log I/O during transfer of hot blocks is eliminated Exadata Avoids I/O Wait
Up to 40% throughput and 33% response Tme improvement
in some heavily contended OLTP workloads
Available with 12.1.0.2 BP12
Copyright 2016. Oracle and/or its aliates. All rights reserved. 31
Upcoming: Super Fast So`ware Updates

4x speed up in Storage Server So`ware Update


Parallel rmware upgrades across components such as hard disks, ash,
ILOM/BIOS, InniBand card
Reduced reboots for So`ware updates
Use kexec where possible
Manage a Cloud instead of managing a single rack
Use single patchmgr uTlity to upgrade hundreds of racks
Enable patchmgr to run from a non-Exadata system and run as
low privileged user

Upcoming release of Exadata So5ware


Copyright 2016. Oracle and/or its aliates. All rights reserved. 32
Upcoming: Extreme Manageability
IPv6 + Virtual machine + VLAN deployments
Get graphs from Exawatcher
Make DNS, NTP, and other IP address changes online
Seamless customer service with AutomaTc Service
Requests sending diagnosTc avachments
Manage Compute nodes using a RESTful service
ExaCli enabled for compute nodes in addiTon to storage cells

Much faster rebalance with improved ash cache hit


raTo during rebalance
Secure Erase during hardware reTrement


Upcoming release of Exadata So5ware
Copyright 2016. Oracle and/or its aliates. All rights reserved. 33
Exadata Advantages Increase Every Year
Exadata Cloud Service
TransformaGonal OLTP, In-Memory Columnar in Flash
Smart Fusion Block Transfer
AnalyGcs, ConsolidaGon In-Memory Fault Tolerance
Direct-to-wire Protocol
JSON and XML ooad
Cloud Without Compromise Instant failure detecTon
Network Resource Management
MulTtenant Aware Resource Mgmt
3D V-NAND
Flash
PrioriTzed File Recovery So`ware-in-
IO PrioriTes Silicon
Data Mining Ooad Tiered Disk/ Flash
Ooad Decrypt on Scans
Database Aware Flash Cache PCIe NVMe Flash
Storage Indexes
Columnar Compression Unied InniBand
Smart Scan
InniBand Scale-Out DB Processors in Storage
Scale-Out Storage

Scale-Out Servers

Copyright 2016. Oracle and/or its aliates. All rights reserved. 34


Copyright 2016. Oracle and/or its aliates. All rights reserved. 35

You might also like