Conf Micro 2006
Conf Micro 2006
Conf Micro 2006
Problem Statement
Excerpt From SOD Public Document
Studies indicate that approximately 80% of all CPI (Critical Program Information ) is contained in software/firmware. A broader range of robust techniques or technologies that protect software, data, and firmware is essential and will have a broad impact on protecting CPI. Secure programmable logic devices and secure processors are needed.
In secure systems, particularly weapon systems, critical technology may be available to an enemy if access is acquired to the system software. In case of capture of a system intact, this information may be available to an adversary by reading the system memory.
2
Exploits
software patching/amputation, de-compilation, worm, virus rootkit, system call tampering kernel space eavesdrop BIOS spoof/hijack,boot image virus chip interconnect/bus chip interconnect/bus snoop, eavesdrop, device spoof power analysis,timing analysis, etc de-packaging, micro-probing, optical reverse engineer
Solution
application signing, access control, OS signing, virtualization, TCG/TPM (trusted platform module) secure processor, memory encryption self-timed circuit, obfuscated power footprint secure packaging, private circuit
3
Architecture Overview
Processor Core
L1/L2 $
Memory Enc/Dec, Integrity Verification Engine Trusted Secure Proc Encrypted Memory
Micro 2003, PACT 2004, ASPLOS 2004, ISCA 2005, ISCA 2006.
4
Decryption
verification in superscalar processor Decryption is faster than authentication Great temptation to issue decrypted instructions/data before authentication Disassociation of decryption and authentication
Confidentiality violations
5
Pad
Pad
Pad
Pad
Cipher Block
Cipher Block
Cipher Block
Cipher Block
MAC
Cipher Block Clear Block Clear Block Clear Block Clear Block
Cipher Block
Cipher Block
Cipher Block
= =?
MAC
Issue Queue
Reservation Station
Issue Queue
Reservation Station
FU
LQ
SQ
iL1$
L 2 $
Authentication-then-fetch
Front Side Bus Control
7
TEST R1, R5
NO
BEQ Addr2
R3<-R1+4
R2<-[R1]
R4<-[Add1]
R1<-[R3]
R4<-R4+R2
[Addr1]<-R4
plaintext
0 1 0 1 0 0 1 1 0
addr =
Data Next Data Next
0 0 0 0 1 1 0 0
Why? Fetches not considered as state changes. Fetch is launched speculatively to improve performance.
Secret
Data NULL
1 1 1 0 1 0 0 1
9
Data NULL
JMP Add1
Secret
JMP Add1
R4 <- 0 Add1 R5 <- 0 TEST R1, R5 BEQ Addr2 R2 <- [R1] R4 <- R4 + R2 R3<- R1+4 R1<- [R3] JMP Add1 Add2
TEST R1, R5
TEST R1, R5
NO
BEQ Addr2 BEQ Addr2
NO
R2<-[R1]
R3<-R1+4
R2<-[R1]
Disclose Secret
R4<-R4+R2 R1<-[R3]
Load Secret
10
11
Simplified Implementation
Integrity Verification Logic Verified Integrity of Line (Tag = 6)
Write Line
Read Line
Experiment Setup
Parameters L1 I/D Cache L2 Cache Memory Bus CPU Clock L1 Latency Value DM, 16KB 4way, unified, 256KB/1M 200MHz, 8B wide 1GHz 1 cycle
L2 Latency
Decryption Latency RUU
Results
1.2 1 0.8 0.6 0.4 0.2 0
m cf am m p ap pl u a m gr id pa rs er ol f im vp r w up w is av e er ag e bz ip 2 ap si gz ip pe rl ga p gc c ar t m es sw tw vo r te x
Performance Ranking write > commit > fetch > commit+fetch > issue > commit + addr obfuscation
14
Results
1.2 1 0.8 0.6 0.4 0.2 0
m cf am m p ap pl u a m gr id pa rs er pe rl sw im ol f vp r w up w is av e er ag e bz ip 2 ga p ap si gc c gz ip ar t m es tw vo r te x
Performance Ranking write > commit > fetch > commit+fetch > issue > commit + addr obfuscation
15
Results
0.6 0.5 0.4 0.3 0.2 0.1 0
m cf am m p ap pl u a m gr id pa rs er pe rl sw im ol f vp r w up w ise av er ag e bz ip 2 ga p ap si gc c gz ip ar t m es tw vo r te x
commit_over_issue write_over_issue
commit+fetch_over_issue
Significant Advantage of Write, Commit Over Issue Commit + Fetch 5-10% Faster Than Issue
16
Results
0.35 0.3 0.25 0.2 0.15 0.1 0.05 0
m cf am m p ap pl u a m gr id pa rs er pe rl sw im ol f w up w i av se er ag e bz ip 2 ga p gc c ap si gz ip m es tw vo r te x vp r ar t
commit_over_issue write_over_issue
commit+fetch_over_issue
Significant Advantage of Write, Commit Over Issue Commit + Fetch Marginal Averaged Improvement, mgrid, vpr, 5-10%
17
Results
1.2 1 0.8 0.6 0.4 0.2 0
m cf am m p ap pl u a m gr id pa rs er pe rl sw im ol f vp r w up w ise av er ag e bz ip 2 ga p ap si gc c gz ip ar t m es tw vo r te x
Hash Tree
authen_then_commit authen_then_fetch
Results
0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0
m cf am m p ap pl u a m gr id pa rs er pe rl sw im ol f vp r w up w i av se er ag e bz ip 2 ga p gc c gz ip ap si ar t m es tw vo r te x
commit_over_issue
commit+fetch_over_issue
Results
1.2 1 0.8 0.6 0.4 0.2 0
m cf am m p ap pl u a m gr id pa rs er pe rl sw im ol f vp r w up w is av e er ag e bz ip 2 ga p ap si gc c gz ip ar t m es tw vo r te x
authen_then_issue authen_then_write
authen_then_commit authen_then_commit_fetch
Performance Ranking write > commit > fetch > commit+fetch > issue > commit + addr obfuscation
20
Results
0.6 0.5 0.4 0.3 0.2 0.1 0
m cf am m p ap pl u a m gr id pa rs er pe rl sw im ol f vp r w up w ise av er ag e bz ip 2 ga p ap si gc c gz ip ar t m es tw vo r te x
commit_over_issue
commit+fetch_over_issue
21
Conclusions
OOO processor pipeline requires special attention on when decrypted data or instruction can be used or issued. To prevent memory fetch address side-channel exploits, authentication-then-issue and authentication-then-fetch are recommended. Performance ranking authen-then-write > authen-then-fetch+commit > authen-then-commit > authen-then->issue Authentication-then-fetch+commit outperforms authenticationthen-issue Precise interrupt Integrity verified architecture and memory states
22
Questions
23
Invariant Prologue
SP, -16(SP) STQ Zero, 8(SP)
After Step 2
R1<-[addr] R2<-[R1]
Runtime
R1<-[addr] R2<-[R1]
R1<-[addr]
Load Secret
R2<-[R1]
Disclose Secret
25
Timing Analysis
Latency of new fetch address from the previous fetch
Authentication-then-issue
external decryption memory fetch authentication Frequent Values Frequent Values Frequent Values external Frequentfetch Values memory
Time Line
decryption authentication Frequent Values Frequent Values
Stall
Authentication-then-fetch
external memory fetch decryption
decryption