IHE Columbia Theory Seminar

Download as pdf or txt
Download as pdf or txt
You are on page 1of 52

Fully Homomorphic

Encryption over the Integers


Many slides borrowed
from Craig

Marten van Dijk1, Craig Gentry2,


Shai Halevi2, Vinod Vaikuntanathan2
1 – MIT, 2 – IBM Research
The Goal

I want to delegate processing of my data,


without giving away access to it.
Application: Cloud Computing

I want to delegate processing of my


data, without giving away access to it.

 Storing my files on the cloud


 Encrypt them to protect my information
 Later, I want to retrieve the files containing
“cloud” within 5 words of “computing”.
 Cloud should return only these (encrypted) files,
without knowing the key
Computing on Encrypted Data
 Separating processing from access via
encryption:
 I will encrypt my stuff before sending it to
the cloud
 They will apply their processing on the
encrypted data, send me back the
processed result
 I will decrypt the result and get my answer
Application: Private Google Search

I want to delegate processing of my


data, without giving away access to it.

 Private Internet search


 Encrypt my query, send to Google
 Google cannot “see” my query, since it does not
know my key
 I still want to get the same results
 Results would be encrypted too
 Privacy combo: Encrypted query on encrypted data
An Analogy: Alice’s Jewelry Store
 Alice’s workers need to assemble raw
materials into jewelry
 But Alice is worried about theft
How can the workers process the raw
materials without having access to them?
An Analogy: Alice’s Jewelry Store
 Alice puts materials in locked glove box
 For which only she has the key
 Workers assemble jewelry in the box
 Alice unlocks box to get “results”
The Analogy
 Encrypt: putting things inside the box
 Anyone can do this (imagine a mail-drop)
 ci  Enc(mi)
 Decrypt: Taking things out of the box
 Only Alice can do it, requires the key
 m*  Dec(c*)
 Process: Assembling the jewelry
 Anyone can do it, computing on ciphertext
 c*  Process(c1,…,cn)
 m* = Dec(c*) is “the ring”, made from
“raw materials” mi
Public-key Encryption
 Three procedures: KeyGen, Enc, Dec
 (sk,pk)  KeyGen($)
 Generate random public/secret key-pair
 c  Encpk(m)
 Encrypt a message with the public key
 m  Decsk(c)
 Decrypt a ciphertext with the secret key

 E.g., RSA: cme mod N, mcd mod N


 (N,e) public key, d secret key
Homomorphic Public-key Encryption
 Another procedure: Eval (for Evaluate)
 c*  Eval(pk, f, c1,…,ct)

function
Encryptions of
Encryption of f(m1,…,mt). inputs m1,…,mt to f
I.e., Dec(sk, c) = f(m1, …mt)

 No info about m1, …, mt, f(m1, …mt) is leaked


 f(m1, …mt) is the “ring” made from raw
materials m1, …, mt inside the encryption box
Can we do it?
 As described so far, sure..
 (Π, c1,…,cn) = c* Evalpk(Π, c1,…,cn)
 Decsk(c*) decrypts individual ci’s, apply Π
(the workers do nothing, Alice assembles
the jewelry by herself)
This is the main
challenge
Of course, this is cheating:
 We want c* to remain small Can be done with
“generic tools”
 independent of the size of Π (Yao’s garbled
 “Compact” homomorphic encryption circuits)

 We may also want Π to remain secret


c  Eval(pk, f, c1,…,ct),
Previous Schemes Dec(sk, c) = f(m1, …, mt)

 Only “somewhat homomorphic”


 Can only handle some functions f
 RSA works for MULT function (mod N)
c = c1 x … x ct =(m1 x … x mt)e (mod N)

c1 = m1e c2 = m2e ct = mte


“Somewhat Homomorphic” Schemes

 RSA, ElGamal work for MULT mod N


 GoMi, Paillier work for XOR, ADD
 BGN05 works for quadratic formulas
Schemes with large ciphertext
 SYY99 works for shallow fan-in-2 circuits
 c* grows exponentially with the depth of f
 IsPe07 works for branching program
 c* grows with length of program
 AMGH08 for low-degree polynomials
 c* grows exponentially with degree
Connection with 2-party computation

 Can get “homomorphic encryption” from


certain protocols for 2-party secure
function evaluation
 E.g., Yao86
 But size of c*, complexity of decryption,
more than complexity of the function f
 Think of Alice assembling the ring herself
 These are solving a different problem
A Recent Breakthrough
 Genrty09: A bootstrapping technique
Scheme E can handle its Scheme E* can
own decryption function handle any function

 Gentry also described a candidate


“bootstrappable” scheme
 Based on ideal lattices
The Current Work
 A second “bootstrappable” scheme
 Very simple: using only modular arithmetic
 Security is based on the hardness of
finding “approximate-GCD”
Outline
1. Homomorphic symmetric encryption
 Very simple
2. Turning it into public-key encryption
 Result is “almost bootstrappable”
3. Making it bootstrappable
 Similar to Gentry’09
As much as
4. Security we have time

5. Gentry’s bootstrapping technique


Not today
A homomorphic symmetric encryption
 Shared secret key: odd number p
 To encrypt a bit m:
 Choose at random small r, large q
The “noise”
Noise much
 Output c = m + 2r + pq smaller than p
 Ciphertext is close to a multiple of p
 m = LSB of distance to nearest multiple of p
 To decrypt c:
 Output m = (c mod p) mod 2
 m = c – p • [c/p] mod 2
= c – [c/p] mod 2
= LSB(c) XOR LSB([c/p])
Homomorphic Public-Key Encryption

 Secret key is an odd p as before


 Public key is many “encryptions of 0”
 xi = [qip + 2ri ]x0 for i=1,2,…,t
 Encpk(m) = [subset-sum(xi’s)+m]x0
 Decsk(c) = (c mod p) mod 2
Why is this homomorphic?
 Basically because:
 If you add or multiply two near-multiples
of p, you get another near multiple of p…
Why is this homomorphic?
 c1=q1p+2r1+m1, c2=q2p+2r2+m2
Distance to nearest multiple of p
 c1+c2 = (q1+q2)p + 2(r1+r2) + (m1+m2)
 2(r1+r2)+(m1+m2) still much smaller than p
c1+c2 mod p = 2(r1+r2) + (m1+m2)

 c1 x c2 = (c1q2+q1c2−q1q2)p
+ 2(2r1r2+r1m2+m1r2) + m1m2
 2(2r1r2+…) still much smaller than p
c1xc2 mod p = 2(2r1r2+…) + m1m2
Why is this homomorphic?
 c1=m1+2r1+q1p, …, ct=mt+2rt+qtp

 Let f be a multivariate poly with integer


coefficients (sequence of +’s and x’s)
 Let c = Evalpk(f, c1, …, ct) = f(c1, …, ct)
Suppose this noise is much smaller than p
 f(c1, …, ct) = f(m1+2r1, …, mt+2rt) + qp
= f(m1, …, mt) + 2r + qp
 Then (c mod p) mod 2 = f(m1, …, mt) mod 2

That’s what we want!


How homomorphic is this?
 Can keep adding and multiplying until the
“noise term” grows larger than p/2
 Noise doubles on addition, squares on
multiplication
 Multiplying d ciphertexts  noise of size ~2dn
2 5
 We choose r ~ 2n, p~ 2n (and q ~ n
2 )
 Can compute polynomials of degree n before
the noise grows too large
Keeping it small
 The ciphertext’s bit-length doubles with
every multiplication
 The original ciphertext already has n6 bits
 After ~log n multiplications we get ~n7 bits
 We can keep the bit-length at n6 by
adding more “encryption of zero”
 |y1|=n6+1, |y2|=n6+2, …, |ym|=2n6
 Whenever the ciphertext length grows,
set c’ = c mod ym mod ym-1 … mod y1
Bootstrappable yet?
c/p, rounded to
 Almost, but not quite: nearest integer

 Decryption is m = LSB(c) / LSB([c/p])


 Computing [c/p] takes degree O(n)
 But O() is more than one (maybe 7??)
 Integer c has ~n5 bits
 Our scheme only supports degree ≤ n
 To get a bootstrappable scheme, use
Gentry09 technique to “squash the
decryption circuit”
How do we “simplify” decryption?
m

Old
decryption DecE
algorithm

sk c

 Idea: Add to public key another “hint” about sk


 Of course, hint should not break secrecy of encryption
 With hint, anyone can post-process the ciphertext,
leaving less work for DecE* to do
 This idea is used in server-aided cryptography.
How do we “simplify” decryption?
m
New
Old approach
decryption
algorithm DecE*
Processed
m ciphertext c*
The hint sk* c*
about sk
in pub key

DecE
Post-
Process
sk c

f(sk, r) c

Hint in pub key lets anyone post-process the ciphertext,


leaving less work for DecE* to do.
Squashing the decryption circuit
 Add to public key many real numbers
 d1,d2, …, dt ∈ [0,2] (with “sufficient precision”)
 ∃ sparse set S for which Σi∈S di = 1/p mod 2
 Enc, Eval output ψi=c x di mod 2, i=1,…,t
 Together with c itself
 New secret key is bit-vector σ1,…,σt
 σi=1 if i∈S, σi=0 otherwise
 New Dec(c) is c – [Σi σiΨi] mod 2
 Can be computed with a “low-degree circuit”
because S is sparse
A Different Way to Add Numbers
ai
 DecE*(s,c)= LSB(c) XOR LSB([Σi σiψi])
a1,0 a1,-1 … a1,-log t
a2,0 a2,-1 … a2,-log t
a3,0 a3,-1 … a3,-log t
ai‘s in binary
representation a4,0 a4,-1 … a4,-log t
a5,0 a5,-1 … a5,-log t
… … … …
at,0 at,-1 … at,-log t

Our problem: t is large (e.g. n6)


A Different Way to Add Numbers

a1,0 a1,-1 … a1,-log t


Let b0 be
the binary a2,0 a2,-1 … a2,-log t
rep of a3,0 a3,-1 … a3,-log t
Hamming
weight a4,0 a4,-1 … a4,-log t
a5,0 a5,-1 … a5,-log t
… … … …
at,0 at,-1 … at,-log t

b0,log t … b0,1 b0,0


A Different Way to Add Numbers

a1,0 a1,-1 … a1,-log t


Let b-1 be
the binary a2,0 a2,-1 … a2,-log t
rep of a3,0 a3,-1 … a3,-log t
Hamming
weight a4,0 a4,-1 … a4,-log t
a5,0 a5,-1 … a5,-log t
… … … …
at,0 at,-1 … an,-log t

b0,log t … b0,1 b0,0


b-1,log t … b-1,1 b-1,0
A Different Way to Add Numbers

a1,0 a1,-1 … a1,-log t


Let b-log t be
the binary a2,0 a2,-1 … a2,-log t
rep of a3,0 a3,-1 … a3,-log t
Hamming
weight a4,0 a4,-1 … a4,-log t
a5,0 a5,-1 … a5,-log t
… … … …
at,0 at,-1 … at,-log t

b0,log t … b0,1 b0,0


b-1,log t … b-1,1 b-1,0
… … … …
b-log t,log t … b-log t,1 b-log t,0
A Different Way to Add Numbers

a1,0 a1,-1 … a1,-log t


Only log t
numbers with a2,0 a2,-1 … a2,-log t
log t bits of a3,0 a3,-1 … a3,-log t
precision. Easy
to handle. a4,0 a4,-1 … a4,-log t
a5,0 a5,-1 … a5,-log t
… … … …
at,0 at,-1 … an,-log t

b0,log t … b0,1 b0,0


b-1,log t … b-1,1 b-1,0
… … … …
b-log n,log t … b-log t,1 b-log t,0
Computing Sparse Hamming Wgt.

a1,0 a1,-1 … a1,-log n


a2,0 a2,-1 … a2,-log n
a3,0 a3,-1 … a3,-log n
a4,0 a4,-1 … a4,-log n
a5,0 a5,-1 … a5,-log n
… … … …
at,0 at,-1 … at,-log t
Computing Sparse Hamming Wgt.

a1,0 a1,-1 … a1,-log t


0 0 … 0
0 0 … 0
a4,0 a4,-1 … a4,-log t
0 0 … 0
… … … …
at,0 at,-1 … at,-log t
Computing Sparse Hamming Wgt.
 Binary representation of the Hamming
weight of a = (a1, …, at)∈{0,1}t
 The i’th bit of HW(a) is e2i(a) mod2
 ek is elementary symmetric poly of degree k
 Sum of all products of k bits
 We know a priori that weight ≤ |S|
  Only need upto e2^[log |S|](a)
  Polynomials of degree upto |S|
 Set |S| ~ n, then E* is bootstrappable.
Security
 The approximate-GCD problem:
 Input: integers w0, w1,…, wt,
 Chosen as wi = qip + ri for a secret odd p
 p∈$[0,P], qi∈$[0,Q], ri∈$[0,R] (with R ^ P ^ Q)
 Task: find p
 Thm: If we can distinguish Enc(0)/Enc(1)
for some p, then we can find that p
 Roughly: the LSB of ri is a “hard core bit”
 Scheme is secure if approx-GCD is hard
 Is approx-GCD really a hard problem?
Hard-core-bit theorem
A. The approximate-GCD problem:
 Input: wi = qip + ri (i=0,…,t)
 p∈$[0,P], qi∈$[0,Q], ri∈$[0,R’] (with R’ ^ P ^ Q)
 Task: find p
B. The cryptosystem
 Input: xi = qip + 2ri (i=0,…,t), c=qp+2r+m
 p∈$[0,P], qi∈$[0,Q], ri∈$[0,R] (with R ^ P ^ Q)
 Task: distinguish m=0 from m=1
 Thm: Solution to B  solution to A
 small caveat: R’ smaller than R
Proof outline
 Input: wi = qip + ri (i=1,…,t)
 Use the wi’s to form a public key
 This is where we need R’>R
 Amplify the distinguishing advantage
 From any noticeable ε to almost 1
 Use reliable distinguisher to learn qt
 Using the binary GCD procedure
 Finally p = round(wt/qt)
Use the wi’s to form a public key
 We have wi=qip+ri, need xi=qi’p+2ri’
 Setting xi = 2wi yields wrong distribution
 Reorder wi’s so w0 is the largest one
 Check that w0 is odd, else abort
 Also hope that q0 is odd (else may fail to find p)
 w0 odd, q0 odd  r0 is even
 x0=w0+2ρ0, xi=(2wi +2ρi) mod w0 for i>0
 The ρi’s are random < R
 Correctness:
1. ri+ρi distributed almost identically to ρi
 Since R>R’ by a super-polynomial factor
2. 2qi mod q0 is random in [q0]
Amplify the distinguishing advantage
 Given an integer z=qp+r, with r<R’:
Set c = [z+ m+2ρ + subset-sum(xi’s)] mod x0
 For random ρ<R, random bit m
 c is a random ciphertext wrt the xi’s
 ρ>ri’s, so ρ+ri’s distributed like ρ
 (subset-sum(qi)’s mod q0) random in [q0]
 c mod p mod 2 = r+m mod 2
 A guess for c mod p mod 2  vote for r mod 2
 Choose many random c’s, take majority
 Noticeable advantage  Reliable r mod 2
Use reliable distinguisher to learn qt’
 From z=qp+r, can get r mod 2
 Note: z = q+r mod 2 (since p is odd)
 So (q mod 2) = (r mod 2) / (z mod 2)

 Given z1, z2, both near multiples of p


 Get bi := qi mod 2, if z1<z2 swap them
Binary-GCD

 If b1=b2=1, set z1:=z1−z2, b1:=b1−b2


 At least one of the bi’s must be zero now
 For any bi=0 set zi := floor(zi/2)
 new-qi = old-qi/2
 Repeat until one zi is zero, output the other
z = (2s)p + r  z/2 = sp + r/2
 floor(z/2) = sp + floor(r/2)
Use reliable distinguisher to learn qt
 zi=qip+ri, i=1,2, z’:=Binary-GCD(z1,z2)
The odd part
 Then z’ = GCD*(q1,q2)·p + r’ of the GCD
 For random q1,q2, Pr[GCD(q1,q2)=1] ~ 0.6
 Try (say) z’:=Binary-GCD(wt,wt-1)
 Hope that z’=1·p+r
 Else try again with Binary-GCD(z’,wt-2), etc.
 Run Binary-GCD(wt,z’)
 The b2 bits spell out the bits of qt
 Once you learn qt then
 round(wt/qt) = p+round(rt/qt) = p
Hardness of Approximate-GCD
 Several lattice-based approaches for
solving approximate-GCD
 Related to Simultaneous Diophantine
Approximation (SDA)
 Studied in [Hawgrave-Graham01]
 We considered some extensions of his attacks
 All run out of steam when |qi|>|p|2
 In our case |p|~n2, |qi|~n5 p |p|2
Relation to SDA
 xi = qip + ri (ri ^ p ^ qi), i = 0,1,2,…
 yi = xi/x0 = (qip + ri)/(q0p + r0)
= (qi + (ri/p))/(q0 + (r0/p))
 = (qi+si)/q0, with si ~ ri/p ^ 1
 y1, y2, … is an instance of SDA
 q0 is a denominator that approximates all yi’s
 Use Lagarias’es algorithm to try and
solve this SDA instance
 Find q0, then p=round(x0/q0)
Lagarias’es SDA algorithm
 Consider the rows of this matrix B:
 They span dim-(t+1) lattice R x1 x2 … xt
-x0
B= -x0
 <q0,q1,…,qt>·B is short …
 1st entry: q0R < Q·R -x0
 ith entry (i>1): q0(qip+ri)-qi(q0p+r0)=q0ri-qir0
 Less than Q·R in absolute value
 Total size less than Q·R·ªt
 vs. size ~Q·P (or more) for the basis vectors
 Hopefully we will find it with a lattice-
reduction algorithm (LLL or variants)
R x1 x2…xt
-x0
-x0
Will this algorithm succeed? …
-x0

 Is <q0,q1,…,qt>·B shortest in lattice? Minkowski


ªt·det(B)1/t+1
 Is it shorter than ? bound
 det(B) is small-ish (due to R in the corner)
 Need ((QP)tR)1/t+1 > QR
g t+1 > (log Q + log P – log R) / (log P – log R)
~ log Q/log P
 log Q = ω(log2P)  need t=ω(log P)
 Quality of LLL & co. degrades with t
 Only finds vectors of size ~ 2t/2·shortest
 or 2t/22εt for any constant ε>0
 t=ω(log P)  2εt·QR > det(B)1/t+1
 Contemporary lattice reduction is not strong enough
Why this algorithm fails
size (log scale)

What LLL can find


min(blue,purple)+εt

log Q
auxiliary solutions
(Minkowski’s bound)
converges to ~ logQ+logP

the solution we
are seeking

blue line
remains above
purple line
t
logQ/logP
Conclusions
 Fully Homomorphic Encryption is a very
powerful tool
 Gentry09 gives first feasibility result
 Showing that it can be done “in principle”
 We describe a “conceptually simpler”
scheme, using only modular arithmetic
 What about efficiency?
 Computation, ciphertext-expansion are
polynomial, but a rather large one…
 Improving efficiency is an open problem
Extra credit
 The hard-core-bit theorem
 Connection between approximate-GCD
and simultaneous Diophantine approx.
 Gentry’s technique for “squashing” the
decryption circuit
Thank you

You might also like