Cryptography - Stefan Dziembowski

innocentsickAI and Robotics

Nov 21, 2013 (3 years and 11 months ago)

71 views

Cryptography

Lecture
1
2


Stefan Dziembowski

www.dziembowski.net

stefan@dziembowski.net

Plan

1.
Introduction to multiparty cryptographic
protocols.

2.
Private Information Retrieval

Traditional scenario

Alice and Bob are attacked by an adversary.

Multiparty protocols

A group of players wants to perform some task together


even though they do not trust each other

This is a vast area

Examples
:


voting


coin
-
tossing


auctions


....


Today’s lecture
:


Private Information Retrieval

AOL search data scandal
(2006)

#4417749:


clothes for age 60


60 single men


best retirement city


jarrett arnold


jack t. arnold


jaylene and jarrett arnold


gwinnett county yellow pages


rescue of older dogs


movies for dogs


sinus infection



Thelma Arnold

62
-
year
-
old widow

Lilburn, Georgia

Observation

The owners of
databases

know a lot about the users!


This poses a risk to users’ privacy.


E.g. consider database with stock prices…


Can we do something about it?



Yes, we can:



trust

them that they will protect our secrecy,


or


use
cryptography
!

problematic

problematic
!

How can crypto help?

Note
: this problem has nothing to do with
secure communication!

user
U

database
D

Our settings

user
U

database
D

A new primitive:

Private Information Retrieval (PIR)

secure link

Plan

1.
Definition of PI
R

2.
An ideal PIR doesn’t exist

3.
Construction of a computational PIR

4.
Open problems

Literature
:




B. Chor, E. Kushilevitz, O. Goldreich and M. Sudan,


Private Information Retrieval
, Journal of ACM, 1998




E. Kushilevitz and R. Ostrovsky


Replication Is NOT Needed: SINGLE Database,


Computationally
-
Private Information Retrieval
, FOCS 1997

Question

How to protect privacy of queries?

user
U

database
D

wants to retrieve some

data from
D

shouldn’t learn what
U

retrieved

Let’s make things simple!

B
1

B
2



B
w

index
i = 1,…,
w

the user should learn
B
i

B
i

?

each
B
i
є

{0,1}

database

B
:

(he may also learn other

B
i
’s)

Trivial solution

B
1

B
2



B
w

The database simply sends everything to the user!

Non
-
triviality

The previous solution has a drawback:

the communication complexity is huge!


Therefore we introduce the following requirement:



Non
-
triviality”
:

the number of bits communicated between

U

and
D

has to be smaller than
w
.

input:

Private Information Retrieval
(PIR)

B
1

B
2



B
w

input:

index
i = 1,…,w



at the end the user learns
B
i




the database does not learn
i




the total communication is
<
w


Note
: secrecy of the database is not required

correctness

secrecy (of the user)

non
-
triviality

This property needs to be defined more formally!

polynomial time randomized interactive algorithms

How to define secrecy of the
user [1/2]?

i

B

Def
.
T(i,B)



transcript
of the

conversation.

query

Q(i)

reply

A(Q(i),B)

multi
-
round case
:


it is impossible to distinguish between

T(i,B)

and
T(j,B)



even if the adversary is malicious

How to define secrecy of the
user [2/2]?

Secrecy of the user
: for every

i,j
є

{0,1}

single
-
round case
:


it is impossible to distinguish

between
Q(i)
and

Q(j)


?

What does it mean
?


For now say:


the distribution of
Q(i)

and
Q(j)

is the same

PIR doesn’t exists [1/4]

We now show that
correctness
,
non
-
triviality

and
secrecy

cannot

be satisfied simultaneously.


Def
: A transcript
T

is
possible

for
(i,B)

if
P(T(i,B) = T) > 0


Take some
T’
,

and look where it is
possible
:

T’

T’

T’

T’

indices
i

databases
B

PIR doesn’t exists [2/4]

Observation
:

secrecy




if

T’

is possible for some

B

and
i


then

it is possible for

B

and all the other

i’
s

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

indices
i

databases
B

T’

T’

T’

T’

PIR doesn’t exists [3/4]

non
-
triviality



length(transcript) < length(database)




# transcripts < #databases



there has to exist
T’

that is possible for

two databases
B
0

and
B
1


T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

databases
B



B
0




B
1

indices
i

PIR doesn’t exists [4/4]

B
0

and
B
1

differ on at least one index
i’

So, if
i’

is the input of the user then


correctness
→ contradiction

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

databases
B



B
0




B
1


i’



indices
i

So PIR doesn’t exist!


How to bypass the impossibility result?


Two ideas:



limit the computing power of a cheating
database


use a larger number of “independent”
databases

Computationally
-
secure PIR

secrecy:

For every

i,j
є

{0,1}



it is impossible to distinguish


efficiently



between


T(i,B)

and
T(j,B)


?

computational
-
secrecy:

Formally
: for every
polynomial
-
time

probabilistic algorithm
A
the value:

|
P(A(T(i,B)) = 0)


P(A(T(j,B))=0)
|

should be

negligible
.

Hardness assumptions?

[KO97]


construct PIR based on the

Quadratic Residuosity Assumption


We describe it on the next slides.

Favourite cryptographers’ group

p,q



large random primes (
|p|=|q|=2
1024
,

say)


RSA group
:

Z
n
*
,

where

n=pq


Chinese remainder theorem

Chinese remainder theorem (CRT)
:

For
n = pq

(where
p

and
q

are prime)
a function
λ
: Z*
n

→ Z*
p
×

Z*
q

defined as

λ
(i) := (i mod p, i mod q)


is an isomorphism.



Example

0

1

2


0 1 2 3 4

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

Z
1
5

Z
1
5
*

λ
(i) := (i mod p, i mod q)

Z
5
*

Z
3
*

Def
.

x

is

quadratic residue modulo

m

if

there exists

a
є

Z
m
*
such that

x = a
2

mod m


QR(m)

:= the set of all quadratic residues modulo
m
.

QNR(m)

:=
Z
m
*
\

QR(n)


1

2

3

4

5

6

7

8

9

10

11

12

Observation
: every quadratic residue modulo
13

has
exactly

2
square roots,

and hence
|QR(13)| = |Z
13
*| / 2
.


Z
13
*
:

QR(13)
:

1

4

9

3

12

10

10

12

3

9

4

1

1

4

9

3

12

10

a

a
2
:

Quadratic Residues

A Lemma about QRs modulo
prime
p

Lemma
:

For every prime
p

we have
QR(p) = (p
-
1)/2


Remark
:

Let

g

be a generator of

Z
p
*.


Then

QR(p) = {g
0
,g
2
,g
4
,g
6
,...,g
p
-
3
}.


QNR(p) = {g
1
,g
3
,g
5
,g
7
,...,g
p
-
2
}.





QRs modulo
pq

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

0

1

4

3

1

5

6

4

4

9

10

1

12

4

1

Observation
: every quadratic residue modulo
15

has
exactly

4
square roots,

and hence
|QR(15)| = |Z
15
*| / 4
.


Z
15
*
:

QR(15)
:

1

4

a

a
2

A Lemma about QRs modulo
pq

Fact
: For
n=pq

we have
|QR(n)| = |Z
n
*| / 4
.



Proof
:

x
є

QR(n)

iff

x = a
2

mod n
, for some

a

iff (by
CRT
)

x = a
2

mod p
and

x = a
2

mod q

iff

x mod p
є

QR(p)
and

x mod q
є

QR(q)

QR(p)

QR(q)

Z
n
*
:

mod q

mod p

QR(n)

0

1

2


0 1 2 3 4

0

2

3

5

8

12

Z
15
:

Z
1
5
*

QRs modulo
pq



an example

QR(3)

QR(5)

QR(5)

6

11

9

14

7

10

13

1

4

1
2

mod 5

4
2

mod 5

2
2

mod 5

3
2

mod 5

1
2

mod 3

2
2

mod 3

Homomorphism
of
QR(
pq
)

Q(n,a) =




Homomorphism
: for all
a,b

є

Z
n
*


Q(
n,
ab) = Q
(n,a) xor Q(n,b
)


(exercise)


1

if
a
є

QR(n)

0

otherwise

Algorithmic questions about QR


Suppose
n=pq


Is it easy to test membership in
QR(n)
?


Fact
: if one knows
p

and
q



yes!




What if one doesn’t know
p

and
q
?

Quadratic Residuosity
Assumption (QRA)

n=pq
, where
p

and
q

are large primes

QR(p)

QR(q)

Z
n
*
:

Z
n
+
:


all
a

є

Z
n
*
:
such that

a mod p
є

QR(p
)

iff

a
mod q

є

QR(q)

Quadratic Residuosity Assumption (QRA)
:


For a random
a

є

Z
n
+

i
t is computationally hard to determine if
a
є

QR(n)
.


Formally
:
for every
polynomial
-
time

probabilistic algorithm
G
the value:

|P(G(a) = Q(a))


0.5|

(where
a

is random) is

negligible.

QNR(q)

QNR(p)

?

a
є

Z
n
+




Note
:

Z
n
+

is a group!



QR(n)

We are ready to construct PIR!

Our PIR will work in the group
Z
n
+
,
where
n=pq
.


What’s so good about this group?
:



testing membership in
Z
n
+
is easy,


testing membership in
QR(n)

is hard for
random elements on
Z
n
+
,

unless one knows

p

and

q
.


homomorphism of
Q
!


First (wrong) idea

i

QR

X
1

QR

X
2

...

QR

X
i
-
1

NQR

X
i

QR

X
i+1

...

QR

X
w
-
1

QR

X
w

B
1

B
2

...

B
i
-
1

B
i

B
i+1

...

B
w
-
1

B
w

for every
j = 1,...,w

the
database sets


Y
j
=



X
j
2


if

B
j

= 0

X
j

otherwise

{

QR

Y
1

QR

Y
2

...

QR

Y
i
-
1


Y
i

QR

Y
i+1

...

QR

Y
w
-
1

QR

Y
w

i



Y
i
is a

QR

iff

B
i
=0

Set

M = Y
1

∙ Y
2


∙ ... ∙ Y
w

M

M

is a

QR

iff

B
i
=0

the user checks

if
M

is a
QR

Problems!

PIR from the previous slide:


correctness




security
?


To learn
i

the database would need to distinguish
NQR

from
QR
. √

QR

X
1

QR

X
2

...

QR

X
i
-
1

NQR

X
i

QR

X
i+1

...

QR

X
w
-
1

QR

X
w


non
-
triviality
? doesn’t hold!



communication
:

user


database
:
|B|

∙ |Z*
n
|


database


user
:
|Z*
n
|

Call it:

(|B|, 1)
-

PIR


How to fix it?

Idea
:

Given:


construct




Suppose

that
|B|

= v
2
and present
B

as a

v
×
v
-
matrix:


B13

B14

B15

B16

B9

B10

B11

B12

B5

B6

B7

B8

B1

B2

B3

B4

consider
each row as
a separate

database


An improved idea

B13

B14

B15

B16

B9

B10

B11

B12

B5

B6

B7

B8

B1

B2

B3

B4

v

v

Let
j

be the column where
B
i

is.


In every “row” the user asks for the
j
th element


So, instead of sending
v

queries the user can
send one!


Observe: in this way the user learns

all the elements in the
j
th column!

B
i

j




execute

v

(v,1)

-

PIRs

in parallel


Looks even worse:


communication
:

user


database
:
v
2

∙ |Z*
n
|

database


user
:
v



|Z*
n
|



The method

Putting things together

B
1

...

B
j
-
1

B
j

B
j
+1

...

B
v

B
i

...

...

B
vv

i

QR

X
1

...

QR

X
j
-
1

NQR

X
j

QR

X
j
+1

...

QR

X
v

X
1

...

X
j
-
1

X
j

X
j
+1

...

X
v

X
1

...

X
j
-
1

X
j

X
j
+1

...

X
v

Y
1

...

Y
j
-
1

Y
j

Y
j
+1

...

Y
v

...

Y
vv

M
1

...

M
v

multiply

elements

in each row

k
th row

M
1

M
k

M
v

B
j
=0
iff


M
k

is

QR


for every
j = 1,...,
v

set


Y
j
=



X
j
2


if

B
j

= 0

X
j

otherwise

{

j
th column

here the same row is copied
v

times:

only this

counts

So we are done!

PIR from the previous slide:


correctness




non
-
triviality:

communication complexity =
2√|B|

∙ |Z
n
|




security
?


The to learn i the database would need to distinguish NQR from QR.


Formally
:


from

any adversary that
breaks our scheme


we can construct


an algorithm that
breaks QRA



simulates:





Improvements

user
U

database
D

(X
1
,…,X
v
)

(M
1
,…,M
v
)

the user is interested

just in one

M
i
.

Idea
: apply PIR recursively!

Complexity of PIRs


overview
of the results

Communication
:



“recursive” PIR of [KO97]:


for every
c
:
O(|B|
c
)


[
Cachin, Micali, Stadler
, 1999]:

poly
-
logarithmic in
|B|


[
Lipmaa
, 2005]:

O(log
2
|B|)


For practical analysis see:


[
Sion, Carbunar
]


On the Computational Practicality of Private Information Retrieval.

their conclusion
:


It is the time
-
complexity that matters.


In
real
-
life
:

it is still more practical

to
transmit the entire database.



Extensions


Symmetric PIR (also protect privacy of the
database).


[
Gertner, Ishai, Kushilevitz, Malkin
. 1998]


Searching by key
-
words


[
Chor, Gilboa, Naor
, 1997]


Public
-
key encryption with key
-
word search


[
Boneh, Di Crescenzo, Ostrovsky, Persiano
]


Open problems:


Improve efficiency.


Construct new extensions.

What was the key property that we
used?


homomorphism of QR


Holy grail
:


fully
-
homomorphic encryption


Fully
-
homomorphic encryption

Observe that we constructed a
1
-
bit
probabilistic public
-
key
encryption scheme:

Enc(X) =

random

QR

if

X = 0

random

NQR

if

X = 1

{

Which has the following
homomorphic with respect to
xor
:

Enc(X
xor

Y) = Enc(X)
• Enc(Y)


It would be really useful to have
an encryption

scheme
homomorphic

with respect to:


conjunction

and
negation

simultaneously
.