# Cryptography - Stefan Dziembowski

AI and Robotics

Nov 21, 2013 (4 years and 5 months ago)

77 views

Cryptography

Lecture
1
2

Stefan Dziembowski

www.dziembowski.net

stefan@dziembowski.net

Plan

1.
Introduction to multiparty cryptographic
protocols.

2.
Private Information Retrieval

Alice and Bob are attacked by an adversary.

Multiparty protocols

A group of players wants to perform some task together

even though they do not trust each other

This is a vast area

Examples
:

voting

coin
-
tossing

auctions

....

Today’s lecture
:

Private Information Retrieval

AOL search data scandal
(2006)

#4417749:

clothes for age 60

60 single men

best retirement city

jarrett arnold

jack t. arnold

jaylene and jarrett arnold

gwinnett county yellow pages

rescue of older dogs

movies for dogs

sinus infection

Thelma Arnold

62
-
year
-
old widow

Lilburn, Georgia

Observation

The owners of
databases

know a lot about the users!

This poses a risk to users’ privacy.

E.g. consider database with stock prices…

Can we do something about it?

Yes, we can:

trust

them that they will protect our secrecy,

or

use
cryptography
!

problematic

problematic
!

How can crypto help?

Note
: this problem has nothing to do with
secure communication!

user
U

database
D

Our settings

user
U

database
D

A new primitive:

Private Information Retrieval (PIR)

Plan

1.
Definition of PI
R

2.
An ideal PIR doesn’t exist

3.
Construction of a computational PIR

4.
Open problems

Literature
:

B. Chor, E. Kushilevitz, O. Goldreich and M. Sudan,

Private Information Retrieval
, Journal of ACM, 1998

E. Kushilevitz and R. Ostrovsky

Replication Is NOT Needed: SINGLE Database,

Computationally
-
Private Information Retrieval
, FOCS 1997

Question

How to protect privacy of queries?

user
U

database
D

wants to retrieve some

data from
D

shouldn’t learn what
U

retrieved

Let’s make things simple!

B
1

B
2

B
w

index
i = 1,…,
w

the user should learn
B
i

B
i

?

each
B
i
є

{0,1}

database

B
:

(he may also learn other

B
i
’s)

Trivial solution

B
1

B
2

B
w

The database simply sends everything to the user!

Non
-
triviality

The previous solution has a drawback:

the communication complexity is huge!

Therefore we introduce the following requirement:

Non
-
triviality”
:

the number of bits communicated between

U

and
D

has to be smaller than
w
.

input:

Private Information Retrieval
(PIR)

B
1

B
2

B
w

input:

index
i = 1,…,w

at the end the user learns
B
i

the database does not learn
i

the total communication is
<
w

Note
: secrecy of the database is not required

correctness

secrecy (of the user)

non
-
triviality

This property needs to be defined more formally!

polynomial time randomized interactive algorithms

How to define secrecy of the
user [1/2]?

i

B

Def
.
T(i,B)

transcript
of the

conversation.

query

Q(i)

A(Q(i),B)

multi
-
round case
:

it is impossible to distinguish between

T(i,B)

and
T(j,B)

even if the adversary is malicious

How to define secrecy of the
user [2/2]?

Secrecy of the user
: for every

i,j
є

{0,1}

single
-
round case
:

it is impossible to distinguish

between
Q(i)
and

Q(j)

?

What does it mean
?

For now say:

the distribution of
Q(i)

and
Q(j)

is the same

PIR doesn’t exists [1/4]

We now show that
correctness
,
non
-
triviality

and
secrecy

cannot

be satisfied simultaneously.

Def
: A transcript
T

is
possible

for
(i,B)

if
P(T(i,B) = T) > 0

Take some
T’
,

and look where it is
possible
:

T’

T’

T’

T’

indices
i

databases
B

PIR doesn’t exists [2/4]

Observation
:

secrecy

if

T’

is possible for some

B

and
i

then

it is possible for

B

and all the other

i’
s

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

indices
i

databases
B

T’

T’

T’

T’

PIR doesn’t exists [3/4]

non
-
triviality

length(transcript) < length(database)

# transcripts < #databases

there has to exist
T’

that is possible for

two databases
B
0

and
B
1

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

databases
B

B
0

B
1

indices
i

PIR doesn’t exists [4/4]

B
0

and
B
1

differ on at least one index
i’

So, if
i’

is the input of the user then

correctness

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

T’

databases
B

B
0

B
1

i’

indices
i

So PIR doesn’t exist!

How to bypass the impossibility result?

Two ideas:

limit the computing power of a cheating
database

use a larger number of “independent”
databases

Computationally
-
secure PIR

secrecy:

For every

i,j
є

{0,1}

it is impossible to distinguish

efficiently

between

T(i,B)

and
T(j,B)

?

computational
-
secrecy:

Formally
: for every
polynomial
-
time

probabilistic algorithm
A
the value:

|
P(A(T(i,B)) = 0)

P(A(T(j,B))=0)
|

should be

negligible
.

Hardness assumptions?

[KO97]

construct PIR based on the

We describe it on the next slides.

Favourite cryptographers’ group

p,q

large random primes (
|p|=|q|=2
1024
,

say)

RSA group
:

Z
n
*
,

where

n=pq

Chinese remainder theorem

Chinese remainder theorem (CRT)
:

For
n = pq

(where
p

and
q

are prime)
a function
λ
: Z*
n

→ Z*
p
×

Z*
q

defined as

λ
(i) := (i mod p, i mod q)

is an isomorphism.

Example

0

1

2

0 1 2 3 4

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

Z
1
5

Z
1
5
*

λ
(i) := (i mod p, i mod q)

Z
5
*

Z
3
*

Def
.

x

is

m

if

there exists

a
є

Z
m
*
such that

x = a
2

mod m

QR(m)

:= the set of all quadratic residues modulo
m
.

QNR(m)

:=
Z
m
*
\

QR(n)

1

2

3

4

5

6

7

8

9

10

11

12

Observation
13

has
exactly

2
square roots,

and hence
|QR(13)| = |Z
13
*| / 2
.

Z
13
*
:

QR(13)
:

1

4

9

3

12

10

10

12

3

9

4

1

1

4

9

3

12

10

a

a
2
:

prime
p

Lemma
:

For every prime
p

we have
QR(p) = (p
-
1)/2

Remark
:

Let

g

be a generator of

Z
p
*.

Then

QR(p) = {g
0
,g
2
,g
4
,g
6
,...,g
p
-
3
}.

QNR(p) = {g
1
,g
3
,g
5
,g
7
,...,g
p
-
2
}.

QRs modulo
pq

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

0

1

4

3

1

5

6

4

4

9

10

1

12

4

1

Observation
15

has
exactly

4
square roots,

and hence
|QR(15)| = |Z
15
*| / 4
.

Z
15
*
:

QR(15)
:

1

4

a

a
2

pq

Fact
: For
n=pq

we have
|QR(n)| = |Z
n
*| / 4
.

Proof
:

x
є

QR(n)

iff

x = a
2

mod n
, for some

a

iff (by
CRT
)

x = a
2

mod p
and

x = a
2

mod q

iff

x mod p
є

QR(p)
and

x mod q
є

QR(q)

QR(p)

QR(q)

Z
n
*
:

mod q

mod p

QR(n)

0

1

2

0 1 2 3 4

0

2

3

5

8

12

Z
15
:

Z
1
5
*

QRs modulo
pq

an example

QR(3)

QR(5)

QR(5)

6

11

9

14

7

10

13

1

4

1
2

mod 5

4
2

mod 5

2
2

mod 5

3
2

mod 5

1
2

mod 3

2
2

mod 3

Homomorphism
of
QR(
pq
)

Q(n,a) =

Homomorphism
: for all
a,b

є

Z
n
*

Q(
n,
ab) = Q
(n,a) xor Q(n,b
)

(exercise)

1

if
a
є

QR(n)

0

otherwise

Suppose
n=pq

Is it easy to test membership in
QR(n)
?

Fact
: if one knows
p

and
q

yes!

What if one doesn’t know
p

and
q
?

Assumption (QRA)

n=pq
, where
p

and
q

are large primes

QR(p)

QR(q)

Z
n
*
:

Z
n
+
:

all
a

є

Z
n
*
:
such that

a mod p
є

QR(p
)

iff

a
mod q

є

QR(q)

:

For a random
a

є

Z
n
+

i
t is computationally hard to determine if
a
є

QR(n)
.

Formally
:
for every
polynomial
-
time

probabilistic algorithm
G
the value:

|P(G(a) = Q(a))

0.5|

(where
a

is random) is

negligible.

QNR(q)

QNR(p)

?

a
є

Z
n
+

Note
:

Z
n
+

is a group!

QR(n)

We are ready to construct PIR!

Our PIR will work in the group
Z
n
+
,
where
n=pq
.

:

testing membership in
Z
n
+
is easy,

testing membership in
QR(n)

is hard for
random elements on
Z
n
+
,

unless one knows

p

and

q
.

homomorphism of
Q
!

First (wrong) idea

i

QR

X
1

QR

X
2

...

QR

X
i
-
1

NQR

X
i

QR

X
i+1

...

QR

X
w
-
1

QR

X
w

B
1

B
2

...

B
i
-
1

B
i

B
i+1

...

B
w
-
1

B
w

for every
j = 1,...,w

the
database sets

Y
j
=

X
j
2

if

B
j

= 0

X
j

otherwise

{

QR

Y
1

QR

Y
2

...

QR

Y
i
-
1

Y
i

QR

Y
i+1

...

QR

Y
w
-
1

QR

Y
w

i

Y
i
is a

QR

iff

B
i
=0

Set

M = Y
1

∙ Y
2

∙ ... ∙ Y
w

M

M

is a

QR

iff

B
i
=0

the user checks

if
M

is a
QR

Problems!

PIR from the previous slide:

correctness

security
?

To learn
i

the database would need to distinguish
NQR

from
QR
. √

QR

X
1

QR

X
2

...

QR

X
i
-
1

NQR

X
i

QR

X
i+1

...

QR

X
w
-
1

QR

X
w

non
-
triviality
? doesn’t hold!

communication
:

user

database
:
|B|

∙ |Z*
n
|

database

user
:
|Z*
n
|

Call it:

(|B|, 1)
-

PIR

How to fix it?

Idea
:

Given:

construct

Suppose

that
|B|

= v
2
and present
B

as a

v
×
v
-
matrix:

B13

B14

B15

B16

B9

B10

B11

B12

B5

B6

B7

B8

B1

B2

B3

B4

consider
each row as
a separate

database

An improved idea

B13

B14

B15

B16

B9

B10

B11

B12

B5

B6

B7

B8

B1

B2

B3

B4

v

v

Let
j

be the column where
B
i

is.

In every “row” the user asks for the
j
th element

v

queries the user can
send one!

Observe: in this way the user learns

all the elements in the
j
th column!

B
i

j

execute

v

(v,1)

-

PIRs

in parallel

Looks even worse:

communication
:

user

database
:
v
2

∙ |Z*
n
|

database

user
:
v

|Z*
n
|

The method

Putting things together

B
1

...

B
j
-
1

B
j

B
j
+1

...

B
v

B
i

...

...

B
vv

i

QR

X
1

...

QR

X
j
-
1

NQR

X
j

QR

X
j
+1

...

QR

X
v

X
1

...

X
j
-
1

X
j

X
j
+1

...

X
v

X
1

...

X
j
-
1

X
j

X
j
+1

...

X
v

Y
1

...

Y
j
-
1

Y
j

Y
j
+1

...

Y
v

...

Y
vv

M
1

...

M
v

multiply

elements

in each row

k
th row

M
1

M
k

M
v

B
j
=0
iff

M
k

is

QR

for every
j = 1,...,
v

set

Y
j
=

X
j
2

if

B
j

= 0

X
j

otherwise

{

j
th column

here the same row is copied
v

times:

only this

counts

So we are done!

PIR from the previous slide:

correctness

non
-
triviality:

communication complexity =
2√|B|

∙ |Z
n
|

security
?

The to learn i the database would need to distinguish NQR from QR.

Formally
:

from

breaks our scheme

we can construct

an algorithm that
breaks QRA

simulates:

Improvements

user
U

database
D

(X
1
,…,X
v
)

(M
1
,…,M
v
)

the user is interested

just in one

M
i
.

Idea
: apply PIR recursively!

Complexity of PIRs

overview
of the results

Communication
:

“recursive” PIR of [KO97]:

for every
c
:
O(|B|
c
)

[
, 1999]:

poly
-
logarithmic in
|B|

[
Lipmaa
, 2005]:

O(log
2
|B|)

For practical analysis see:

[
Sion, Carbunar
]

On the Computational Practicality of Private Information Retrieval.

their conclusion
:

It is the time
-
complexity that matters.

In
real
-
life
:

it is still more practical

to
transmit the entire database.

Extensions

Symmetric PIR (also protect privacy of the
database).

[
Gertner, Ishai, Kushilevitz, Malkin
. 1998]

Searching by key
-
words

[
Chor, Gilboa, Naor
, 1997]

Public
-
key encryption with key
-
word search

[
Boneh, Di Crescenzo, Ostrovsky, Persiano
]

Open problems:

Improve efficiency.

Construct new extensions.

What was the key property that we
used?

homomorphism of QR

Holy grail
:

fully
-
homomorphic encryption

Fully
-
homomorphic encryption

Observe that we constructed a
1
-
bit
probabilistic public
-
key
encryption scheme:

Enc(X) =

random

QR

if

X = 0

random

NQR

if

X = 1

{

Which has the following
homomorphic with respect to
xor
:

Enc(X
xor

Y) = Enc(X)
• Enc(Y)

It would be really useful to have
an encryption

scheme
homomorphic

with respect to:

conjunction

and
negation

simultaneously
.