Cryptography
Lecture
1
2
Stefan Dziembowski
www.dziembowski.net
stefan@dziembowski.net
Plan
1.
Introduction to multiparty cryptographic
protocols.
2.
Private Information Retrieval
Traditional scenario
Alice and Bob are attacked by an adversary.
Multiparty protocols
A group of players wants to perform some task together
even though they do not trust each other
This is a vast area
Examples
:
•
voting
•
coin

tossing
•
auctions
•
....
Today’s lecture
:
•
Private Information Retrieval
AOL search data scandal
(2006)
#4417749:
•
clothes for age 60
•
60 single men
•
best retirement city
•
jarrett arnold
•
jack t. arnold
•
jaylene and jarrett arnold
•
gwinnett county yellow pages
•
rescue of older dogs
•
movies for dogs
•
sinus infection
Thelma Arnold
62

year

old widow
Lilburn, Georgia
Observation
The owners of
databases
know a lot about the users!
This poses a risk to users’ privacy.
E.g. consider database with stock prices…
Can we do something about it?
Yes, we can:
•
trust
them that they will protect our secrecy,
or
•
use
cryptography
!
problematic
problematic
!
How can crypto help?
Note
: this problem has nothing to do with
secure communication!
user
U
database
D
Our settings
user
U
database
D
A new primitive:
Private Information Retrieval (PIR)
secure link
Plan
1.
Definition of PI
R
2.
An ideal PIR doesn’t exist
3.
Construction of a computational PIR
4.
Open problems
Literature
:
•
B. Chor, E. Kushilevitz, O. Goldreich and M. Sudan,
Private Information Retrieval
, Journal of ACM, 1998
•
E. Kushilevitz and R. Ostrovsky
Replication Is NOT Needed: SINGLE Database,
Computationally

Private Information Retrieval
, FOCS 1997
Question
How to protect privacy of queries?
user
U
database
D
wants to retrieve some
data from
D
shouldn’t learn what
U
retrieved
Let’s make things simple!
B
1
B
2
…
B
w
index
i = 1,…,
w
the user should learn
B
i
B
i
?
each
B
i
є
{0,1}
database
B
:
(he may also learn other
B
i
’s)
Trivial solution
B
1
B
2
…
B
w
The database simply sends everything to the user!
Non

triviality
The previous solution has a drawback:
the communication complexity is huge!
Therefore we introduce the following requirement:
“
Non

triviality”
:
the number of bits communicated between
U
and
D
has to be smaller than
w
.
input:
Private Information Retrieval
(PIR)
B
1
B
2
…
B
w
input:
index
i = 1,…,w
•
at the end the user learns
B
i
•
the database does not learn
i
•
the total communication is
<
w
Note
: secrecy of the database is not required
correctness
secrecy (of the user)
non

triviality
This property needs to be defined more formally!
polynomial time randomized interactive algorithms
How to define secrecy of the
user [1/2]?
i
B
Def
.
T(i,B)
–
transcript
of the
conversation.
query
Q(i)
reply
A(Q(i),B)
multi

round case
:
it is impossible to distinguish between
T(i,B)
and
T(j,B)
even if the adversary is malicious
How to define secrecy of the
user [2/2]?
Secrecy of the user
: for every
i,j
є
{0,1}
single

round case
:
it is impossible to distinguish
between
Q(i)
and
Q(j)
?
What does it mean
?
For now say:
the distribution of
Q(i)
and
Q(j)
is the same
PIR doesn’t exists [1/4]
We now show that
correctness
,
non

triviality
and
secrecy
cannot
be satisfied simultaneously.
Def
: A transcript
T
is
possible
for
(i,B)
if
P(T(i,B) = T) > 0
Take some
T’
,
and look where it is
possible
:
T’
T’
T’
T’
indices
i
databases
B
PIR doesn’t exists [2/4]
Observation
:
secrecy
→
if
T’
is possible for some
B
and
i
then
it is possible for
B
and all the other
i’
s
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
indices
i
databases
B
T’
T’
T’
T’
PIR doesn’t exists [3/4]
non

triviality
→
length(transcript) < length(database)
↓
# transcripts < #databases
↓
there has to exist
T’
that is possible for
two databases
B
0
and
B
1
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
databases
B
←
B
0
←
B
1
indices
i
PIR doesn’t exists [4/4]
B
0
and
B
1
differ on at least one index
i’
So, if
i’
is the input of the user then
correctness
→ contradiction
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
T’
databases
B
←
B
0
←
B
1
i’
↓
indices
i
So PIR doesn’t exist!
•
How to bypass the impossibility result?
•
Two ideas:
–
limit the computing power of a cheating
database
–
use a larger number of “independent”
databases
Computationally

secure PIR
secrecy:
For every
i,j
є
{0,1}
it is impossible to distinguish
efficiently
between
T(i,B)
and
T(j,B)
?
computational

secrecy:
Formally
: for every
polynomial

time
probabilistic algorithm
A
the value:

P(A(T(i,B)) = 0)
–
P(A(T(j,B))=0)

should be
negligible
.
Hardness assumptions?
[KO97]
–
construct PIR based on the
Quadratic Residuosity Assumption
We describe it on the next slides.
Favourite cryptographers’ group
p,q
–
large random primes (
p=q=2
1024
,
say)
RSA group
:
Z
n
*
,
where
n=pq
Chinese remainder theorem
Chinese remainder theorem (CRT)
:
For
n = pq
(where
p
and
q
are prime)
a function
λ
: Z*
n
→ Z*
p
×
Z*
q
defined as
λ
(i) := (i mod p, i mod q)
is an isomorphism.
Example
0
1
2
0 1 2 3 4
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Z
1
5
Z
1
5
*
λ
(i) := (i mod p, i mod q)
Z
5
*
Z
3
*
Def
.
x
is
quadratic residue modulo
m
if
there exists
a
є
Z
m
*
such that
x = a
2
mod m
QR(m)
:= the set of all quadratic residues modulo
m
.
QNR(m)
:=
Z
m
*
\
QR(n)
1
2
3
4
5
6
7
8
9
10
11
12
Observation
: every quadratic residue modulo
13
has
exactly
2
square roots,
and hence
QR(13) = Z
13
* / 2
.
Z
13
*
:
QR(13)
:
1
4
9
3
12
10
10
12
3
9
4
1
1
4
9
3
12
10
a
a
2
:
Quadratic Residues
A Lemma about QRs modulo
prime
p
Lemma
:
For every prime
p
we have
QR(p) = (p

1)/2
Remark
:
Let
g
be a generator of
Z
p
*.
Then
QR(p) = {g
0
,g
2
,g
4
,g
6
,...,g
p

3
}.
QNR(p) = {g
1
,g
3
,g
5
,g
7
,...,g
p

2
}.
QRs modulo
pq
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
0
1
4
3
1
5
6
4
4
9
10
1
12
4
1
Observation
: every quadratic residue modulo
15
has
exactly
4
square roots,
and hence
QR(15) = Z
15
* / 4
.
Z
15
*
:
QR(15)
:
1
4
a
a
2
A Lemma about QRs modulo
pq
Fact
: For
n=pq
we have
QR(n) = Z
n
* / 4
.
Proof
:
x
є
QR(n)
iff
x = a
2
mod n
, for some
a
iff (by
CRT
)
x = a
2
mod p
and
x = a
2
mod q
iff
x mod p
є
QR(p)
and
x mod q
є
QR(q)
QR(p)
QR(q)
Z
n
*
:
mod q
mod p
QR(n)
0
1
2
0 1 2 3 4
0
2
3
5
8
12
Z
15
:
Z
1
5
*
QRs modulo
pq
–
an example
QR(3)
QR(5)
QR(5)
6
11
9
14
7
10
13
1
4
1
2
mod 5
4
2
mod 5
2
2
mod 5
3
2
mod 5
1
2
mod 3
2
2
mod 3
Homomorphism
of
QR(
pq
)
Q(n,a) =
Homomorphism
: for all
a,b
є
Z
n
*
Q(
n,
ab) = Q
(n,a) xor Q(n,b
)
(exercise)
1
if
a
є
QR(n)
0
otherwise
Algorithmic questions about QR
•
Suppose
n=pq
•
Is it easy to test membership in
QR(n)
?
•
Fact
: if one knows
p
and
q
–
yes!
•
What if one doesn’t know
p
and
q
?
Quadratic Residuosity
Assumption (QRA)
n=pq
, where
p
and
q
are large primes
QR(p)
QR(q)
Z
n
*
:
Z
n
+
:
all
a
є
Z
n
*
:
such that
a mod p
є
QR(p
)
iff
a
mod q
є
QR(q)
Quadratic Residuosity Assumption (QRA)
:
For a random
a
є
Z
n
+
i
t is computationally hard to determine if
a
є
QR(n)
.
Formally
:
for every
polynomial

time
probabilistic algorithm
G
the value:
P(G(a) = Q(a))
–
0.5
(where
a
is random) is
negligible.
QNR(q)
QNR(p)
?
a
є
Z
n
+
↓
Note
:
Z
n
+
is a group!
QR(n)
We are ready to construct PIR!
Our PIR will work in the group
Z
n
+
,
where
n=pq
.
What’s so good about this group?
:
testing membership in
Z
n
+
is easy,
testing membership in
QR(n)
is hard for
random elements on
Z
n
+
,
unless one knows
p
and
q
.
homomorphism of
Q
!
First (wrong) idea
i
QR
X
1
QR
X
2
...
QR
X
i

1
NQR
X
i
QR
X
i+1
...
QR
X
w

1
QR
X
w
B
1
B
2
...
B
i

1
B
i
B
i+1
...
B
w

1
B
w
for every
j = 1,...,w
the
database sets
Y
j
=
X
j
2
if
B
j
= 0
X
j
otherwise
{
QR
Y
1
QR
Y
2
...
QR
Y
i

1
Y
i
QR
Y
i+1
...
QR
Y
w

1
QR
Y
w
i
↓
Y
i
is a
QR
iff
B
i
=0
Set
M = Y
1
∙ Y
2
∙ ... ∙ Y
w
M
M
is a
QR
iff
B
i
=0
the user checks
if
M
is a
QR
Problems!
PIR from the previous slide:
•
correctness
√
•
security
?
To learn
i
the database would need to distinguish
NQR
from
QR
. √
QR
X
1
QR
X
2
...
QR
X
i

1
NQR
X
i
QR
X
i+1
...
QR
X
w

1
QR
X
w
•
non

triviality
? doesn’t hold!
communication
:
user
→
database
:
B
∙ Z*
n

database
→
user
:
Z*
n

Call it:
(B, 1)

PIR
How to fix it?
Idea
:
Given:
construct
Suppose
that
B
= v
2
and present
B
as a
v
×
v

matrix:
B13
B14
B15
B16
B9
B10
B11
B12
B5
B6
B7
B8
B1
B2
B3
B4
consider
each row as
a separate
database
An improved idea
B13
B14
B15
B16
B9
B10
B11
B12
B5
B6
B7
B8
B1
B2
B3
B4
v
v
Let
j
be the column where
B
i
is.
In every “row” the user asks for the
j
th element
So, instead of sending
v
queries the user can
send one!
Observe: in this way the user learns
all the elements in the
j
th column!
B
i
j
↓
execute
v
(v,1)

PIRs
in parallel
Looks even worse:
communication
:
user
→
database
:
v
2
∙ Z*
n

database
→
user
:
v
∙
Z*
n

The method
Putting things together
B
1
...
B
j

1
B
j
B
j
+1
...
B
v
B
i
...
...
B
vv
i
QR
X
1
...
QR
X
j

1
NQR
X
j
QR
X
j
+1
...
QR
X
v
X
1
...
X
j

1
X
j
X
j
+1
...
X
v
X
1
...
X
j

1
X
j
X
j
+1
...
X
v
Y
1
...
Y
j

1
Y
j
Y
j
+1
...
Y
v
...
Y
vv
M
1
...
M
v
multiply
elements
in each row
k
th row
M
1
M
k
M
v
B
j
=0
iff
M
k
is
QR
for every
j = 1,...,
v
set
Y
j
=
X
j
2
if
B
j
= 0
X
j
otherwise
{
j
th column
here the same row is copied
v
times:
only this
counts
So we are done!
PIR from the previous slide:
•
correctness
√
•
non

triviality:
communication complexity =
2√B
∙ Z
n

√
•
security
?
The to learn i the database would need to distinguish NQR from QR.
Formally
:
from
any adversary that
breaks our scheme
we can construct
an algorithm that
breaks QRA
simulates:
Improvements
user
U
database
D
(X
1
,…,X
v
)
(M
1
,…,M
v
)
the user is interested
just in one
M
i
.
Idea
: apply PIR recursively!
Complexity of PIRs
–
overview
of the results
Communication
:
•
“recursive” PIR of [KO97]:
for every
c
:
O(B
c
)
•
[
Cachin, Micali, Stadler
, 1999]:
poly

logarithmic in
B
•
[
Lipmaa
, 2005]:
O(log
2
B)
For practical analysis see:
•
[
Sion, Carbunar
]
On the Computational Practicality of Private Information Retrieval.
their conclusion
:
It is the time

complexity that matters.
In
real

life
:
it is still more practical
to
transmit the entire database.
Extensions
•
Symmetric PIR (also protect privacy of the
database).
[
Gertner, Ishai, Kushilevitz, Malkin
. 1998]
•
Searching by key

words
[
Chor, Gilboa, Naor
, 1997]
•
Public

key encryption with key

word search
[
Boneh, Di Crescenzo, Ostrovsky, Persiano
]
Open problems:
•
Improve efficiency.
•
Construct new extensions.
What was the key property that we
used?
homomorphism of QR
Holy grail
:
fully

homomorphic encryption
Fully

homomorphic encryption
Observe that we constructed a
1

bit
probabilistic public

key
encryption scheme:
Enc(X) =
random
QR
if
X = 0
random
NQR
if
X = 1
{
Which has the following
homomorphic with respect to
xor
:
Enc(X
xor
Y) = Enc(X)
• Enc(Y)
It would be really useful to have
an encryption
scheme
homomorphic
with respect to:
conjunction
and
negation
simultaneously
.
Comments 0
Log in to post a comment