HYDRA: A Flexible PQC Processor

shoulderslyricalΤεχνίτη Νοημοσύνη και Ρομποτική

21 Νοε 2013 (πριν από 3 χρόνια και 6 μήνες)

90 εμφανίσεις

HYDRA: A Flexible PQC Processor

Chen
-
Mou Cheng

National Taiwan University

November 16, 2012

Acknowledgment


Joint work with Bo
-
Yin Yang (Academia
Sinica
)
and Andy Wu

Post
-
quantum cryptography


Hash
-
based cryptography


Code
-
based cryptography


Lattice
-
based cryptography


Multivariate cryptography

Multivariate cryptography


Composition of maps


Public
quadratic

polynomials


F
1

and
F
k

are affine

(

y

=
A

x

+
b
)



Step 2. Encryption

p

――――→
E

――――→
c




easy↑ ↓hard

Step 1. Generation

p


F
1


F
2

… →
F
k



c




↓easy ↓easy easy↓

Step 3. Decryption

p


D
1


D
2

… ←
D
k


c

Classification of
multivariates


Big
-
field
multivariates


Matsumoto
-
Imai derivatives


SFLASH, HFE


Small
-
field (or true)
multivariates


Unbalanced Oil
-
and
-
Vinegar derivatives


Rainbow, TTS

Security of UOV


MQ: Multivariate quadratics direct attacks


Gröbner

bases: XL, F
4
/F
5

families


EIP: Extended Isomorphism of Polynomials,

a.k.a.
r
ank or linear algebra attacks


Low rank attack


High rank attack


Reconciliation attack




The HYDRA processor


A scalable, programmable crypto coprocessor


Accompanying
toolchains

and software
libraries


API to raise abstraction level for developing
security applications


Allowing aggressive experimentation with PKC,
especially PQC

Slogans


Cheap PKC


Hardware acceleration of core computation


Customizable for multiple vertical markets, allowing
cost sharing


Future
-
proof PKC


Algorithm agility, allowing “BIOS upgrades”


PQC to resist emerging quantum
-
computers’ attacks


Management
-
free PKC


Lower total cost of ownership via PKC


Identity
-
based crypto


No more PKI!


“If we build them [cheaply], they will come”

Target cryptosystems

Scheme

Low Security (2
80
)

High Security

(2
112
,2
128
)

ECC

NIST 2K160

NIST 2K233 (112bit)

NIST P192

NIST P256, Curve25519

GLS1271

Surface1271

(HEC)

Pairings

BN(
Barreto
-
Naehrig
)
161

BN256

LD(Lopez
-
Dahab
)2
271

LD 2
1223,
Beuchat

3
509

NTRU

ees251ep7

ees347ep2

(112bit)


(
q
=2 instead

of
q
=3)

ees397ep1 (128bit)

MQPKC

Rainbow(q=16 or 31;24,20,20)

Rainbow (q;

32, 32, 32)

TTS

(q=16 or 31; 24,20,20)

3HFE(7
31
)
-
p

3HFE(7
47
)
-
p

ASIC prototyping of NTRU

ASIC prototyping of TTS


ASIC prototyping of
F
p

multiplications

The Hydra microarchitecture

D$

Axpy

engine

Decoder

I$

Memory bus

μ
C DMA


Axpy
-
style ISA for regular data movement
between cache &
datapath
, i.e.,
Y

a

X

+
Y,
where |
a
| = w, |X| =
l
w
, |Y| =
l
w

or (
l

+
1)w


Wide & flexible vector
datapath


DMA engine to (pre
-
)fetch and
store data to
fill up vector
datapath

as much as possible


General
-
purpose
m
C

for complex I/O

Design ingredients


Core operation: Multiplication in Z[x]/(x
n
-
1)



Key generation



Encryption



Decryption


Review: NTRU cryptosystem


Randomly choose
f

and
g

with small coefficients


Find
f
p

,
f
q

such that
f
p
f

= 1
mod
p

and
f
q
f

= 1
mod
q


Public key:
h
=
pf
q
g


Private key:
f
,
f
p



Randomly generate
r

with coefficients in [
-
1,1]


c
=
rh
+
m


a
=
fc
, with coefficients in [
-
q
/2,
q
/2]


m
=
af
p
, with coefficient in [
-
p
/2,
p
/2]

a
4

a
3

a
2

a
1

a
0

x

b
4

b
3

b
2

b
1

b
0

a
4
b
0

a
3
b
0

a
2
b
0

a
1
b
0

a
0
b
0

a
3
b
1

a
2
b
1

a
1
b
1

a
0
b
1

a
4
b
1

a
2
b
2

a
1
b
2

a
0
b
2

a
4
b
2

a
3
b
2

a
1
b
3

a
0
b
3

a
4
b
3

a
3
b
3

a
2
b
2

+

a
0
b
4

a
4
b
4

a
3
b
4

a
2
b
4

a
1
b
4

c
4

c
3

c
2

c
1

c
0

Multiplications in NTRU


p
=2,
q
=307,
n
=397


Message
m
: 397 bits


Signature
c
: (Z
307
)
397
,

397x9 bits


Public key
h
: (Z
307
[x])/(x
397
-
1),

397x9 bits



Private key

NTRU ees397ep1

f
:
(Z
307
[x])/(x
397
-
1),

397x9 bits

-

Contains
74 nonzero elements

f
p
: (Z
2
[x])/(x
397
-
1),
= 397x1
bits



Message
z
: (GF
31
)
40
,

200 bits


Signature
w
: (GF
31
)
64
,

320 bits


Public key
P
: (GF
31
)
40x2080
,

416 Kbits


Bottleneck: Quadratic polynomial evaluation


Private key:

44244 bits


Bottleneck: Linear maps and system solving

Review: TTS cryptosystem


Core operations are finite
-
field arithmetic


Bottleneck for prime fields: Modular multiplication


Euclid’s division:
y
=
qn
+
r
, 0<=
r
<
n


Hensel’s

division:
y
+
qn
=
p
k
r
, 0<=
r
<2
n
,
p

prime


Montgomery method


x



p
k
x

mod
n
: ring homomorphism if (
p,n
)=1


Precompute

p’
,
n


such that
p
k
p

-
nn

=1


q



(
y

mod
p
k
)
n’


q’



(
q

mod
p
k
)
n


r



(
y
+
q

)/
p
k

Review: Elliptic curve pairing


Problem: Given A, B, M, compute AB mod M


Idea: Works in an isomorphic ring


A

AR mod M and B

BR mod M


Need a way to compute ABR mod M


Solution: (
x,y
)

M

(
xy
)/R mod M


T

(AR mod M)(BR mod M)


Can add multiple of M since mod M


T +
xM

= 0 mod R, therefore x =

M

1
T mod R


(AR,BR)


M
(T + (

M

1
T mod R)M)/R = ABR mod M

Montgomery method: More details


X = (
x
n



1

x
n



2

… x
0
), x
i

in {0,…,2
w



1}


S

0


for
i

in 0 .. n


1


q
i

s
0

+ a
i
b
0
(

M

1
) mod 2
w


S

(S +
a
i
B

+
q
i
M
)/2
w


[loop invariant: S in {0,…,M + B


1}]


[post condition: 2
nw
S = AB + QM]

Multi
-
precision Montgomery


Recall:
Y

a

X

+ Y


|
a
| =
w, |
X| =
l
w
,
|
Y| =
l
w

or (
l

+ 1)w


Type
i

(for pairing)


a

in {0,…,2
w


1
}, X in {0,…,2
l
w



1},

Y in {0,…,2
(
l

+ 1)w



1}


•,
+: the usual integer multiplication and addition


Type q (for TTS)


a

in
F
q
,
X in
F
q
l
,
Y in
F
q
l
, and q ≤ 2
w


•,
+: scalar multiplication and vector addition in
l
-
dimensional vector spaces over
F
q

The main Hydra ISA


X
in
Z
q
l
, Y in
Z
q
l

such that q
≤ 2
w


a

in
Z
p
h

such that
h
[
lgp
]
≤ 2
w

Type r
Axpy

instructions

a
4

a
3

a
2

a
1

a
0

x

b
4

b
3

b
2

b
1

b
0

a
4
b
0

a
3
b
0

a
2
b
0

a
1
b
0

a
0
b
0

a
3
b
1

a
2
b
1

a
1
b
1

a
0
b
1

a
4
b
1

a
2
b
2

a
1
b
2

a
0
b
2

a
4
b
2

a
3
b
2

a
1
b
3

a
0
b
3

a
4
b
3

a
3
b
3

a
2
b
2

+

a
0
b
4

a
4
b
4

a
3
b
4

a
2
b
4

a
1
b
4

c
4

c
3

c
2

c
1

c
0

Next steps


Prototype implementation


Bulk of the work goes here


SystemC
-
based ISA simulator


Compiler construction


Maybe to base on LLVM


Questions
or comments?

Thank you!