Figure 1 - Standard data-flow on TCP protocol

blackstartNetworking and Communications

Oct 26, 2013 (3 years and 11 months ago)

94 views

Mapping of scalable RDMA protocols to ASIC/FPGA platforms


Yosef Gavriel Tirat
-
Gefen, PhD

Castel Systems Inc., 3900 Bradwater Street, Fairfax, VA

yosefgavriel@computer.org
, Phone: 703
-
426
-
0723


Extended Ab
stract

1. Introduction


This paper describes a scalable FPGA/ASIC implementation of the Remote DMA (RDMA) protocol
running on top of the Stream Control Transmission Protocol (STCP). This work has key importance for
high performance computing applications o
f the interest of the aerospace and defense communities. The
RDMA protocol will allow massive transfers of data between sites connected through a wide area network
(WAN). It will enable the transport of large amounts of classified data across the continent
al U.S. for
applications involving connections among secure supercomputing facilities. Possible other applications
would include Terabyte/Petabyte Storage Area Networks, support on distributed simulation of new
weapons/airplanes, high bandwidth inter
-
proce
ssor communication in military command and control sites,
and off
-
site processing of large amounts of sensitive data.


The RDMA and SCTP protocols address the future needs for a substitute for the TCP protocol, the
present backbone of the internet and mil
itary communication networks. The current work describes the
necessary intellectual property (IP) modules necessary for a FPGA or ASIC scalable implementation of
the RDMA and SCTP protocols in to speed as high as 80 Gbps, using off
-
the
-
shelf components for

the
PHY and MAC layers and the deploying new high speed serial communication standards such as PCI
-
Express and Hyper
-
transport.


2. Related Work


The Transmission Control Protocol (TCP) limits the CPU efficiency and increases the overall latency of
commun
ication between processes connected through a WAN. The advent of the Gigabyte Ethernet [1]
has lead the appearance of off
-
load processors, that are still unable to provide a
zero
-
copying

way of
exchanging large amounts of data. The use of ATM/SONET [3] net
works across the U.S. has enabled the
diffusion of WANs supporting the R&D efforts of the aerospace and military industries. Design teams in
different parts of the U.S. would like to be able to share large sets of data without geographic constraints.


The

RDMA [2][5] protocol is tan outgrowth of the Infiniband standard [4]. The use of the RDMA protocol
using Direct Data Placement (DDP) [6] over Reliable Transports such as SCTP [7], along with the new
high speed serial communications provisions such as PCI
-
Express [8] and Hyper
-
transport [9] for data
rates as high as 80Gbps, will provide a platform independent set of standards that will shape the high
-
performance computing installations to be used by the defense and aerospace communities.



3. Discussion



Appl
ication A

Memory Space

TCP Buffer/Stack

Memory Space

TCP Buffer/Stack

Memory Space

Application B

Memory Space

WAN/LAN

L3 L2 L1

L1 L2 L3


Figure 1
-

Standard data
-
flow on TCP protocol

TCP is the traditional way to allow connection among processes or application separated by a wide area
network (WAN) or a local area network (LAN). An application A (Figure 1) is supposed to copy the data to
be transferred to application B

to its TCP buffer space.


This approach leads to losses of CPU clock cycles. The use of off
-
loading processors minimizes the need
of data copying on the sender side, but the receiver is still supposed to copy the data from the network
buffer allocated to
the off
-
load processor on the receiver end (Figure 2). Several data copies are
necessary due to the presence of cache memory (Figure 2).



The RDMA protocol is an attempt to minimize unnecessary data copies, i.e. by allowing
zero
-
copying

rem
ote data transfers. Future RDMA NICs will allow that allowing direct access to memory space of
applications sharing data across a WAN/LAN (Figure 3).



This work describes the VHDL implementation of IP cores supporting RDMA and other nece
ssary
standards (SCTP and DDP) to be used by future RDMA NICs for speeds from 10 Gbps to 80 Gbps. We
assumed that technologies such as PCI
-
Express [8] and Hyper
-
transport [9] will become prevalent in next
ten years, so our work focus on layer L3 and above.

Data traffic at rates above 10 Gbps are split on one
or more different lanes of 10 Gpbs allowing low cost scalability (Figure 4).


The final paper describes the main VHDL modules necessary for mapping the RDMA, DDP and SCT
P
protocols to an ASIC or FPGA platform supported by the PCI
-
Express or Hyper
-
transport standards.



TCP off
-
load Processor


TOE/NIC Card




Host Main Memory


Host CPU Cache Memory


Host CPU

Network buffer

Receive Buffer

Figure 2
-

Zero
-
copying and TCP offloading processing

WAN/LAN

Host Memory

Host Memory

WAN

Host CPU A

Host CPU B

Application
Memory

Space

Application
Memory

Space

RDMA NIC Card

RDMA NIC Card

Figure 3
-

RDMA

data
-
flow for WAN applications

RDMA NIC Card for WAN

RDMA Engine

Tx Buffer

Rx Buffer




MAC



PHY

WAN

10 Gbps
links




Host

Figure 4
-

Scalable WAN
-
RDMA for bandwidths above 10 Gbps

> 10 Gbps

Validation and modeling issues are addressed to the satisfaction of interested designers in the defense
and aerospace communities. Synthesis and Mapping resu
lts for Xilinx FPGAs are highlighted. Issues
related to the scalability of the intellectual property (IP) cores to speeds above 20 Gbps are discussed.


4. References


[1] CUNNINGHAM, D. and LANE, W., “Gigabit Ethernet Networking”, Macmillan Technical Publ
ishing,
1999.

[2] CULLEY, P. et al., “Maker PDU Aligned Framing for TCP Specification”, Draft Paper, RDMA
Consortium,
www.rdmaconsortium.org
, October 2002.

[3] DUTTON, H. and LENHARD, P., “Asynchronous Transfe
r Mode (ATM)”, Second Edition, Prentice Hall,
1995.

[4] InfiniBand Trade Association, “InfiniBand Architecture Specification”, Release 1.0, October 2000.

[5] RECIO, R., “An RDMA Protocol Specification”, Draft Paper, RDMA Consortium,
www.rdmaconsortium.org
, October 2002.

[6] SHAH, H. et al, “Direct Data Placement over Reliable Transports”, Draft Paper, RDMA Consortium,
www.rdmaconsortium.org
, October 2002.

[7
] STEWART, R. et al., “Stream Control Transmission Protocol”, Request for Comments (RFC) 2960,
The Internet Society, October 2000.

[8] WILEN, A., SHADE, J., THORNBURG, R., “Introduction to PCI Express”, Intel Press, 2003.

[9]
www.amd.com
, Advanced Micro Devices website, December, 2003.