A Hardware Based TCP/IP Procesing Engine - Applied Research Lab

standguideNetworking and Communications

Oct 26, 2013 (3 years and 9 months ago)

84 views

Department of Computer Science and Engineering

Applied Research Laboratory

1

A Hardware Based TCP/IP
Processing Engine


David V. Schuehler

dvs1@arl.wustl.edu

Department of Computer Science and Engineering

Applied Research Laboratory

2

Outline


Problem Statement and Motivation



Challenges



Description of Architecture



Traffic Analysis



Current Results and Future Work


Department of Computer Science and Engineering

Applied Research Laboratory

3

Background


Transmission Control Protocol (TCP) provides a virtual
bit pipe between two end nodes


Byte streams generated at source are delivered to destination


Connection oriented protocol


Retransmission services


Flow control services



Internet Protocol (IP) provides message routing services


Datagram is supported transmission unit


Unreliable


Connectionless



TCP is an important protocol


All interesting data on the Internet is transmitted via TCP

Department of Computer Science and Engineering

Applied Research Laboratory

4

Problem Statement


Given an arbitrary network, design a solution
which provides access to TCP stream content at
various locations within the network

Core Router

Edge Router

A

B


Department of Computer Science and Engineering

Applied Research Laboratory

5

Objective


Reconstruct TCP data streams from
individual network packets


Operate at Internet backbone data rates


OC
-
48 (2.5Gbps) and above


Support millions of simultaneous flows


Maintain per
-
flow context information


Provide enhanced flow management services


Contain a simple client interface




Department of Computer Science and Engineering

Applied Research Laboratory

6

Target Platform


Design must support implementation using logic
and memory devices



Example platform


FPX card


Xilinx Virtex 2000E


2MB ZBT SRAM


1 GB SDRAM


PC100 (100Mhz)

Department of Computer Science and Engineering

Applied Research Laboratory

7

Motivation


Most Internet traffic is TCP based


Network solutions will require access to TCP data


Virus detection and elimination


Viruses spread to machines world wide


Consume computing resources


Reduce network throughput


Content filtering


50% of email traffic is spam


Corporate security


Content based routing


Extensible networking solutions


Stream reassembly is required


Processing packets separately provides insufficient
coverage of network content






Department of Computer Science and Engineering

Applied Research Laboratory

8

Related Work


Software based approaches


tcpdump & httpdump


Ethereal


Internet Protocol Scanning Engine


Packet Scope & BLT (AT&T)


Cluster based online monitoring



Hardware based approaches


TCP reassembly & state tracking (Georgia Tech)


Department of Computer Science and Engineering

Applied Research Laboratory

9

Related Technologies


Load balancing systems


Content (cookie) based request routing


Delayed binding technique


Limited to scanning start of flow


Intrusion Detection Systems


Perform stream reassembly and content scanning


Traffic Rates < 1Gbps


TCP offload engines


Move TCP protocol processing to NIC


Targeting Gigabit NIC market


Intel, NEC, Adaptec, Lucent, and others


Department of Computer Science and Engineering

Applied Research Laboratory

10

Outline


Problem Statement and Motivation



Challenges



Description of Architecture



Traffic Analysis



Current Results and Future Work


Department of Computer Science and Engineering

Applied Research Laboratory

11

Challenges


Matching individual packets to flows


96 bit exact match


High rate of insert & delete events


Operational environment requires high performance


Resequencing of out
-
of
-
order packets


Passive solution


Annotate sequence gap


Forward packet & store data for later delivery to monitor


Active solutions


Drop selected out
-
of
-
order packets


Buffer packet for later in
-
order transmission


Dealing with idle flows


Handling resource exhaustion


Drop packet


Ignore packet


Resource reclamation


Providing different levels service


Selectable on a per
-
flow basis


Monitoring flows at core routers


Coordinating traffic amongst multiple nodes



Department of Computer Science and Engineering

Applied Research Laboratory

12

Challenges (cont)


Processing packet fragments


Passive or active solution


Reassemble original IP frame or process fragments


Jumbo frames (9k packets)


Supporting large numbers of flows


Backbone links can carry millions of active flows


Maintaining per
-
flow context information


Larger per
-
flow records support more complex solutions


Providing enhanced flow manipulation features


Blocking and unblocking flows


Terminating and ensuring they are terminated


Support flow modification


Monitoring bidirectional traffic


Alter response traffic based on request traffic


Providing for advanced content manipulation


Altering previously processed data


Requires buffering


TCP slow start




Department of Computer Science and Engineering

Applied Research Laboratory

13

Outline


Problem Statement and Motivation



Challenges



Description of Architecture



Traffic Analysis



Current Results and Future Work


Department of Computer Science and Engineering

Applied Research Laboratory

14

TCP Processing System

Department of Computer Science and Engineering

Applied Research Laboratory

15

TCP Protocol Processing Engine

Department of Computer Science and Engineering

Applied Research Laboratory

16


Simple interface



Supports multiple retrieval algorithms



512 MByte SDRAM module



64 bytes of state per flow


32 bytes used by TCP Processing Engine


32 bytes available for Application



8 million active flows supported

State Store Manager Features

Department of Computer Science and Engineering

Applied Research Laboratory

17

State Store Manager

Department of Computer Science and Engineering

Applied Research Laboratory

18

Per
-
Flow State Store Record

Department of Computer Science and Engineering

Applied Research Laboratory

19

Hash Implementation Tradeoffs


Unlimited hash entry chaining


Pro:

Best option for fully monitoring all flows


Con:

Poor worst case performance




Excessive time required to perform lookup



No hash entry chaining


Pro:

Easy to implement



Fast


Con:

Potential for incomplete monitoring of flows



Limited hash entry chaining


Pro:

Bounded time to perform lookup


Con:

Potential for incomplete monitoring of flows




Excessive time required to perform lookup

Department of Computer Science and Engineering

Applied Research Laboratory

20

Outline


Problem Statement and Motivation



Challenges



Description of Architecture



Traffic Analysis



Current Results and Future Work


Department of Computer Science and Engineering

Applied Research Laboratory

21

Flow Classification Analysis


Analyze Internet backbone captures


Evaluate hashing functions


Detect traffic patterns



N
ational
L
aboratory for
A
pplied
N
etwork
R
esearch (NLANR)


Difficulty in retrieving data sets


Department of Computer Science and Engineering

Applied Research Laboratory

22

Hash Table Analysis

Source

BWY

(OC
-
3)

MEM

(OC
-
3)

ADV

(OC
-
3)

TXS

(OC
-
3)

AIX

(OC
-
12)

ANL

(OC
-
3)

MEM

(OC
-
3)

COS

(OC
-
3)

Total
Packets

2,975k

227,641

180,557

17,171

239,817

163,267

235,898

2,668k

TCP Packets

2,247k

75%

13,232

6%

179,166

99%

1,858

11%

80,307

33%

117,8067
2%

23,649

17%

2,445k

92%

TCP Flows

27,846

466

14,365

180

2,710

6,501

970

109,997

Cache Hits

2,146k

10,608

142,031

1,280

77562

92,880

21,265

2,140k

Collisions

(New
-
Old)

286


725

1


6

65


10

0


0

15


35

78


40

2


4

1713


1752

Table Usage

17,401

261

4840

51

2695

2521

579

50,313

Deepest
Bucket

3

2

2

1

2

3

2

4

Department of Computer Science and Engineering

Applied Research Laboratory

23

Consecutive Small Packets


TCP data packets (0 < data length < 64)

Source

BWY

(OC
-
3)

MEM

(OC
-
3)

ADV

(OC
-
3)

TXS

(OC
-
3)

AIX

(OC
-
12)

ANL

(OC
-
3)

MEM

(OC
-
3)

COS

(OC
-
3)

Small
Packets

266,239

1,342

6,326

597

37,563

7,521

6,142

116,711

Consecutive
Small Pkts

23

2

17

3

12

10

15

46

Min time
between SP

0

20us

1us

236us

0

1us

2us

0

Max time
between SP

392us

1.8ms

9.4ms

56ms

3.5ms

3.6ms

4.5ms

492us

Avg time
between SP

26us

538us

355us

32ms

349us

452us

260us

29us

Department of Computer Science and Engineering

Applied Research Laboratory

24

Minimum Length Packet
Processing

FPX operational
environment

Current technology

Department of Computer Science and Engineering

Applied Research Laboratory

25

Average Length Packet Processing

FPX operational
environment

Current technology

Department of Computer Science and Engineering

Applied Research Laboratory

26

Outline


Problem Statement and Motivation



Challenges



Description of Architecture



Traffic Analysis



Current Results and Future Work

Department of Computer Science and Engineering

Applied Research Laboratory

27

Place & Route Results


Including Protocol Wrappers & Application



Number of BLOCKRAMs


68 out of 160


(42%)


Number of SLICEs


6579 out of 19200


(34%)



Minimum period: 14.910ns


Maximum frequency: 67.069MHz


Department of Computer Science and Engineering

Applied Research Laboratory

28

TCP Processing Circuit Layout for
Xilinx Virtex 2000E

Department of Computer Science and Engineering

Applied Research Laboratory

29

Future Work


Multi
-
node coordinated monitoring


Bi
-
directional flow monitoring


Packet reordering schemes


Resource exhaustion & resource reclamation


Packet classification & lookup algorithms


Selectable per
-
flow monitoring


Performance enhancement


Memory contention prevention


Flow modification


Extensible networking solution integration


Worst case traffic loads


Traffic analysis

Department of Computer Science and Engineering

Applied Research Laboratory

30

Architecture for a Hardware Based,
TCP/IP Content Scanning System


David V. Schuehler

dvs1@arl.wustl.edu