Modular TCP - University of California, Berkeley

gazecummingNetworking and Communications

Oct 26, 2013 (3 years and 5 months ago)


University of California, Berkeley

Fall 2000

Modular TCP

Yunfei Deng
, Kenneth Cheung
, Daniil Khidekel

Professor Jean Walrand


Communication Networks


Since the first introducing of TCP/IP protocols, they have achiev
ed great success
in computer networks. While with the increasing speed of evolution of Internet, there
come more and more various communication network conditions in which the current
TCP/IP protocols have poor performance. In this paper, we describe the n
ew challenges
faced in modern transport protocols and applications. By introducing a new TCP
compatible transport protocol, Modular TCP, we address these issues in the principle of
Application Level Framing (ALF). Modular TCP is aimed to be an application
controlled; connection based flexible transport layer protocol. We present the Modular
TCP protocol designs in details considering semantics of reliability, ordering, and what
need to be change in flow control and congestion control for TCP. We also discus
s the
issues in implementation of this experimental protocol and planned work. The chief
contributions of this project are to design the modular transport protocol based on TCP
and try to demonstrate the flexibility and performance gain in an experimental


Yunfei Deng,
, for EE228a


Kenneth Cheung,
, for CS262a


Daniil Khidekel,
, for CS262a


Modular TCP is extension of standard TCP [RFC 793, TCP] that we introduce to
address the issues of satisfying the demanding requirements of applications in today’s
Internet environments. Although there have been enorm
ous efforts from Internet
community to improve existing TCP/IP protocol suites for performance and functionality
[Floyd]. More often Researchers try to design and experiment various transport protocols
to catch up with the development of communication netw
orks [WebTP]. We look at this
issue and take a conservative approach to solve these problems based on existing TCP by
introducing Application Level Framing mechanism support in TCP and related
semantics. It is designed to be application
oriented with fine
granularity control and will
leverage on existing standards, on
going research to ensure incremental deployment and
high performance.

Modular TCP is a potentially major project and this report only introduces the
initial evolution of it. In the next secti
on, the report illustrates the impact on network
protocols of Internet from the history to recently developments. Section 3 describes the
efforts of Internet community toward improving transport protocols and the problems
needed to consider for next genera
tion transport protocol design. Section 4 of this report
presents the Modular TCP, from conceptual design to details of connection semantics in
the sender side and receiver side. We survey the acknowledgement schemes used for
congestion control in transpor
t protocols and propose the use of Selective
Acknowledgement and Explicit Congestion notification for the future. Some issues of our
planned experimental implementation of Modular TCP are discussed in Section 5. We
conclude this report and present the furt
her work plan on this project in Section 6.

2. Challenges in the New Internet Era

Although computer networks only have history of less than 30 years, the
development of computer networks has been speed up greatly. From the beginning of 21

century, w
e have entered the new Internet Era, in which we have entirely different
network environments from the original networks. These changes have great impact on
requirements on the infrastructure of our networking protocols.

In this new Internet Era, the net
work hosts and users have increased more than 7
magnitudes. According to the statistics from MIDS Internet Growth report [MIDS], the
overall world online users have reached 377 millions, and the Internet hosts with IP
addresses has reached 36,739,000. T
he network traffic also increases in corresponding
manner. These increase are based on vast employment of high
speed Internet backbones,
all kinds of Internet connections to users, and also because of the fast developing
Personal Computer industry. The d
istribution of networks determines that there’s large
variance on conditions of Internet devices and usages. For example, people have high
speed Broadband Internet Access (Cable Modem/DSL) with speed up to 1Mbps and also
many still have to stuck on the slo
w phone modem with only 56Kbps or even 33.6Kbps.
The wireless communication networks also bring people into a complete new network
environment. A new characteristic of wireless network is that it has much large loss ratio
compared with the normal wired net
works. These differences in network conditions ask
the easy adaptation of the transport protocols.

In this evolution of information, people have created and used all kinds of
information or contents, which didn’t existed in the early days of Internet. T
he delivered
content now days on Internet is multimedia including audio, video, or even 3D virtual
world. Different type of media has different type of requirement on quality of
transportation such as reliability, timing, ordering, and integrity. Also user
s may have
different preferences on delivery of different contents. The transport protocols also need
to adapt these needs.

All this kinds of variance in conditions and requirements ask for the help of
transport protocols to present a simple, uniform an
d flexible network interface to the
application developers. But the existing protocols, mainly TCP/UDP on IP were
developed in much simple environments so that they fail or are nor efficient in meeting
these needs.

In this section we present three exampl
es of applications to describe the situation.
The first example is a simple FTP program. The FTP protocol on the top of TCP would
have to stick on the stream data interface provided by TCP though it has much simple
semantics and fixed correlation between t
ransferred data. A good transport protocol
should be able to utilize FTP’s intrinsic less order constraints in transferring files, such as
order packets can be delivery direct to FTP if FTP knows the location where to put
the packets. The omitted re
transmission of out
order packets and possible
optimization of kernel operations save download time of the files and also put less
dataflow into the network traffic.

The second example is a stock quote program, which reports the latest stock price

traders. Since day
traders want the latest quote, then on the events of lost packets
or corrupted packets, Stock quote program doesn’t trigger the re
transmission of those
packets and can start to transmit the new stock prices if available instead.

The third example is an online video
playing program. Users ask for the smooth
play with minimum delay and maximum resolution in the conditions of current
connection. A lost packet in the frame may delay the delivery of the entire frame to video

Video player:

Transport Protocol




Transport Protocol








Fig. 1. Video frame processing

player in
time, so it’s the better that video player ignores the delayed frame, and
continued with data available. But the delayed frame might be still useful for the
program to play the later frames since the coding mechanism so that video player may
keep it in th
e buffer or drop it when it’s no longer useful. While this algorithm works
only if the transport protocol knows about the frames existing, and deliveries them as
units to the player. The sketch picture [Fig. 1.] shows the processing video frames.

3. Tra
nsport Protocol Design

As described in the last section, the old transport protocol TCP/IP in the early
Internet days can’t meet the requirements of the development of communication
networks and applications. The Internet community kept improving the de
signs and
implementations of TCP/IP. These efforts include several TCP improvements including
flow control, congestion control [Stevens], TCP for transactions [RFC 1379, 1644,
T/TCP]. But these improvements based on TCP still keep the concept of reliable i
delivery of stream data transfer of the original TCP. Applications for unreliable data
transfer cannot use TCP as the transport protocol. Many of them are built on the top of

Applications can use UDP directly, but because there are less fea
tures provided
by UDP and it’s unreasonable and impossible to let application developers to implement
all those functions need but not in UDP. So that there were many protocols designed on
the top of UDP and each provides features specific to some kinds of

applications, such as
RTP, RSTP, and SRDP etc, as shown in [Fig. 2]. While these protocols solve some parts
of problems faced by application developers, but they make the communication networks
worse or unfair to other TCP applications. The reason is that

most of these protocols
didn’t consider the cooperation between protocols or didn’t have congestion control
implemented in transport layer. This has unfair impact on the TCP connections because
when congestion happens, TCP detects the congestion and backo
ffs, while UDP
protocols didn’t aware and ignore the congestion, or was delayed in reactions to the
congestion. With more and more various applications used in Internet are based on those
protocols, the Internet is in the danger of unstableness.





Congestion Control


Transport Layer


Fig. 2
. Transport protocol design space

RSTP etc.

transport layer
protocol built
on UDP

Aware of this situation, researchers at Berkeley presented a new transport
protocol, WebTP, motivated by the increasing popularity of the Web
based applications
and user
centric design principle. The WebTP project supports fine
grained and
cific control, and congestion control, though it is aimed for broader goals
such as single/multi user
satisfaction optimization with QoS guarantees [WebTP].

Since most of these transport protocols was designed new, the implementation
and adoption of thes
e protocols may be a problem for applications to utilize them fully.
TCP was created much earlier but it was greatly improved since then. TCP
implementations are widely supported and have tuned with high performance. TCP also
have features of flow control
and congestion control, which are desired. The natural way
to reuse TCP designs and implementations will be extend TCP for the requirements of
conditions. We conclude and design Modular TCP based on this idea. IETF summarized
the requirements for transport

layer by the new development of the Internet in [IETF
RUTS]. The modular TCP is designed to satisfy most of these requirements. Since the
improved TCP has satisfied some of them, it is easier to design Modular TCP.
Overall, the Modular TCP was mainly

designed to add these features into TCP:

support for application level framing

visibility into network conditions

control over reliability

the ability to supercede previous application messages

want to deal with transport at a 'frame' granularity (reco
rd marking)

message priority control

Congestion control (extended for all types of ADU)

4. Modular TCP design


Overview of Modular TCP

The Modular TCP support the principle of Application Level Framing (ALF)
[Clark90]. The idea is that application
s provide data to the transport layer protocol as
Application Data Units (ADU). The fundamental characteristic of ADU is that each ADU
can be processed out of order with respect to other ADUs. This rules permits the ADU
boundaries to take place of the pack
et boundaries for end
end error detection or
correction and other encryption and presentation manipulating operations. The ADU also
permits applications can transfer different ADU with different transport requirements of
quality such as reliability, ord
ering, timing, etc.

Modular TCP is designed to support four types of ADUs with respect to
requirements on reliability and ordering. The Modular TCP connection permits the ADU
flows of 5 levels:

Reliable, in
ordered delivery

Reliable, un
ordered delivery

Unreliable, in
ordered delivery

Unreliable, un
ordered delivery

Mixture delivery

Pure requirements as the first four levels are simple and easy to be consistent. The
mixture delivery is designed for applications with multiple delivery requirements in the
same connection. It would be simpler if the application can use multiple connections with
different delivery requirements. But limited mixture delivery is good for performance in
the case of application which ask for unreliable flows most of time, and only

reliable flow for some special ADUs, such as ADUs with important synchronization or
timing information for applications. The order constraints for mixture delivery ask for
additional semantics to define.

Modular TCP connections are established only
when both sender and receiver
sides agree to use Modular TCP semantics. It’s achieved by defining a Modular TCP
permit option in the TCP header. The proposed format will be:

Modular TCP permit option: {[option kind = 200], [option length = 4], [ADU
y], [ADU max size]}.

During the connection, both sender and receiver need to track all the ADU
information including sequences, sizes, reliability requirements, ordering requirements,
and other possible options like priority. These data are saved in the s
tructure of ADU
header. It’s possible to define another TCP header option just as the same as Modular
TCP permit option. But it’s argued that this option is not flexible and will limit the
available option size to other TCP options, such as SACK option tha
t will be used by
Modular TCP.

There are many changes in TCP to make it support ALF, but basic connection
flow is clear and simple. It is described in [Fig. 3.] and discussed step by step in the
following sub

4.2 Sender of the connection

er side of the connection in Modular TCP keeps most part of TCP for flow
control and congestion control. Sometimes the job is easier since the unreliable packets
are not requited of re

On receiving an ADU from the application, the sender fr
agments the ADU if
needed, and queues the packets into sending queue. Sender still keeps the sliding window
to determine sending packet flow. Once data packet has been sent out, the unreliable
packet data can be discarded since it won’t need anymore, but t
he reliable packet data
still need to be queued for re
transmission in the case of lost and corrupt. The sender also
needs to keep records of outstanding unreliable ADUs, which is used to compare with
acknowledgements for flow control and congestion contro

When receiving acknowledgement from the receiver, which should be SACK,
sender check the acked continuous blocks of data to make marks. After the check, if the
sender find out that the SACK indicates the receiver is waiting for a lost reliable packet,
it start to re
transmit that reliable packet, and also adjust the congest control parameters.
In the case of timeout for a reliable packet, sender also starts re
transmission. In this case,
Fast Retransmit will help. The acknowledgement for unreliable pack
ets is recorded so
that the sender can update the cumulative sequence number to ignore the fact of lost
unreliable packet once the whole unreliable ADU is dropped at receiver.

4.3 Receiver of the connection

Receiver side of the connection in Modular TC
P is more complex than the
original TCP since it must deal with all kinds of delivery requirements mentions in the
overview and also need to acknowledge the received packet correctly to inform the data
sender with the timely information of flow status and
congestion signals.

On the receiving an incoming packet, the data receiver first does normal
checksum calculation on the packet passed from IP. If the packet is not corrupted, it as
passed along to received data queue manager and acknowledged appropriat
ely. If the









Partial ADU

Lost packet

Corrupted packet

Lost ACK

Packet queue




SACK list

Fig. 3. Overview of Modular TCP

packet was corrupted, it was dropped and also acked.

Modular TCP enables the data flow of unreliable transfer, which needs not to be
acked usually. While in this case it still need acknowledgements for transferring
information of flow status
and congestion signals. So Modular TCP chooses to use
Selective Acknowledgements (SACK) [RFC 2018, SACK] for all kinds of ADU flows.
SACK is discussed later in details.

The queue manager receives the data passed and check with queued data blocks to
find a
ny full ADU, which can be repackaged. If a full ADU is available and it’s in the
order or the ADU flow is permitted with out of order delivery, it dequeues the data and
passes them to application. The Application reads the data through the ordinary Socket
interface and process this ADU in its manner. It’s also possible to design unsynchronized
system call to read ADUs. If the available ADU is out of order and not permitted to
delivery out
order, them it has to be buffered in the ADU queue at receiver. T
he queue
manager also updates the statistics and advertises optimal window size in the
acknowledgements. . If the unreliable packet is out of order or delayed, the queue
manager must check the queue to find its ADU, which can be discarded when buffer is
w. If its ADU was discard, this packet also has to be discard.

4.4 Flow control and Congestion control

Modular TCP aims to be compatible with the original TCP and the main goal is to
extend flow control and congestion control to all type of services.
So that flow control
and congestion control in TCP need to change to adapt Modular TCP flows.

Basically flow control and congestion control management module keeps the
same as in TCP since we change the acknowledgement scheme so that the data receiver
n inform the data sender any change of window size and congestion signals in a timely
manner without regarding to types of ADU. We employ Selective Acknowledgement
scheme in Modular TCP so that the data sender can detect the congestion in time and
adopt co
ngestion control to ease the network flow. The new ECN mechanism also looks
promising as congestion indicator. Both of them are discussed in the following sub
section in details.

4.4.1 SACK

Selective Acknowledgement was a proposal to TCP by Ramakrish
nan and Floyd
[RFC 2018, SACK] to improve TCP’s original simple positive cumulative
acknowledgement scheme, in which received segments that are not at the left edge of the
receive window are not acknowledged. SACK option enables the data receiver to report

that non
continuous blocks of data have been received while even some blocks before
them are lost or corrupted. The data sender then can use this information to selective re
transmit only the data needed.

permit option only used in SYN p
ackets into the TCP header
for negotiation in the initialization phase. In the SACK
enabled connection, the data
receiver writes in the SACK TCP option field with a list of blocks of contiguous
sequence space occupied by data that has been received and que
ued within the window.
SACK TCP option takes 8*n+2 bytes so that it usually contains 3 SACK blocks since
TCP option size is limited to 40 bytes.

SACK is only advisory to normal TCP connections considering some
implementations don’t support SACK options.
Modular TCP is designed to fully utilize
the SACK option so that it checks the TCP connection requests and use Modular TCP
semantics only for Modular TCP permitted and SACK option permitted.

4.4.2 Explicit Congestion Notification

TCP's existing congest
ion monitor and control algorithms are based on the notion
that the network is a black
box [Jacobson90]. TCP probes the network state by gradually
increasing the load on the network with increased window size of packets that are
outstanding in the network

until the network becomes congested and a packet is lost
indicated by acknowledgements. This method is appropriate only for pure best
effort data
carried by TCP on low loss ratio and low latency connection since TCP congestion
management algorithms employ

Fast Retransmit and Fast Recovery techniques to
minimize the impact of losses from a throughput perspective. But with introduce of
wireless networks which have high loss ratio, TCP needs to adapt to these cases.

TCP network bandwidth probing by graduall
y increasing the window size until it
experiences a dropped packet will cause the queues at the bottleneck router to build up.
Bottleneck router had to drop packets, which might belong to a loss
sensitive or delay
sensitive connection such as Web browsing
and interactive processes. If router can detect
the congestion in the processing flow and send the congestion warning signals to the
sender of the participating connections, then sender can decrease congestion window size
as the way to avoid the congestio
n to happen really soon. This is the idea of ECN
(Explicit Congestion Notification) [RFC 2481, ECN].

ECN proposes that an ECN field in the IP header with two bits used as the
congestion indication for incipient congestion where the ECN
enabled packets can

sometimes be through routers which only mark rather than drop them. ECN needs the
support from transport protocols of end
systems, such as the negotiation during setup to
determine if they are both ECN
capable, an ECN
Echo flag in the TCP header to inform

the sender when a CE packet has been received, and a Congestion Window Reduced
(CWR) flag to inform the receiver that the congestion window has been reduced.
Modular TCP needs to support this good mechanism, though it has not been widely
employed and only

implemented by few TCP/IP stacks.

5. Experimental Implementation

During the design of Modular TCP, we also look at the issues of implementation.
To be a high
performance transport protocol, it has to be closely integrated with lower
Internet Protocol

and even network layer, thought it is best modularly designed.
According to the principle of Integrated
Layer Processing introduced in [Clark90], it’s
important to distinguish between the architecture of protocol suites and the engineering of
a specific e
system. The key architectural principle should be flexible decomposition:
the deferral of engineering decisions to the implementation and the avoidance of
inessential constraints. Thus careful consideration with implementation in mind at design
phase is


We decide that the experimental implementation of Modular TCP should be
developed in Linux. Linux TCP/IP protocol suite is one of existing TCP/IP
implementations with most advance features. For example, Linux supports Selective
t (SACK) since kernel 2.1.19, and it also support Explicit Congestion
Control (ECN) in kernel 2.4.test
10. More important is that Linux TCP/IP is open
sourced. There’s vast resource about Linux available online and also many kernel
developers, which should

make the implementation easier. It’s also good for adoption and
evolution of Modular TCP suite on the Internet.

Linux TCP/IP implementation has high performance or throughput based its good
quality, though it also means it’s harder to make changes. It e
mploys the Integrated
Processing principle, and also recommendations in the TCP Control Block
Interdependence [RFC 2140]. It uses the generic BSD socket interface for application,
which should be kept in Modular TCP. BSD sockets are higher
level abs
tractions of
INET sockets. The key common data structure is the socket buffer or sk_buff, which
enables maintaining the strict layering of protocols without wasting time copying
parameters and payloads back and forth. Modular TCP would add additional param
to TCP control block for support ALF/ADU in TCP. It’s estimated Modular TCP need
add additional 1/3 of the protocol stack codes.

6. Conclusion

In this project, we presented the Modular TCP, an extension to TCP in the new
Internet environments. W
e analyzed the challenges and requirements from the network
applications right now and projected in the near future, and described what features is
demanded for transport layer protocols. Based on these requirements, the design of
Modular TCP was presented

with special sub
sections for congestion control, which’s the
main feature we want keep it and make it working efficiently for all cases. Solution of
congestion control based on Selective Acknowledgement was discussed and also the
Explicit Congestion Cont
rol was projected that it will be great help though it still in
research. Experimental implementation on Linux was discussed. Overall we are excited
with introduce of Modular TCP and will continue work on this project. Future plans
include some simulations

to study the correctness and performance of designed protocol
and to determine several engineering decisions by using network simulation

[ns]. Also
planed is the experimental implementation of Modular TCP on Linux.


[MIDS] MIDS Internet


[Clark90] David D. Clark, David L. Tennenhouse,

Architectural Considerations for a
New Generation of Protocols
, Lab. for Computer Scie
nce, M.I.T.

Requirements for Unicast Transport/Sessions bof
, Dec. 1998.

[Floyd] Sally Floyd, website,

[Rizzo96] Luigi Rizzo,
TCP re
transmission on very lossy networks
, Jan 1996

[Steven] W. Richard Stevens,
TCP/IP Illustrated, I, II, III


Ye Xia , Hoi
Sheung Wilson So, Venkat Anantharam, Steven McCanne, David
Tse, Jean Walrand and Pravin Varaiya,
The WebTP Architecture and algorithms
Memorandum No. UCB/ERL M00/53, Electronics Research Laboratory, University of
California, Berkeley, Jan. 1
5, 2000

[Jacobson90] V. Jacobson,
Modified TCP Congestion Avoidance Algorithm
, Message to
interest mailing list, April 1990

[ns] The Network


[RFC 793, TCP] ISI,
Transmission Control Protocol
, RFC 793, ITEF, 1981

[RFC 2001] W. Stevens,
TCP Slow Start, Congestion Avoidance, Fast Retransmit,
Fast Recovery Algorithms,

RFC 2001, IETF, January 1997

[RFC 2018, SACK] M. Mathis,J. Mahdavi, S. Floyd, A. Romanow,
TCP Selective
Acknowlegement Options
, RFC 2018, IETF, Oct. 1996.

[SACKTCP] Sally Floyd, Jamshid Mahdavi, Matt Mathis, Matthew Podolsky
, and Allyn
An Extension to the Selective Acknowledgement (SACK) Option for TCP
Internet Draft, August 1999

[RFC 1379, T/TCP] Braden, R. T.,
Extending TCP for Transac

1379, ITEF,
Nov 1992

[RFC 1644, T/TCP] Braden, R. T.,

TCP Extensions for Transactions
Functional Specification
, RFC 1644, ITEF,
July 1994

[RFC 2481, ECN] Ramakrishnan, K.K., and Floyd, S.,
A Proposal to add Explicit
Congestion Notification (ECN) to IP

RFC 2481, January 1999

[RFC 2140] J. Touch,
TCP Control Block Interdependence
, RFC 2140, IETF, April 1997