Development of a TCP/IP stack in a real time embedded system

hollowtabernacleNetworking and Communications

Oct 26, 2013 (3 years and 10 months ago)

231 views

Development of a TCP/IP
stack in a real time embedded
system
Stefan Jakobsson & Erik Dahlberg
August 21,2007
Master’s Thesis in Computing Science,2*20 credits
Supervisor at CS-UmU:Jan Erik Mostr¨om
Examiner:Per Lindstr¨om
Ume
˚
a University
Department of Computing Science
SE-901 87 UME
˚
A
SWEDEN
Abstract
A communication protocol is a description of the rules computers must follow to
communicate with each other.The TCP/IP protocol suite refers to several separate
protocols that computers use to transfer data across networks.It is named after the
two most important protocols in its suite,Transmission Control Protocol (TCP) and
Internet Protocol (IP).TCP/IP is widely used because its ability to switch packets from
all shapes and sizes and varieties of networks.
The aim of this Master Thesis was to implement a TCP/IP stack in an embedded
system.The intended embedded system demanded some requirements on the TCP/IP
stack.Only the most important protocols,and features in these protocols,should be
implemented.Memory usage should be kept low,and MISRA-C coding rules should be
employed.
This report presents how the TCP/IP stack is implemented as well as information
about involved hardware and software.The report also gives an introduction to some
basic concepts regarding a TCP/IP stack and also information about the system in
which the stack resides.
ii
Contents
1 Introduction 1
1.1 Outline of this report.............................1
2 Problem Description 3
2.1 Problem Statement..............................3
2.2 Goals.....................................3
3 VCS 5
3.1 VCS System..................................5
3.2 VCS Core...................................8
3.3 Rubus OS...................................8
3.3.1 The Red Kernel...........................10
3.3.2 The Blue Kernel...........................11
3.3.3 The Green Kernel..........................12
3.3.4 Rubus Visual Studio.........................13
4 Software Development For Safety Critical Systems 15
4.1 Introduction..................................15
4.2 System safety.................................16
4.3 Software failuers:What can go wrong?...................16
4.3.1 Therac-25...............................17
4.3.2 Ariane 5................................17
4.4 What is the problem?............................17
4.5 Language Selection..............................18
4.5.1 Java..................................20
4.5.2 Ada..................................21
4.5.3 C....................................21
4.6 Conclusion..................................21
4.7 Development of safety critical software...................22
4.7.1 Correctness of software based systems...............23
4.7.2 When is Software Ready for Production?.............25
iii
iv CONTENTS
5 TCP/IP 27
5.1 Overview...................................27
5.2 Protocol....................................27
5.3 Requirements.................................27
5.3.1 Application layer...........................28
5.3.2 Transport layer............................28
5.3.3 Network layer.............................28
5.3.4 Network interface layer.......................28
5.4 Socket.....................................28
5.5 Network Layer Protocols...........................30
5.5.1 Internet Protocol (IP)........................30
5.5.2 Addressing..............................30
5.5.3 Fragmentation............................31
5.5.4 Header format............................32
5.5.5 Address Resolution Protocol (ARP)................33
5.5.6 Internet Control Message Protocol (ICMP)............34
5.6 Transport Layer Protocols..........................35
5.6.1 Transmission Control Protocol (TCP)...............35
5.6.2 User Datagram Protocol (UDP)..................39
5.7 Application Layer Protocols.........................40
5.7.1 File Transfer Protocol (FTP)....................40
6 Hardware 41
6.1 Phytec phyCore-XC167...........................41
6.2 Microcontroller Infenion XC167CI.....................43
6.3 Ethernet Controller Cirrus Logic CS8900A.................45
7 System description 47
7.1 Overview...................................47
7.2 Telnet Server.................................47
7.2.1 Negotiation..............................48
7.2.2 Interface................................48
7.3 FTP Server..................................50
7.3.1 Supported Commands........................50
7.4 Socket.....................................51
7.5 TCP......................................53
7.5.1 Data structures............................53
7.5.2 Buffers.................................53
7.5.3 State machine.............................54
7.5.4 Establishing a connection......................55
7.5.5 Sending data.............................55
7.5.6 Receiving data............................59
CONTENTS v
7.5.7 Timers.................................61
7.5.8 Congestion control..........................61
7.6 UDP......................................63
7.7 IP layer....................................64
7.7.1 Upper level interface.........................64
7.7.2 Lower level interface.........................64
7.8 QueueLayer..................................65
7.8.1 ARP-module.............................67
7.9 HAL......................................69
8 Conclusions 71
8.1 Limitations..................................71
8.2 Future work..................................71
9 Acknowledgments 73
References 75
A Abbreviations 77
B Socket Application Programming Interface (API) 79
B.1 General....................................79
B.2 Usage.....................................79
B.2.1 Creating a socket...........................80
B.2.2 Accepting a connection.......................80
B.2.3 Connect to a peer socket.......................81
B.2.4 Send and receive data........................81
B.2.5 Close a socket.............................82
B.3 Code example.................................82
C Ethernet Hardware Abstraction Layer 85
C.1 General....................................85
C.2 Usage.....................................85
C.2.1 Intialize Ethernet Controller.....................85
C.2.2 Interrupts...............................86
C.2.3 Identifying The Ethernet Controller................86
C.2.4 Link Status of Ethernet Controller.................86
C.2.5 Send a Ethernet Frame.......................86
vi CONTENTS
List of Figures
3.1 Example of a VCS System..........................5
3.2 System overview...............................6
3.3 Inside a VCS node..............................8
3.4 Rubus overview................................9
3.5 Clock scheduler in the red kernel......................10
3.6 Legal state transitions............................11
3.7 Blue thread scheduling in Rubus......................12
3.8 Rubus Visual Studio.............................13
5.1 An example of a network..........................30
5.2 IP header format...............................32
5.3 A small Local Area Network example....................33
5.4 TCP header format..............................35
5.5 Example of sequence and acknowledgment numbers............36
5.6 Example of how a TCP connection is established..............37
5.7 UDP header format.............................39
6.1 Block diagram of phyCore-XC167......................42
6.2 Modification of phyCore-XC167.......................42
6.3 XC167CI Memory map............................44
6.4 Overview of XC167CI’s on-chip components................45
7.1 The TCP/IP-stack is built in a layered structure.............47
7.2 Data buffer organization for a TCP packet with N bytes of data....54
7.3 Overview of the send-algorithm.......................56
7.4 Receive procedure for the Application-Thread...............59
7.5 Receive procedure for the Receive-Thread.................60
7.6 Packet queues buffer organization......................65
7.7 Application-Thread..............................66
7.8 Work process for the Receive-Thread....................67
B.1 System structure...............................79
vii
viii LIST OF FIGURES
C.1 System structure...............................85
List of Tables
4.1 Definitions of controllability levels.....................24
5.1 Address classes in IPv4...........................31
5.2 A possible ARP table in node 192.168.0.17................33
5.3 ICMP message types.............................34
6.1 Memory organization in CS8900A.....................45
6.2 CS8900A I/O port mapping.........................46
ix
x LIST OF TABLES
Chapter 1
Introduction
The use of embedded systems are common in many areas today.These systems vary
in size,scope of use and complexity and resides in everything from toasters to space
shuttles.An embedded system is a combination of computer hardware and software
that is manufactured to handle a specific task.Although an embedded systemis usually
a single-purpose application,it is often integrated with other embedded systems which
together performs advanced functions.
Ordinary personal computers are designed to satisfy a variety of users and to run
many different kinds of applications.Embedded systems can be specialized in terms of
hardware and software for its specific task.This limited scope of use makes it possible
to design these systems to perform as efficiently as desired,and thereby keeping the
costs down.One domain using embedded systems is the vehicle industry.A modern
car contains many embedded systems which handles functionality like Anti-lock Braking
Systems (ABS) and Cruise Control.
Another vehicle industry using embedded systems in their products is the defence
industry.One leading manufacturer of combat vehicles are BAE Systems H¨agglunds
in
¨
Ornsk¨oldsvik (Sweden) with over 1200 employees.This master thesis was proposed
by H¨agglunds and involved an embedded system in a combat vehicle.The task was to
extend this system with a TCP/IP stack,which is used to transmit data over a network.
1.1 Outline of this report
This thesis report describes the project made for BAE Systems H¨agglunds.Chapter 7
will explain how the work was done,thus describing the actual implementation details
of the project.Preceding chapters are meant to give an insight of the overall system,
which involves both hardware and software.There is also an in-depth study,in chapter
4,discussing topics concerning development of safety critical software.
This project is an extension of an existing systemin a combat vehicle.The systemis
called VCS (Vehicle Control System) and chapter 3 will explain what components VCS
consists of and how they interact.After the in-depth study,chapter 5 will explain the
responsibilities of a TCP/IP stack,which is a suite of transport protocols.A subset of
these protocols will be explained in terms of how they work and cooperate.Readers
1
2 Chapter 1.Introduction
familiar with TCP/IP and its protocols might skip this chapter.The hardware used for
this project is presented in chapter 6.
After the description of the implementation in chapter 7,future work and limitations
of the solution is discussed.The list below gives a short description of the different
chapters,provided to give a quick overview of this report.
– Chapter 3
Describes VCS and VCS Core.
– Chapter 4
Introduction to software development in safety critical systems.
– Chapter 5
Information about TCP/IP.
– Chapter 6
Describes the hardware used for this project.
– Chapter 7
Presents and describes the result of our implementation.
– Chapter 8
Presents limitations of the current implementation and suggestions for future work.
Chapter 2
Problem Description
2.1 Problem Statement
When H¨agglunds developed their new combat vehicle CV9030CH,a new distributed
computer based system called Vehicle Control System (VCS) was born.This electrical
system replaced the old systems built upon cables and relays.VCS consists of several
microcontrollers which uses CAN (Controller Area Network) to communicate with each
other.CAN is based on a shared serial bus developed for connecting electronic control
units.VCS is responsible for controlling and monitoring vehicle functions such as fans,
engines etc.Each microcontroller is configured to perform a specific task.To ease de-
velopment of software for each microcontroller,all common functions were gathered and
forms a framework under the name VCS Core.This framework can be viewed as an
Operating System though it is built upon a small Real-Time Operating System called
Rubus OS.
The assignment of this master thesis was to extend VCS Core with a TCP/IP stack.
In addition to implementing the stack,the project also included implementing drivers to
the ethernet-controller.Applications that use the network communication tools provided
by the TCP/IP stack,like FTP and Telnet,should also be implemented if time allowed.
Certain aspects concerning the project had to be considered.First,the TCP/IP stack
should use as little memory as possible,because VCS is running on small microcontrollers
with limited resources.Further,the implementation should follow a number of coding
guidelines.One guideline which had a big impact of the systems design was that no
dynamic memory allocation was allowed.
2.2 Goals
The main purpose of this thesis was to investigate the possibility of a TCP/IP-stack in
the VCS system.Is it realistic to have a TCP/IP-stack in VCS core or will it consume
to much memory and CPU-time?The goal was therefore to implement a TCP/IP-stack
into the VCS Core framework that meets the systems demands.VCS had no support
for TCP/IP whatsoever,so everything had to be done from scratch.This made it hard
to estimate how much work and time that was needed for this project.Therefore the
goal was divided into the following sub-goals:
3
4 Chapter 2.Problem Description
– To Construct a driver for the Ethernet controller CS8900 adapted for VCS Core.
– To be able to send and receive Ethernet frames.
– To be able to ping a VCS node.
– To be able to set up a TCP-connection.
– A working Telnet-server that allows remote login.
– A working FTP-server to download log-files.
Many opportunities arise if communication with a VCS-node through Ethernet was
possible.Diagnostic information could be downloaded via FTP,or possibilities of remote
login with Telnet.Also many 3:rd part components communicates through TCP/IP and
integration of such components could easily be done with a TCP/IP-stack in VCS.
Chapter 3
VCS
3.1 VCS System
VCS (Vehicle Control System) is the result of a new distributed computer based elec-
trical system that was developed for the combat vehicle CV9030CH.Together with
VCS another system called VIS (Vehicle Information System) was also developed as
a complement to VCS.The system is based on different microcontrollers with CAN-
communication,also called nodes.Each node runs VCS Core software and has one
or several different CAN interfaces.Normally a node is connected to a single CAN-
network but it can be configured to act as a gateway between different CAN-networks.
An example of a VCS System installed in a vehicle is seen in Figure 3.1.
Figure 3.1:Example of a VCS System
There are several advantages of using software and buses instead of cables and relays
in a electrical system.Software makes it more flexible and configurable.It makes it easy
to build vehicle simulators,test-benches,virtual vehicles and also to discover and react
on errors that can arise in components.Software will also make it possible to use Built
In Tests (BIT) for error detection,which H¨agglunds describes as a requirement from
customers.Another benefit with a software system is that it simplifies communications
with 3:rd part systems like a GPS etc.
The electrical system inside H¨agglunds combat vehicles can be divided into different
function domains,where each domain has special requirements.An overview of a typical
5
6 Chapter 3.VCS
computer based distributed electrical system and how it can be divided in different
domains is shown in Figure 3.2.
Figure 3.2:System overview
As seen in the picture above the system has been divided in three function domains.
To meet the function requirements of each domain H¨agglunds has,besides VCS,devel-
oped three other core platforms named Vehicle Information System (VIS),Extended
Control System (ECS) and Diagnostic Information System (DIS).DIS is an external
system that is plugged into the CAN-network when diagnostics is needed.
– Information domain
Man Machine Interface (MMI) manages and presents information to the user
through displays,alarms,diagnoses,videos etc.All of these functions are han-
dled by the VIS.Typical information that are presented here is:
• Operation data (instrumentation,meters etc).
• Ongoing errors (symbols + information texts).
• Manuals.
• Performing system test and diagnosis.
• Maps.
3.1.VCS System 7
Occasional disturbances that can arise in this domain will not effect the vehicle be-
cause it contains no functions that controls the vehicle.VIS runs on PC machines
with Linux and communicates with each other through TCP/IP via Ethernet.To
be able to communicate with nodes in the utility domain,one VIS-node must act
as a gateway.
– Utility domain
The functions here provide the normal vehicle functionality such as:
• Controlling of motors,valves,wipers,lighting etc.
• Reading of important signals like safety loops,emergency stop etc.
• Detection of errors,so called build in tests (BIT).
• Logging of operational data.
These functions are managed by VCS.Communication within the domain here is
exclusively done through CAN-buses.
– Real-Time domain
Although VCS can be used in this domain,the primary choice is ECS.ECS is a big
brother to VCS,and is based on VxWorks OS.The functions in this domain has
higher real time demands than in the utility domain.The functions here include
ballistic calculations which require more computational power than the 40 Mhz
microcontrollers can offer.Therefore ECS is running on Power PC machines at
266 Mhz.
A fourth,safety-Critical,domain could also be added in the system view.The domain
would contain safety-critical operations like drive-by-wire systems for electrical steering
or braking.These operations are today traditionally done mechanically or hydrauli-
cally but it will probably change in the future.Then VCS could be an alternative to
commercial solutions.
8 Chapter 3.VCS
3.2 VCS Core
Inside a VCS node the software can be divided into layers according to Figure 3.3.At
the bottom there are Electronic Hardware,like signals,clocks,communication ports
etc.Then comes the Hardware Abstraction Layer (HAL) that provides standardized
methods to access different hardware.Above the HAL comes the VCS Core which
contains a small Real-Time Operating System (RTOS) and a configuration part.The
main task of VCS Core is to provide common services and functions like I/O,monitoring,
logging and CAN-protocols.With the configuration part it makes it easy to configure
nodes for different applications.VCS Core is written in a subset
1
of ANSI Cand designed
to be platformand compiler independent.This can be done because all hardware-specific
software is located in the HAL.It is also possible to simulate one or more VCS nodes
on a ordinary PC,then the real CAN-drivers is replaced by drivers that communicates
through a shared memory.
Figure 3.3:Inside a VCS node
3.3 Rubus OS
The most inner part of the VCS Core is the Real-Time operating system Rubus OS,
made by Arcticus Systems AB.Like all Real-Time operating systems time plays an
essential role in Rubus.Typically a Real-Time operating system must react on external
devices in a very constrained time period.Application operations running in Real-
Time systems can be divided into two groups;hard Real-Time and soft Real-Time.In
hard Real-Time it is critical for the application operation to meet its deadline.In soft
1
A subset specially developed for using C in safety critical systems called MISRA C
3.3.Rubus OS 9
Real-Time it is not critical if a deadline is missed but it is still not desirable.Because
of different demands of application operations,Rubus OS provides three categories of
run-time services:
– Green Run-Time Services
External event triggered execution (interrupts).
– Red Run-Time Services
Time triggered execution,used by applications which deadlines are critical to the
operation (Hard Real-Time).
– Blue Run Time Services
Internal event triggered execution,for applications which deadlines are not critical
to the operation (Soft Real-Time).
Figure 3.4:Rubus overview
As seen in the Figure 3.4 Rubus OS contains three kernels:
– Red Kernel
– Green Kernel
– Blue Kernel
Each kernel is responsible for its the corresponding run-time service.The Red Kernel
manages the execution of time triggered red threads.Red threads are statically sched-
uled before run-time.The Blue Kernel handles execution of internal event triggered blue
threads.Scheduling of blue threads is done during run-time.The green kernel manages
the execution of interrupt triggered green threads.
10 Chapter 3.VCS
3.3.1 The Red Kernel
The red kernel contains services related to time driven execution.Its main task is to
dispatch threads according to the current schedule and the current time.The schedule
defines when threads are to be executed.When the dispatcher reaches the end of the
schedule it starts over from the beginning.This schedule can be illustrated as a clock.
In Figure 3.5 Thread A will execute in slot 0.Thread B and C in slot 4 and 5.When
the clock has completed one rotation it will start over.Several schedules can co-exist in
the system and the Red Kernel contains functionality for switching between them.The
schedules are statically defined before run time.
Figure 3.5:Clock scheduler in the red kernel
3.3.Rubus OS 11
3.3.2 The Blue Kernel
The Blue Kernel contains management of blue threads and services for communication
between threads such as synchronization and message passing.Synchronization is done
by message queues,signals and mutexes according to the POSIX standard.The Blue
Kernel is a traditional event triggered kernel with a preemptive scheduling algorithm,
which means that each thread will run for a maximumamount of some fixed time.If the
thread is still running at the end of its given time it will be suspended and the scheduler
will select another thread to run [19].The Blue kernel provides different priorities for
blue threads and guarantees that the thread with highest priority,among the ready
threads,will be executed first.Like everything else in Rubus the stack is allocated
statically and the size of the stack can not be altered at run-time.Therefore the size of
the stack must be dimensioned to the maximum stack usage.A blue thread can be in
one of the following states:
– Dormant
The thread is terminated or not initialized.
– Ready
The thread is ready to execute.
– Blocked
The thread is blocked waiting for a signal.
– Running
The thread is running.
Legal state transitions can be seen in figure 3.6.
Figure 3.6:Legal state transitions
12 Chapter 3.VCS
There are 16 thread priority levels,from 0 to 15,where 15 is reserved for the Blue
Kernel Thread and 0 is used (but not reserved) by the Blue idle thread.Each priority
level has an internal thread FIFO list which is scheduled in a Round-Robin manner.
Within this list the threads are ordered by the time they arrived.
Figure 3.7:Blue thread scheduling in Rubus
Blue threads are scheduled during run-time and are executed when no red threads
are running.Either when no red threads are scheduled or when the red threads do not
utilize all time that was given to them.When it is time for a blue thread to execute,
the thread with highest priority among all blue threads that are ready will be selected.
If a blue thread is executing and a higher priority thread becomes ready,the running
thread will be preempted and moved to the Ready-state.
3.3.3 The Green Kernel
The Green Kernel handles interrupt processing using Green Threads.The dispatch unit
in the Green Kernel is the target processors interrupt logic.Therefore the behavior is
target dependent.If an interrupt occurs,a Green Thread will preempt the execution of
both Red and Blue Threads.
3.3.Rubus OS 13
3.3.4 Rubus Visual Studio
Arcticus Systems has developed a configuration tool,to ease the development of de-
pendable Real-Time system,based on Rubus OS.All threads in VCS Core are created
before run-time,and Rubus Visual Studio provides a interface which makes it easy to
add and remove such threads.
Figure 3.8:Rubus Visual Studio
14 Chapter 3.VCS
Chapter 4
Software Development For
Safety Critical Systems
4.1 Introduction
Failures are not desirable in any systems.But the consequence of something going wrong
is not equal for all systems.If the software in a desktop computer crashes a couple of
working hours may be lost,while a software failure inside an aircraft’s fly-by-wire system
could lead to lost lives.In other words,there are systems where software failures are
acceptable at some level and there are systems where failures are unacceptable[10].
Safety-critical systems are those systems where failures are unacceptable,a failure in
such system could result in loss of life,significant property damage or damage to the
environment.Traditional areas where these systems can be found include:
– Military,e.g.weapon systems.
– Space programs.
– Industry,e.g.manufacturing control where toxic substances are involved,and robots.
– Transport,e.g.fly-by-wire systems in aircraft,aircraft control.Interlocking sys-
tems for trains.
– Medical devices.
– Nuclear power plant control.
A failure in these areas can directly lead to that human lives are put in danger.Many
safety critical systems are embedded systems,many do also have Real-Time demands.In
a Real-Time systema correct,but delayed,response can have equal serious consequences.
Imagine a system controlling the air-bag in a car,if the air-bag is triggered too late it
will do more harm then good.This chapter will focus on the software aspect of safety,
and discuss desirable properties of a programming language.Safety critical systems
are not necessarily (but often) embedded systems or Real-Time systems.Therefore the
aspects of both embedded and Real-Time systems will be considered.
15
16 Chapter 4.Software Development For Safety Critical Systems
4.2 System safety
It is important to realize that computers are not unsafe by themselves.They rarely
explode,catch fire or cause any physical harm.Computers can only indirect cause
accidents and therefore must safety be evaluated in the context of the whole system.
System safety is often described in terms of hazards and risks.Leveson[12] defines
hazard and risk as:
A hazard is a state that can lead to an accident given certain environmental
conditions.
A risk is defined as a function of:
1.The likelihood of a hazard occurring.
2.The likelihood that the hazard will lead to an accident.
3.The worst possible potential loss associated with such an accident.
An example of a hazard would be brakes failing in a motor vehicle.It will not nec-
essarily cause an accident but it is certainly a state that can lead to an accident.If
system safety was defined by accidents or catastrophic failures instead of hazards and
risks,most systems could be considered safe.
In this definition,software itself can not directly cause accidents but rather only
contribute to hazards.However,because of software can contribute to system hazards,
minimizing software flaws will therefore reduce or prevent accidents.
It is also important to separate safety from reliability.Reliability is often measured
in up-time or availability of a system.A safe system can fail frequently provided that
it fails in a safe way.While a reliable system may not fail often,it makes no guarantees
what happens when it fails.Systems can be safe but not reliable.A rail-road signaling
system may be very unreliable but still safe if every time it fails ends up with showing
’stop’.Similar a system can be reliable but unsafe.Reliability can also be achieved by
the cost of safety when errors are ignored,and systems are allowed to continue for the
cause of availability.Place and Kang [17] points out that safety software requires not
to be perfect.The code may contain errors as long the consequence of the errors do not
lead to hazards.
4.3 Software failuers:What can go wrong?
Today,flaws in software products are common.Software patches and updates are con-
stantly released to correct previous problems.People have come to accept that software
fails once in a while.However,in safety critical systems there is no acceptance for failure.
Unfortunately,safety critical systems are not completely spared from software failures.
In this section a few examples will be presented,all where the failure was caused by
software errors.
4.4.What is the problem?17
4.3.1 Therac-25
One of the most known software-related safety failure occurred in a radiation therapy
treatment device called Therac-25.It was developed by the Atomic Energy of Canadian
Limited (AECL)in the late 70th,and in 1982 the cancer treatment machine was ready.
Therac-25 was like its predecessor Therac-6 and Therac-20 controlled by a DEC PDP-
11 computer.In Threac-6 and Therac-20,which relied on hardware safety features
and interlocks,the software played a minor part.However,the software in Therac-25
had more responsibility for maintaining system safety [11].Due to some software flaws
in Therac-25,six patients received an overdose of radiation.The accidents occurred
between 1985 and 1987 when Therac-25 was recalled.Three of the accidents had a
lethal outcome.
4.3.2 Ariane 5
Ariane 5 is an European launch system designed to deliver satellites and other payload
into space.In June 4 1996,after 10 years of development and a cost of e7 billion,Ariane
5 was finally ready for its first test flight (flight 501).The launch went well and Ariane
5 followed a normal trajectory for 37 seconds.But shortly after,it suddenly veered off
its flight path,broke up,and exploded.A data conversion from 64-bit floating point
to 16-bit integer had caused a software exception.The floating point number had a
value greater than what could be represented by a 16-bit signed integer.This led to an
Operand Error-exception.Due to efficiency considerations the software handler (in Ada
code) for this trap was disabled,although other conversions of comparable variables in
the same place in the code were protected.This software flaw was located in the Inertial
Reference System (SRI) which provides flight data to the On-Board-Computer (OBC).
The OBC is responsible for executing the flight program and controlling the nozzles of
the solid boosters.Because of the software exception,the SRI transmitted erroneous
flight data to the OBC.The flight data resulted in full nozzle deflection of the booster
and the self destruction mechanism was eventually triggered when the aerodynamic
loads became too high.More details about the Ariane 501 flight failure can be found in
[4].
4.4 What is the problem?
Despite that accidents have occurred due to software failures,software controlling safety
critical systems are here to stay.The use of software provides a number of advantages
over hardware.Software is flexible and easy to modify and can also provide a better
interface for users.With software,it is easier to simulate and test the whole or parts
of a systems.So called Built-In-Tests (BIT),where software examines the system sta-
tus,is also feasible with software.It is cheaper to reproduce software than hardware,
although maintaining the software can be the opposite.As systems are responsible for
more and more complicated tasks,the software controlling the systems are getting more
complex.Because of the software complexity,the human design-errors has increased
dramatically and it is hard,if not impossible,to ensure the correctness of the soft-
ware.Before software was used in safety-critical systems,they were often controlled
by non-programmable electronic and mechanical devices.Parnas[15] argues that analog
systems,such as mechanical- and hydraulic systems,are made from components that,
within a broad operating range,have an infinite number of stable states and whose
18 Chapter 4.Software Development For Safety Critical Systems
behavior can be adequately described by continuous functions.When systems are de-
scribed by continuous function it means that small changes in inputs will always cause
correspondingly small changes in output.This is not the case in a discrete systemwhere
a single bit change can have a huge impact.The mathematics of continuous functions is
well understood and this combined with testing to ensure that components are within
their operating range,leads to reliable systems.
4.5 Language Selection
There are plenty of programming languages to choose fromtoday,each have its own ben-
efits and is suitable for different tasks.Java,with its byte code,is platformindependent
while C is more suitable for low level programming etc.
A programming language itself can of course not guarantee the correctness of the
software.However,it can help the programmer in the right direction.The philosophy
of C is to trust the programmer while other languages are more restrictive.A quote
from Powell [6] describes this very well:
C treats you like a consenting adult.Pascal treats you like a naughty child.
Ada treats you like a criminal.
Especially when developing safety critical systems the choice of programming lan-
guage should be carefully considered.Some characteristics that generally improves the
safety of a programming language are:
– Strong compile-time checking
– Strong run-time checking
– Support for encapsulation and abstraction
– Exception handling
At compile-time checking the code is evaluated according to the grammatical rules
of the language.A more comprehensive check will increase the reliability of a language
when many common programming mistakes are discovered early.Consider the Pascal -
code below:
Program Test;
var
x:integer;
y:real;
myArray:array[1..10] of integer;
begin
y:= 3.14;
x:= y;/* Syntax Error,Incompatible types */
myArray[11]:= 4;/* Syntax Error,Range check error */
end.
The code will not pass a Pascal -compiler since two rules are broken.However,the
corresponding C-code would compile.Many strongly typed language as Java,Ada,
Pascal etc.offers a better compile time checking and they are therefore considered as
”safer” than a language like C.Another good feature with compile-time checking is that
no extra overhead is generated in the execution.
4.5.Language Selection 19
As mentioned earlier,many errors can be prevented with a better compile-time
checking,but all errors cannot be discovered before run-time.The next step is to provide
run-time checking.A typical example of run-time checking is array range checking.
Consider following statement:
array[i] = 11;
In C with no run-time checking,the value 11 will be stuffed into the memory location
pointed out by the array and index i.No implicit checking of the index i will be done.
If the corresponding code was written in Ada,Pascal or Java,the variable i would be
checked during run-time.This will,of course,force the compiler to generate some extra
code and therefore effect both code size and execution time.Because of the performance
loss,some developers dislike run-time checking although it provides increased safety.
The ability to handle run-time errors as well as other unusual states that can arise
in software is important to achieve safety.An exemplary way for a programming lan-
guage to handle this,is to provide exception handling.Exceptions are preferred over
error codes for several reasons.Checking return codes fromfunctions,as the normal way
in C,tends to mess up the code and makes it harder to follow.Consider the codes below:
retValue = function1( x,y);
if ( retValue < 0 )
{
switch (retValue)
{
case...
case...
}
}
retValue = function2( z,v );
if ( retValue < 0 )
{
switch (retValue)
{
case...
case...
}
}
try {
retValue = function1( x,y);
retValue = function2( z,v );
} catch (...)
{
/* Handle exceptions */
}
With exception handling the error handling code can be separated from the rest of
the code and prevent to obscuring the actual program logic.Checking return codes is
a manual process,the programmer needs to remember to check every time the function
is used.It is all too easy to circumvent the error handling system with return codes
1
.
With exception handling,the exceptions has to be handled,or ignored explicitly by the
programmer.This makes it harder to forget (or skip) error handling.
1
When did you lately check the return value of the printf()-function in C
20 Chapter 4.Software Development For Safety Critical Systems
Properties that make a programming language safer can also be more subtile than
those which has been described so far.Depending on how the grammatical rules of a
language is formed,ordinary typing errors can more or less easily slip into the code and
cause unexpected behavior.For example,in C ’=’ is used for assignment and ’==’ for
comparison,this notation can easily result in the typing mistake shown in statement
(1):
if ( a = b ) ( 1 )
if ( a == b ) ( 2 )
In C the statement (1) will always be true as long b 6= 0 and (2) will be true only
if a = b.In java statement (1) would not pass the compiler since a boolean type is
excepted.
If a language has features that are not completely or ambiguously defined,the pro-
grammer can assume that the compiler interprets the code differently than expected.
This is mainly an issue when using different compilers,but one compiler can also behave
differently depending on context.
4.5.1 Java
The Java programming language was initially designed to be used in embedded systems
[18].Sun Microsystems were not satisfied with Cand C++ for developing software for
electronical devices.C lacked the support of object-oriented programming,and C++
was far too complex.They also believed that neither of them could provide enough
reliability.Sun’s guidelines when designing Java were therefore that it should be more
reliable and simpler than C++.Although Java was designed for embedded systems,
no electronic products using Java reached the market in the beginning.Instead,Java
was found to be a useful tool when the Word Wide Web became widely used.Today,
Java has matured and is widely used in different systems,not to mention in embedded
devices such as mobile phones where Java has grown strong.Java has several features
that are suitable for safety software development:
– Strong typing.
– Exception handling
– Run time checking of Null pointers and array ranges etc.
Although Java is used in many embedded systems,it has not been used for the same
extension in Real-Time systems.Many safety critical systems belong to the Real-Time
domain where predictability of execution is important.The automatic memory man-
agement in Java leads to unpredictable latencies in execution.There is also some un-
predictability in synchronization[2].Beyond requirements for Real-Time systems there
are also features needed for embedded systems,such as low level programming,writing
interrupt procedures.This is a weak area (with purpose) of Java since it both com-
promise portability and safety.To address the problems with Java,Sun Microsystems
created a Real-Time Specification for Java (RTSJ).
4.6.Conclusion 21
4.5.2 Ada
Not surprisingly Ada is considered as the most suitable language for safety-critical-
systems.Ada is the result of the most extensive and most expensive language design
effort that has ever been made[18].It was standardized by the American National
Standards Institute in 1983 but the first truly usable Ada compilers did not appear
until two years later.Ada was developed to meet the high demands of a Department of
Defense (DoD) and to be used in their safety critical systems.Although Ada received
some criticism in the beginning,especially from Hoare in[7] that he thought it was too
complex,it became the primary choice for programming safety critical systems.
4.5.3 C
C is not known as a reliable programming language.The language is suitable for
low-level programming and is therefore popular for programming embedded systems.
There are C-compilers available for most architectures,therefore C-programs are very
portable.Some safety flaws in C have already been discussed,but there are also a num-
ber of drawbacks when using C in Real-Time systems.C has no language support for
concurrency (multithreading),a serious drawback for Real-Time systems.The program-
mer needs to use an external application programming interface or Real-Time operating
system (RTOS) which compromises portability.
4.6 Conclusion
Writing software for safety critical systems is not an easy task.Whatever programming
language,reliable or not,it is still up to the programmer to implement the software cor-
rectly.Implementation is only one part of the software development process.To achieve
software safety,each step from design to testing must be rigorouly defined.It is also
important to remember to use trusted developing tools when developing the software,
the compiler itself can be considered as safety critical.
Computer systems are already widely used in our society today and more and more
applications will rely on computers in the future.Software systems will also be trusted
in a higher extent.Modern cars are already filled with small computers,but still no
car contains drive-by-wire technology (replacement of mechanical operation devices by
electrical signals).The car industry is a bit behind the flight industry where many
modern planes today are controlled by fly-by-wire systems.It just a matter of time
before such systems also will be introduced in the cars.
22 Chapter 4.Software Development For Safety Critical Systems
4.7 Development of safety critical software
In many areas the use of embedded systems increase rapidly.The software in these
systems handles more and more advanced tasks,and thus becomes bigger and more
complex.In some areas,including the motor vehicle industry,these systems handles
tasks that are more or less safety critical.An example of a relatively complex safety
critical system,used in cars,is Adaptive Cruise Control (ACC).This system is similar
to conventional cruise control,which holds a preset speed without participation of the
driver.The extra feature this systemprovides is the ability to adjust speed automatically
to maintain a distance to the vehicle ahead.This is achieved through a radar headway
sensor,digital signal processor and a longitudinal controller.If the lead vehicle slows
down,or if another object is detected,the systemsends a signal to the engine or braking
system to decelerate.Then,when the road is clear,the system will re-accelerate the
vehicle back to the preset speed.This kind of system involves interaction between many
vehicle control systems and is also a relatively new technique.A software system of
this complexity has a potential for problems.Interesting questions about safety critical
systems are:
1.How can confidence in software based systems be increased?
2.When is software ready for production,and how shall this be determined?
One organization dealing with these issues is MISRA,short for The Motor Industry
Software Reliability Association.MISRA is a collaboration between vehicle manufac-
turers,component suppliers and engineering consultancies which seeks to promote best
practice in developing safety-related electronic systems in road vehicles.In 1994 MISRA
published guidelines for the development of software for vehicle-based systems,these
guidelines are meant to ”provide important advice to the automotive industry for the
creation and application of safe,reliable software within vehicles”[3].The list below
shows what these guidelines include.
– guidance for creating contracts and specifications for software procurement.
– an introduction to issues of automotive software reliability.
– a basis for training requirements within the automotive industry.
– guidance for company quality procedures.
– guidance for management on resource requirements.
– a basis for assessment.
– a foundation for a standard.
The vehicle manufacturer is responsible for the safety of its product,so which de-
velopment process to use is decided by the manufacturer.The techniques described in
section 4.7.1 are proposed approaches from MISRA,which will give examples of tech-
niques for question 1 above.Then in section 4.7.2,the second question will be discussed.
4.7.Development of safety critical software 23
4.7.1 Correctness of software based systems
MISRA guidelines gives recommendations covering the whole software development pro-
cess.One important aspect of the development process is the systems involved hazards.
These must both be understood and taken into consideration at an early stage.Doing
this makes it possible to take design actions that can reduce these risks.It is also impor-
tant to have documents of the reasoning behind the design actions taken.Therefore the
first step in the process is to make a hazard analysis.For this analysis to be accurate
a model,or approximation,of the system must be produced.Often the system can be
associated with many potential hazards which can lead to human injury,but most of
these hazards are limited to specific physical situations.The model should therefore
have a well defined boundary between the systemitself and the potential situations that
can lead to hazards.Also the boundaries between components within the system itself
as well as boundaries between the system and other subsystems shall be defined.A
model shall consist of:
– Components.
– The interconnections between the components.
– The boundaries.
The level of detail chosen for defining a component depends on what type of system
the model shall describe.Each component has some sort of interface to other compo-
nents in the system,which shall be included in the model.The interface shall describe
how components communicate with each other,and the format of this interface depends
on what type of component is being described.For a system in a car,most hazards
will be related to the movement of the whole vehicle.Therefore an important boundary
to include is the one between the vehicle and its environment.Examples of environ-
ment can be driver,passenger,road,and other vehicles.Possible interactions with the
environment are:
– Inputs - All devices that a driver can use to control signals to the system.
– Outputs - All displays and warning systems for the driver.
– Physical properties - Physical materials used etc.
When the model is complete,a Preliminary Hazard Analysis (PHA) can be per-
formed.One technique used for the analysis is Failure Mode & Effects Analysis (FMEA),
which is often called a ’what if?’ analysis.This analysis begins by identifying possible
hazards.Each boundary is considered in turn and the hazards related to its environment
are considered systematically.FMEA both considers hazards related to some system
failure and when the system is working as desired.In complex systems,there can be a
chain of events that finally lead to a hazard.This means that other systems,whom can
be affected by a fault of the system being developed,also must be considered.There
is no easy task finding all possible hazards,but a well performed hazard analysis will
ultimately lead to a safer system.
Once the hazard analysis is complete,the systems Safety Integrity Level (SIL) must
be determined.This value between 0 and 4 classifies the hazards according to their
severity,and is needed to determine the development process.A system with a low SIL
24 Chapter 4.Software Development For Safety Critical Systems
does not have to be designed,documented and built as thoroughly as a system with
a high SIL.There are many techniques to determine the Safety Integrity Level for a
system,but most of them consider systems in static environments.The non-static envi-
ronment for vehicle systems is practically infinite.Aspects like the skill of the driver and
weather conditions must be considered.Therefore the MISRA Guidelines have adopted
the concept of controllability as a means of determining safety integrity levels for sys-
tems which do not have a static environment[13].
Controllability measures what degree of loss of control a fault causes.The greater
the loss of control the more serious the failure.Table 4.1 shows how the different levels
of controllability are defined.
Uncontrollable This relates to failures whose effects are not
controllable by the vehicle occupants,and which
are most likely to lead to extremely severe
outcomes.The outcome cannot be influenced by a
human response.
Difficult to control This relates to failures whose effects are not
normally controllable by the vehicle occupants but
could,under favorable circumstances,be
influenced by a mature human response.They are
likely to lead to very severe outcomes.
Debilitating This relates to failures whose effects are usually
controllable by a sensible human response and,
whilst there is a reduction in safety margin,can
usually be expected to lead to outcomes which
are at worst severe.
Distracting This relates to failures which produce operational
limitations,but a normal human response will
limit the outcome to no worse than minor.
Nuisance only This relates to failures where safety is not
normally considered to be affected,and where
customer satisfaction is the main consideration.
Table 4.1:Definitions of controllability levels
After these steps are completed and the design phase begins,the Detailed Safety
Analysis (DSA) are performed.The objectives of the DSA are to [9]:
1.Confirm the findings of the PSA (Preliminary Safety Analysis) or PHA (Prelimi-
nary Hazard Analysis).
2.Identify any additional hazards that may have been introduced as a result of the
design used.
3.Identify the possible causes of each hazard.
4.Confirm the allocation of SILs (Controllability).
5.Predict the frequency with which a particular failure may occur.
6.Identify the degree to which the system can accommodate any fault.
4.7.Development of safety critical software 25
Two techniques are commonly used for the DSA:Design Failure Mode & Effects
Analysis and Fault Tree Analysis (FTA).Design FMEA is a bottom-up technique start-
ing at a fault and ends up at its resultant effect.Each fault is classified using three
different parameters:severity,occurrence and detection.Each of these parameters are
given a score between 1 and 10 (not critical - extremely critical).The severity is the
same as controllability,with the exception that a different scale is used.Occurrence is
pretty hard to estimate for systematic software faults,because there is no reliable data
available as there is for randomfaults in hardware.One option is to use measured Mean
Time Between Failure for software from similar systems.The detection parameter de-
scribes the systems ability to detect a fault and of what degree the risk associated with
that error can be reduced by the system.
Fault Tree Analysis is a top-down approach.This is done via a hierarchy which
gradually refines the hazard through a series of sub-system down to the failures of
individual components.It can be useful to use both techniques to get an independent
check,and provide confidence that nothing has been missed.This technique is also useful
for design engineers to communicate to those writing service repair documentation.
4.7.2 When is Software Ready for Production?
The vehicle manufacturers are constantly trying to improve their products to get an
edge on their competition.This leads to creation of new applications in vehicles,with
increasing complexity.Most features in modern vehicles are controlled with some soft-
ware based system,and in the future these systems will most certainly control more
areas and be much more complex.And as the complexity of control systems increases,
so does the complexity of interactions between the systems.To determine that the final
product meets the demands and safety analysis,there must be some guidelines these
manufacturers can follow.
In 1995 a joint venture of Ford,General Motors and Chrysler published the QS9000
[5] document set.The documents are an interpretation of ISO9001,which is an in-
ternational standard that gives requirements for an organizations Quality Management
System.QS9000 extends these requirements to be more specific about how they should
be met.It consists of a suite of documents and two of them,Advanced Product Quality
Planning (APQP) and Production Part Approval Process (PPAP),will be discussed
here.Another document for developing safety critical software are MISRA-C:2004.
These coding guidelines,published by MISRA,will also be mentioned in this section.
The Advanced Product Quality Planning document provides a structured approach
for the implementation of a Product Quality Plan,which will support the development
of the product,to satisfy customers.It is based on a vehicle design lifecycle.It defines
what has to be completed at the end of each phase of the development,to ensure quality
of the product.It is impossible to directly measure the quality of the code,therefore
confidence in a product is achieved through examining how well the APQP is followed.
It is very important to have a well described criteria of when a stage is complete.These
criteria may include requirements documents,functional specifications,design docu-
ments,checklist,the final code and so on.A company can evolve the APQP process for
each new project,and strive for improvement in their products.When the quality plan
is complete,the PPAP process can be applied.
26 Chapter 4.Software Development For Safety Critical Systems
The PPAP document defines an approach to determine if all customers,engineering,
design records and specification requirements are properly understood by the supplier[14].
A product must be approved by the PPAP process in order to be complete.The process
must prove that the product can be produced with all requirements stated within APQP.
MISRA has recently (march 2006) published a document with the title:Software
readiness for production(SRfP).This document describes metrics that can be used for
tracking the progress for software projects towards the goal of production readiness.The
method is inspired by the ideas in Advanced Product Quality Planning and Advanced
Product Quality Planning mentioned above.The purpose of the SRfP process is to
make these approaches more suitable within a software context.
The C programming language are widely used in embedded systems,and the reason
is features of the language like easy access to hardware,low memory requirements,and
efficient run-time performance.But the language also has its drawbacks such as highly
limited run-time checking,a syntax that is prone to mistakes that are technically legal
etc.In 1998 MISRA published MISRA C to support the language requirements of the
1994 MISRA Guidelines.The document became widely used and in year 2004,MISRA
published a new version:Guidelines for the Use of the C Language in Critical Systems,
also called MISRA-C:2004.These set of rules,or guidelines,helps developers to write
safer and more portable code.
Chapter 5
TCP/IP
5.1 Overview
The TCP/IP protocol suite has become the standard for com-
puter communications in today’s networked world,mainly because
of its simplicity and power.TCP/IP stands for:’The Transmis-
sion Control Protocol(TCP)/Internet Protocol(IP)’,which are the
names of the two most important protocols in this suite.The
protocol suite was constructed to enable communication between
hosts on different networks.Therefore one important aspect is
the creation of an abstraction for the communication mechanisms
provided by each type of network.TCP/IP hides the architec-
ture of the physical network from the developer of the network-
application.TCP/IP is modeled in layers as many other networking software.This
architecture has many advantages such as ease of implementation and testing,ability to
alternative layer implementations etc.Each layer communicates with those above and
below through a well defined interface,and each layer has its own well defined tasks.
This section will explain each layer in this protocol suite,starting at the top and working
its way down.
5.2 Protocol
A protocol defines the format and the order of message exchanged between
two or more communicating entities,as well as the actions taken on the
transmission and/or receipt of a message or other event.
[Kurose & Ross Computer Networking]
5.3 Requirements
In this section each layer is explained briefly,to get an overview of what the individual
layers contains.
27
28 Chapter 5.TCP/IP
5.3.1 Application layer
The application layer is provided by the programthat uses TCP/IP for communication.
Examples of applications are Telnet and the File Transfer Protocol (FTP).The interface
between the application and transport layers is defined by something called socket,which
will be explained in section 5.4.
5.3.2 Transport layer
The transport layer is responsible for transferring data fromone application to its remote
peer.There is support for multiple applications to use the layer simultaneously.The
most used transport protocol are Transmission Control Protocol (TCP) which provides
reliable data delivery.Another commonly used protocol are User Datagram Protocol
(UDP) which provides unreliable data delivery.
5.3.3 Network layer
This layer works as an abstraction of the physical network architecture below it.Internet
Protocol (IP) is the most important protocol in this layer.IP does not provide reliability
or error recovery,this is up to higher layers to handle.Other network layer protocols
are ICMP,IGMP,ARP and RARP.
5.3.4 Network interface layer
The Network interface layer works as the interface to the actual network hardware.
TCP/IP does not specify any protocol here,but can use almost any network interface
available,which illustrates the flexibility of the Network layer.
5.4 Socket
As the network layer provides a host-to-host delivery ser-
vice,there must be possible to direct data to a certain
process running on its host.This process-to-process ser-
vice is accomplished with ports and sockets.A socket can
be seen as a door to the process.When data is to be
received or to be sent,a socket has to be used.As
stated earlier,the socket is within the transport layer and
provides an interface to the application layer.This sec-
tion will explain how this process-to-process service is accom-
plished.
A socket is compound of a port and an IP-address.In programming,a port is a
”logical connection place”,and is simply a 16 bit integer.On an ordinary computer,
there are usually 65.535 available ports.The port number is used for identifying the
correct application process.Port are usually divided into two types:well-known ports
and ephemeral ports.The well-known range from 1 to 1023,and are used for certain
server applications.For example Telnet uses port 23 and File Transfer Protocol(FTP)
uses port 21.The reason for well-known ports is to allow clients to be able to find
servers without configuration information.Ephemeral port numbers are of range 1024
5.4.Socket 29
to 65.535.These port are free to use by any application as long as the combination of
[transport protocol,IP address,port number] is unique.The IP-address is used for the
host-to-host delivery service.
When a segment (a packet of data) arrives,the header is examined.The header
provides information about what protocol the packet belongs to as well as destination
port number.With this information the correct socket can be found,and thereby the
correct process.The job of examining the header and deliver the segment to the correct
socket is called demultiplexing.The job of gathering data from a socket,encapsulating
it into a segment with header information and sent it to the network layer is called
multiplexing.
30 Chapter 5.TCP/IP
5.5 Network Layer Protocols
5.5.1 Internet Protocol (IP)
The Internet Protocol (IP) was designed 1981 to be used in interconnected systems of
packet-switched computer communication networks.IP provides a format for trans-
mitting blocks of data (datagram) from sources to destinations.These sources and
destinations are called hosts and are identified with an IP address.A network consists
usually of a number of hosts connected to each other through a router.A host sends a
packet over a link,this link can be physical like cable but also wireless.The boundary
between the link and the host is called an interface.A host is normally connected to a
single link while a router is connected to several.A router´s job is to receive datagram
from one link and pass it on to another link.An IP address is bound to a interface
which means that a router needs multiple IP addresses,one for each interface.Figure
5.1 shows an example of a IP network.
Figure 5.1:An example of a network
Unlike many other host-to-host protocols,IP has no mechanisms for reliability [RFC
791],flow control,sequencing etc.All such functionality is left for higher level protocol
to implement.The only error detection mechanism is a checksum control for the IP
header.If the checksum procedure fails the datagram will be discarded and will not
be delivered to a higher level protocol.The main functions of the IP is addressing and
packet fragmentation.
5.5.2 Addressing
Today there are two standards of IP addressing,version 4 (IPv4) and newer version 6
(IPv6).IPv6 is meant to replace the IPv4 in the future but today most of IP traffic is
still done with IPv4 addressing.In IPv4,each address is 32 bits long which means there
are 2
32
unique addresses.That is about 4 billions.It may seem alot but in a world with
6 billion people the addresses are soon used up.The addresses are typically written
in dotted-decimal notation like 192.168.0.3,the same address in binary notation
would be:
11000000 10101000 00000000 00000011
5.5.Network Layer Protocols 31
Class Prefix Range Networks Hosts
A 0 1.0.0.0 - 127.255.255.255 2
7
2
24
B 10 128.0.0.0 - 191.255.255.255 2
14
2
16
C 110 192.0.0.0 - 223.255.255.255 2
21
2
8
Table 5.1:Address classes in IPv4
Four classes of addresses were defined in the original IPv4,also an additional fifth
class was reserved for future use.The fourth class,dedicated for so called multicast
addressing,is no longer considered a formal part of IPv4 addressing.The first bits
of the IP address determines which class it belongs to,see 5.1.Address 192.168.0.3
begins with 110 in binary notation and therefore belongs to class C.
This partitioning of classes with even byte length of network and host portion turned
out to be problematic.Consider an organization with 2000 hosts.A class C network
would not sufficient,instead a class B network with support of 65634 hosts could be used.
This not only led to poor utilization of the given address space but also to depletion of
class B addresses.The solution to this problem was Classless Interdomain Rout-
ing (CIDR) and was standardized 1993 by the Internet Engineering Task Force
(IETF).Instead of using the 8,16 or 24 first bits of the address for network portion,
any number could be used.For the example above the organization could use the first
21 bits for the network address,and the remaining 11 bits for hosts.This would lead to
2048 (2
11
) possible host addresses within the organization´s network.Another benefit
with CIDR is that it makes subnetting possible.By dividing the the 11 rightmost bits
the organization above could create own subnetworks within its network.
In the beginning of the 1990s people realized that the 32 bit address space in IPv4
would not be enough.Therefore began the Internet Engineering Task Force to develop
a successor to the IPv4 protocol.In the new standard IPv6,the 32 bit address space has
been increased to 128 bit.With such address space every grain of sand on the planet
could have a unique IP address.
5.5.3 Fragmentation
Fragmentation and defragmentation is the other main task of IPv4.It is necessary
because not all link-layer protocols can carry packets of same size.Ethernet packets can
not carry more than 1500 bytes [8] of data,while many wide-area links has a packet
limit of 576 bytes.The maximum packet size a link can carry is called Maximum
Transfer Unit (MTU).A router can have different link interfaces,and that is why
fragmentation of packets is primarily done in routers.To spare routers the burden of also
reassembling packets the designers of IPv4 decided to let the hosts take care of that.Still
fragmentation reduces performance in a router and it is desirable to keep fragmentation
to a minimum.Because of all link layer protocols supported by IPv4 should have a
MTU of at least 576 bytes,fragmentation can be entirely eliminated if all IP packets
are less than 576 bytes in size.This can be controlled by setting Maximum Segment
Size (MSS) of 536 bytes (removing 20 bytes of IP header and 20 bytes of TCP header)
in the TCP connection.This technique often used in TCP data transfers,for example
HTTP packets are often 512 - 536 bytes long.In IPv6 fragmentation is completely
removed.If a router in IPv6 receives a packet that is too large to be forwarded over the
outgoing link,the router simply discards the packet and notifies the sender through a
32 Chapter 5.TCP/IP
5.5.4 Header format
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Version| IHL |Type of Service| Total Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Identification |Flags| Fragment Offset |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Time to Live | Protocol | Header Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Destination Address |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Options | Padding |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 5.2:IP header format
ICMP packet.The sender can then reduce the packet size and retransmit.
5.5.Network Layer Protocols 33
5.5.5 Address Resolution Protocol (ARP)
Nodes in a LAN transmit their frames over a broadcast channel.This means that
every node connected to the LAN will receive the frame.But most of the times a node
just wants to communicate with one particular node in the LAN.To manage this,every
node must have a unique LAN address.This address is also known as MediumAccess
Control (MAC) address or physical address.These MAC addresses are bound to a
specific adapter when it is manufactured and no adapter has the same MAC address.
Most of the networks,including the Ethernet networks,have 6 byte MAC addresses.To
transmit a frame from one host to another host in a LAN,a LAN address is needed.
Because every host has two sorts of addresses,one network address and one physical
address,there is a need to translate between them.This task is solved by the Address
Resolution Protocol (ARP).
Figure 5.3:A small Local Area Network example
If a host wants send a packet to a computer on the LAN,it must know the MAC
address of that computer.Figure 5.3 shows a network of three computers connected
through a hub.Imagine that the computer with address 192.168.0.17 wants to send
a packet to node 192.168.0.11,it will then check its ARP table.
IP Address MAC Address TTL
192.168.0.11 ED-66-AB-90-75-B1 227
Table 5.2:A possible ARP table in node 192.168.0.17
Table 5.2 contains the translation of IP addresses to MAC addresses.This time the
address 192.168.0.11 has a valid translation in the table and Ethernet packet can be
sent to MAC address ED-66-AB-90-75-B1.But what happens if node 192.168.0.17
wants to send a packet to 192.168.0.3.The host will check its ARP-table,but will
not find a valid translation.At this moment an ARP-request packet must be sent.This
packet is sent to the LAN broadcast address FF-FF-FF-FF-FF-FF and contains the IP
address of the target machine which in this case is 192.168.03.All computers in LAN
will receive this ARP-Request,and each of them compare its local IP address with the
target IP address in the ARP-packet.The node with a match will respond with an
ARP-Reply packet,containing the desired address mapping,back to the sender.The
querying node (192.168.0.17) will then update its ARP-table and send the IP datagram.
34 Chapter 5.TCP/IP
5.5.6 Internet Control Message Protocol (ICMP)
The Internet Control Message Protocol (ICMP) is often considered as a part of the
IP although it lies above IP,as ICMP packets are carried inside IP packets,just like
TCP and UDP packets.The purpose of ICMP is provide feedback about problems in the
network communication.A typical example is the Destination network unreachable
which is sent by a IP router if it was unable to find a path to the destination host.
There are a numerous different types of messages that can be sent by ICMP,see 5.3.
The programping may be familiar to the reader,it sends a ICMP message with of type
8 with code 0.
ICMP Type Code Description
0 0 Echo reply (pong)
3 0 Destination network unreachable
3 1 Destination host unreachable
3 2 Destination protocol unreachable
3 3 Destination port unreachable
3 6 Destination network unknown
3 7 Destination host unknown
4 0 Source quench (congesting control)
8 0 Echo request (ping)
9 0 Router advertisement
10 0 Router discovery
11 0 TTL expired
12 0 IP header bad
Table 5.3:ICMP message types
5.6.Transport Layer Protocols 35
5.6 Transport Layer Protocols
5.6.1 Transmission Control Protocol (TCP)
TCP is a connection-oriented protocol which provides features like flow-control,con-
gestion control and reliability.It relies on many principles including error detection,
retransmissions,timers and header fields for sequence and acknowledgment numbers.
A TCP connection provides full-duplex data transfer,which means that two connected
processes can send data to each other simultaneously.Multicasting is not possible with
TCP,the connection is always point-to-point.Before data can be sent between two TCP
connections,an initial ”handshake” must be completed.How this is accomplished will
be explained later in this section.When the connection is established the two processes
can start transferring data to each other.TCP guaranties that all data is delivered in
order,without gaps and without errors.TCP also makes sure that router buffers along
the way does not get congested.Yet another feature that TCP supports is flow control,
which protects against overflowing the remote hosts internal buffers.This section will
explain the most important features of this transport protocol.Figure 5.4 illustrates
how the TCP header is composed.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Port | Destination Port |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Acknowledgment Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data | |U|A|P|R|S|F| |
| Offset| Reserved |R|C|S|S|Y|I| Window |
| | |G|K|H|T|N|N| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Checksum | Urgent Pointer |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Options | Padding |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| data |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 5.4:TCP header format
36 Chapter 5.TCP/IP
Sequence and acknowledgment numbers
Two of the most important fields of the TCPheader are the sequence- and acknowledgment-
number fields.They are used to provide TCP’s reliable data transfer service.To un-
derstand how this works,the meaning of these fields will first be explained.TCP views
data as an ordered stream of bytes.Each byte of the data-stream is assigned a se-
quence number.When a segment is received,the sequence-number field displays the
byte-streamnumber of the first byte in the segment.The acknowledgment-number field
indicate what sequence number is expected from the peer process.How these numbers
are used to provide reliable data transfers is illustrated in figure 5.5.
Figure 5.5:Example of sequence and acknowledgment numbers
Figure 5.5 shows a simplified data transmission.The arrows represent how segments
are sent between the two processes.The sequence-number field of the segment is labeled
Seq and the acknowledgment-number field is labeled Ack.The label length indicates how
many bytes of data the segment contains.
1.Process A sends a segment with three bytes of data.The sequence number for
the first byte of this segment is zero (Seq 0).The acknowledgment field is also
zero,indicating that process A expects a segment with sequence number zero from
process B.
2.Process B answers with an acknowledgment field set to three,which states that
next expected sequence number from process A is three.This also means that all
bytes up to sequence number two are received.
3.Process Asends a second data segment containing seven bytes,starting at sequence
number three.The received segment fromprocess B did not contain any data,thus
next expected sequence number from process B is still zero (Ack 0).
4.Process B acknowledges the received segment which informs process A that next
sequence number to send is ten.
5.6.Transport Layer Protocols 37
Connection establishment
TCP requires an initial connection process before data can be sent or received.This
connection process is often called a ”three way handshake”,because the connection
establishment demands three segments to be passed between the two TCP sockets.
During the connection-process the necessary state-variables are initiated.Both sides
need to know what Initial Sequence Number (ISN) the remote hosts data stream starts
at.Figure 5.6 illustrates what segments are sent to establish the connection.
Figure 5.6:Example of how a TCP connection is established.
In this example a client shall connect to a server.The client first sends something
called a SYN segment to the server.This segment is a connection-request and is called
SYN segment because the SYN flag in the header is set to one.The sequence number
field is set to the clients ISN,which is randomly selected by the client.If the server ac-
cepts the connection request,a reply is sent called a SYN-ACK segment.This segment
contains the servers ISN placed in the sequence number field,and has its SYN flag set
to 1.It also acknowledges the clients connection-request with acknowledgment number
field set to client
isn + 1.A SYN-segment does not contain any data but counts as
a segment of length 1,and this is the reason for setting the acknowledgment field to
client
isn + 1.The final step of the three way handshake is for the client to acknowledge
the servers ISN.
In addition to initialization of ISN,some other options can be agreed upon.These
options resides in the header-field of the same name,and can only be present in the
initial SYN segment.The most important and most used option is Maximum Segment
Size (MSS),which informs the remote TCP socket how much data a segment is allowed
to contain.For an embedded system with limited memory resources the MSS option
38 Chapter 5.TCP/IP
makes it possible to limit the size of the receive-buffer used by TCP.MSS is also used to
avoid fragmentation within routers along the path between sender and receiver.When
this handshake is complete,the two processes can start sending data to each other.
Congestion Control
To avoid network congestion TCP uses Congestion Control.When many sources are
sending data at high rate,router buffers are sometimes overflowed and begins to drop
segments.Congestion Control is used to throttle the send-rate when network congestion
is discovered.
Flow Control
TCP also uses a mechanism to avoid overflowing receive-buffers.A TCP socket keeps
track of how much available buffer-space the remote host has for receiving data.The
window field in the TCP header announces this information.
5.6.Transport Layer Protocols 39
5.6.2 User Datagram Protocol (UDP)
UDP is defined in RFC 768,and is a very thin transport protocol.It does not provide
features like flow-control,congestion control or error recovery.It is basically an applica-
tion interface to IP,and just handles multiplexing/demultiplexing (explained in section
5.4) and some light error checking.There is no initial handshaking procedure,as there
is with TCP,and therefore UDP is said to be connectionless.
There are however some benefits with the UDP protocol,which are explained below:
– It has very little packet overhead,only eight bytes compared to twenty bytes for
TCP.
– No connection establishment delay.
– No connection state variables to keep updated.
– Finer application-level control over when data is send,and when.
Some applications do not need reliable data transfer,since they tolerate some data-
loss.Many multimedia applications use UDP because they do not work well with TCP’s
congestion control due to the delays it causes.Figure 5.7 illustrates the format of the
UDP header.Source and destination port are used for multiplexing/demultiplexing.
The length field specifies the length of the UDP segment including the header,in bytes.
For UDP to calculate the checksum,a few fields in the IP header is used in addition to
the UDP segment.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Port | Destination Port |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Length | Checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 5.7:UDP header format
40 Chapter 5.TCP/IP
5.7 Application Layer Protocols
5.7.1 File Transfer Protocol (FTP)
The FTP (File Transfer Protocol) is a commonly used protocol of exchanging files over
any network that supports the TCP/IP protocol.The protocol relies upon TCP,with a
server on one side and a client on the other.Two TCP-connections are used in a normal
FTP-session:
– Control Connection
Used for sending control information between the server and the client,information
such as user identification,password and commands like change remote directory,
get file,put file,etc.
– Data Connection
Used for file transferring and file lists of the remote directory.
A user starts an FTP session by initiating a TCP connection to port 21 (standard
port) at a server.The client then sends user identification and password and the FTP
control connection is established.When a server receives a file transfer request over the
control connection,it initiates a TCP data connection to the client side.The file will be
transferred,either fromserver to client or fromclient to server,over the data connection.
The data connection will be closed as soon the file transmission i completed.Thus,the
control connection remains open throughout the duration of the FTP session while the
data connection is created and closed for each file transfer.Commands and replies,that
are sent over the control connection,are text coded in seven-bit ASCII-format.Each
command consists of four uppercase ASCI characters,some with optional arguments.
Some common commands are given below:
– USER username Sends the user identification to server.
– PASS password Sends the user password to server.
– PORT address and port Informs the server to where it should initiate next data
connection.
– LIST Request the file list of current remote directory.
– RETR filename Initiate a file transfer from server to client.
– STOR filename Initiate a file transfer from client to server.
Each command issued at client side is followed by a reply fromthe server.The reply
messages consists of a three-digit number followed by an optional message.More details
about the File Transfer Protocol can be found in RFC 959.
Chapter 6
Hardware
6.1 Phytec phyCore-XC167
The hardware used in this project was a rapid development kit fromPhytec.It consisted
of a Single Board Computer (SBC) phyCore-XC167 together with a development board.
An SBC can be seen as a small computer and the size of phyCore-XC167 is just 60 x
53 mm.The most central part of phyCore-XC167 is the microcontroller XC167CI from
Infenion but it also has many other features like:
– 16-bit,multiplexed bus mode.
– 256 KB - 2 MB external Flash on-board.
– 256 KB - 1 MB external RAM on-board.
– 512 KB fast SRAM on-board.
– 20-40 MHz clock frequency.
– 16 MB address space.
– 2 CAN interfaces.
– RS232 transceiver for two serial interfaces.
– CS8900A 10Base-T Ethernet Controller.
These and other features are presented in [16].
As seen in 6.1 the Ethernet controller is connected through a common bus with the
external Flash,RAM and SRAM.The access to these components is easy because they
are mapped into the microcontrollers address space.A memory access to the external
memory areas (see Figure 6.3) of the microcontroller will therefore use the external bus
and access a device there.
41
42 Chapter 6.Hardware
Figure 6.1:Block diagram of phyCore-XC167
The use of interrupts is preferable when constructing a driver for the Ethernet con-
troller.It is clearly a better and more elegant way to let the Ethernet controller interrupt
the host CPU than have some kind of polling solution.To make it possible for the Eth-
ernet controller to interrupt a jumper needs to be closed.This jumper connects the
IRQ output from the Ethernet controller to the pin P2.13 of the microcontroller.This
jumper is marked in Figure 6.2 and is the only modification of the hardware that must
be done.
Figure 6.2:Modification of phyCore-XC167
6.2.Microcontroller Infenion XC167CI 43
6.2 Microcontroller Infenion XC167CI
XC167CI is a full featured single-chip microcontroller.The list below presents a short
summary of XC167CI’s features.
– High Performance 16-bit CPU with 5-Stage Pipeline.
– 40 MHz CPU Clock.
– 16 Mbytes Total Linear Address Space for Code and Data.
– 2 Kbytes On-Chip Dual-Port RAM (DPRAM).
– 4 Kbytes On-Chip Data SRAM (DSRAM).
– 2 Kbytes On-Chip Program/Data SRAM (PSRAM).
– 128 Kbytes On-Chip Program Memory (Flash Memory).
– Up to 12 Mbytes External Address Space for Code and Data.
– On-Chip Bootstrap Loader.
The memory space of the XC167 is configured in a Von Neumann architecture,which
means that all internal and external resources,such as code memory,data memory,reg-
isters,and I/O ports are organized within the same linear address space.This common
memory space includes 16 Mbytes and is arranged as 256 segments of 64 Kbytes each,
where each segment consists of four data pages of 16 Kbytes each.How the memory
space is organized is presented in figure 6.3.
44 Chapter 6.Hardware
Figure 6.3:XC167CI Memory map
The microcontroller has several on-chip memory modules.These are Dual-Port RAM
(DPRAM),Data SRAM (DSRAM),Program/Data SRAM (PSRAM) and Program
Memory (Flash Memory).The PSRAM is provided to store user code or data and
is accessed via the Program Management Unit (PMU),as figure 6.4 illustrates.PMU
is responsible for all code fetching and therefore also accesses the Flash Memory which
stores code or constant data.The DSRAM are provided as a storage for general user
data and is accessed via the Data Management Unit (DMU).DPRAM is provided as a
storage for user defined variables,the system stack,and general purpose register banks.
If more memory is required than the memory provided on the chip,up to 12 Mbytes of
external RAM and/or ROM can be connected to the microcontroller.
6.3.Ethernet Controller Cirrus Logic CS8900A 45
Figure 6.4:Overview of XC167CI’s on-chip components
6.3 Ethernet Controller Cirrus Logic CS8900A