Embedded MCU Debuggers

russianharmoniousElectronics - Devices

Nov 2, 2013 (4 years and 8 months ago)


Embedded MCU Debuggers


Chris Hills

Edition 3.3
19 January 2006

Part 2 of the QuEST series



page 2 of 81

Embedded MCU Debuggers

Third edition (3.3)
January 06 by
Eur Ing Chris Hills BSc(Hons), C. Eng., MIEE, MIEEE FRGS

Second edition
April 1999 by
Chris Hills
Presented at
JAva C & C++ Spring Conference
Oxford Union, Oxford UK , April 1999
For the Association of C and C++ Users, see

The slides and copies of this paper (and subsequent versions) and the
power point slides will be available at or

This paper will be developed further.
Copyright Chris A Hills 1999, 2001. 2005, 2006
The right of Chris A Hills to be identified as the author of this work has
been asserted by him in accordance with the Copyright, Designs and
Patents Act 1988

Quality Embedded Software Techniques

QuEST is a series of papers based around the theme of Quality
embedded systems. Not for a specific industry or type of work but for
all embedded C. It is usually faster, more efficient and surprisingly a
lot more fun when things work well.

QuEST 0 Introduction & SCIL-Level
QuEST 1 Embedded C Traps and Pitfalls
QuEST 2 Embedded MCU Debuggers
QuEST 3 Advanced Embedded Software Testing

QA1 SCIL-Level
QA2 Tile Hill Style Guide
QA4 PC-Lint and DAC MISRA-C Compliance Matrix


page 3 of 81


page 4 of 81


















Code Coverage

Performance Analysis












OCDS Level-1

OCDS Level-2




















page 5 of 81











page 6 of 81

Micro-controller Debuggers -

Their Place In The Micro-controller

Application Development Process

1. Introduction

Embedded systems are different to most "normal" computer systems.
Usually they have no screen and keyboard, they have strange IO and
peripherals working at their own time and pace not yours. More to the
point these strange peripherals
often controlled external
equipment many in safety critical,
medical and transport systems.
Some in systems like rockets and
cruise missiles have to be tested
fully before the equipment is
used… you can't just "run it and


Embedded systems use strange
processors specially made for the
job. Most often the software for the system is developed on a work-
station (more likely a PC these days) and "loaded" on to the target after
it has been compiled and linked. Debugging requires either the target
system to be simulated on the development platform or remote
debugging of the target from another machine…. This caused many
weird and wonderful schemes and tools to be developed over the
years. This paper will look at the main methods that been developed
and those that have survived through to current usage.

Back in the good old days when microprocessor were new (the mid
1970's), microprocessor programs were developed in assembler and
"blown" or burnt into EPROM’s such as 2708's, 2716's - remember
those? There was no possible way of knowing what the program was
actually doing, step by step, as it ran. Looking for bugs, ingenious
methods such as twiddling port pins and flashing LED's were used to
tell the programmer where the program was. It not how it had actually
got there. With a lot of sweat and a lot of ingenuity, working
programs were produced.

The a few had ROM-based monitor programs whereby a array of seven
segment LED’s and a hex keypad allowed assembler to be single
stepped and simple execution breakpoints to be set. In some
instances, a serial connection to a computer or terminal allowed more
flexibility and interactive debugging via archaic and cryptic commands.

page 7 of 81

Note this is to an 80 column mono screen. The only graphic visible
would have been the manufacturers logo on the case.

There were also some very rudimentary software simulators. Again,
these were command line with strange command sequences on a
monochrome character based screen. It was hardly real time, more a
case of run to a point, stop, dump the registers, maybe a block of
memory and see what state things were in. Rudimentary timing was
possible by knowing the number of cycles each instruction took. The
problem with simulators was they were usually hand made and not
exactly portable. We are talking of a time before PC's and the omni
presence of MS Windows, or inexpensive UNIX workstations. There was
no common host platform or GUI environment.

An elite few, in the largest [richest] companies, were "blessed" with the
ultimate tool - the In-Circuit Emulator (ICE). However, such was the
initial cost (around $20,000 in the early 1970's), limited features and
subsequent unreliability of some of these early devices, that they often
were ditched in favour of the more reliable monitors or logic analysers.

The one-thing emulators did have in common with monitors (and later
the logic analyser), was their strong assembler-orientation. High-level
languages were for wimps, besides there was no direct and visible line
from the HLL to the binary. This made the engineers of the day very
nervous of compilers. Compilers, and their languages were also in
their infancy so this scepticism was often well founded. This axiom
that assembler is best because there is a direct one to one relationship
between the code and the binary still pervades the industry today.
Several parts of the industry mandate assembler in the mistaken belief
that it is safer than using a high level language.

1.1. Basic methods
When there was no test equipment one simply wrote the code,
compiled it (sometimes linked it
) and burnt into an EPROM. There
was no white box testing of the code: on or off the target. It was not
possible to look inside the system as it ran. The “Burn and Pray”
brigade worked on the assumption that if the correct outputs
appeared during testing then the code must work therefore the
program was "tested". As there is a strong relationship between
assembler and the resultant binary image in ROM it is true to say that
the developers knew reasonably well what was in the PROM. At least
they thought they did….

The use of MACRO and optimising assemblers changed the
relationship between what was written and what actually ended up in

When there was only one file there was no linking to do. Many old programs were
one asm file.

page 8 of 81

the ROM., though the list and map files were a great help to see where
things actually were and most good assembler engineers knew how
their tools worked.

This is there time when a good EPROM programmer with editing
capabilities was very useful. I have one that had the usual nixie tube
display and the ability to drive a monitor (screen) for editing sessions.
It was expected that professionals would patch the hex!

Using C (or any other HLL) there is no direct relationship between the
HLL source and the bytes in the ROM in the same way as there was
with assembler. Besides compilers for HLL were relatively new as were
the languages and their definitions. The other problem was that
compilers were not always very good at producing the best assembler
from the HLL.

Thus, the programmer could not say for any certainty that he knows
exactly what is in the ROM without an ICE or ROM-monitor True the
map files helped a lot but this only goes so far and does not help with
code execution. The ICE and Rom monitor use other tools and produce
other files to ensure they do know where they are in the source.

I have seen an interesting case where an optimising compiler noted
that a value was written to a register and then to memory. The value
was then immediately read back from memory into the register and
compared with the value that was first written to the register. The
compiler realised that the write to and subsequent read from memory,
always being the same address, was "superfluous". It optimised them
out saving both time and code space. It speeded up the memory test
quite a bit. The memory test never failed but of course it never
touched the memory that it was “testing”! Hence the C keyword:

In the second edition of this paper I wrote, “Surprisingly in 1999 the
practice of burn-and-pray still goes on. Though I hope that these
days the code is, at least, tested with a software simulator first. I hope
that the “Burn and Pray” technique will not continue into the 21

Century.” Unfortunately, there has been no miracle and sudden
conversion to good engineering methods. The practice still goes on
but I am happy to report that in 2002 there is more of a tendency to
do things correctly. I am hoping that the amendments to the
Manslaughter Act for Corporate Manslaughter will expedite this
tendency! (See Quest 1 Embedded C Traps and Pitfalls)

The use of LED’s and toggling port pins is used like a rudimentary
printf(“Hi! I am at line x \n”); It probably tells you where you are in a
program. This technique assumes that the software is functioning to
the point there the port pin is toggled. There is no guarantee that the
target is in the state it is supposed to be in. Using this technique to

page 9 of 81

test hardware is one thing but for software testing it is of dubious

These techniques of port pins and LED’s are still in use today but
generally for post-production error reporting. An example of this is
the POST
beeps produced by a PC during boot up or status LED’s on
an ICE. However this is as part of a fully tested and working system to
show certain known (usually hardware) errors. In the case of a PC it
can indicate missing keyboard, disk, CMOS information etc.

Usually it requires working [initialisation] software. It is rarely, I hope
never, used as a main test method in SW development anymore. Using
7 Segment LED’s some additional information can be obtained. These
techniques are hopefully now only used in programs to test new
hardware rather than debugging the software. They are also used as
part of some systems to give fault information to service engineers
when faults develop in the field. Again these are usually hardware
problems i.e. something is physically broken.

The next stage was to have a terminal connected to a serial port on the
target and printf statements top the serial port and the state of the
target sent out. However, this requires hardware to support the debug
port and debug code in the application source. This form of debug is,
as mentioned still built in for field service technicians. It is becoming
increasingly popular was FLASH memory permits easy field upgrades
of system software without the need to disturb the hardware or in
many cases it removes the need to even. open the equipment.

1.2. ROM-Monitors

Engineers required more than toggled port pins and flashing LED's.
They wanted to see what was going on inside the MCU. The problem
with an embedded MCU is that not only is the CPU in the chip but, as
can be seen from the diagram, many of the peripherals are also inside
the chip. Many of the address, data and control lines never become
visible at the edge of the chip.

This means that, even with a logic analyser, it is not possible to have
any view of the inner workings of the part. In a single chip design
where no external memory or peripherals are used the part is the

Power On Self Test used on many systems E.G. PC's, alarm systems etc to give
rudimentary debug information if the system fails to power up and self test correctly.

page 10 of 81

Thus ROM-monitors were developed to give this internal view. In some
cases they could only work with external memory in others they were
part of the application. ROM monitors are a very simple and hopefully
very small piece of code that permitted a user to have some control
over the program under test whilst actually running on the target. The
monitor code is loaded in
to the target. There were
two types of monitor: A
monitor that was part of
the application and
"compiled in" and those
that were more like an
operating system. They
were loaded to the target
and the application
loaded on top of them.


When an application is
loaded under control of
the monitor the monitor
runs it. In most cases
single step or run to
break point was possible.
This has a couple of
effects. Firstly the
monitor code takes up
space thus distorting the
memory map. Even the smallest ROM-monitor required several Kbytes
of space. Secondly it takes time to run the monitor code therefore the
program could never run in real time.

The trouble with these early ROM monitors was they had few
commands, a very simple (and often cryptic) interface via a serial link
to a dumb terminal. They also required a working serial port on the
target. The command range was usually very limited: single-step but
only on 3 byte opocdes in the case of 8051, break points but not
watch-points, display registers or memory, read memory.

The user interfaces were very primitive often limited to responses
printed to screen as a command (usually a letter) was entered.
Memory could be displayed as a snapshot not a continually updating
window. The connection to the host is usually via a slow serial link.
Over they years monitor interfaces have improved.

The major drawback, apart from not being "real-time" is that ROM
monitors required the target to be largely correctly working and the
insertion of a ROM monitor changes the memory map. ROM monitors
also cost money over the cost of the actual software, as every board
that you wanted a monitor on had to have an additional serial port.

page 11 of 81

This means that the hardware used for testing with the monitor was
often not the same as the production item.

However ROM-monitors did permit vision inside the target and control
of the software on the target. They were often seen as the “Poor Man’s
ICE” as they were relatively inexpensive. Many programmers were able
to write their own monitors. However, as noted, there was a cost in
the hardware requirements or development systems would have to be
different to the production boards which, of course brings it's own

1.3. The CPU Simulator

More recently another form of debugger has arisen, which fits, cost
and performance wise, between emulators and conventional monitors.
This is the CPU simulator sometimes called the instruction set

Initially simulators were often a product of the compiler
manufacturers. Some of the silicon vendors also produced
simulators. These simulators were often made available at a (highly)
subsidised cost before the real silicon became available in quantity to
kick start development, and sales. It also created market for 3
tools, assuming the part started selling.

Simulators seek to provide a complete simulation of the target CPU via
pure software. The initial simulators traded functionality and interface
for speed. In those days target systems were not much slower than
the host development platforms. They usually worked in hex rather
than HLL. Thus when presented with an executable object file, the
behaviour of the program can be observed any errors identified. Of
course, being a software simulation, execution does not proceed in
real time and all IO signals must be generated by special routines
designed to mimic peripherals or other extra-CPU devices.

Simulators started to emerge when PC’s and workstations became
powerful enough (and cheap enough) to run simulation SW at a
reasonable speed. Though the term “reasonable speed” is rather
subjective and what was reasonable then is not now. The other
determining factor was a "standard" interface. CPM was one but the
emergence of a single OS for the majority ie MS DOS (and later MS
Windows) made it commercially sensible to produce debuggers at a
"reasonable" cost.

The interfaces were initially rather basic. Like the monitor the early
simulators had basic and cryptic interfaces. Actually monitors and


page 12 of 81

simulators often shared the same user interface. The commands were
also similar though without the actual constraints of the target
hardware simulators were often more flexible than the monitors. They
could use all the resources of the host machine.

The monitor shown here is for the 68 K {Clements]. IT is from his book
on the 68K. Simulators were a far cry from the modern GUI based
windowing simulators today.

Whilst mentioning the fact the simulators were originally supplied by
silicon manufacturers: This is a good test for a new piece of silicon. If
the development tools available only come from the silicon
manufacturer, beware! This is for several reasons:

Firstly it could be that all the compiler, debugger and ICE vendors see
no market for the chip. Thus the MCU may only be short lived, in a
niche market or very different from other chips available. This is of
course not a problem if it is your niche market.

Secondly, if the tools are available, usually at low cost, from the silicon
manufacturer. What incentive is there for 3rd party tools companies to
get involved? This ties the developer to a, highly controlled, single
source for tools which could disappear as soon as the silicon
manufacturer decides to stop production. This is because they want to
release the next line in the family and want all their users to “upgrade”
to it and stop supporting the older version of the development tools.

The third test is look to see if the tools are all software based.
Hardware tools still cost more time and money to produce than
software tools. Copies of software tools can easily be “run off” and are
easy to modify which is very useful if the silicon is not 100% stable.


page 13 of 81

Hardware based tools tend to appear after the new silicon is
reasonably stable. If there are no hardware tools available look at the
age of the part and find out which hardware tools companies will be
supporting the part. If it is only M-Mouse & Co beware.

For chips with a future there should be several mainstream tools
vendors with tools. The only exception to this is on special niche
markets. In this case the silicon companies tend to restrict the tools to
one or two companies for reasons of commercial security. The smart-
card and DECT markets fall into this category.

1.4. The First ICE

Even in the early years of embedded micro-controller development,
the ICE was the ultimate tool. Intel developed the first real ICE, the
MDS-800, for the 8080 in 1975.

However they were very expensive, in 1975 the MDS-800 was
$20,000. In those days, as can be seen they were also bulky. The
picture is just the ICE, nothing else!

The MDS-800 has a screen, a keyboard and two 360-kilo byte, 8-inch
floppy disks (that’s 20cm for the youngsters). It was heavier than the
average PC these days
having an all-metal
construction. The
screen was a
monochrome character
based type. Also there
is no mouse. Note it
was expected in those
days that the light pen
would rule the world
by now, not that the
ICE had a light pen


In those days ICE in
general were not
always reliable and
they were intrusive in
many situations. So they were “almost real time” (but a lot more real
time than anything else). Their buffers were small 44 cycles. (Not 44
Kbytes!). This was partly due to the level of ICE technology and to what
now seems the ridiculously high cost of RAM.

page 14 of 81

The reliability, or the lack of it, often stemmed from the mechanical
adaptors used to attaché the ICE to the CPU under test. Fragile multi
pin connectors with large cables attached and in this cased shown
above a substantial amount of electronics at the target end.

The other problem was (still is actually) the electrical loading. The Ice
itself can affect the electrical characteristics to the effect where it can
actually cause the glitches that are virtually impossible to find.

Further ICE were also not able to emulate the faster micros. In the
early days a CPU that could be used for embedded work was the same
as the one that would be used in the ICE. These days we have 500MHz
to 1GHz machines that are debugging embedded systems that have
also increased in speed from 4Mhz to speeds reaching nearly 50Mhz.

The two major differences between monitors and emulators are that
when an ICE stops at a break point the whole board stops and one is
actually looking inside the MCU with the whole target suspended. With
a monitor, the target and the monitor program are still running and
the monitor does not actually look inside the MCU in the same way. To
see the registers on an Ice you just see them as the ICE holds them on
a monitor the monitor has to run code on the target to retrieve the
values. Break points on an ICE can usually be placed anywhere whereas
on many monitors and simulators there are often restrictions. For
example with the 8051 family software break points can usually be
placed only on a 3-byte opcode.

However most decent Engineers could write their own monitors, many
did, which meant that not only was a monitor available the engineer
understood it and could modify it if required. In contrast the early ICE
was expensive back magic that was not always reliable.

The one thing that emulators had in common with monitors was their
strong assembler orientation. There was, in the 1970’s no HLL that
approached the current widespread use of C in the industry today.
Thus the number of people using any one language (or a version of it)
for embedded work was small. In any event, early HLL compilers
could not produced code that was compact enough to fit in many
embedded designs.

Due to the initial cost and subsequent unreliability of some of these
early ICE, they were often dropped in favour of ROM-monitors. This
may seem like a backward step now but engineers like to get on with
the task and not have to spend time wrestling with unreliable
equipment. Simulators were not available and the monitor was the
only other on-target option.


page 15 of 81

In the beginning logic analysers were also used as they had far better
triggering facilities than the ICE. They did at least permit a view on the
busses that was not intrusive.

One drawback of the old ICE was the slow serial link. Where 9600
baud was the fastest rate the serial link could do, it could take a while
to download the program code. Added to which it took some time to
generate the symbol table. Any change in the code, crash on the Ice or
the target often required a long "reload" cycle.

Some manufacturers got round this by having high speed parallel
kinks between the ICE and the host. These were usually proprietary
and required special (expensive) parallel cards in the host.

The spectre of the slow serial link and the high cost of the ICE still
haunt the modern ICE. However most Ice use high-speed serial, USB
and Ethernet. So the parallel connection has all but disappeared.

1.5. How does an ICE Work?

This section has been added after an exchange of emails I had with an
experienced test engineer who last used an ICE when the Intel MDS,
shown previous section, was new! It appears that not everyone
understands how ICE work or why they were comparatively expensive.
This section will, I hope, explain some of the limitations of an ICE and
why, as with all tools, understanding the philosophy behind them will
assist in your use of them.

The in most instances the ICE
replaces the target MCU on the
board. It is possible with some
processors EG the 8051 where
the address and data buses are
visible that it can leave the part
on the board. However this is
not the same as BDM or JTAG
debuggers as will be seen later.
In the good old days this was
usually a 40-pin DIL package.
However, as MCU got larger
and more complex so did the
connections to the target

Given that the ICE replaces the MCU it has to emulate the part and run
programs. This resulted in some "interesting" electronics replacing the


page 16 of 81

MCU part on the board. The functional replacement was done in
several ways. TO start with the ICE must contain something that
replaces the target MCU. This has traditionally been a "bond out" part.

Emulation memory resides in the ICE. In Hitex ICE it is high-speed dual
ported memory fitted as standard. Other ICE offered it as an optional
extra. Dual ported means the ICE can access it at the same time the
target program is executing in the same locations. This RAM used to
be very expensive. The reason for this is so that the ICE can work on
the memory whilst the program is running. If the memory is not dual
ported the memory can only be accessed when the program execution
is stopped, ie at a


Normally the
emulation memory
can be code or
data memory to
mimic the internal
memory of the
target MCU.
However, any
memory that is
external to the part
that is used by the
program is also
mapped to
emulation memory.

Mapping is the
term used to set
up memory spaces
in the emulation
memory to replace
the actual target
memory. In many
ICE the emulation
memory also
replaced memory
mapped IO and
peripherals. This is usually done using a bond out part that actually
contains the peripherals. Bond out parts are only made in small
numbers, are very expensive and not usually released onto the general
market. In some ICE, 8051 for example, there are many memory
configurations so the user has to set them up. In other families the
memory map is fixed and the emulation memory is hard wired.

page 17 of 81

The code is run from the Emulation memory. It is loaded to the ICE
(not blown into the EPROMS/Flash etc) Most ICE can load FROM the on
target EPROM's to the emulation memory. This is usually done in final
stages of test to confirm that the binary that is tested is the one on the

Normally the code is compiled and downloaded to the ICE for

Data memory can also be mapped. This usually turns off the rd/wr
signals. Where this is a problem the memory is "shadowed" so that
the rd/wr lines work and the memory on the target is checked.


page 18 of 81

1.6. The Logic Analyser

The Logic analyser is not really a software-debugging tool but in the
early days the absence of good ICE pushed it to the fore. It is more of
a multi-channel digital scope.

The Logic analyser permitted a view of the data and address bus, and
other lines in real time and was not intrusive. Well it did have an effect
on the lines but this was minimal. It does not in itself change the
timing of anything neither does it change the memory map of the
source code.

Like the ICE, the Logic analysers were large, heavy and expensive (but
not as expensive as an ICE). They also took a lot of setting up.

One of the problems with
the Logic Analyser was the
way it connected to the
target. Usually something
like 30-50 flying leads.
You only had to miscount
and be out by one to
cause all sorts of
"interesting" results.
Leads falling off the target
were a common problem
(also with some ICE).
However, the logic
analyser connections
always seemed to be more
reliable than the Ice
connections. Maybe it was
because the logic analyser
tended to be used by
hardware people and the
Ice by software people….

However, unlike ICE that tended to be for a single processor or family,
the logic analyser could be used on any bus system. Thus one logic
analyser would be used on several projects with different CPU types
making the logic analyser more appealing to the managers and
accountants as well as the Engineer.

This was also their drawback. The logic analyser could not be used in
the same way or as a substitute ICE when the target was a single chip
MCU. IE All the memory was internal and there was no bus visible
outside the chip.


page 19 of 81

The logic analyser could put probes wherever the user wanted.
Whereas the ICE connections were specific to a particular pin on a
specific MCU.

Another advantage was that the analyser could run faster than most
ICE which, in the early days made it a lot more flexible and attractive
to management. It's lack of HLL support was not a problem as most
work was done on assembler and most engineers spoke Hex anyway!

Logic Analysers have improved dramatically over the years to the
point where the logic analyser has almost become a reliable, less
expensive, version of what an ICE used to be. Some of the top end
logic analysers can disassemble the data on the bus and even do a
high level language
debug. Logic analysers
are still used occasionally
for hardware debugging.
However the logic
analysers are still very
expensive to the point
were they are in many
cases more expensive
than an ICE.


Some logic analysers have
colour graphical screens
are mouse driven and have
floppy disk drives to store
data and set up
information. The one I have will also print to a colour printer screen
dumps, trace information etc.

The Logic Analyser also has competition from many ICE these days.
Some ICE can trace not only in C source but also in a similar mode to
the logic analyser. This can mean, that often, all that is needed for
debugging is and ICE and digital scope.

The logic analyser tends to be the tool of the hardware teams bringing
up a new board and no longer in the software engineers toolbox. In
some places the ICE has been seen in use rather than the logic
analyser for bringing up hardware.

page 20 of 81

2. Brave New World

So much for the past, things have moved on and at an alarming rate.
Things are now
at the same time. They are faster,
cost a lot less (comparatively), have more useful features, better
interfaces, more intuitive and are multimedia. Every thing is now
Multi-Media. OK…. We have pretty colour screen shots in the

help files, more imaginative icons and the tool
will use one of those annoying beeps (in stereo) at the appropriate
time… what more do you want?

The advances in embedded tools have been mainly due to the semi-
conductor industry itself. The changes have been brought about by the
vast increase in the number of transistors that can be packed on to a
chip and the mass production techniques that have meant prices of
chips have plummeted. Many more functions are now packed on to
less expensive chips including Debugger Friendly Features. Basically
faster better tools can be made for less cost. Also features that were
not practically possible ten years ago are now on all but the most basic

The other more visible effect of this chip revolution is far more
powerful and inexpensive host systems. The ubiquitous PC with faster
processors, huge hard disks, Giga bytes instead of kilobytes of
memory (megabytes came and went last year!), a multiple window GUI
rather than a single screen of text and all of it all running several
orders of magnitude faster. Thus the tools user interfaces improved
dramatically and the host capable to handling far more data from the
ICE itself. Another side effect of a powerful host system is that there
could be integration between development tools.

The practical integrated development environment (IDE) was born. It is
now possible to have the compiler system running (controlled by
menus and mice) at the same time as the debugger (simulator, ROM-
monitor or ICE) and “flick” between them. A cycle of edit, compile, link
and debug on an ice is as fast as it takes to tell. Taking it to the
extreme, in some cases it is possible to "run" the diagrams in CASE
tools on the target hardware via an Ice or JTAG debugger.

Having said how wonderful the world is now there are many
developers still using tools via command line control and make files
under DOS. I understand that there is also a C/PM group out there
somewhere as well!

The Unix developers will get annoyed here but the same is true for
Unix (and Linux). Most developers tend to use X and a graphical front
end than a terminal window for a lot more things. Though Unix


page 21 of 81

developers usually drop into a terminal faster than windows
developers. They also tend to be far more comfortable with Make.

2.1. Bringing The ROM-Monitor Debugger Up To Date

With the simulator and emulator are now able to debug the HLL (or at
least C) and a lot more user friendly the ROM monitor started to look
very old-fashioned in its adherence to assembler support. People were
suggesting the death of the monitor. However as software can be
"made" for the cost of duplicating the disk, monitors are now the
cornerstone of most 8-bit 16-bit development kits where they are
included "free". This gives students, hobby users and small companies
that are looking at using a Micro for the first time some debug

Adapting the monitor to HLL support has proved remarkably simple
for tools makers using the standard debugging and symbol
information files. These debug files are commonly used by simulator
and ICE. In fact most monitors now work with simulator front ends
and all that was required was some comparatively simple changes to
give monitors a longer lease of life.

Similarly the ICE and Simulator producers were also able to produce a
target monitor, which can fool an emulator's or simulator's HLL
debugger into thinking that real emulator hardware is present (with
limitations). The flip side is that in some cases the simulator or
monitor user interface can be used to drive an ICE!

Therefore, the
developer now
has access to a
subset of the
sort of facilities
previously only
enjoyed by
emulator users.
This means that
programs may
be transmitted
to the target for
debugging and
C source lines
single stepped;


page 22 of 81

breakpoints set on code addresses, using source lines or symbols.

The monitor user interfaces now a fully windowed affair, with full
mouse control and pop-up menus. In some cases it is the same
interface as the simulator or ICE. The one shown here is a Keil

For testing purposes, command files may be constructed to allow
repeatable test conditions to be applied to user program sections,
traditionally, the simulator's forte. Often the same tests can be run on
both the simulator and the monitor.

Some monitor-based HLL debuggers even manage on-the-fly program
variable retrieval to the host PC's screen, albeit with some loss of real
time operation.

However the ROM monitor is still not real time and still changes the
memory map (a commercial 8051 monitor is about 4-5 kbytes). Some
monitors now load into high memory (or low memory depending on
architecture) reducing the effect they have on the application under
test. The change in location means that the user code is compiled to
the same locations it will be in on the final target. This is an
improvement but still distorts the memory map in a less obvious way.
In some cases these monitors require some special hardware memory
mapping. The current use of monitors is in two main areas:

Firstly, very simple, low cost projects where it is unlikely there will be
hardware or time critical problems. Typically student projects or initial
feasibility studies with a processor that has not been used before and
there are no in house tools available. This assumes that there is
memory available for the monitor and the hardware has a spare serial
port. However it should be remembered that serial ports cost money
and this hidden when looking at the “cheap” ROM monitor. The cost of
designing in an additional serial port, the parts and production can be
considerable. Either that or the development boards will be different to
the production boards.

Secondly, initial stage of large undertakings where SW trials are made
within a fully proven board, usually provided by the CPU manufacturer,
or other 3
party supplier. Later in the project, when the real target
becomes available the monitor is discarded in favour of a full in-circuit
emulator. Whilst the ICE can do the same job as the monitor in this
case it is cost effective on large teams to give several team members a
target board and a monitor. Also on a development with a new MCU
the ICE may not be available initially. The problem still remains that
break points may only be set in certain places e.g. on 3 byte opcodes
in the case of the 8051. This restricts the use of the monitor
somewhat. Currently almost all monitors have some sort of HLL
capability, reflecting the widespread use of C on embedded projects.

page 23 of 81

Now C is used even in most projects, the traditional assembler-only
monitor has reached the end of the road.

The increasing use of micro controllers with on chip FLASH or OTP
rules out the monitor debugger. This is because the monitor needs
the program to run from ram. The break point (and single stepping) is
produced by putting in a jump to a monitor handling routine where the
stop it to occur. In OTP, EPROM, ROM and most FLASH this is not

There are some single chip parts that get round this by offering a
limited number of break point registers for special "compiled in"
monitors but in single chip designs the monitor is effectively dead for
all but initial development.

2.2. User Applications With Integral Debuggers

There are three types of integral debuggers. One is a recent (Spring
2002) incantation of the traditional monitor for 8-bit single chip
designs that can be used where the program space is in FLASH
memory. This puts a traditional monitor that talks to one of the new
breed of graphical monitor/simulator interfaces. It offers a sub set of
the usual features but it opens up a new option to designs that would
previously only have been able to run on an ICE if target debugging
was required.

The second form of monitor is again where the monitor's own object
files can be linked with the final application code to produce a system
with an integral debugger.

The third type is a specific to an application or family of applications
such as alarm systems or washing machines where system tests,
software upload and system data can be run. Thus, armed only with a
portable and a company standard debugger a field engineer can
service, upgrade and test a system.

The inclusion of a standard commercial monitor gives permanent
debug capabilities. It also means that people using ASICs containing
MCU cores can test the system and ship the chip containing the code
wit the monitor. This is often required where the code as tested must
be shipped. There must be sufficient ROM and RAM left over to
support the monitor and usually a spare serial port, of course. Despite
other interfaces (USB, CAN, Ethernet) being available serial remains the
most popular for test ports to a system.

Most manufacturers will only make the monitor source available at
extra cost, and may even require a royalty for each system shipped


page 24 of 81

which contains the monitor. This has generally ruled out an integral
commercial monitor for most applications and it rarely happens these
days as an ICE would be less expensive except where is required to
have a monitor in the production version.

Also the increasing use of micro controllers with on chip FLASH or OTP
often rules out the classic monitor debugger. Therefore the traditional
monitor is not often found in shipped products.

The alternative is for the development team to write their own
diagnostic program that will take control of the system when it is
externally put in to a debug or "Engineer" mode. This happens a lot
with alarm systems, washing machines, modern cars and the like that
need maintenance checks. This is not a true monitor and does not
usually offer the same type of general debugging facilities but can be
tailored specifically to the system.

2.3. Modern Simulators

Simulators have come a long way in a few years, largely due to the
growth in power of the PC and GUI operating systems. Simulators
seek to provide a complete simulation of the target CPU via pure
software. The program can be “run” and its behaviour be observed to
a point.

Of course, being a software simulation, execution does not proceed in
real time and all IO signals must be generated by special routines
designed to mimic peripherals or other extra-CPU devices. The other
problem is that simulators with simulate to the specification of the
part…. This is more than some silicon does. This can lead to
differences between the simulated and the actual part. Further to this
there can be revisions of parts. The other thing to bear in mind is that
a simulator is software…. Software is notorious for having "interesting

Simulators now offer high-level language debugging because they are
normally bundled, or more usually tightly coupled, with HLL compilers.
Most commercial simulators are able to simulate a large range of
CPU’s in a family.


page 25 of 81

The simulator's role can span an entire project but its inability to cope
with real time events and signals means that as a the major debugging
tool, it is limited to programs with little IO. Also, if the program has to
interact with other devices, whether on chip peripherals or external
inputs, the complexities of accurately simulating them can be a

The real strength on the simulator is in being able to exercise whole or
parts of software system, repeatedly with predefined signals, usually
using scripts. Unit test of a source code module of function can be
carried out under simulation. See QuEST 3 Advanced Embedded
Software Testing Thus before committing a new function to the whole
system, a standard test on a simulator should rule out serious crashes.

Due the low cost of simulators and that they do not need external
hardware they have become quite popular. Many compiler and ICE
vendors now supply them whilst the traditional suppliers, the silicon
manufacturers have tended to fade from the scene. More often the
silicon vendors will work with a 3
party tool vendor when simulators,
compiler and ICE are needed for new chips. Thus most compiler suites
now offer a
simulator as an
option. Most are
now tightly
coupled with the
compiler IDE.


The simulator
can often easily
be converted to
run new versions
of a part or
interfaces to new
This makes
development of
software for a
new part possible even before the part has been put on silicon. This
enables silicon vendors to have software available and create an
interest for a new part. So if there is a simulator but no Ice available
for a part you are thinking of using take care! Either the part is very
new (unstable?) or no one sees a big enough market for it to warrant
making tools for it. If is a niche market silicon vendors usually team
up with a 3
party tools vendor to produce the required tools.

page 26 of 81

2.4. Modern In Circuit Emulators

The modern emulator, like all electronics, has got smaller and faster
with a lot more features for a lot less money. The ICE lost the built in
screen, keyboard and floppy disks making them smaller, less
expensive to manufacture and helped to increase their reliability. At
the same time emulators are faster in raw processing speed, which
means that they can support non-intrusive real time emulation at
speeds only dreamed about a few years ago. This is because
embedded targets, generally, have not increased in speed at the same
rate the host CPU’s have.

We are still using, as a
target, the 8Mhz 8051 that
was around when the
80286 baesed PC running
at at 16Mhz was king but
now we have the 2.5Ghz (it
was 1Ghz when I first wrote
this 6 months ago!) Pentium
4 used in the host for
debugging. The 8051
family which still make up a
major part of the embedded market still only tend to run up to 20Mhz.
But recently speeds have been increasing up to 30Mhz. However, with
internal clock doubling on some chips requires an ICE to emulate at
40Mhz. To 60Mhz . In the 16 and 32 bit markets speed has increased
but again nothing like as fast as the speed increase in hosts and
debugging tools.

At last the ICE
has broken
free of
assembler and
has HLL
support, well
mainly C and
C++ anyway!

Note, the
problems in
providing C++
(and OO)
support are
much greater


page 27 of 81

than providing C support. Also C is not a sub set of C++. That is a
myth that has caused much trouble in recent years.

There are a few ICE that do support other languages such as PL/M,
Pascal ADA and surprisingly Modula 2! (From the November 93
Embedded Systems engineering magazine survey) These non-C
languages are becoming less common as time rolls on. PL/M is an old
language that is fading
away whilst Pascal and Modula 2 never really
gained critical mass before C swept all before it. ADA will of course
continue whist the US Government requires it. However it should have
enough users to stand on it’s own anyway at the moment. This is a
purely commercial comment of supply and demand and I am not
commenting on the technical suitability of any of the languages.

ICE can now display the HLL with full symbol and type information
using various formats that are used such as ELF, DWARF, OMF, etc,
which are common
to a particular
group. There are a
few proprietary
systems still in use
but these are fading
out. The compiler
produces these
debugging files. The
ICE or simulator (and
latterly the monitor)
processes these to
get the information
in the form it needs.
In the 8051 field
OMF or extended
OMF are used.

As can be seen by the screen shot the modern ICE can give not only
the name of the variable but the physical location, size and type

When thinking of using a compiler suite with a debugger check for
compatibility and upgrades. Some inexpensive 8051 ICE still do not
implement the extended OMF. Fortunately it is an extended OMF and a
basic OMF tool should not crash if it gets and extended OMF file. It
just does not use the extensions. On the PowerPC front I have seen a
cased where a compiler vendor upgraded all the compilers at a their
customers sight and mentioned in passing that they could now do

If anyone is still using PLM let me know at chris@phaedsys.org I have some PL/M
tools going spare.

page 28 of 81

ELF2. However what did not become apparent until a few days later
was that it no longer supported ELF1 There was no Elf1/2 switch it was
Elf 2 or nothing. However the debuggers were still Elf one and could
not read the Elf 2 files. The debugger vendor said they would
"probably" have the ELF 2 version out in six months…

Whilst mentioning the symbol display above and the user interface etc
other improvements in the modern ICE are that dipswitches are no
longer required. ICE these days are point and click….

Set up and configuration should all be via software (and FLASH RAM or
EEPROM). This includes target clock speeds as well as memory, CPU
type (within a family) and peripheral configuration. It is rare to find
jumpers on Ice except occasionally at the personality pod.

Because everything is now set up in software it is possible to save state
and set up information. This means that debugging can be resumed
after a break with certainty and speed. No more sitting around on
Monday mornings trying to recall exactly what the settings were when
you finally gave up late on Friday! As an aside most ICE can be driven
by scripts (see QuEST 3 Advanced Embedded Software Testing) which
means that scripts or macros can be used to speed up common tasks.

This increase in everything coupled with an even faster drop in the
cost of technology has unfortunately resulted some manufacturers
using brute force to crack some problems. Rather than better
engineering. Initially trace was 44 cycles now it can be 64K lines or
more of information. This, as we shall show later is not always as
useful as it seems.

Another example of changing technology (that is linked to the trace
problem indirectly) is break points.

In the past the ICE would execute the line where the break point was
and then stop. This required the engineer to place the break point on
a suitable line before the point of interest. This was not always as
easy as it sounds. For example:-

char Func(unsigned char b, unsigned char y)
unsigned char x[ max_count];
unsigned char count = 0;
unsigned char char a =1;
unsigned char char c =2;
unsigned char char z =0;

while ( max_count > count)


page 29 of 81

x[count] = y/((a*b)/(b+c));
c = a+(b/a);
z=z +x[count];
count ++;
return(c + z);

To find the return value of z one needs to set a break point on the
return but not execute it.

Executing the return and then stopping will put one back into the
calling function. This would cause a loss of visibility of the local
variables in this function.

Stopping on an instruction before the return is going to be inside the
loop on the first iteration unless suitable triggers are provided.
Neither option is satisfactory but many ICE still expect you to work this

Another common type of line is:

Var1 = var2 + get_result(var3);

This makes it impossible to check the return value. The only option
(other than putting a break point on a return in the function is to be
able to drop into assembler. It this were the way the Func(char a, char
y) in the last example was called it would be almose impossible to
debug with the older type ICE.

As can be seen here the
lower window has a
breakpoint on the HLL
function call that is
(automatically) mirrored
by a break point in the
assembler window
above(which also has
the C interleaved)

This permits the
engineer to step
through the assembler
so the mechanics (and
the values) of the return
value can be observed.

page 30 of 81

With older ICE this just was not possible because there would either be
NOP's or the break point would have been somewhere in the middle of
the calling function..

To stop at but not execute a line requires a lot of work to provide a
non-intrusive look ahead op-code decode logic. The side effect is a
very good trigger system for the users and it also has a beneficial
effect on the trace.

NOTE Some ICE, without the “look ahead” opcode decoder, still insert
NOP’s into the assembler before each C code instruction to “emulate”
these features. This instruments the code and therefore it is not s true

Now memory is comparatively inexpensive. Ok it goes up and down a
bit but compared to the 1970's and early 80's it is unbelievably
inexpensive. As mentioned previously many ICE have large trace
buffers and advertise the fact suggesting that more is better. Well,
depending how it is used it might be.

Some manufacturers took
the time to look at what
was actually needed in a
trace buffer and put the
work into making them
more effective. It is far
better to be able to trace
1k bytes at the right time
than 8k bytes “around
the problem” that
someone has to wade
through to find the


To trace the "right" area
requires a good set of
triggers and holding the
right data in the buffer.
The triggers and the ability to filter cannot be under stated. Good
triggers can ensure that the trace only records the area that is needed
and holds the correct data.

The right data includes:

Executed addresses (not loaded and discarded ones)
Code labels and variable names
External signals, ports etc
Bus states, read/write/fetch and interrupt ack

page 31 of 81

Some ICE use a technique of only recording the program branch points
in the trace buffer, then dynamically recreating the source code when
the trace is examined. This can, when coupled with filtering triggers
for stopping and starting, and things like user selective ignoring of
library calls make a very powerful trace system without requiring a
large trace buffer. In fact this method (by rule of thumb on 8051 ICE)
has the effect of making the trace appear three times it's physical size.
So a 2K trace using the branching technique will probably hold, on
average, as much as a 6K trace buffer. This also has the side effect
that less data is require to be transmitted from the ICE to the debug
software host (usually a PC)

The trace should also be accessible “on the fly” whilst the system is
running and searchable within the ICE (not in a text editor later).

Of course most Ice these days can display the trace in a variety of
formats such as raw, assembler, High-level Language lines, or as
shown here "signal" mimicking the display on a logic analyser making
debugging with a storage scope much easier. NOTE this does not
replace the digital scope or logic analyser but makes it much easier to
use the two together without having to do mental mind flips to use the
two displays together.

Some ICE need the larger trace buffers because they record all the
assembly code not just the C and cannot be switched to ignore library
calls. In addition,
they may have a line
holding a NOP for
every line of C as

Some ICE with all
these good features
mentioned have also
been increasing their
buffer sizes of late.
This is for an
entirely different
reason to the brute
force reasons. Some
ICE vendors are
working with high-
end code analysis
tools that permit the
dynamic analysis of
code and animation


page 32 of 81

of things like state-charts, Rhapsody from I-Logix is shown here,
where by the animation is actually run on the target hardware via an

To work with these tools does require a trace large buffer but it is of
little use without all the other mentioned features such as triggers and
break before make breakpoints.

In many cases to produce an ICE one needs a bond out or hooks chip
that is only available under license from the original silicon
manufacturer Hooks are explained later. To get round this modern
technology has produced the FPGA. The FPGA is often used in
inexpensive ICE to produce a system that on the surface appears to do
as much as an expensive ICE. That is until you actually try to use it
and discover that it does not quite match the real ICE in performance
usually where you need it on the edges of the hard real time
performance. Having said that "proper" ICE also use FPGAs and ASICs
as well but not as a way of avoiding licensing the correct technology.

One “entry level ICE” that did not have a hooks licence was emulating a
part using a second processor and a monitor type system. Users were
often not made aware of this drawback. It was not until they started to
get stuck in to the debugging that they discovered the difference and
had to buy a proper ICE…. They now have the "cheap" ICE on the shelf
gathering dust. Not such a bargain. Remember you only get what you
pay for.

The new inexpensive technology has spawned many so-called
“Universal” ICE that claim to cover several MCU families. Some claim to
cover "all 8-bit" systems. If it was that simple to do properly the
major ICE manufacturers would be rushing to the FPGA and producing
multiple architecture ICE. It would be wonderful for production costs
and profit margins but it hasn’t happened with any of the reputable
vendors so far.

The phrase “Jack of all trades and master of none” comes to mind.
One only has to compare the Von Nueman architecture of the 68 series
with the idiosyncrasies and Harvard architecture of the 8051 to see
that there will have to be some compromises or some dual systems
under the lid. If it were a dual system these ICE would cost more than
a single family ICE. However, “Universal ICE” always seems to cost so
much less than single-family systems. These inexpensive “Universal”
ICE usually require a lot of “add on’s”, “add in’s” and upgrades to
come close to matching the single family ICE. There is no such thing as
a free lunch.

What has happened is that the cost of the hig- end universal ICE has
come down and due to modern technology got a lot smaller. These
ICE are modular in both hardware and firmware. So whilst no system is

page 33 of 81

“universal” they have a range of generic and architecture specific
modules that can be used in many combinations. At one time these
ICE were housed in 19-inch racks and the modules were large PCB’s
now the are considerably smaller, often a single ASIC chip.

What is coming to the fore is the professional modular ICE. This is a
professional standard Universal ICE. The inexpensive "universal" Ice
are what they always have been, minimal hardware and as much as
possible in software. The problem here is that they do "most" of what
you need reasonably well. The trouble is that when you get to the
point where you really need a proper ICE they tend to let you down.

I recently (late 2001) had to demonstrate a good ICE to a customer on
their equipment. (It was a Hitex MX51 for 8051) The ICE refused to
correctly boot the system. The customer pointed out that the Hitex
ICE was "no good" because they had a cheap ICE that worked
perfectly…. A couple of hours later the answer became clear. The
customers system had a deep-seated problem in start-up that their
cheap ICE totally ignored. A couple of days later and the problem was
solved. It also had a knock on effect on the problem I was called to see
in the first place. The cheap ICE would not have found that problem

One of the “it came and went” parts of the ICE history is the serial link
problem. At one time, a serial link ran at 1200, 2400, 4800 and (if
pushed) 9600 so people looked at parallel links from host to ICE.
Some vendors even went as far as putting the ICE into the PC! Others
insisted on putting parallel interfaces, most of them proprietary on to
the systems.

Now serial links run well at 115200. The parallel link has gone,
replaced in high-end systems, with USB or an Ethernet link.

The other interesting problem we have found is that many (most?)
people debug on MS Windows platforms. From Win3* there has been
less user control over the hardware. In NT it was positively
discouraged giving rise to many problems as those with parallel port
dongles discovered. Despite the fact that the parallel port grew up to
become bi-directional it is still seen by windows as essentially a
printer (output) port. This has meant that it has become a lot less
suitable for controlling equipment. A pity.

The bottleneck was rarely the serial link anyway; it was the generation
of the symbols. By generating the symbols at compile time before the
program is loaded into the ICE a lot less data has to be passed up the
serial like. The other method is rather like interpreted basic. Every
time there is a breakpoint the symbol information required has to be
processed and sent up from the ICE to the host. This slows down the
screen update.

page 34 of 81

As mentioned the USB port will (has?) taken over from serial for the
lower end (8 & 16 bit) and Ethernet for the higher end (16 to 128 bit).
2.5. New ICE Features

Emulators are now able to offer many new features apart from HLL
support and vastly improved triggers and trace that were not dreamed
of in the past. Two of the most important advances are code
coverage and performance analysis. Some simulators also offer this
but not on the real hardware or in real time. In the case of code
coverage this is not to bad, though obviously it does put a query over
interrupt sequences from external source hopefully the internal
sources will be synchronised. However, you can never be certain of the
interrupts in anything other than real time. For performance analysis
it can only be a theoretical result.

There is a separate paper " Advanced Embedded Software Testing
QuEST 3" that looks at the implementation and uses of code coverage,
performance and regression testing with an ICE.

2.5.1. Code Coverage

Code coverage is now an important part of testing and validation
particularly in safety critical environments. Actually it always was
important now it is
recognised as
such.. In simple
terms this test
should result in
evidence that
during a certain
test, all
instructions were
executed and no
malfunctions were
observed. This
gives a high level
of confidence that
there are not any hidden bugs in un- executed code waiting for the
end user!

There are many software tools that now do code coverage but the
problem for embedded systems is that many will instrument the code
or not actually run on the target. Using an ICE you know the code has


page 35 of 81

been run on the hardware to real time. However it is not that
simple…. See [Maric]

Code coverage comes in several versions. Statement coverage, branch
coverage and modified decision coverage. The basic version is :

Statement coverage. This simply says that the line has or has not be
run. In some cases it will also say if it has been partially covered. This
will cover lines that contain more than one clause. This should be
possible on any reasonable ICE as shown.

Branch coverage is a little deeper. This is not usually available as an
"tick box" option on an ICE. It is the prerogative of specialised tools It
does coverage based on the number paths through the code. In the
codes shown below statement coverage would show this as 100%
coverage if each line had been covered.

int a,b,c,d;

void f(void)
if (a)
b = 0;
b = 1;
if (c)
d = 0;
d = 1;

However looking at the diagram below it
can been seen that there are in fact four
paths through the code. These would be :

A = 0 and C = 0 Thus B = 0 and D = 0
A = 0 and C = 1 Thus B = 0 and D = 1
A = 1 and C = 0 Thus B = 1 and D = 0
A = 1 and C = 1 Thus B = 1 and D = 1

Statement coverage would be happy with
the two lines below:

A = 0 and C = 1 Thus B = 0 and D = 1
A = 1 and C = 0 Thus B = 1 and D = 0

Therefore all the lines of code would be covered but not all the paths.

page 36 of 81

So whilst the ICE doing Statement coverage will give 100% it will not
test branch coverage. At least not in it's standard "off the shelf" form.
It is reasonably simple to make an ICE that has code coverage to do
branch coverage but first we should look at the final level of coverage.

Modified Decision Coverage. This takes branch coverage to the next
level. This looks at not only the paths but the values of the decisions.
In the previous example we had a simple true or false.

int a,b,c,d;

void f(void)
if (a < 0)
b = 0;
b = 1;

if (b < 3 && c <=a)
d = 0;
d = 1;

In this case both statement and branch would give 100% coverage
without ever testing "c <=a" when b !< 3. This is because C stops as
soon as it gets a false.

In this case it appears immaterial but it does highlight the point that
100% code coverage can mean different things to different people.

As mentioned most ICE only do statement coverage as standard but it
is possible to do both other versions of coverage with an ICE {Guest 3}
and [Buchner/Tessy]. Most ICE have a script language that can be used
to program the tool. Usually an ICE can also input variables into the
program under test. Thus by careful analysis one can exercise the
code to cover all branches and as it only takes a little more work to do
decision coverage as well.

This falls under the remit of the paper The Tessy article for the ESC II
Brochure from, F Buchner at Hitex DE and Quest 3 on advanced
testing. This is a topic unto itself that merits close inspection and
study if you do any embedded testing.


page 37 of 81

So far we have only looked at code coverage. There is also Data. It can
be most illuminating to find that some data has or has not been read
or written to. Data coverage monitors which data areas were accessed
during a test and crucially allow the potentially dangerous READ before
WRITE un-initialised data bug to be identified.

To really validate embedded software it must be running in real-time
on the target hardware. This is only possible with an ICE. Code
coverage can be done on a simulator but it is not in the real target
environment and a ROM monitor, unless shipped with the product
changes the memory map (and is not real time anyway). The DTI’s
Tickit guidelines, the MOD’s DefStan 00-55 and the motor industry’s
MISRA guidelines advise that coverage tests be performed prior to

On a strictly commercial note an ICE (that has many other uses) is
usually less expensive than the better code coverage and analysis

2.5.2. Performance Analysis

Performance analysis for many applications requires hard real-time
not the pseudo real-time of simulators. Simulators can show
performance in cycles and percentages and even milliseconds but this
pre-supposes that everything is synchronised.

I once completely
tested and debugged a
smart card application.
A smart card has five IO
pins: Power, ground,
clock IO and control.
Everything is
synchronised to the
clock pulse. Thus in a
simulation everything
can happen on cycles in
a know way.


Most embedded
systems have external
asynchronous i/o and
internal interrupts. This
lack of synchronisation makes conclusive testing (and timing) in a
simulator less reliable.

page 38 of 81

Modern ICE usually all have profilers. This is a reasonable level of
performance analysis; whilst better than a simulator these usually do
not have the high level of accuracy of flexibility for precise timing.

The analyser is usually based around the trace, triggers and
conditional filtering to permit timing of precise areas of code either
inclusive or exclusive to other calls and interrupts. The ability of the
modern ICE to give, switch-able by the user, net function times. That
is the time taken for the actual function to execute with and without
the subroutines and library calls it makes.

For example, you may wish to analyse a major function that contains a
complex set of if statements. It would be pointless there were castled
functions with varying execution times and these were included, of
course, you would become proficient with the pocket calculator. By
removing the called functions the function under test can be accurately

The ability
to time
event pairs
coupled with
guards and
analysis with
precise timing. It also means that long term tests can be run looking
for those intermittent glitches that traditionally take teams weeks to
find. As mentioned this is non-intrusive timing so that this is reality
not simulation and exactly what can be expected in the field.

This means that the ICE is now in a position to do full system tests
rather than simple be used for debugging. Especially if scripts,
macros (tool control language) are used to set up and run the tests
and measurements.


page 39 of 81

3. New methods & N-wire debuggers

One of the problems for the new ICE are the target packages. The
40pin DIL has all but gone only used on low budget projects. Most
targets now use the surface mount chips with many legs (often over
200) on a much smaller pitch. This has resulted in a variety of
complex, expensive and delicate adapters. They are not only
mechanically delicate but need to have a great deal of care taken over
their electrical characteristics.

In an effort to standardise debugging techniques and get round the
need for multiple complex cables and pods manufacturers devised
various schemes. Generally, these systems use a fewer pins on the
processor taken to a standard socket. Whilst, in most cases, they
provide information from inside the CPU they are do not all quite
provide all the real time information a full ICE pod and bond out chip
would. In some cases the services offered are quite restricted. However
other methods do give full ICE capability. A range of methods is
described in the following pages. Many of these are specific to a
particular family, vendor or group of parts.

3.1. ICE- Connect (Hi-Tex)

On systems like the 8031 (and C166) where the systems have the
address and data bus available through ports in some configurations
Hitex has a system
called "ICE-Connect"

The ICE-Connect system
is particularly useful in
high integrity systems
where code coverage
and testing must be
shown with the final
code and hardware as it
will be used.

page 40 of 81 19/01/2006

The system uses a
simple 30 or 56 way
header for 8 or 16 bit
systems that permits a
full ICE to take control of
the target in the usual
way but instead of costly
bond out chips the actual target processor is used. All that is required
is for the target board to be tracked for the ICE Connect header that
makes it cost effective even on production boards. Any board my then
have the inexpensive ICE-Connect header fitted and the system tested.
This means that new software may be tested on real systems using
code coverage and performance analysis, which makes this system
almost obligatory for some high integrity systems.

On the down side the connector has to be designed in to the PCB from
scratch. Also several lines (4 on the 8051) must be intercepted rather
than monitored. This usually means that the lines are linked on the
PCB and have to be cut if the ICE Connect socket is fitted. Then
debugging and testing is finished the links and be put into the socket
to make the board function as stand-alone as it had before the links
were cut.

This method is very cost effective for low volume production. Which is
what the high integrity systems tend to be. In high volume production
the cost of a few pennies for the tracking and holes would mount up.

The advantage to this method is you get full ICE control with all the
code coverage and timing analysis on the actual production board
using the processor and components (and code) that will be shipped to
the customer. This method is open to two families that are for all
practical purposes not single chip designs.

3.2. Hooks

For some families emulation of on-chip ROM based programs is
possible. Traditionally this has been using expensive "bond out chips"
These are parts with the additional internal busses brought out to the
edge of the part. They are also expensive to produce.

page 41 of 81 19/01/2006

In the 8051 world several chip manufacturers use the Hooks
developed by
For example
Atmel WM and
Philips use it in
their RX2 parts,
Siemens use it
in their in their
C500 and C166
families to control the execution of MCUs and to gain information on
the internal operation of the controllers.

Each production chip has built-in logic for the support of the
Enhanced Hooks Emulation Concept. Therefore, no expensive bond-
out chips are necessary for emulation. This also ensures that
emulation and production chips are identical. Remember when a part
is changed a new bond out is also required

The Hooks Technology is a port replacement method. Using the
Infineon C500 (8051 range) as an example: It requires embedded
logic in the C500 together with the Hooks interface unit: EH-IC to
function similar to a bond-out chip. This simplifies the design and
reduces costs of an ICE-system. ICE-systems using an EH-IC and a
compatible C500 are able to emulate all operating modes of the
different versions of the C500 micro controllers.

This includes emulation of ROM, ROM with code rollover and ROM-less
modes of operation. It is also able to operate in single step mode and
to read the SFRs after a break.

On this 8051 example
Port 0, port 2 and
some of the control
lines of the C500
based MCU are used
by Enhanced Hooks
Emulation Concept to
control the operation
of the device during
emulation and to
transfer information
about the program
execution and data
transfer between the
external emulation
hardware and the
MCU. The other advantage is that the target MCU is exactly the same
as the MCU in the ICE. There are two types of Hooks emulation, Basic
and enhanced

Standard Hooks
Additional time multiplex on port P0 and P2
Modified Ale and Psen#

Enhanced Hooks

page 42 of 81

No additional time slots but multiplexed port P2
Multiplexed and bidirectional EA#
Modified Ale and Psen#

Philips use Basic Hooks this has a maximum upper limit of about
25Mhz. However, it has been found that most ICE will only work
reliably to 20-22Mhz. This is despite the fact that the parts are
designed to run at 33Mhz. Atmel and Infineon use Enhanced Hooks
which can be ICEed to over 66Mhz

3.3. Once Mode

Intel developed their own system, which is not so commonly used.
Industry prefers "open" systems these days.

The ONCE (“on-circuit emulation”) mode facilitates testing and
debugging of systems using the device without the device having to
be removed from the circuit. The ONCE mode is invoked by:

Pull ALE low while the device in in reset and PSEN is high;

Hold ALE low as RST is deactivated.

While the device is in the ONCE mode, the Port 0 pins go into a float
state, and the other port pins and ALE and PSEN are weakly pulled
high. The oscillator circuit remains active. While the device is in this
mode, an emulator or test CPU can be used to drive the circuit.

Normal operation is restored after a normal reset is applied.


page 43 of 81

3.4. BDM (Motorola)

BDM or Background Debug Mode is used by Motorola for their generic
on chip debug system. They can also be used for programming flash
memory as well. There are several versions of BDM.

A 6-pin connector for HC12, Star8, Star12

A 10-pin connector for HC16, CPU32, PowerPC

Additional output signals are available on different devices or
architectures (ECLK, mode pins, trace port) The transportation layers

page 44 of 81

and the protocols used are NOT identical on different architectures.
The following basic debug features are in common:

Level 1:
Run control
Register and memory access during halted or running emulation

An additional trace port may be available; e.g. on the PowerPC (Ready

Level 2:
Branch trace
Data trace
Ownership trace

BDM permits reading and writing of memory and registers, reading
and writing blocks of memory. The stop and restart from the PC
(which may be the original or modified value). The BDM takes no
resources from the target, in normal running the target is in real time.
The BDM only comes into play when activated where it freezes the
whole MCU. BDM can not replace the ICE but does give the developer
a dynamically loaded, but basic, ROM monitor that is always present.
The power of BDM such as it is comes from the software running on
the PC host.

Whilst BDM is not all-powerful, it is free in the sense it takes no
resources at all and is on the chip anyway. BDM adapters and host SW
are comparatively inexpensive. As BDM cycle steals in some cases
BDM can be intrusive but this is not usually a problem.

With BDM, you can run the app in real-time and set breakpoints if you
run it from RAM or you have hardware breakpoint registers in the
chip, which not all have, that allow you to stop even when you execute
from ROM.

An in-circuit debugger is a little bit more hardware that in this case
You run-control (start/stop) and memory access via BDM but in
addition you connect to the other external signals of the MCU to allow
for functionality like code breakpoints in (external) ROM, bus cycle
trace and data breakpoints.

There is more to this, some vendors provide mappable memory at this
stage as well. With a real in-circuit emulator you have all of the above
functions but in addition you have emulation memory that can overlay
or replace physical memory in your target hardware. Also, the
emulator should always run even if there are problems in your target
hardwar, allowing you to track down problems..

page 45 of 81


page 46 of 81

3.5. JTAG

Several North American and European electronics engineering
companies formed JTAG from the Joint Test Action Group. JTAG and
IEEE/ANSI standard 1149.1 (industry standard since 1990) are
synonymous. JTAG was originally designed as a Boundary Scan test
system for CPU, MCU and other parts

Boundary Scan was
developed in the mid-
1980s to solve physical
access problems on
PCBs caused by
increasingly crowded
assemblies due to novel
packaging technologies
This technique embeds
test circuitry at chip
level to form a
complete board-level
test protocol. With
boundary-scan you can
access even the most
complex assemblies for
testing, debugging and
on-board device programming and for diagnosing hardware problems.

Basic JTAG is a 4 or optionally 5
wire system (TDI, TDO, TCK,
TMS and optional TRST)
principally used on CPUs by
Intel for boundary scan in
testing. That is it will provide
repeated snapshots of the pins
on the edge of the CPU. There
is also a Test Access Port (TAP)
that permits a limited number
of instructions. As information
is clocked in and out serially,
JTAG is not real time or a
replacement for a full ICE
connection, as it will not show
activity on the internal busses.

A sixteen state FSM called the TAP Controller has to be implemented in
the target silicon. The TAP Controller understands the basic Boundary


page 47 of 81

Scan instructions and generates internal control signals used by the
test circuitry.

A Boundary Scan Instruction Register and decode logic in addition to
two mandatory Boundary Scan Data Registers, the Bypass Register and