Emulation of an Embedded System

bendembarrassElectronics - Devices

Nov 2, 2013 (3 years and 7 months ago)

66 views

1

T. Steckstor

©

RSP’99

INTERNATIONAL

WORKSHOP ON

RAPID SYSTEM

PROTOTYPING

Performance Analysis of a RTOS by
Emulation of an Embedded System

June 17th, 1999

T. Steckstor, K. Weiß, W. Rosenstiel


Lehrstuhl für Technische Informatik

University of Tübingen

D
-
72076 Tübingen, Germany

e
-
mail: stecki@fzi.de

2

T. Steckstor

©

RSP’99

INTERNATIONAL

WORKSHOP ON

RAPID SYSTEM

PROTOTYPING

Outline


Introduction


Emulation environment: S
PYDER
-
C
ORE
-
P1


Benchmark example: Actuator
-
Sensor
-
Interface
(ASI) master unit


Embedded system performance analysis


Analysis results of different cache configurations
and cache sizes


Conclusion

3

T. Steckstor

©

RSP’99

INTERNATIONAL

WORKSHOP ON

RAPID SYSTEM

PROTOTYPING

Introduction


Embedded systems in the industrial automation


Application specific hardware implementation using a FPGA


Application specific software running on a microcontroller


The interaction between the hardware part and the software part
demands hard real
-
time requirements (reaction times of about
200µs)


Motivation from an embedded system designers point of view


Sophisticated software task architecture (RTOS)


Novel microcontroller architecture with caches


Fast reaction times to external events cause that task switching
and interrupt reaction times become a major performance
bottleneck

4

T. Steckstor

©

RSP’99

INTERNATIONAL

WORKSHOP ON

RAPID SYSTEM

PROTOTYPING

Emulation


Embedded system with complex internal system behavior


Emulation is very close to the final target system to get a
detailed internal view


Emulation offers the possibility to find the best hw/sw
partitioning early in the design process


Emulation gives answers to the following questions:


What is the optimum clock speed?


How much performance is consumed by the RTOS?


How great is the performance enhancement of the on
-
chip
caches and what can be done with this enhancement?


What is the effect of different cache sizes on the important
RTOS task switching and interrupt reaction times?

5

T. Steckstor

©

RSP’99

INTERNATIONAL

WORKSHOP ON

RAPID SYSTEM

PROTOTYPING

Emulation environment: S
PYDER
-
C
ORE
-
P1

Embedded

PowerPC

PPC403

25..80MHz

32 bit

microcontroller

bus

microcontroller

core

DRAM

1
-
128MB

CORE
-
P1 AT
-
ISA add
-
on board

extension

headers



Actel

add
-
on

II

FPGA

architectures



Xilinx

XC6000



Xilinx

XC4000

I



analog

module

peripherie

devices

8 Bit I/O bus

Intra/

Internet

AT
-
ISA

bus

III



FLASH

8MB



Ethernet

10MBit



2 serial

ports



DPRAM

2KB



driver

6

T. Steckstor

©

RSP’99

INTERNATIONAL

WORKSHOP ON

RAPID SYSTEM

PROTOTYPING

Benchmark example: ASI master unit

ASI communication system


ASI real
-
time critical constant (220µs)



ASI

master





0

I3

I2

I1

I0

PB

1

slave

answer



ASI

power

supply

4O

4I

4O

up to 32 slaves

ASI

slave

ASI

slave

4I

master

call

0

SB

A4

A3

A2

A1

A0

I4

I3

I2

I1

I0

PB

1

7

T. Steckstor

©

RSP’99

INTERNATIONAL

WORKSHOP ON

RAPID SYSTEM

PROTOTYPING

Benchmark example: Implementation



Embedded

PowerPC

PPC403

25..80MHz

32 bit

microcontroller

bus

microcontroller

core

DRAM

1
-
128MB

CORE
-
P1 AT
-
ISA add
-
on board

extension

headers



Actel

add
-
on

II

FPGA

architectures



Xilinx

XC6000



Xilinx

XC4000

I

analog

module

peripherie

devices

8 Bit I/O bus

Intra/

Internet

AT
-
ISA

bus

III



FLASH

8MB



Ethernet

10MBit



2 serial

ports



DPRAM

2KB



driver

S
PYDER
-
C
ORE
-
P1 hardware

VxWorks

real
-
time operating system

int_service

control

C
-
server

http
-

server

ASI application sofware

TCP/

IP

microcontroller

register

interface

tele_receive

tele_send

ASI
-
UART

from

to

analog

module

ASI hardware (single channel)

Target chip:

XC4005E, 166 CLBs, utilization: 85%

8

T. Steckstor

©

RSP’99

INTERNATIONAL

WORKSHOP ON

RAPID SYSTEM

PROTOTYPING

Embedded system performance analysis

µs

t

0

100

200

Int.

I/O

60

semTake

ASI real
-
time critical constant (220µs)

30


PPC403GA/33MHz

int_reaction


task change

80

40



control task



int_service



all caches disabled


10



RTOS

time used by RTOS

time used by the application

50µs


(23%)

170µs


(77%)

9

T. Steckstor

©

RSP’99

INTERNATIONAL

WORKSHOP ON

RAPID SYSTEM

PROTOTYPING


Above 1.0 system is
under
-
sized


Below 1.0 system is
over
-
sized

1.0

with I
-
16KB/D
-
8KB


With 8 times larger
caches the performance
gain at the optimal WP is
350%

Embedded system performance analysis



0.5

1.5

25

40

80

real
-
time execution time (used)

real
-
time critical constant (220µs)

33


Optimal working point is
33MHz

optimal WP

without caches

with I
-
2KB/D
-
1KB

MHz

clock frequency

40%


With I
-
2KB/D
-
1KB at the
optimal WP 40%
performance gain


Real
-
time critical
constant is 220µs

10

T. Steckstor

©

RSP’99

INTERNATIONAL

WORKSHOP ON

RAPID SYSTEM

PROTOTYPING

Analysis results of different cache configurations

without I
-
Cache

without D
-
Cache

without I
-
Cache

with D
-
Cache

with I
-
Cache

without D
-
Cache

PPC403GA

33MHz (WP)

+46%

+50%

+187%

-
1%

-
4%

+10%

100% (87µs)

100% (27µs)

100% (6211)

task switching

time

interrupt

reaction time

dhrystones

with I
-
Cache

with D
-
Cache

+60%

+43%

+455%

without I
-
Cache

without D
-
Cache

without I
-
Cache

with D
-
Cache

with I
-
Cache

without D
-
Cache

PPC403GCX

33MHz (WP)

+152%

+205%

+207%

+10%

+12%

+11%

100% (87µs)

100% (27µs)

100% (6211)

task switching

time

interrupt

reaction time

dhrystones

with I
-
Cache

with D
-
Cache

+340%

+377%

+529%

11

T. Steckstor

©

RSP’99

INTERNATIONAL

WORKSHOP ON

RAPID SYSTEM

PROTOTYPING

Conclusion


The optimal working point is at 33MHz


At the optimal working point 77% of the total execution
time (220µs) is consumed by the RTOS


At the optimal working point small caches improve
execution performance by 40%, larger caches provide an
average gain of 350%


Such enhancements can only be used for non
-
real
-
time
dependent system services, e.g. network communication
via the internet


The cache sizes should be in a range of about 8
-
16KByte to
provide a significant performance gain, if the application is
running under the control of a RTOS

12

T. Steckstor

©

RSP’99

INTERNATIONAL

WORKSHOP ON

RAPID SYSTEM

PROTOTYPING

Demonstrator: Industrial shelf model