2-1x

weedyhospitalΗλεκτρονική - Συσκευές

25 Νοε 2013 (πριν από 3 χρόνια και 7 μήνες)

66 εμφανίσεις

Unit
-
2

General Purpose Processors

-


Basic Architecture
-


Operation
-

Pipelining
-

Programmer’s View
-

Development Environment
-

Application
-
Specific Instruction
-
Set Processors
-

Microcontrollers
-

Digital Signal Processors
-





















1.


Describe why a general
-
purpose processor could cost less than a single purpose processor?

Using

general purpose
-
processor
s

( GPP ), in the design of certain embedded systems gives scope to
achieve several benefits. One of the benefit is “lesser cost” th
an when a

single purpose processor is used.
The

benefits

are given below:

1.


Cost of GPP is low because the processor manufacturer spreads its NRE cost for the processor’s design
over a large number of units ( bulk manufacture and sales ).

2.

Because of
assured bulk sales, the manufacturer ventures to invest a large amount ( Large NRE cost)

in

a.


incorporating advanced architectural features

in the design of General Purpose processor


b.


Using leading edge optimization technologies

c.

Using the s
tate
-
of
-
t
he
-
art
IC


technology

d.

Using h
andcrafted VLSI layouts for cr
itical components


As a result, using such GPPS in Embedded system improves design metrics like performance, size and power.

3.

When such GPPs are used in the design of embedded syste
ms the designer ‘
s NRE cost will be low
because, he does not spend any time for processor design. Instead he n
eeds to write only software.


He can
apply a compiler and/or an assembler, both of which are mature and low
-
cost design
technologies. Furthe
r his

“ time
-
to
-
proto
-
type ”


and



time to market



will be short
,

and as such
,

he
get
s

the benefit

from these factors

also
.


4.

Flexibility will be great and the designer can perform software rewrites in a straightforward manner.

...............

……………………………………………………………………………………………………………………………………………………………………

2.

Create a table listing the address spaces for the following address sizes

(a)


8


bit : 2
8

= 256;





(b)16


bit : 2
16

= 65.536





(c) 24


bit : 2
24
=

16777216

(d)



32


bit : 2
32

= 4294967296 (e) 64


bit : 2
64

: 1.844674407 x 10
19

…………………………………………………………………………………………………………………………………………………………………………..

3.

Illustrate how program and data memory fetches can be overlapped in Harvard Architecture.



Explain pipeline based Instruction execution in detail.

In Harvard architecture, the program memory space is distinct from data memory sp
ace. Such architecture
requiring

two con
nections. It

can perform instruction

fetch

(

from

program memory

)

and

data memory
fetch simultaneously , by adopting

a

pipelined instruction execution approach
,

as shown below.

A typical
instruction execution consists of

performing

Fetch instruction, Decode instruction, Fetch
operands, execution operation , store results
. Then,

by adopting a pipelined approach, which is possible in
Harvard architecture, it

is evident that the instruction throughput increases by overlapping.

It is simple to
imagine that

in the above case,
if all

the above

states are executed

on
e

after the other

, the execution
time of the instruction will be longer than when it is pipelined.









4.

Explain the basic architecture of general


purpose pr
ocessor.

5.

Explain about datapath with the help of a neat diagram

6.

Explain about control unit with the help of a neat diagram.

7.

Expl
ain the

memory with the help of a neat diagram.

Compare Princeton and Harvard Architectures.

A general purpose processor con
sists of a
datapath

and a
control unit
. These are tightly linked with
memory.

The figure
[ fig.
,
a ]
illustrates the basic architecture of a general purpose processor.








Datapath
:
The datapath consists of the circuitry for transforming data and for storing temporary data. The datapath
contains an ALU capable of transforming data through mathematical and logical operations etc. ALU also generates
status signals. These status sign
als are often stored in a status register indicating particular data conditions like
zero, overflow, parity, carry etc.. ALU contains registers to store temporary data. Internal data bus in this
processor carries data within datapath , while the ext
ernal data bus carries data to and from the data memory.
Processors are distinguishe
d by their size. Usually measure the size as the bit
-
width of the datapath component.

Control Unit :
The control unit consists of circuitry for retrieving program

instructions and for moving data to, from
and through the datapath according to those instructions. Control Unit consists of a Program counter(PC) ,
Instruction Register and a controller as shown in the above figure.. Program counter holds the address i
n memory of
the next program instruction to fetch. The Instruction Register(IR) to hold the fetched instruction. Controller
consists of a state register + next
-
state control logic. The controller sequences through the states and generates the
control

signals necessary to read instructions into IR and control the flow the data in the datapath. Controller also
determines the next value of the PC. For non
-
branch instruction , the controller increments the PC. For a branch
instruction
, the controller

looks at the datapath status

signals and the IR to determine the appropriate next address.


For each instruction, the controller typically sequences through several stages, such as fetching the
instruction from memory, decoding it, fetching the operands,
executing the instruction in the datapath and storing
the results. Each stage may consist of one or more clock cycles. A clock cycle is usually the longest time required for
data to travel from one register to another.

Critical path
:

The path throug
h the datapath or controller that results in the longest time ( e.g., from datapath
register through ALU back to datapath register )

Memory:
Memory

serves


the processor

s medium
-

and long term information
-
storage

requirements.
We can
classify th
e
stored information as either p
rogram or data. Program information consists of the sequence of
instructions that cause the processor to carry out the desired system functionality. Data information represents the
values being input, output and transformed

by the program.
We can store program and data together or
separately. In
Princeton Architecture,
data and program share the same memory

[ fig. b]. In
Harvard Architecture,

Program memory space is distinct from data memory space[ fig. c ] . Harvard a
rchitecture can perform program and
data fetches simultaneously resulting in improved performance.


8.


Define Cache memory. Explain it with neat sketch.

To reduce the time needed to access ( read or write ) memory, a local

copy

of a portion of memory may be kept in a small but especially

fast memory called
cache.
Cache memory often resides on
-
chip and

often uses fast but expensive static RAM technology. Cache memory

is based on the principle that if at a particular time a
processor accesses

a particular memory location, then the processor is likely to access that

location and immediate neighbors of the location in the near future.

Thus, when we access that location

in memory, we copy

that location

and some number of
its neighbors ( called a block )

into cache, and

then access the copy of the location in cache. When we access another location we first check a cache
t
able to see if a copy of the location resides in the cache. If the copy does reside in the cache,

we have a

cache
-
hit, and we can read or write that location very quickly. If the copy does not reside in the cache, we
have a cache
-
miss, so we must copy the location’s block into cache, which takes a lot of time. Thus, for a
cache to be effective in
improving in performance, the ratio of cache hits to the misses must be very high,
requiring intelligent caching schemes. Caches are used for both program memory ( often called instruction
-
cache or
I
-
cache

)as well as data memory often called
D
-
cache

).

.............................................................................................................................
.....................................................

9.


Briefly explain superscalar Architecture. Explain VLIW architecture. Comp
are both architectures.

The key to higher performance in microprocessors for a broad range of applications is the ability to exploit

fine
-
grain, instruction
-
level parallelism. Some methods for exploiting fine
-
grain parallelism include:



pipelining



multiple
processors



superscalar implementation



specifying multiple independent operations per instruction

Pipelining

is now universally implemented in high
-
performance processors. Little more can be gained by

improving the implementation of a single pipeline.


Usin
g
multiple processors

improves performance for only a restricted set of applications.


Superscalar implementations

can improve performance for all types of applications. Superscalar (
super
:

beyond;
scalar
: one dimensional) means the ability to fetch, issue

to execution units, and complete more

than one instruction at a time. Superscalar implementations are required when architectural compatibility

must be preserved, and they will be used for entrenched architectures with legacy software, such as the x86

arc
hitecture that dominates the desktop computer market.


Specifying multiple operations per instruction

creates a very
-
long instruction word architecture or
VLIW
.

A

VLIW implementation

has capabilities very similar to those of a superscalar processor

issui
ng and

completing more than one operation at a time

with one important exception: the VLIW
hardware
is not

responsible for discovering opportunities to execute multiple operations concurrently. For the VLIW

implementation, the long instruction word already

encodes the concurrent operations. This explicit

encoding leads to dramatically reduced hardware complexity compared to a high
-
degree superscalar

implementation of a RISC or CISC.

The big advantage of VLIW, then, is that a highly concurrent (parallel) imp
lementation is much simpler and

cheaper to build than equivalently concurrent RISC or CISC chips. VLIW is a simpler way to build a

superscalar microprocessor.

Programmer’s view :


When a programmer writes a program to carry out the desired functionality,

at least, he may have to be aware of
the processor’
s architectural abstraction
.

T
he level of abstraction depends upon the level of programming.

Commonly , we have two levels of programming namely, Assembly
-
language programming ( with processor
-
specifi
c
instruction set ) and structured
-
language programming ( processor
independent instructions). In respect of
structured programming
,

a compiler automatically translates those instructions to processor
-
specific instructions.

Instruction set :

The p
rocessor’s instructions mainly contain two fields namely Opcode field and Operand field.
The instructions fall in three categories: Data
-
Move Instructions, Arithmetic/Logical Instructions and Branch
Instructions.


[ Details in respect of these instructio
ns are not

being discussed
here,

a
s this topic is already discussed in
previous classes, students should do some homework in this regard,
for examination

]


Addressing Modes :


The figure below gives a brief summary of different addressing modes









Those familiar with structured languages may note
the following points:



direct addressing
-

implements regular variables.



Indirect addressing
-

implements pointers.




In
Inherent or implicit addressing,
the particular register or memory location of
the data is implicit in the
Opcode; for example, the data may reside in a register called “ accumulator “ .




In
Indexed addressing,
the direct or indirect operand must be added to a particular implicit register to
obtain the actual operand address.



Jum
p instructions may use
relative
addressing,


to reduce the number of bits needed to indicate the jump
address. A relative address indicates how far to jump from the current

address , rather than indicating the
complete address.

Uses of ALP
:
While wri
ting software

using structured language , a programmer

faces

a situation
,


where he is
constrained to switch to Assembly language in writing certain portions of the program. Those portions

deal with
low
-
level input
-
output operations with devices ou
tside the processor, like a display device. Such device may require
timing

sequences of signals
,

in order to receive data.

Writing program in assembly language is most convenient in
such cases.

Driver routines :

This is a program written specifically

to communicate with
,

or
, to
drive another device. These
are written in assembly language.

Program and Data Memory Space :

The programmer must be aware of the size of the available memory for
program and data.

He should

not
exceed these limits. He should be aware of the on
-
chip program and data
memory capacity and should take care to fit the necessary program and data in on
-
chip memory
,
if possible.

Registers

:

Assembly language programmers must know
-

h
ow many registers are available for general purpose
data storage. They must also be familiar with other registers that have special functions.

I/O :

The programmers should be aware of the processor’s input and output facilities, with which the
processor
communicates with other devices. One common i/o facility is parallel


i/o, in which
,

the programmer can read or
write

to

a port.

Interrupts :
An interrupt causes the processor to suspend execution of the main program and jump to an interrupt
service routine ( ISR

) that fulfills a special short term processing need. In particular, the processor stores the current
PC and sets it to the address of the ISR. After the ISR
completes
, the processor resumes

execution

of the main
program by restorin
g the PC.
He should be aware of the types of interrupts supported by the processor


and must
write ISRs when necessary.


Assembly language programmer places each ISR

at a specific address in the program
memory.

For example, we may need to record the occurrence of an event from a peripheral device, such as
pressing of a button. We record the event by setting a variable in memory when that event occurs, although the
user’s main program may not process

that


ev
ent until later. Rather than requiring the user to insert checks for the
events throughout


the main program, the programmer merely writes an interrupt service routine and associates it
with an input pin
,

connected to the button. The processor will the
n call the routine automatically
,

when the button
is pressed.


Operating System :

[
The need for an operating system and the services offered by OSs will be discussed
exhaustively , in subsequent units
]

……………………………………………………………………………………………………………………
……………………………………………

:
Development Environment :


D
esign flow tools
:



Development processor: this is a processor on which we write and debug the program
.



Target processor : this is the processor to which we send our program
,

and this program becomes a part o
f
the embedded system.



Integrated Development Environment (tools) : This is an environment which contains tools
,

to test the
written program for the embedded system’s functionality. It consists of editor, compilers/assemblers,
linkers and simulators
.

:

Testing and Debugging Tools

:

Introduction:
Generally
, the
testing and debugging phase

of the developing programs is a major part of the overall
design process. This is especially true when the program is being developed;


finally for running in a embedded
system.
Some requirements are given as points below:



The most common method of verifying the correctness of a program is


to run the program with ample
input data, that check’s the program’s behavior ( especially us
ing boundary cases ).



Specifically, a program running in an embedded system ( most often ) needs to be
real
-
time.

[
A distinguishing characteristic of
a real
-
time system

is that it must compute correct results within
predetermined time. On the other han
d, a non
-
real
-
time system only needs to compute correct results.
]



A program running in an embedded system works in conjunction with many other components of that
system, as well as interacts with the environment, where embedded system is to function.

He
nce, debugging a program running in an embedded system
requires

(1)

having control over time,

(2) having control over environment


and (3)
having the ability to trace or follow the execution of the program, in order to detect errors.


The debugging tools ( Debuggers, Emulators etc.. ) available enable us to execute and observe the behavior of our
programs.



Debuggers :

Debuggers help programmers evaluate and correct their programs. They run on the development
processor and support

stepwise program execution, executing one instruction and then stopping, proceeding to the
next instruction, when instructed by the user. They permit execution up to user
-
specified breakpoints, which are
instructions that
,

when countered
,
cause the programs stop executing. Whenever the program stops, the user can
examine

values

of

various memory and register locations. A source
-
level debugger enables

step
-
by
-
step
execution in the source program language, whether assembly language

or structured language. A good debugging
capacity is crucial, as to day’s program can be quite complex and hard to write correctly. Since debuggers are
programs that run on your development processor, but execute code designed for your target processor,

they
always mimic , or simulate the function of the target processor. These

debuggers


are


also


known as Instruction
-
set
-
simulators ( ISS)

or


Virtual Machines ( VM).

Emulators :

Emulators support debugging of the program , while it executes on

the target processor. An
emulator typically consists of a debugger coupled with a board connected to the desktop process via a cable
. The
board consists of the target processor plus some support circuitry ( often another processor ) . The board may
have
another cable with a device having the same pin configuration as the target processor, allowing one to plug the
device into a real embedded system. Such an in
-
circuit
-
emulator enables one to control and monitor the program’s
execution in the actual

embedded system circuit. In
-
circuit emulators are available for nearly any processor
intended for emb
edded use. Emulators are

quite expensive if they are to run at real speeds.


Device programmers
:


Device programmers

download

a binary machine pr
ogram from the development
processor’s memory into the target processor’s memory. Once the target processor has been programmed, the
entire embedded system can be tested in its most realistic form ( i.e., it can be executed in its environment and the
be
havior observed in a realistic way ). [For example, a car equipped

with the

engine management can be taken
out for a drive !]

…………………………………………………………………………………………………………………………………………………………………………………….


We see that programs intended for embedded systems can

be tested in three ways, namely, debugging using ISS,


Emulation
using emulator, and Field T
esting
,

by downloading the program directly into the target processor. The
difference between these three methods is a s follows :



The design cycle using a deb
ugger
,

based on ISS
,
running on the development computer is fast, but it

is
inaccurate since it can

interact

with the rest of the system &

the environment
,

to a limited degree.



The design cycle using an emulator is a little longer, since code must be
down loaded into the emulator
hardware; however, the emulator hardware can interact with the rest of the system, hence can allow for
more accurate testing.



The design cycle using a programmer to download the program into the target processor is the lon
gest of
all. Hence
,

the target processor must be removed from its system and put into the programmer,
programmed,

and returned to the system. However, this method will enable the system to interact with
its environment more freely, hence provides t
he highest execution accuracy
.

It has
little debug control.



The availability of low
-
cost or high
-
quality development environments for a processor often heavily
influences the choice of the processor.