# Data Representation and Architecture Modelling

Δίκτυα και Επικοινωνίες

24 Οκτ 2013 (πριν από 4 χρόνια και 7 μήνες)

177 εμφανίσεις

Data Representation and
Architecture Modelling

Revision

Binary system

1.
Conversion

1.
Convert decimal to binary

2.
Convert binary to decimal and hexadecimal

2.
Integer representation

1.
Unsigned notation

2.
Signed notation

3.
Excess notation

4.
Tow’s complement

3.

Floating point representation

1.
What decimal floating point number is represented
by the following 32 bits (single precision format)?

1 100 0011 1 000 1010 0000 0000 0000 0000

2.
What is the range of negative numbers in this
representation

3.
Define negative overflow and underflow in this
representation.

Solution

1.
Method

the sign
-
bit is one, negative number

biased exponent = 10000111 = 128 + 4 +2 + 1 = 135

The real exponent = 135
-
127= 8

the normalized mantissa = 000 1010 0000 0000 0000 0000.

the real mantissa = 1.000101

the final value represented =
-
(1.000101
2
) x 2
8

= 100010100
2

=
-
(256+16+ 4)=
-
276

2.
Negative range:
-
(2
-
2
-
23
)x 2
127

to
-

2
-
127

3.
Negative overflow and
underlflow

Negative over: value less than
-
(2
-
2
-
23
)x 2
127.

Negative underflow: 2
-
127
<

value < 0
.

CPU

CPU registers:

PC, IR, AC,MAR, MBR

System bus

Data bus, Address bus, and control bus

Pipelining

Role of pipelining

Pipelining hazards (control hazards, data hazards, and
structural hazards)

What is the disadvantage of using a very long stage
pipeline?

Exercise

Suppose you have designed a processor implementation whose five
pipeline stages take the following amounts of time:

IF(instruction fetch)=20ns,

ID (instruction decode)=10ns,

EX (execution)=20ns,

MEM (memory operation)=35ns and

WB (write back)=10ns.

(a) What is the minimum clock period for which your processor
functions properly?

(b) What should be redesigned first to improve this processors
performance?

(c) Assume this processor is redesigned with 50 pipeline stages. Is
it true to say that the new processor is 10 times faster than the
previous design with 5 pipeline stages?

solution

(a) The minimum clock period is the time of the longest
stage: stage MEM takes 35ns.

(b) The MEM should be redesigned to reduce the clock cycle.

(c) Probably not.

Longer pipelines can be faster due to higher clock rates,

unlikely that the clock rate is 10x faster due to uneven

Furthermore, longer pipelines tend to make data and control
hazards require longer stalls.

higher clock
-
rate processor is likely to be more power
-
hungry proportional to the increase in clock
-
speed

Question 2

An instruction requires four stages to execute:

stage 1 (instruction fetch) requires 30 ns,

stage 2 (instruction decode) = 9 ns,

stage 3 (instruction execute) = 20 ns and

stage 4 (store results) = 10 ns.

An instruction must proceed through the stages in sequence.

1)
What is the minimum asynchronous time for any single
instruction to complete?

2)
We want to set this up as a pipelined operation. How
many stages should we have and at what rate should we
clock the pipeline?

Hints

1)
The minimum time it takes to execute all the 4 stages
of an instruction.

We have 4 natural stages given and no information on
how we might be able to further subdivide them, so we
use

4 stages in our pipeline.

Clock rate?

use the longest stage

Or use a time that closely matches the shortest stage,
but integrally divisible into the other stages. DISCUSS
EACH CASE.

Question 3

The pipeline for these instructions runs with a 100
MHz clock with the following stages:

instruction fetch = 2 clocks,

instruction decode = 1 clock,

fetch operands = 1 clock,

execute = 2 clocks, and

store result = 1 clock.

HINTS FOR QUESTION 3

1)
THE longest stage takes two cycle. Hence we need to
execute one instruction per 2 cycles. What is the rate
then?

2)
The Operand Fetch unit must wait until the prior
instruction stores its result.

before it can retrieve one of its operands (e.g. Op Fetch
for 2 must wait until Op Store for 1 completes). As

a result, things begin backing up in the pipeline, and
we produce one instruction output only every 4 cycles.

No dependencies

E
xecute instruction every 2 cycles. Cock rate?

dependency

From the table we still begin fetching instructions every two cycles.

However the operand fetch for 2 instruction must wait until
Op Store for

instruction 1 completes. (wait for another 2 cycles). Hence, the rate????

Memories

CPU registers

Cache memory

Main memory (electronic memory)

Magnetic memory (hard drive)

Optical memory

Magnetic tape

Cache memory

Cache memory enhances computer performance
using:

Temporal locality principle

Spatial locality principle

Cache mapping

Associative Mapped Cache

Direct
-
Mapped Cache

Set
-
Associative Mapped Cache

Why is cache memory needed?

CPU slowed down by the main memory

When a program references a memory location, it is
likely to reference that same memory location again
soon.

A memory location that is near a recently referenced
location is more likely to be referenced than a memory
location that is far away.

Cache memory

Resides between the CPU and the main memory

Operates at a speed near to that of the CPU

Data is exchanged between CPU and main memory
through the cache memory

Cache memory use locality principles to enhances
computer performance.

Temporal locality principle

Spatial locality principle

Temporal locality principle

When a program references a memory location, it is
likely to reference that same memory location again
soon.

Cache memory keeps records of data recently being
used.

Spatial locality principle

A memory location that is near a recently referenced
location is more likely to be referenced than a memory
location that is far away.

Cache memory copies not only the recently referenced
memory locations but also its nearby.

Cache mapping

Commonly
used methods:

Associative Mapped Cache

Direct
-
Mapped Cache

Set
-
Associative Mapped Cache

Associative Mapped Cache

Any main memory blocks can be mapped into each
cache slot.

To keep track of which

of the 2
27

possible blocks is in
each slot, a 27
-
bit tag field is added to each slot.

Associative Mapped Cache

Valid bit is needed to indicate whether or not the slot
holds a line that belongs to the program being
executed.

Dirty bit keeps track of whether or not a line has been
modified while it is in the cache.

Associative Mapped Cache

The
mapping from main memory blocks to cache slots
is performed by partitioning an address into fields.

For each slot, if the valid bit is 1, then the tag field of
the referenced address is compared with the tag field
of the slot.

Associative Mapped Cache

How
16

is
mapped to the cache.

If the addressed word is in the cache, it will be found
in word (14)
16

of a slot that has a tag of (501AF80)
16

,
which is made up of the 27 most significant bits of the

Associative Mapped Cache

Any main memory block can be placed into any cache
slot.

Regardless of how irregular the data and program
references are, if a slot is available for the block, it can
be stored in the cache.

Associative Mapped Cache

Considerable hardware overhead needed for cache
bookkeeping.

There must be a mechanism for searching the tag
memory in parallel.

Direct
-
Mapped Cache

Each cache slot corresponds to an explicit set of main
memory.

In our example we have 2
27

memory blocks and 2
14

cache slots.

A total of 2
27

/ 2
14 =

2
13

main memory blocks can be
mapped onto each cache slot.

Direct
-
Mapped Cache

The 32
-
bit main memory address is partitioned into a
13
-
bit tag field, followed by a 14
-
bit slot field, followed
by a five
-
bit word field.

Direct
-
Mapped Cache

When a reference is made to the main memory
address, the slot field identifies in which of the 2
14

slots the block will be found.

If the valid bit is 1, then the tag field of the referenced
address is compared with the tag field of the slot.

Direct
-
Mapped Cache

16

is
mapped to the cache.

If the addressed word is in the cache, it will be found
in word (14)
16

of slot (2F80)
16

which will have a tag of
(1406)
16
.

Direct
-
Mapped Cache

Simple and inexpensive

The
tag memory is much smaller than in associative
mapped cache.

No need for an associative search, since the slot field is
used to direct the comparison to a single field.

Direct
-
Mapped Cache

Fixed location for a given memory block.

If a program accesses 2 blocks that map to the same
line repeatedly, caches misses are very high.

Set
-
Associative Mapped Cache

Combines the simplicity of direct mapping with the
flexibility of associative mapping

For this example, two slots make up a set. Since there
are 2
14

slots in the cache, there are 2
14
/2 =2
13

sets.

Set
-
Associative Mapped Cache

When an address is mapped to a set, the direct
mapping scheme is used, and then associative
mapping is used within a set.

Set
-
Associative Mapped Cache

The format for an address has 13 bits in the set field,
which identifies the set in which the addressed word
will be found. Five bits are used for the word field and
14
-
bit tag field.

Typical exam question

Explain the difference between direct mapped cache
and associative mapped cache.

Explain how cache memory uses temporal and spatial
locality principles to enhance computers
performance.

Web languages (
html,xml
,
xhtml
)

Difference between these languages

How does XHTML solve these problems

Difference between HTML selector, CLASS selectors
and ID selectors

htlm

selector:

h{

bgcolor:green
;

color
: red;

font
-
weight: bold;

}

Class selector:

.section {

color
: red;

font
-
weight: bold;

}

ID selector:

#section{

color
: red;

font
-
weight: bold;

}

An ID selector applies styles to an element in the same way as a class.

The main difference between an ID selector and a class is that an ID can be
used only once on each page, whereas a class can be used many times.

Computer networks

TCP/IP model (internet model)

The role of each layer

Example of protocols at each layer and there role.

TCP
vs

UDP

How is error and flow control achieved? Layer responsible for
this?

Subnetting

Role of
subnetting

Range of addresses in a subnet

Exercise

Given a host configuration with an IP address
192.158.15.33 and a subnet mask 255.255.255.248:

What is the number of possible hosts and range of host

Solution

192.168.10.32

0.0.0.1

192.168.10.39

The number if bits for the host is 3 and therefore the
number if hosts allowed in in this subnet is 2
3
-
2=6

The range of address is 192.168.10.33
-

192.168.10.38.

Exam

Duration 1:30 hours

3 questions: 30 minutes each

Time : May

Preparation:

Past exam papers

Revise all the questions given in two assignments

Consult revision slides

Concentrate on the preparation list

Attempt the Mock exam on my website

Next week mock exam

Fin

Good Luck