Chapter 2: IA-32 Processor Architecture

burgerraraΛογισμικό & κατασκευή λογ/κού

18 Νοε 2013 (πριν από 3 χρόνια και 6 μήνες)

185 εμφανίσεις

http:
//www.wordwendang.com/en/

Please go to
http://www.wordwendang.com/en/
, where you can download million word
documents .

Chapter 2: IA
-
32 Processor Architecture



This word document was downloaded from the website:
http://www.wordwendang.com/en/
,
please remain this
link
information when

you

reproduce , cop
y
, or use

it.

<a href=
'http://www.wordwendang.com/en'>word documents</a>



I.

Objectives



Understand the basic structure of a microcomputer



Be familiar with the instruction execution cycle



Understand how computers read from memory



Understand how the operating sy
s
tem loads a
nd executes programs



Know the modes of operand and basic execution environment of the IA
-
32 processors



Be familiar with the f
loating
-
point unit and the hist
ory of Intel Processors



Understand how memory is addressed in protec
t
ed mode and real
-
address mod
e



Know the basic components of a microcomputer



Understand the different levels of input
-
output

II. Lecture Notes


1.
General Concepts



Basic Microcomputer Design

Central Processor Unit
(CPU)
Memory Storage
Unit
registers
ALU
clock
I/O
Device
#1
I/O
Device
#2
data bus
control bus
address bus
CU

o

CPU (Central Processor Unit)



Register


storage location



C
ontrol unit (CU)

coordinates s
equence of execution steps



ALU
performs arithmetic and bitwise processing



Clock


-

S
ynchronizes all CPU and BUS operations

http:
//www.wordwendang.com/en/

Please go to
http://www.wordwendang.com/en/
, where you can download million word
documents .

-

M
achine (clock) cycle

measures time of a single operation

one cycle
1
0

-

Measured in oscillation per second

or the time used to produce

a clock cycle

i.e. 1

GHz


one billion times per second

1 nanosecond


produce a clock cycle with a duration of one billionth
second

-

Cl
ock is used to trigger events

o

Memory Storage Unit

-

hold the data and instructions

o

Bus


a group of parallel
wires that transfer data from one part of the computer to
another



Data bus



transfer instructions and data between the CPU and memory



Control bus


uses binary signals to synchronize the actions of all devices



Address bus


holds the addresses of instruct
ions and dat
a when the currently
executing instruction transfers data between the CPU and memory


o

I/O devices

o

Instruction Execution Cycle





I-1
I-2
I-3
I-4
PC
program
I-1
instruction
register
op1
op2
memory
fetch
ALU
registers
write
decode
execute
read
write
(output)
registers
flags



Multi
-
Stage Pipeline

o

Pipelining makes it possible for processor to execute instructions in parallel

o

Instruction executio
n divided into discrete stages

o

More efficient use of cycles, greater throughput of instructions:

o

Fetch

o

Decode

o

Fetch operands

o

Execute

o

Store output


http:
//www.wordwendang.com/en/

Please go to
http://www.wordwendang.com/en/
, where you can download million word
documents .

S1
S2
S3
S4
S5
1
Cycles
Stages
S6
2
3
4
5
6
7
I-1
I-2
I-1
I-2
I-1
I-2
I-1
I-2
I-1
I-2
I-1
I-2

For
k

states and
n

instructions, the number of required cycles is:
k

+ (
n



1)




Superscalar

A superscalar processor has multiple execution pipelines


S1
S2
S3
u
S5
1
Cycles
Stages
S6
2
3
4
5
6
7
I-1
I-2
I-3
I-4
I-1
I-2
I-3
I-4
I-1
I-2
I-3
I-4
I-1
I-3
I-1
I-2
I-1
v
I-2
I-4
S4
8
9
I-3
I-4
I-2
I-3
10
I-4
I-2
I-4
I-1
I-3



-

Stage S4 has lef
t and right pipelines (u and v)

-

For
k

states and
n

instructions, the number of required cycles is:
k

+
n



Reading from Memory

Multiple machine cycles are required when reading from memory, because it responds
much more slowly than the CP
U

The steps are:

o

address placed on address bus

o

Read Line (RD) set low

o

CPU waits one cycle for memory to respond

o

Read Line (RD) goes to 1, indicating that the data is on the data bus

http:
//www.wordwendang.com/en/

Please go to
http://www.wordwendang.com/en/
, where you can download million word
documents .


Cycle 1
Cycle 2
Cycle 3
Cycle 4
Data
Address
CLK
ADDR
RD
DATA



Cache Memory

o

High
-
speed expensive static RAM both inside and outside

the CPU.

o

Level
-
1 cache: inside the CPU

o

Level
-
2 cache: outside the CPU

o

Cache hit:

when data to be read is already in cache memory

o

Cache miss:

when data to be read is not in cache memory.



How a Program Runs

o

Load and execute process

















o

Multitas
king



OS can run multiple programs at the same time.



Multiple threads of execution within the same program.



Scheduler utility assigns a given amount of CPU time to each running program.



Rapid switching of tasks

-

G
ives illusion that all programs are runni
ng at once

-

T
he processor must support task switching.


2.

IA
-
32 Processor Architecture


IA
-
32 processor family:
Intel386


Pentium 4

Operating
system
User
Current
directory
System
path
Directory
entry
sends program
name to
gets starting
cluster from
searches for
program in
loads and
starts
Program
returns to
http:
//www.wordwendang.com/en/

Please go to
http://www.wordwendang.com/en/
, where you can download million word
documents .



Modes of Operation

o

Protected mode



N
ative mode (Windows, Linux)



Programs are given
separate

memory area
(segmen
ts)

o

Virtual
-
8086 mode



H
ybrid of Protected



E
ach program has its own 8086 computer

o

Real
-
address mode



N
ative MS
-
DOS



All Intel processors boot in
Real
-
address mode

o

System management mode

(SMM)



power management, system security, diagnostics



Basic Execution Envi
ronment

o

Addressable memory



Protected mode

-

4 GB

-

32
-
bit address

0


2
32




Real
-
address and Virtual
-
8086 modes

-

1 MB space

-

20
-
bit address

0
-

2
2
0

o

General
-
Purpose Registers



used for arithmetic and data movement

CS
SS
DS
ES
EIP
EFLAGS
16-bit Segment Registers
EAX
EBX
ECX
EDX
32-bit General-Purpose Registers
FS
GS
EBP
ESP
ESI
EDI

o

Accessing Parts of Registers



Use 8
-
bit name, 16
-
bit name, or 32
-
bit name



Applies to EAX, EBX, ECX, and EDX

http:
//www.wordwendang.com/en/

Please go to
http://www.wordwendang.com/en/
, where you can download million word
documents .

AH
AL
16 bits
8
AX
EAX
8
32 bits
8 bits + 8 bits


o

Index and Base Registers

-

s
ome registers have only a 1
6
-
bit name for their lower half

and cannot be divided futher






o

Some Specialized Register Uses



General
-
Purpose

-

EAX


accumulator

-

ECX


loop counter

-

ESP


stack pointer

-

ESI, EDI


index registers

-

EBP


extended frame pointer (stack)



Segment

-

CS


code segment

-

DS


data segment

-


SS


stack segment

-

ES, FS, GS
-

additional segments



EIP


instruction pointer



EFLAGS

-

S
tatus and control flags

(control the CPU operations)

-

E
ach flag is a single binary bit

o

Status Flags



reflect the outcomes of arithmetic and
logical

operatio
ns



Carry

(CF)

-

U
nsigned arithmetic out of range




Overflow

(OF
)

-

S
igned arithmetic out of range

http:
//www.wordwendang.com/en/

Please go to
http://www.wordwendang.com/en/
, where you can download million word
documents .



Sign

(SF)

-

R
esult is negative



Zero

(ZF)

-

R
esult is zero



Auxiliary Carry

(AC)

-

C
arry from bit 3 to bit 4



Parity

(PC)


-

S
um of 1 bits is an even numbe
r

o

Floating
-
Point, MMX, XMM Registers



Eight 80
-
bit floating
-
point data registers

-

ST(0), ST(1), . . . , ST(7)

-

A
rranged in a stack




-

U
sed for all floating
-
point arithmetic



Eight 64
-
bit MMX registers



Eight 128
-
bit XMM registers for single
-
instru
ction multiple
-
data (SIMD)
operations



Intel Microprocessor History

o

Early Intel Microprocessors



Intel 8080

-

64K addressable RAM

-

8
-
bit registers

-

CP/M operating system

-

S
-
100 BUS architecture

-

8
-
inch floppy disks!



Intel 8086/8088

-

IBM
-
PC

Used 8088

-

1 MB addressable RAM

-

16
-
bit registers

-

16
-
bit data bus (8
-
bit for 8088)

-

S
eparate floating
-
point unit (8087)

o

The IBM
-
AT



Intel 80286



16 MB addressable RAM



Protected memory



several times faster than 8086



introduced IDE bus architect
ure



80287 floating point unit

o

Intel IA
-
32 Family



Intel386


-

4 GB addressable RAM, 32
-
bit registers, paging (virtual memory)



Intel486

-

I
nstruction pipelining



Pentium

-

S
uperscalar, 32
-
bit address bus, 64
-
bit internal data path

o

Intel P6 Family



Pentium

Pro

http:
//www.wordwendang.com/en/

Please go to
http://www.wordwendang.com/en/
, where you can download million word
documents .

-

A
dvanced optimization techniques in microcode



Pentium II

-

MMX (multimedia) instruction set



Pentium III

-

SIMD (streaming extensions) instructions



Pentium 4 and Xeon

-

Intel NetBurst micro
-
architecture, tuned for multimedia



CISC and RISC

o

CISC
(C
omplex
I
nstruction
S
et
)



large instruction set



high
-
level operations



requires microcode interpreter



examples: Intel 80x86 family

o

RISC
(R
educed
I
nstruction
S
et
)



simple, atomic instructions



small instruction set



directly executed by hardware



examples:

-

ARM (Advanced RISC Machines)

-

DEC Alpha (now Compaq)

http:
//www.wordwendang.com/en/

Please go to
http://www.wordwendang.com/en/
, where you can download million word
documents .

3,
IA
-
32 Memory Management



Real
-
Address mode

-

1 MB RAM maximum addressable

-

Application programs can access any area of memory

-

Single tasking

-

Supported by MS
-
DOS operating s
ystem

o

Segmented Memory



Segmented memory addressing: absolute (linear) address is a combination of a
16
-
bit segment value added to a 16
-
bit offset


00000
10000
20000
30000
40000
50000
60000
70000
80000
90000
A0000
B0000
C0000
D0000
E0000
F0000
8000:0000
8000:FFFF
seg
ofs
8000:0250
0250



Calculating Linear Addresses

-

Given a segment address, multiply it by 16 (add a hexadecimal zero),

and add
it to the offset

-

Example: convert 08F1:0100 to a linear address


Adjusted Segment value:

0 8 F 1 0


Add the offset:



0 1 0 0


Linear address:


0 9 0 1 0



Protected Mode

-

4 GB addressable

RAM

-

(00000000 to FFFFFFFF
)

-

Each program assigned a memory partition which is protected from other programs

-

Designed for multitasking

-

Supported by Linux & MS
-
Windows


-

Segment descriptor tables

-

Program structure



cod
e, data, and stack areas



CS, DS, SS segment descriptors



global descriptor table (GDT)

-

MASM Programs use the
Microsoft flat memory model

http:
//www.wordwendang.com/en/

Please go to
http://www.wordwendang.com/en/
, where you can download million word
documents .

o

Flat Segment Model



Singl
e global descriptor table (GDT)



All segments mapped to entire 32
-
bit address space

00000000
FFFFFFFF
(4GB)
physical RAM
00000000
Segment descriptor, in the
Global Descriptor Table
00040
- - - -
base address
limit
access
00040000
not used

o

Multi
-
S
egment Model



Each program has a local descriptor table (LDT)



holds descriptor for each segment used by the program

3000
RAM
00003000
Local Descriptor Table
0002
00008000
000A
00026000
0010
base
limit
access
8000
26000



Paging



o

Supported directly by the CPU

o

Divides each segment into 4096
-
byte blocks called
pages

o

Sum of all programs can be larger than p
hysical memory

o

Part of running program is in memory, part is on disk

o

Virtual memory manager (VMM)

-

OS utility that manages the loading and unloading
of pages

o

Page fault

-

issued by CPU when a page must be loaded from disk

http:
//www.wordwendang.com/en/

Please go to
http://www.wordwendang.com/en/
, where you can download million word
documents .

4.

Components of an IA
-
32 Micr
ocomputer



Motherboard

o

CPU socket

o

External cache memory slots

o

Main memory slots

o

BIOS

(Basic Input
-
Output System)

chips

o

Sound synthesizer chip (optional)

o

Video controller chip (optional)

o

IDE, parallel, serial, USB, video, keyboard, joystick, network, and mou
se connectors

o

PCI bus connectors (expansion cards)

o

Intel 8042 keyboard and mouse microcontroller



Video Output

o

Video controller



O
n motherboard, or on expansion card



AGP
(
accelerated graphics po
rt technology
)

o

Video memory (VRAM)

o

Video CRT Display



U
ses raster scanning



H
orizontal retrace



V
ertical retrace

o

Direct digital LCD monitors



N
o raster scanning required




Memory

o

ROM

(R
ead
-
O
nly
M
emory
)



permanent, cannot be erased

o

EPROM

(E
rasable
P
rogrammable

R
ead
-
O
nly
M
emory
)

o

Dynamic RAM

(Ramdom Access Memory)
(DRAM)

-

I
nexpensive; must be refreshed constantly

o

Static RAM

(SRAM)

-

E
xpensive; used for cache memory; no refresh required

o

Video RAM (
VRAM)

-

D
ual ported; optimized for constant video refresh

o

CMOS

(C
omplimentary
M
etal
-
O
xide
S
emiconductor
)

RAM

-

S
ystem setup information



Input
-
Output Ports

o

USB (universal serial bus)



intelligent high
-
speed connection to devices



U
p to 12 megabits/second



USB hub connects multiple devices



E
numeration
: computer queries
devices



S
upports
hot

connections

o

Parallel



S
hort cable, high speed



C
ommon for printers



B
idirectional, parallel data transfer



Intel 8255 controller chip



http:
//www.wordwendang.com/en/

Please go to
http://www.wordwendang.com/en/
, where you can download million word
documents .

o

Serial



RS
-
232 serial port



O
ne bit at a time



U
ses long cables and modems



16550 UART (universal asynchron
ous receiver transmitter)



P
rogrammable in assembly language


5.
Input
-
Output System



Levels of Input
-
Output

o

Level 3
: Call a library function (C++, Java)



E
asy to do; abstracted from hardware; details hidden



S
lowest performance

o

Level 2:

Call an operating sy
stem function



S
pecific to one OS; device
-
independent



M
edium performance

o

Level 1:

Call a
BIOS (basic input
-
output system)

function



M
ay produce different results on different systems



K
nowledge of hardware required



U
sually good performance

o

Level 0:

Communicat
e directly with the hardware



May not be allowed by some operating systems



ASM Programming levels

-

ASM programs can perform input
-
output at each of the following levels:


ASM Program
OS Function
BIOS Function
Hardware
Level 0
Level 1
Level 2


This word document was downloaded from the website:
http://www.wordwendang.com/en/
,
please remain this
link
information when

you

reproduce , cop
y
, or use

it.

<a href='http://www.wordwendang.com/en'>word documents</a>