Memory
Management
Managing Memory … The
Simplest
Case
The O/S
User
Program
0
0xFFF …
* Early PCs and Mainframes
* Embedded Systems
One user program at a time.
The logical address space is
the same as the physical
address space.
But in Modern Computer Systems
Modern memory managers subdivide memory to
accommodate multiple processes.
Memory needs to be allocated efficiently to pack
as many processes into memory as possible
When required a process should be able to have
exclusive use of a block of memory, or permit
sharing of the memory by multiple processes.
Primary memory is abstracted so that a program
perceives that the memory allocated to it is a large
array of contiguously addressed bytes (but it usually isn’t).
* Relocation
* Protection
* Sharing
* Physical Organization
Four Major Concerns of
a Memory Manager
The Programmer does not know where the program will
be placed in memory when it is executed. While the
program is executing, it may be swapped to disk and
returned to main memory at a different location
(relocated). Memory references in the code must be
translated to actual physical memory address
Relocation
Process’s logical
address space
Physical address
space
With multiple processes in memory and running
simultaneously” the system must protect one
process from referencing memory locations in
another process. This is more of a hardware
responsibility than it is an operating system
responsibility.
Protection
Source
Compiler
Relocatable
Object Module
Relocatable
Object Modules
Linker
Load Module
All have their own:
Program Segment
Data Segment
Stack Segment
Library Modules
All references to data or
functions are resolved…
Linking and Loading a Program
Loader
Load Module
Process Image
In Main Memory
Load Module
Process Image
In Main Memory
Absolute Loader
program
data
program
data
0
0
no changes in addresses
Load Module
Process Image
In Main Memory
Static Binding
program
data
program
data
0
All addresses in the
code segment are
relative to 0
Add the offset to every
address
as the code is
loaded.
Offset = 1000
Jump 400
Jump 1400
1000
Load Module
Process Image
In Main Memory
Dynamic Run
-
time Binding
program
data
program
data
0
All addresses in the
code segment are
relative to 0
Offset = 1000
Jump 400
Jump 400
Addresses are
maintained in
relative
format
Address translation
takes place on the
fly at run time.
program
data
Base Register
Limit Register
jump, 400
Adder
Comparator
absolute address
interrupt
1000
relative address
400
1400
segment error!
A Code Example
. . .
static int gVar;
. . .
int function_a(int arg)
{
. . .
gVar = 7;
libFunction(gVar);
. . .
}
static variables are stored in the
data segment.
generated code will be stored
in the code segment.
libFunction( ) is defined in an external
module. At compile time we don’t know
the address of its entry point
A Code Example
. . .
static int gVar;
. . .
int function_a(int arg)
{
. . .
gVar = 7;
libFunction(gVar);
. . .
}
0000
. . .
. . .
0008
entry function_a
. . .
0220
load r1, 7
0224
store r1, 0036
0228
push 0036
0228
call libFunction
. . .
0400
External Reference Table
. . .
0404
“libFunction”
????
. . .
0500
External Definition Table
. . .
0540
“function_a”
0008
. . .
0600
Symbol Table
. . .
0799
End of Code Segment
. . .
0036
[space for gVar]
. . .
0049
End of Data segment
Code Segment
Data Seg
relative addresses
0000
. . .
. . .
0008
entry function_a
. . .
0220
load r1, 7
0224
store r1, 0036
0228
push 0036
0228
call libFunction
. . .
0400
External Reference Table
. . .
0404
“libFunction”
????
. . .
0500
External Definition Table
. . .
0540
“function_a”
0008
. . .
0600
Symbol Table
. . .
0799
(end of code segment)
. . .
0036
[space for gVar]
. . .
0049
(end of data segment)
libFunction
0000 (other modules)
. . .
1008
entry function_a
. . .
1220
load r1, 7
1224
store r1, 0136
1228
push 1036
1232
call 2334
. . .
1399
(end of function_a)
. . .
(other modules)
2334
entry libFunction
. . .
2999
(end of code segment)
. . .
0136
[space for gVar]
. . .
1000
(end of data segment)
relative addresses
Code Segment
Data seg
Object File
Contains an external definition table
Indicating the relative entry point
0000 (other modules)
. . .
1008
entry function_a
. . .
1220
load r1, 7
1224
store r1, 0136
1228
push 0136
1232
call 2334
. . .
1399
(end of function_a)
. . .
(other modules)
2334
entry libFunction
. . .
2999
(end of code segment)
. . .
0136
[space for gVar]
. . .
1000
(end of data segment)
4000 (other modules)
. . .
5008
entry function_a
. . .
5220
load r1, 7
5224
store r1, 7136
5228
push 7136
5232
call 6334
. . .
5399
(end of function_a)
. . .
(other modules)
6334
entry libFunction
. . .
6999
(end of code segment)
. . .
7136
[space for gVar]
. . .
8000
(end of data segment)
real addresses
static
Bind
(offset 4000)
Load Module
Multiple processes (fork) running the same
executable
Shared memory
Sharing Memory
The flow of information between the various
“levels” of memory.
Physical Organization
Computer memory consists of a large array of words
or bytes, each with its own address.
Registers built into the CPU are typically accessible in
one clock cycle. Most CPUs can decode an instruction
and perform one or more simple register operations
in one clock cycle. The same is not true of memory
operations which can take many clock cycles.
Registers
Cache
RAM
Disk
Optical, Tape, etc
fast but expensive
cheap but slow
1 machine cycle
Memory Allocation
Before an address space can be bound to physical
addresses, the memory manager must allocate
the space in real memory where the address space will
be mapped to. There are a number of schemes to
do memory allocation.
Fixed Partitioning
Equal
-
size fixed partitions
any process whose size is less than
or equal to the partition size can be
loaded into an available partition
if all partitions are full, the operating
system can swap a process out of a partition
a program may not fit in a partition.
Then the programmer must design the
program with
overlays
Fixed Partitioning
Main memory use is inefficient.
Any program, no matter how small, occupies
an entire partition. This is called
internal
fragmentation
.
But . . . It’s easy to implement.
Placement Algorithm with
Fixed Size Partitions
Equal
-
size partitions
because all partitions are of equal size, it does
not matter which partition is used
Placement is trivial.
Example is OS/360 MFT. The operator fixed the
partition sizes at system start up.
Two options:
* Separate Input Queues
* Single Input Queue
Fixed Partition with
Different Sizes
Multiple Input Queues
O/S
0
100K
200K
400K
700K
800K
Partition 1
Partition 2
Partition 3
Partition 4
Jobs are put into the queue
for the
smallest
partition big
enough to hold them.
Disadvantage?
Memory can go unused,
even though there are
jobs waiting to run that
would fit.
Single Input Queue
O/S
0
100K
200K
400K
700K
800K
Partition 1
Partition 2
Partition 3
Partition 4
When a partition becomes free
pick the first job on the queue
that fits.
Disadvantage?
Small jobs can be put into
much larger partitions than
they need, wasting memory
space.
Single Input Queue
O/S
0
100K
200K
400K
700K
800K
Partition 1
Partition 2
Partition 3
Partition 4
Alternative Solution
–
scan the
whole queue and find the job
that best fits.
Disadvantage?
Discriminates against small jobs.
Starvation.
CPU Utilization
From a probabalistic point of view ….
Suppose that a process spends a fraction
p
of its time
waiting for I/O to complete. With
n
processes in memory
at once, the probability that all
n
processes are waiting
for I/O (in which case the CPU is idle) is
p
n
.
CPU utilization is therefore given by the formula
CPU Utilization = 1
–
p
n
Consider the case where processes spend 80% of their time
waiting for I/O (not unusual in an interactive end
-
user system
where most time is spent waiting for keystrokes). Notice that
it requires at least 10 processes to be in memory to achieve
a 90% CPU utilization.
Predicting Performance
Suppose you have a computer that has 32MB of memory and that
the operating system uses 16MB. If user programs average 4MB
we can then hold 4 jobs in memory at once. With an 80% average
I/O wait
CPU utilization = 1
–
0.8
4
= approx 60%
Adding 16MB of memory allows us to have 8 jobs in memory at once
So
CPU utilization = 1
-
.8
8
= approx 83%
Adding a second 16MB would only increase CPU utilization to 93%
Dynamic Partitioning
Partitions are of variable length and number.
A process is allocated exactly as much memory as it requires.
Eventually you get holes in the memory.
This is called external fragmentation.
You must use
compaction
to shift processes so they
are contiguous and all free memory is in one block.
O/S
8M
56M
For
Example …
O/S
8M
Process 1
20M
36M
O/S
8M
Process 1
20M
Process 2
14M
18M
Process 3
8M
O/S
8M
Process 1
20M
10M
18M
Process 3
8M
Process 4
4M
O/S
8M
16M
10M
18M
Process 3
8M
Process 4
4M
Process 5
4M
Fragmentation!
Periodically the O/S could do memory compaction
–
like
disk compaction. Copy all of the blocks of code for loaded
processes into contiguous memory locations, thus opening
larger un
-
used blocks of free memory.
The problem is that this is expensive!
A related question: How much memory do you
allocate to a process when it is created or
swapped in?
In most modern computer languages data can be
created dynamically.
The Heap
This may come as a surprise….
Dynamic memory allocation with
malloc
, or
new
, does not really
cause system memory to be dynamically allocated to the process.
In most implementations, the linker anticipates the use of
dynamic memory and reserves space to honor such requests. The
linker reserves space for both the process’s run
-
time stack and
it’s heap. Thus a
malloc( )
call returns an address within the
existing address space reserved for the process.
Only when this space is used up does a system call to the kernel
take place to get more memory. The address space may have to
be rebound
–
a very expensive process.
Managing Dynamically
Allocated Memory
When managing memory dynamically, the operating
system must keep track of the free and used blocks
of memory.
Common methods used are bitmaps and linked lists.
Linked List Allocation
P 0 5
H 5 3
P 8 6
P 14 4
H 18 2
P 20 6
P 26 3
H 29 3 X
Hole
Starts at
Length
Process
Memory is divided up into some
number of fixed size
allocation
units
.
Keep List in order sorted by address
Linked List Allocation
P 0 5
H 5 3
P 8 6
P 14 4
H 18 2
P 20 6
P 26 3
H 29 3 X
When this process ends, just merge this node
with the hole next to it (if one exists). We want
contiguous blocks!
Hole
Starts at
Length
Linked List Allocation
P 0 5
H 5 3
P 8 6
P 14 4
H 18 8
P 26 3
H 29 3 X
When blocks are managed this way, there are several algorithms that
the O/S can use to find blocks for a new process, or one being
swapped in from disk.
Hole
Starts at
Length
Dynamic Partitioning
Placement Algorithms
Best
-
fit algorithm
Search the entire list and choose the block that is
the smallest that will hold the request. This algorithm
is the worst performer overall. Since the smallest possible
block is found for a process this algorithm tends to leave
lots of tiny holes that are not useful.
smallest block that process will fit in
tiny hole
Dynamic Partitioning
Placement Algorithms
Worst
-
fit
–
a variation of best fit
This scheme is like best fit, but when looking for a new
block it picks the
largest
block of unallocated memory.
The idea is that external fragmentation will result in
bigger holes, so it is more likely that another block will fit.
Largest block of unallocated memory
big hole
Dynamic Partitioning
Placement Algorithms
First
-
fit algorithm
Finds the first block in the list that will fit.
May end up with many process loaded in the front end of
memory that must be searched over when trying to
find a free block
Dynamic Partitioning
Placement Algorithms
Next
-
fit
–
a variation of first fit
This scheme is like first fit, but when looking for a
new block, it begins its search where it left off the
last time. This algorithm actually performs slightly
worse than first fit
Swapping
Used primarily in timeshared systems with single thread
processes.
Optimizes system performance by removing a process from
memory when its thread is blocked.
When a process is moved to the ready state, the process
manager notifies the memory manager so that the address
space can be swapped in again when space is available.
Requires relocation hardware
Swapping can also be used when the memory requirements of
the processes running on the system exceed available memory.
System Costs to do Swapping
If a process requires
S
units of primary storage, and a disk block
holds
D
units of primary storage, then
ceiling(
S
/
D
)
disk writes are required to swap the address space to disk. The
same number of reads are required to swap the address space
back into primary storage.
For example, if a process is using 1000 bytes of memory, and
disk blocks are 256 bytes, then 4 disk writes are required.
Suppose that a process requiring
S
units of primary storage is
blocked for
T
units of time. The resource wasted because the
process stays in memory is
S
x
T
.
What criteria would you use to determine whether or not to
swap the process out of primary storage?
Suppose that a process requiring
S
units of primary storage is
blocked for
T
units of time. The resource wasted because the
process stays in memory is
S
x
T
.
What criteria would you use to determine whether or not to
swap the process out of primary storage?
How big is
S
? If it is small, then the amount of storage made
available for other processes to use is minimized, and another
process may not fit. Swapping would be wasteful if there is not
a process that would fit in the storage made available.
Suppose that a process requiring
S
units of primary storage is
blocked for
T
units of time. The resource wasted because the
process stays in memory is
S
x
T
.
What criteria would you use to determine whether or not to
swap the process out of primary storage?
If
T
is small, then the process will begin competing for primary
storage too quickly to make the swap effective.
If
T
<
R
, the process will begin requesting memory before it is
even completely swapped out
(
R
is the time required to swap).
Suppose that a process requiring
S
units of primary storage is
blocked for
T
units of time. The resource wasted because the
process stays in memory is
S
x
T
.
What criteria would you use to determine whether or not to
swap the process out of primary storage?
For swapping to be effective,
T
must be considerably larger
than 2
R
for every process that the memory manager chooses
to swap out, and S must be large enough for other processes
to execute.
S
is known. Can
T
be predicted?
S
is known. Can
T
be predicted?
When a process is blocked on a slow I/O device, the memory
manager can estimate a lower bound.
What about when a process is blocked by a semaphore
operation?
Example Test Questions
A memory manager for a variable sized region strategy has a
free list of memory blocks of the following sizes:
600, 400, 1000, 2200, 1600, 2500, 1050
Which block will be selected to honor a request for 1603 bytes
Using a best
-
fit policy?
2200
Which block will be selected to honor a request for 949 bytes
Using a best
-
fit policy?
1000
Which block will be selected to honor a request for 1603 bytes
Using a worst
-
fit policy?
2500
If you were designing an operating system, and had to
determine the best way to sort the free block list, how
would you sort it for each of the following policies, and why?
Best
-
fit
Smallest to largest free
block
Worst
-
fit
Largest to smallest free
block
Enter the password to open this PDF file:
File name:
-
File size:
-
Title:
-
Author:
-
Subject:
-
Keywords:
-
Creation Date:
-
Modification Date:
-
Creator:
-
PDF Producer:
-
PDF Version:
-
Page Count:
-
Preparing document for printing…
0%
Σχόλια 0
Συνδεθείτε για να κοινοποιήσετε σχόλιο