Figure 3.1 Windows NT 's Address Space

wackybabiesSoftware and s/w Development

Dec 14, 2013 (3 years and 10 months ago)

68 views






Linux vs. Windows NT Memory Management





Contents

1. Introduction


2. Linux Memory Management

2.1 Address Generation in x86




2.2 How Linux Does This




2.3 Page Allocation




2.4 Page Replacement Algorit
hm


2.5 Kernel Memory Allocation




2.5.1 Slab Layout


3. Windows NT Memory Management



3.1 Reserved vs. Committed Memory



3.2 Page Frame Database



3.3 Page Replacement


References







Linux vs. Windows NT Memory Management



1. Introduction :



Linux and WNT have the common concept in memory management , virtual
memory with paging .The purpose of this paper is to explain the memory management in
Linux and to provide a brief introduction to WNT memory management for compariso
n
purposes . The research in memory management had a great impact on design of the
hardware . So explanation of memory management is impossible without the discussion
of what support microprocessor or hardware provides to the operating system .

Although m
emory management in Linux is platform independent but most of
these platforms share a common architecture of page tables with varying paging levels.
Linux memory management was designed by taking 64
-
bit Alpha processor into
consideration. But it easily acc
ommodates other platforms by slight modifications.


So this discussion is little bit hardware dependent , although the basic idea is
same. The hardware which I have chosen is of course Intel Pentium/x86. This
information is valid for any Intel gener
al purpose processor from 80386 to latest Pentium.

2. Linux Memory Management :


As in SVR4 and Solaris ,Linux also uses two separate memory management
schemes ; virtual memory management for user processes and kernel memory
management for the use of kerne
l .Linux divides the memory in two parts . Memory from
0 to 3GB(0xBFFFFFFF) is used for user processes and from 3GB to 4GB(0xFFFFFFFF )
is for kernel . This arrangement is shown in figure 2.1.



4GB






3GB







0






Figure 2.1 Linux Address Space

Kernel

Space




User Space



In user space the demand paged virtual memory scheme is used. Let us consider
the address generation mechanism w.r.t x86 to completely understand this concept.


2.1 Address Generation in x86 :


Intel x86 provides
the support for both segmentation and paging . The maximum
segment size is 4GB which is the complete linear address space of the processor . Smaller
size segments are created by specifying limit field in the descriptor of the segment.





Logical Address


Selector Offset

Linear Address


Space

Global Desc.

Table

Linear Address


Physical Addr.


Space


Page


Page Dir. Table










Figure 2.2 Segmenta
tion And Paging In x86





segment




Lin Addr






Segment

Descriptor


Dir Table offset




Entry




Entry




Page



Phy Addr.

To locate a byte in a particular segment , a logical address must be provided . A
logical address consists of a segment selector and an offset . A selector is a unique
identifier for a segment . Among other things it provides an
offset into a descriptor table
to a data structure called a segment descriptor . A segment descriptor provides the base
address of the segment , along with the access rights and limit of the segment.

This base address is added with the offset from the log
ical address to generate a
linear address.


Now if paging is not used , the linear address space of the processor is mapped
directly into the physical address space of processor. But if the paging is used then the
32
-
bit linear address is treated as follow
s




31 21 11 0









Figure 2.3 Linear Address


Where the right most 10 bits select a second level
page table from the first level page
table called page directory . The next 10 bits select a page from the second level page
table and the last 12 bits are the address of the byte in the 4k size page.


2.2 How Linux does this ? :



As I said that segments

can be any size from 0 to 4GB . Linux uses two
sizes. All the segments in user space for all the processors are of 3GB , and the segments
in kernel space are of 1GB starting from 3GB. It means Linux uses a kind of
flat memory
model

in which all the segmen
ts in user space share the same address space. Then how
does the memory is protected in this multitasking environment , the protection at page
level is used for this purpose. In a sense Linux uses pure paging mechanism for virtual
memory management . Now l
et us consider the platform independent paging scheme of
Linux .

Page Directory Page Table



Offset


Linux makes use of a three
-
level page table structure consisting of the following
types of tables :

Page Directory : This is top
-
level node , known as PAGE GLOBAL DIRECTORY or
“pgd” .

Page M
iddle Directory : A middle level node is called PAGE MIDDLE DIRECTORY or
“pmd” .

Page Table : A bottom level node which holds the actual PTE(page table entry)
describing pages.



Since x86 provides support for only two level paging the code that traverses
the
“middle level “ of page tables does nothing on the x86 architecture
---

it gets
preprocessed and compiled down to essentially nothing via platform specific #ifdefs .
This allows other code to be written as though all machines had three


level page tab
les.


2.3 Page Allocation :


The part of memory management which handles the allocation of pages or which
manages physical memory is called Zone Allocator . Different ranges of physical pages
may have different properties for the kernel purposes . For e
xample DMA , may only
work for physical address less than 16MB . The zone allocator handles such differences
by dividing memory into a number of zones and treating each zone as a unit for allocation
purposes .Within each zone the buddy system is used to ma
nage physical pages . Pages
are always allocated in blocks of 2
n

pages aligned on 2
n

page boundary.

2.4 Page Replacement Algorithm

:


The major component of the page replacement mechanism is a clock algorithm .
The clock algorithm is used because it pr
ovides an approximation of LRU replacement
and is cheaper to implement . Plus all common general purpose CPU’s have hardware
support for clock algorithm in the form of the reference bit maintained by PTE cache.


The simple clock scheme which uses only one

bit is known as “second chance”
algorithm , because it gives a page a second chance to stay in memory one more sweep
cycle.



Linux uses a simple second chance (one
-
bit clock ) algorithm , but with several
elaborations and complications.


2.5 Kernel M
emory Allocation :


The above discussed
Buddy System

based zone allocator is a simple and
relatively fast allocator ; but it is a poor allocator in many respects . The fact that it can
only manage block sizes in powers of two means that using it straightf
orwardly requires
rounding the requested block sizes up to power of two , which can incur a large cost in
internal fragmentation .


Linux therefore uses one more memory allocator for kernel ‘s use called slab
allocator . The basic behind slab allocator i
s the concept of “object caching” , which is a
technique for dealing with objects that are frequently allocated and freed. In kernel the
small sized objects , like mutex for synchronization ,are very frequently created and
destroyed. However in many cases
the cost of initializing and destroying the objects
exceeds the cost of allocating and freeing memory for it . So the idea is to preserve the
invariant portion of an object‘s initial state
-
its constructed state
-
between uses, so it does
not have to be destr
oyed and recreated every time the object is used. This is achieved by
caching the objects in small buffers.


The slab allocator uses the zone allocator to get the largish hunks of memory and
carves them into smaller pieces as needed .


A slab consists of o
ne or more pages of virtually contiguous memory carved up
into equal size chunks , with a reference count of how many of those chunks have been
allocated.


2.5.1 Slab Layout:


The contents of each slab are managed by a kmem_slab structure that maintains
t
he slab’s linkage in the cache , its reference count , and its list of free buffers. In turn ,
each buffer in the slab is managed by a kmem_bufctl structure that holds the freelist
linkage , buffer addresses , and a back pointer to the controlling slab. Th
is arrangement is
shown in figure.
















Figure 2.4 Slab Layout


3. Windows NT Memory Management

:



Windows NT provides a page
-
based memory management scheme that
allows applications to realize a 32

bit linear address
space for 4GB of memory . Like
Linux , WNT also divides the memory in two equal parts of 2GB each . This is shown in
figure 4.1 . Like Linux the upper half of the address space is reserved for system and
lower half is for user processes. Similar to Linux
, WNT also didn’t choose the
segmented memory architecture but it implemented the pure demand paged virtual
memory system . Same discussion of how the addresses are generated on x86 architecture


4 GB





2GB





0



Figure 3.1 Windows NT ‘s Address Space


Kmem

slab

Kmem

bufctl

Kmem

bufctl

Kmem

bufctl

Buf


Buf


buf


unused



Reserved For

Use by
System



Available for
use by
application

can also be applied to WNT . As told the address space integrity of the process is
preserved at page levels. This is achieved in two ways . First each process has its own
page
-
directory , so that it can not access

the address space of any other process . Second
the access rights bits of the PTE can be used to protect the individual pages from being
accidentally corrupted by the process itself.


3.1 Reserved vs. Committed Memory :


In Windows NT, a distinction exis
ts between memory and address space.
Although each process has a 4
-
GB address space, rarely if ever will it realize anywhere
near that amount of physical memory. Consequently, the virtual
-
memory manager must
keep track of the used and unused addresses of a

process, independent of the pages of
memory it is actually using. In actuality this amounts to having a structure for
representing all of the physical memory in the system and a structure for representing
each process's address space.

As part of the proce
ss object (the overhead associated with every process in
Windows NT), the VMM stores a structure called the virtual address descriptor (VAD)
tree to represent the address space of a process. As address space gets used for a process,
the VMM updates the VAD

tree to reflect which addresses are used and which are not.


3.2 The Page
-
Frame Database:

The virtual
-
memory manager uses a private data structure for maintaining the
status of every physical page of memory in the system. The structure is called the
page
-
frame database
. The database contains an entry for every page in the system, as well as a
status for each page. The status of each page falls into one of the following categories:

Valid : A page in use by an active process in the system. Its PTE is marked
as valid.

Modified: A page that has been written to, but not written to disk. Its PTE is marked as
invalid and in transition.

Free : A page with no corresponding PTE and available for use. It must first be zeroed
before being used unless it is used as a re
ad
-
only page.

Zeroed : A free page that has already been zeroed and is immediately available for use by
any process.

Bad : A page that has generated a hardware error and cannot be used by any process in
the system.

Most of the status types are common to mo
st paged operating systems, but the
two transitional page status types are unique to Windows NT. If a process addresses a
location in one of these pages, a page fault is still generated, but very little work is
required of the VMM. Transitional pages are m
arked as invalid, but they are still resident
in memory, and their location is still valid in the PTE. The VMM merely has to change
the status on this page to reflect that it is valid in both the PTE and the page
-
frame
database, and let the process continu
e.


Process Page Table


Page Frame Database











Figure 3.2

3.3 Page Replacement :


In Windows NT, the component responsible for making page replacement
decisions is called the
working
-
set manager
. When a proces
s starts, the VMM assigns it a
default working set that indicates the minimum number of pages necessary for the
process to operate efficiently. The working
-
set manager periodically tests this quota by

PTE

PTE

PTE

Valid


Free

Modifed


Standby


Valid


Free


stealing
Valid

pages of memory from a process. If the p
rocess continues to execute
without generating a page fault for this page, the working set is reduced by one, and the
page is made available to the system.

The act of stealing a page from a process actually occurs in two stages. First, the
working
-
set man
ager changes the PTE for the page to indicate an invalid page in
transition. Second, the working
-
set manager also updates the page
-
frame database entry
for the physical page, marking it as either
Modified

or
Standby
, depending on whether
the page is dirty
or not.


























References:



UNIX System for Modern Architectures ; Curt Schimmel , Addison
-
Wesley



Linux Memory Management Documentation ;
http://www.linux
-
mm.org/docs.shtml



THE GNU/LINUX

2.2 VIRTUAL MEMORY SYSTEM, PART I ; Paul Wilson




Operating Systems , Fourth Edition ; William Stallings ,Prentice Hall




Linux MM : Design of a Zone based memory allocator ; Rik Van Riel , July 1998




MSDN Library , Microsoft , Memory Management In Micros
oft Windows.