Linux Memory Management

wackybabiesSoftware and s/w Development

Dec 14, 2013 (3 years and 8 months ago)

126 views


1



Advanced Operating System
(
CS5423)

Linux Memory

Management

Report













934361
Chou, Chih
-
Hung

(
周志鴻
)

936703
Huang, Pei
-
Chi


(
黃珮琪
)


2

1

Introduction


The memory management subsystem is one of the most important parts of the
operating system. Since the
early days of computing, there has been a need for more
memory than exists physically in a system. Strategies have been developed to overcome
this limitation and the most successful of these is virtual memory. Virtual memory makes
the system appear to have

more memory than it actually has by sharing it between
compe
ting processes as they need it.

We
'll discuss the details of the code and algorithms
,
and

describe how it all works together to implement the
memory
-
management

policy.


3

2

x86 Segmentation & Paging


2.1

Intel x86
Segmentation




Supports segmentation over paging


(1)

Segment
ID

is implicitly or explicitly associated with a Segment Selector
register

CS

C
ode segment

D
efault for fetch accesses

DS

D
ata segment


D
efault for non
-
stac
k data accesses

SS

S
tack segment


D
efault for stack ops (push/pop/call/ret)







⠲E

Segment selector registers are 16 bits.

-

Selector (bit 3~15):
segment descriptor index

-

TI

(
bit

2):

indicates GDT or LDT

-

RPL/CPL
(bit 0,1): identifies r
equest/
c
urrent

p
rivilege level


More

attributes of a segment are
defined

in the
segment descriptor

-

Segment descriptors are 64 bits (8 bytes in size)

-

Global or Local descriptor tables
can define
up to 8192 entries

-

Entry 0 always represents an invalid segment


8

GDT or LDT

Linear Address

GDTR
(0)

or LDTR
(1)

Logical Address

Table Indicator

15

3

2

1

0

3 1


0

Of f s e t


S e l e c t o r

TI

R
PL

Selector


Segment

number

R
PL


Request

privilege level


4

(3)

Segmen
t descriptors

contain


-

32 bit base

virtual

or
linear

address of segment

-

20 bit size of the segment in bytes or pages

-

G (granularity)
:

specifies bytes or pages

-

S (system)

(4)

Selector registers have an "invisible" extension that holds descriptor data
.

Validity checking is done when the selector register is loaded


Main Memory need not be accessed again to refer to segment attributes


2.2

Intel x86 Paging




(1)
Pages are
default
4K
B

in size

(2) In x86, l
inear
addresses

are
partit
ioned as

-

Page directory index
: bit 31 ~ 22

-

Page table index
: bit 21 ~ 12

-

Offset into page
: bit 11 ~ 0

cr3


cr3

Page

Linear Address

Physical Address


P
DE


P
TE

register


point to the page directory


5


Page
directory
/
table entr
y

-

P
:

is page present

-

A
:

has paged been accessed

-

D
:

has page been written

-

R
:

page is readonly

-

U
:

user mode page (U = 0
-
> access by PL 3 is verboten)

-

S
:

pages may be 4K or 4 MB in size

-

PCD
:

disable hardware caching

-

PWT
:

use wri
te through instead of copy back

(both default to 0. but can be
reset as required)


Page direct
ory/table entry

control bit

0~12

31

12

7 6

5

2 1

0

P a g e A d d r e s s

D

A

U


/

S

R


/

W

P

D i r t y F l a g

A c c e s s F l a g

U s e r / S u p e r v i s o r

R e a d / Wr i t e

P r e s e n t F l a g

P

S

P a g e S i z e ( O n l y p a g e d i r e c t o r y e n t r y )

S e t i f t h e p a g e i s w r i t t e n

S e t i f t h e p a g e
m a y b e

w r i t t e n t o

P a g e i s r e s i d e n t i n m e m o r y a n d
n o t s w a p p e d o u t


6

3

Preview of
Linux Memory Management


3.1

Se
gmentation

in Linux


A.

Linux makes minimal use of segmentation in a limited way.

B.

All kernel and user segments overlap into 4GB

linear

virtual


C.

Use segment as identifier

-

Kernel: RPL = 0

-

User: RPL = 3

D.

GDT is as shown





7

3.2

Paging in Linux



(1)

Linux
use
s three level
paging



Page global directory (PGD)



Page middle directory (PM
D)



Additional level to the x86
paging



Page table

entry

(PT
E
)

(2)

Apply
to

the x86 architecture



Define
s

the size of PMD as 1



PMD translation is identical mapping



Output of PMD translation = Input of PMD translation = Output of
PGD translation

(3)

In 32 bit system
s the PMD layer is effectively null.

(4)

P
GD & PTE

are
10

bits each

(5)

There are 1024 entries in each table

(6)

Th
erefore,

PGD

&

PTE

are 4K in size and 4K aligned

(7)

pgd_t, pmd_t, pgd_t

are 32
-
bit data types (unsigned long) for entries

(8)

The functions and macros for creat
ing and manipulating entries are defined
in

include/asm
-
i386/

page.h
,
pgtable.h
,
pgtable
-
2level.h
,
pgtable
-
3level.h
,
pgalloc.h
,
p
galloc
-
2level.h
,
pgalloc
-
3level.h
,
mm.h

(9)

cr3:
-

register that stores the page global descriptor table which can help us to




find the linear


physical translation


-

Every context switch will change the cr
3 register to the corresponding


process.

Linear Address

Page

global

directory

Page

middle

directory

Page table

Page

Physical Address


C
r3


8

3.3

Linux Memory Mapping (x86)


While the kernel is loaded physically at the 1MB, it is mapped virtually at location
0x
C
0
10
0000

(3GB + 1MBe).

T
h
e

offset is
defined

in

the macro

PAGE_OFFSET which
is 0x
C0
00000
0

on

x86 architecture
.

Also, Linux reserves highest 128MB address space
for memory mapped I/O and non
-
contiguous memory area allocation (vmalloc).



3.4

Linux Kernel Image (x86)


(1)

Page 0
: stores
BIOS
function and data

(2)

Page 0x
A
0
~

0x
FF:

Legacy video device reservation

(3)

Kernel code and data starts at page 0x100 (1 MB) and is delimited by:

Kernel Code

Data

BSS

Video RAM Mapping

BIOS

1 MB

2 MB

_text

_end

Physical Address

code start

initial
ized data start

_etext

_edata


Physical

Memory

Anywhere

User

Spac
e

Kernel

Space

0 GB

3 GB

4 GB

HIGH_MEM

3GB + 896 MB

0 GB

896M
B

4 GB

PAGE_OFFSET

TASK_SIZE

Reserved for memory mapped IO &
Non
-
contiguous memory area allocation

Anyw
here

un
-
initialized data start


9

-

_text
:

start of code

-

_etext
:

end of co
de / start of initialized data

-

_edata
:

end of initialized data / start of un
-
init data.

-

_end
:

end of the kernel

(4)
/boot/
System.map

defines more specific address boundary of kernel image



3.5

Linux Memory Management
Overview






Buddy

System

Slab

Allocator

Process

Address

Space

Noncontiguous

Memory

Mana
gement

Page

Fault

Handler

NUMA






Page




Page

Zone

User
Space

Kernel
Space

Allocate

Pages

Physical
Memory

Linear Address
Space


10

Usage Count

Status of page frame

PTE cha
in for swapping

Page out list

4

Physical Memory Descriptors


4.1

Page Descriptor



(1)

The kernel keeps track of the current status of each page frame

(2)

The
struct page

is the descriptor of
each page

frame

(3)

All the page frame descriptors on the system are included in an array called

mem_map

if t
here is only single node.

(4)

There are many predefined page flags when identify the status of a page.

And thos
f
lags are defined in
/include/linux/page
-
flags.h


/include/linux/mm.h

struct page

{

unsigned long flags;

atomic_t count;

struct list_head list;




unsigned long index;

struct list_head lru;



union {

struct pte_chain *chain;



pte_addr_t direct;

} pte;

unsigned long private;


void *virtual;

};


4.2

NUM
A & Memory Zone


(1)

NUMA (Non
-
Uniform Memory Access)

-

Access time of different memory may vary

-

Physical memory is partitioned into several nodes

(2)

Memory zones


-

Nodes are subdivided into zones


-

ZONE_ DMA < 16MB


-

ZONE_ NORMAL 16MB ~ 896MB


-

Z
ONE_ HIGHMEM > 896 MB



11

Swap

Node Lists

Array of page descriptors

Zone Lists

Buddy
System

4.3

Zone Descriptor


/include/linux/mmzone.h

struct
zone


{


unsigned long
free_pages
;


unsigned long
pages_min, pages_low, pages_high
;





spinlock_t lru_lock;


struct list_head

active_list;



struct list_head

inactive_list;






unsigned long nr_active;


unsigned long nr_inactive;



struct free_area


f
ree_area
[MAX_ORDER];




struct pglist_data


*
zone_pgdat
;


struct page *
zone_mem_map
;


unsigned l
ong
zone_start_pfn
;


char *
name
;




}



4.4

Node Descriptor


/include/linux/mmzone.h

typedef struct
pglist_data


{


struct zone
node_zones
[MAX_NR_ZONES];


struct zonelist node_zonelists[MAX_NR_ZONES];


int nr_zones;


struct
page *
node_mem_map
;


unsigned long *valid_addr_bitmap;


struct bootmem_data *bdata;


unsigned long
node_start_pfn
;





struct pglist_data

*
pgdat_next
;




} pg_data_t;


12

4.5

Relationship between Node, Zone & Page Descriptor





Page Descriptors

Zone Descriptors

Node Descriptors

architecture dependent

normal regular mapped pages

high memory may not
permanently mapped


into kernel address


13

0, Both busy or free

1, One of buddy is busy

Free page descriptor list

5

Memory allocation & de
-
allocation


Linux uses the
Buddy System
as the most fundamental memory management
component which handle
s

memory allocation/de
-
allocation at page frames level.

Buddy
S
ystem

was designed as a good compromise between eff
icient operation of
allocation/free avoidance of fragmentation of physical memory

Besides
Buddy System
, there are two
other
malloc
-
like memory allocators for kernel
memory allocation/de
-
allocation
, named kmalloc & vmalloc
. Both of them are built on
top of
the
Buddy System
, and

deal with
different

types of memory allocations.


5.1

Buddy System


(1)

Low level page allocation

(2)

Deal with External fragmentation

(3)

Efficient strategy for allocating groups of contiguous page frames

(4)

Allocates memory from one of a set of disjoi
nt zones

(5)

Buddy algorithm

-

Allocation

I.

Allocates 2^n pages

II.

If the block of pages found is larger than requested must be
broken
down

until there is a block of the right size

-

Deallocation

I.

Recombine pages into large blocks of free pages whenever it can

II.

Whenever
a block of pages is freed, the adjacent or

buddy

block of the same size is checked to see if it is free


(6)

Free
memory

within each zone is mapped by one of MAX_ORDER (
=
1
1
) free
area structures.




struct
free_area


{




struct list_head
free_list
;






unsigned long *
map
;

};


14

-

The
free_ list
field points to a list of
struct page
where each is the first

free page of a free

block

-

The
map
field is bitmap identifying states of buddies within the the entire

zone

I.

0 => both buddies free or both b
uddies allocated

II.

1 => exactly one buddy free and one buddy allocated


(7)

Example


(8)

Buddy System allocation
:
alloc_pages

(/mm/page_alloc.c)


-

call _
_alloc_pages(gfp_mask, order, zonelist)




(9)

Buddy System de
-
allocation:
free
_pages

(
/mm/page_alloc.c)

-

call
_
_free_pages_ok(page, order)

-

If possible, it merges the freed region w
ith its buddy, then merges that



l
arger region with its buddy, and so on

try_t
o_free_pages

order
: 2^order pages

zonelist
: from which NODE


gfp_mask
: get pages
from DMA, NORMAL or
HIGHMEM region

alloc_pages

__alloc_pages

buffered_rmqueue

wakeup_kswapd


15

Per
-
CPU slab cache

Parameters passed to the
Buddy system

struct list_head
slabs_partial

struct list_head
slabs_fu
ll

struct list_head
slabs_free

Constructor

De
-
constructor

5.2

Slab Allocator


I.

Slab Characteristics

(1)

Slab larger divides different objects into grou
ps called caches.

(2)

Reduces “Internal fragmentation” by allocating “objects”

(3)

Build on top of the Buddy System

(4)

Each cache is a “store” of objects of the same type

(5)

All objects allocated from a given slab cache have the same size, commonly
2^N bytes



II.

Cache
Descriptor


/mm/slab.c

struct
kmem_cache_s


{


struct array_cache

*
array
[NR_CPUS];



struct kmem_list3
lists
;




unsigned int


objsize
;


unsigned int flags;







unsigned int
gfporder
;


unsigned int
gfpfl
ags
;




void (*
ctor
)(void *, kmem_cache_t *, unsigned long);


void (*
dtor
)(void *, kmem_cache_t *, unsigned long);


const char *
name
;


struct list_head
next
;

};


16

# of allocated objects

Index of first free object

Pointer of first
object

in the slab

III.

Slab Descriptor


/mm/slab.c

struct
slab

{


struct list_head

list;


unsigned long colouroff;


void *



s_mem
;


unsigned int
inuse
;


kmem_bufctl_t

free
;

};




IV.

Relationship



Free object index


17

V.

Slab Algorithm




VI.

Type of Slab


(1)

General cache: kmem_cache_init()

-

cache_cache: c
ache of “cache descriptor”

-

cache_sizes

struct cache_sizes {



size_t

cs_size;



kmem_cache_t *cs_cachep;



kmem_cache_t *cs_dmacachep;


};

malloc_sizes: array of predefined cache_sizes (32B ~ 128MB)

(2)

Specific cache: kmem_cache_create()

-

Used by the kernel

-

Ex: kmem_cache_t *vm_area_cachep;

kmem_cache_t *mm_cachep;


18


desc

avail

obj

obj

limit

VII.

Per Cpu Cache


(1)

Each cache contains a small array of freed objects for each CPU

(2)

Purpose

-

Reduce the number of linked list operations

-

Reduce spinlock contention


struct
ar
ray_cache

{



unsigned int
avail
;


unsigned int
limit
;


unsigned int
batchcount
;




};


VIII.

F
lowchart



descriptor


Page(s)

Buddy System


__alloc_pages()

Page(s)

__free_p
ages_ok()

Slab Allocator

kmalloc()

kmem_cache_alloc()

cache

slab_full


objec
t

slab_partial

slab_free

3

4

1

cache_grow()



kmem_getpages()

2

ac_data(), ac_entry()

cache_refill()

9

7

8

kfree()

kmem_cache_free()

1

2

Timer

1

cache_reap()


drain_array()

2

3


kmem_free_pages()

4

5


slab_destroy()

avail


19

Start linear address

Memory size

Page descriptors array

Physical address (IO
-
r
emap)

VM List

# of page

5.3

Non
-
contiguous Memory Area Management


For infrequent memory requests sometimes it makes sense to al
locate noncontiguous
memory areas

(1)

Works similarly as paging

(2)

Uses the reserved addresses above PAGE_OFFSET to map noncontiguous
memory areas

(3)

To allocate and release noncontiguous memory, use vmalloc and vfree,
respectively

(4)

Ex: modules, I/O driver buffers


I.

L
inear Address Interval




II.

VM Descriptor


struct
vm_struct

{


void *
addr
;


unsigned long
size
;


unsigned long flags;


struct page **pages;


unsigned int

nr_pages;


unsigned long phys_addr;


struct vm_struct

*
next
;

};

3 GB

VMALLOC_OFFSET

Safety interval

Safety interval

End of physical memory

Starting address of kernel page table entry

HIGH_MEM

Memory mapped I/O address


20

Allocate page one by one
from Buddy system

Allocate page
descriptors from Slab

Allocate corresponding page
directory/table entries to
Kernel Page Table (init_mm)

Get free address space


III.

Search Free VM Area


Function
get_vm_area
(addr, flag)
:
First
-
fit algorithm




IV.

vmalloc (/mm/vmalloc.c)


void *__vmalloc(unsigned long size
, int gfp_mask, pgprot_t prot)

{


struct vm_struct *area;


struct page **pages;


...


area =
get_vm_area
(size, VM_ALLOC);


...


area
-
>pages = pages =
kmalloc
(array_size, “gfp_flags”);


...


for (i = 0; i < area
-
>nr_pages; i++) {


area
-
>pages[i] =
alloc_page
(gfp_mask);


...


}


if (
map_vm_area
(area, prot, &pages))


goto fail;


return area
-
>addr;




}



vmlist

VMALLOC_START

VMALLOC_END


Allocated space


Unallocated space

addr

addr+size


21

Free kernel virtual memory space

Free pages to Buddy System

Free vm_struct

Free page descriptors

V.

vfree (/mm/vmalloc.c)





void __vunmap(void *addr, int deallocate_pages)

{


struct

vm_struct *area;




area =
remove_vm_area
(addr);





if (deallocate_pages) {


int i;


for (i = 0; i < area
-
>nr_pages; i++) {





__free_page
(area
-
>pages[i]);






kfree
(area
-
>pages);



}


kfree
(area);


return;

}




22

The address space we belong to

Starting/Ending address of virtual
memory

Next vm_area_struct

Access permission of the
pages/VMA

Red
-
Black tree node

Function pointers invoked when the VMA is added,
removed and when file is mapped to the VMA

6

Process Address Space


The address space of a process consists of all logical addresses that the process is
allowed to use
s

-

Each process address space is separate

-

The kernel allocates logical addresse
s to a process in intervals called memory
regions


When a process gets new memory regions?

-

Creating a new process: fork()

-

Loading an entirely new program: execve()

-

Memory mapping a file: mmap()

-

Growing its stack

-



etc


To the kernel, user mode re
quests for memory are

-

Considered non
-
urgent

-

Considered untrustworthy


As a result, the kernel tries to defer allocation of dynamic memory to processes



6.1

Memory Region Descriptor

(1)

Linux represents a memory region with
vm_area_struct (VMA)

(2)

Memory regions
never overlap

(3)

Kernel tries to merge contiguous regions

(4)

All regions are maintained on a simple list in ascending order by address

(5)

If the list of regions gets large, then it is also managed as an red
-
black tree for
efficiency


vm_area_struct (/include/linux
/mm.h)

struct vm_area_struct {


struct mm_struct * vm_mm;


unsigned long vm_start;


unsigned long vm_end;







struct vm_area_struct *vm_next;


pgprot_t vm_page_prot;


unsigned long vm_flags;


struct rb_node vm_rb;





struct vm_operation
s_struct * vm_ops;

};




23

Red
-
Black Tree of
memory regions

Last accessed VMA

# of LW process

Refernce of the mm_st
ruct

mm_struct list

VMA list

Heap, Stack address

6.2

Process Memory Descriptor

(1)

All information related to the process address
space is included in the
memory

descriptor (mm_struct) referenced by the mm field of the process
descriptor


I.

Page table

II.

# of allocated pages

III.

… etc

(2)

mm_struct is

defined in task_struct named

mm


mm_struct (/include/linux/sched.h)

struct mm_struct {


struct vm_area_struct * mmap;



struct rb_root mm_rb;





struct vm_area_struct * mmap_cache;



pgd_t * pgd;


atomic_t mm_users;


atomic_t mm_co
unt;






struct list_head mmlist;


unsigned long start_code, end_code, start_data, end_data;


unsigned long start_brk, brk, start_stack;


unsigned long rss, total_vm, locked_vm;





};


6.3

Address Space of a Process



0

3GB

mm_struct

vm_struct

Page aligned


24

6.4

do_mmap (/mm/mmap.c)


To allocate a logical address interval, the kernel uses

do_mmap
()

-

Checks for errors and limits

-

Tries to find an unmapped logical address interval in memory region list

-

Allocates a
vm_area_struct

for new interval

-

Updates

bookkeeping and inserts into list (merging if possible)


6.5

do_m
un
map (/mm/mmap.c)

To release a logical address interval, the kernel uses
do_munmap()

-

Locates memory region that overlaps, since it may have been merged

-

Removes memory region, splitting if neces
sary

-

Updates bookkeeping


25

7

Page Fault Handler


(1)

When a process requests more memory from the kernel, it only gets additional
logical address space, not physical memory

(2)

When a process tries to access its new logical address space, a page fault occurs to
tell
the kernel that the memory is actually needed(i.e.,
Demand paging
)

(3)

A
rchitecture
-
Dependent

-

Get information about
:

I.

What’s the problem?

II.


Where’s the offending address?

-

Handles the “abnormal case” and vmalloc fault

-

do_page_fault
() (/arch/i386/mm/fault.c)


(4)

Arch
itecture
-
Independent

-

Handles the “normal case” for user

I.

Do COW, Demand paging, etc.

-

handle_mm_fault
() (mm/memory.c)



26

7.1

Page Fault Handling



(1)

do_page_fault()

-

T
his routine is a dispatcher
which

is invoked when page fault occurs


I.

F
i
nd out the faulting address

(get address from register cr2)

II.

F
ind out the reason for the fault and call the right routine



protection fault: do_wp_page()



page
-
not
-
present fault: do_no_page()



do_page_fault()

Architecture

independent

specific

Error condition

vmalloc fault


27

7.2

handle_mm_fault (/mm/memory.c)


(1)

Demand paging

-

Anonymous pages

-

De
vice/file backed pages

(2)

Swapping

(3)

Copy On Write

-

How to recognize a page as COW?

I.

PTE is write protected

II.

VMA is writable




(1)

do_no_page()

1.

U
sed to page
-
in a page

2.

D
etermine the source of the page

3.

Executable file? Swap area ? Or Anonym
ous page?

Page
-
in from an executable file

A.

When an executable file is opened, Linux will attach the file
operations for the file. The file operations include a mmap( ) function

B.

For UFS, this mmap() function points to generic_file_mmap( )

C.

After opening the
file, Linux maps segments by do_mmap( )

D.

do_mmap() will actually call the mmap() file operation to attach the
vm_operations for the vma

E.

The vm operation contains a nopage( ) which is called when page
handle_mm_fault

handle_pte_fault

pte_alloc_map

do_swap_page

do_no_page

do_wp_page

Allocate page table

Demand paging

Swap in

Copy on wr
i
t
e

PTE write protected

do_anonymous_page

Expand stack

Check if page table

entry present

!vma
-
>vm_ops
-
>nopage


28

fault occurs( see filemap_nopage() )

F.

do_mmap()

i.

construct
a vma

ii.

insert the vma
-
>vm_op by calling file
-
>f_op
-
>mmap()

iii.

brings in pages if VM_LOCKED is specified

G.

filemap_nopage()

i.

look for the page from page cache first

ii.

read
-
in the page if cache miss

read
-
ahead if necessary


29

7.3

Swap Area


(1)

Swap Area Management

-

do_swap_pa
ge()

I.

D
etermine if the vma that contains the faulting page has swapin()
vm_operation

II.

call swap_in() if the vma doesn’t have swapin() vm operation

III.

swap_in()

1.

Th
e default swap in operation: allocate a free page and get
th
e content of the page from the swap are
a

2.

D
-
allocate the swap page in swap area

(2)

Swap Area Organization


(3)

Swap Area Management

-

s
wap_on()

I.

E
able a swap area

II.

A
locate pages for lockmap and swapmap

III.

I
sert the swap area into the swap list



7.4

Copying Virtual Address Spaces


(1)

Wh
en Linux create a new process by fork(), it will call copy_mm() to copy the
address space from the father to the child.

(2)

The child can share the same address space with its father if the CLONE_VM is
specified

(3)

copy_mm()

-

new_page_tables( )

I.

allocate a page for

PTEs and copy the kernel
-
part PTEs

-

dup_mmap( )

I.

duplicate the mmap

swap_inf
o

lockmap

A swap area

1111111111111…


1111111111111…


111………………


111……00000000

00SWAP
-
SPACE


30

8

References

I.

Linux Cross
-
Reference
http://lxr.linux.no

II.

O’Reilly


Understanding the Linux Kernel 2nd

III.

Understanding the Linux Virtual Memory Manager

IV.

Jun
-
Ch
iao Wang
-

VM Architecture

V.

I
-
Jui Sung
-

The Page Fault Handler in Linux: A Dynamic View for MM

VI.

http://www.tldp.org/LDP/tlk/tlk.html

VII.

http://www.cs.purdue.edu/homes/li/cs690Z/Outline/vmm.pdf

VIII.

http://www.uni
-
tuebingen.de/zdv/projekte/linux/books/khg

IX.

http://home.earthlink.net/~jknapka/linux
-
mm/vmoutline.html


31

Appendix


A.


Reverse Map




B.


__get_free_pages()


__get_free_pages(int priority, unsigned long order, int dma)

A.

Getting continuous free pages

from physical memory.

B.

priority: GFP_BUFFER,GFP_IO,GFP_KERNEL,GFP_USER,
GFP_ATOMIC, GFP_NFS

C.

order: allocation size

D.

dma: allocation memory type


C.


Slab

kmalloc(size_t size, int priority)

A.

implemented by the slab allocator algorithm

B.

allocate memory from phys
ical memory

I.

32 bytes, 64 bytes, 128bytes…..


C.

size: the memory size allocated

D.

priority: the same as the priority in __get_free_pages()



RMAP

VM A





VM B






VM C






Page

Frame

PTE Chain

Avoid searching
all
virtual address space




32




Struct size_descriptor

page_de
s

block_des

block_des

page_des

block_des

block_des

3
2

6
4

12
8


.

4080

8176

page_des

1 page

1 page

2 page

..
.

...

Check free page

__get_free_pages

Data

initialization

yes

yes

No

No

Data processing

and return addr

no_bucket_page

found_it

Get cache page

No

yes

Return Null

No_free_page

Found_cache_page


33


kfree(void *__ptr)

A.

F
ree the memory allocated by kmalloc

B.

__ptr
: the start address of the memory to be free




D.


Non
-
contiguous Memory Allocation


vmalloc(unsigned long size)

A.

allocate virtual memory

B.

size : the mallocation size



Add this pa
ge

Can add this page

No

No

Found the addr of

the page and block

Check addr

Add block

Is nblocks==1

yes

yes

Free_page

Can free page

No

return

yes

next

size

addr

next

size

addr

vmlist

Virtual

memory


34



vfre
e(void * addr)

A.

free the memory allocated by vmalloc

B.

addr: the start address of the memory



Round_size

and check it

Kmalloc (area
)

Search memory

hole

Add it to the

vmlist

Change Page table