Memory Management in Windows CE and RTLinux - IDA

jaspersugarlandSoftware and s/w Development

Dec 14, 2013 (3 years and 9 months ago)

83 views


















Memory Management in Windows CE and RTLinux


Simon Josefsson (simjo598)

Mårten Thurén (marth852)


Linköpings Universitet




2

Abstract


This arti
cle examines the memory management aspects of the real
-
time operating systems
Real Time Linux and

Windows CE. It includes information about the backgrounds of the
operating system

and the memory management
techniques like paging and
swapping.




3

Index



1

Introduction

................................
................................
................................
........................

4

2

Background

................................
................................
................................
........................

4

2.1

Windows CE

................................
................................
................................
..............

4

2.2

RTLinux

................................
................................
................................
.....................

4

3

Memory Management

................................
................................
................................
........

5

3.1

Windows CE

................................
................................
................................
..............

5

3.1.1

CE 1.0


CE 5.0

................................
................................
................................
..

5

3.1.2

CE 6.0

................................
................................
................................
.................

6

3.2

RTLinux

................................
................................
................................
.....................

7

3.2.1

The Memory allocator

................................
................................
........................

7

3.2.2

Communication between Linux and RTCore

................................
.....................

8

3.2.3

PSDD for memory protection

................................
................................
............

8

3.2.4

Memory management by the Linux kernel

................................
........................

8

4

Conclusions

................................
................................
................................
......................

10

5

Reference

................................
................................
................................
..........................

10





4

1

Introduction


Our purpose with this report is to compare the difference in memory management between the
two real
-
time operating

systems Microsoft Windows CE and
RTLinux.

W
e
will first take a

l
ook

at eac
h operat
ing
system by itself and then later compare the key
features that set them apart

a
long with ou
r

op
inions of pros and cons.

Some brief background information is given for each operating system for an understanding
of the needs that justified the cre
ation of the operating systems.



2

Background


2.1

Windows CE


Microsoft Windows CE is a
slimmed down 32
-
bit real

time
variant of Microsofts Windows
aimed at embedded system such as smartphones or handheld computers.

In recent years
implementations have expande
d to include cars, game systems and more.

The first version was released
in late 1996 and the basic architecture has been overhauled
many times
and
the current version
6
.0 released in
November 2006

is
able to run on several
platforms such as Intel x86, MIP
S, ARM
/StrongARM
, and Hitachi SuperH.

Windows CE implements a subset of the Win32 API which is probably one of its best sales
arguments considering the low threshold for Windows programmers used to developing
software in the Windows environment.

The kernel

itself can run in under a megabyte of memory but
Windows CE is
perhaps
better
known under
its offspring
Pocket PC

and
Windows Mobile

where different modules and
features are added to suit the needs of a specific platform.

There is no
real
expla
nation as t
o what the abbreviation CE stand for
, Microsoft say it reflects
design features such as “Compact," Connectable," Compatible," "Companion," and
"Efficient."
[1]

but “
C
o
nsumer Electronics”
and

“Compact Electronics”

are also used
regularly. Others less
friendly

to Microsoft
may refer to it tongue
-
in
-
cheek as
“wince”

[2]



2.2

RTLinux


RTLinux is an extension of Linux originally created by V. Yodaiken, as a research project at
the New Mexico institute of min
ing and technology
.

[4]

The OS was an attempt to implement real time features in a general purpose operating system

[5]
.
Since the original Linux kernel was kept mainly intact, RT Linux could benefi
t from the
rapid development of Linux while satisfying the need for real time services.


This is achieved by using the virtual machine idea, the Linux kernel is used as a passive
entity, controlled by the RTCore kernel in a real
-
time way
.
[6]


Like normal Linux, it soon became too popular to stay just an academic project, the company
FSMLabs was founded
.
[5]


5

Since then, RTLinux has been used in cell phones, jet engines, power grids, weapon testing
sy
stems and "Unmanned aerial vehicles" to pick a few applications.
[7]

3

Memory Management


In this segment will we look at memory management under Windows CE and RTLinux.


3.1

Windows CE


The
basic architecture of the
operati
ng

system

has
seen major changes
over the years with
version 6.0 possible being the biggest overhaul yet.
[3]

It may seem unnecessary to discuss
previous versions but since most devices
still
run

CE
5.0 or lower
we
thought that it was
re
levant.



3.1.1

CE 1.0


CE 5.0

(Note: this section is written for CE 5.0 but
most

(if not all) points apply to earlier versions.)


Windows CE
5.0
uses
32
-
bit
Virtual Memory
which is mapped to a corresponding physical
location. This could be system RAM, ROM, reg
isters or peripheral device
s
.

The Memory Management Unit (MMU) is responsible for allocating virtual memory for all
new processes and threads,
thus
making sure everything

is in order
before actually loading
them to real memory for execution.
If a problem i
s detected during the allocation, for instance
if available memory is to low, the MMU generates an “out of memory” error and the process
demanding the resource can respond accordingly. But no damage
has been done
to the actual
memory
so
the system won’t be

corrupted.

Windows CE access real physical addresses like ROM during the boot cycle
before the
MMU
is initialized
, but when this
is
done it
takes complete control

of all memory handling.

Memory can be both
statically and
dynamically
mapped
,

most of the me
mory in kernelspace
(more about t
hat later) is mapped statically, t
his way programmers always know which address

corresponds to a certain device or register. OEM
s

(Original Equipment Manufacturers) have
the ability to specify
these mappings

for their own m
achines
.

Userspace on the other hand is mapped
dynamically;

this way code and data can be moved
about with the flow of the program

and makes the system more flexible
.


The 4 GB Virtual Memory Map
is split evenly into 2 GB’s kernel
-

and userspace blocks.

Ke
rnelspace is off
-
limit for programs without permission and house critical OS components
.

But many services are still run in userspace as Microsoft took a micro kernel approach to CE.

Programs running in userspace must
make
use
of
system

calls
if they want
to communicate
with resources in kernelspace.

Userspace is where the applications run in form of processes with one or more threads.

It

is
divided into
64

slots
of
32
M
b each

wh
ere the first 32
slots are
reserved for processes and
33
-
64 is shared memory.

(
See

figure 1)
.
Windows CE
5.0
has
a limit of 32 processes

but in
effect it’s really around
27 because system
processes
like GWES (Graphics, Windowing, and
Events Subsystem) FileSys and others have to run by default.


6


Figure
1
. CE

5.0 Memory Map




Slot 0 is an alias for the current running process.
A context switch will map the scheduled
process to this slot.





3.1.2

CE 6.0

Core features along with
m
emory management
has been rewritten in
Windows CE
6.0
.

Most notably is the eliminatio
n of
the 32 processes á 32
MB
limit which was plenty 10 years
ago but became more of a problem as time went on.

The 2 GB userspace is now dedicated to a single process and can map over 32000 individual
processes which should be plenty (at least for another

decade…)

This is very similar as to how the virtual memory map is set up in Windows XP.


Along with the bump in simultaneous processes t
his move
a
lso
significantly
increases security
.

N
ow that every process has its own virtual memory table you can no long
er access other
processes

directly but has to explicitly use ReadProcessMemory() and
WriteProcessMemory().





7


Figure
2
. CE 6.0 Memory Map





3.2

RTLinux


As mentioned, RTLinux
uses
two kernels, the RTCore kernel and the Linux kernel
.

The Linux kernel is made passive by disabling its hardware interrupts, these interrupts are
instead received by the RTCore, which uses
its

scheduler and emulates hardware interrupts
for Linux, so that real
-
time functionality is achieved.

Before version 2
.2, no memory management what so ever was done by RTCore, everything
was handled by Linux. The reason for this was difficulties with implementing a deterministic
memory allocator that satisfied real
-
time demands.
[5]


3.2.1

The Memo
ry allocator

In version 2.2, a real
-
time memory allocator was included, usable by both real
-
time and non
-
RT threads.

The allocator is fast, but it’s not deterministic, so one can not depend totally on
it’s operation time.

In practice, it is very good at fu
lfilling its task in time, but in case it fails,
there is no way out, it does not abort.

To achieve 100% reliability, the user would have to pre
-
allocate memory and manage memory usage themselves.


The memory allocator uses four different pools of memory;

these pools consist of blocks of
data which in turn consist of a certain number of bytes.

The default values for these pools are 15 blocks of 30000 bytes each, these are just default
values however, and can be changed.


When the user needs memory, two thi
ngs can occur;

1. The amount of memory requested is greater then half of the block size.

In this case, the whole block is used, and marked as such, the part of
the block

that is not
actually used, is wasted.


2. The amount of memory requested is less then
half of the block size.


8

Now, if say 50 bytes out of the block size 10000 bytes is used, we do not want to throw away
99,5% of the block. So the 50 bytes are used, and the rest of the block is usable by other
pieces of 50 bytes.


If many allocations is ma
de simultaneously, each allocation will use its own block, however,
these blocks will be used in parallel if more allocations of the same size are made.

[5]


3.2.2

Communication between Linux and RTCore

Since the two parts of a compl
ete RTLinux system (real
-
time and non
-
RT) run in different
address spaces, ways of communications between these two are needed, so that a thread in the
Linux kernel can communicate with a thread in the RTCore kernel.

There are four main ways of doing this
communication, printf() and rtl_printf() via a stdout
device, FIFO and shared memory.

The first three of these is used for serial data access, but if one needs access to bigger amounts
of data from both the userspace and the real
-
time code, one should use
shared memory.

The functions used for this is mmap() and shm_open().

First, a file descriptor for the shared memory is created with shm_open(), the address of the
file can then be accessed by using mmap() on the file descriptor.


Since the shared memory s
pace can be used in pretty much any way the programmer prefers,
the programmer is also responsible for making sure the program still runs in a limited time
space. For example, the programmer could let a real
-
time thread wait for a userspace thread
writing
to the shared memory space, but this could take any amount of time, depending on the
usage of the
CPU
, if the userspace thread gets lower priority, this will spread to the real
-
time
thread as well!

[5]


3.2.3

PSDD for memory protecti
on

Since there is “default” real memory protection stopping a real
-
time thread from writing in the
memory space used by the Linux kernel, Process Space Development Domain (PSDD) was
created.

PSDD lets a real
-
time thread run in the context of Linux, that is
, giving the memory
management responsibility to Linux while still being scheduled by RTCore.

A typical way to do this is to let a process in real
-
time space create threads in userspace,
shared memory is automatically implemented, since they execute in the

same address space.

There are some limitations though:



The thread can only do system calls to RTCore



The stack does not automatically grow when it’s full, like it would do in userspace. If
one tries to push to a full stack, one will cause a segmentation
fault.



Dynamical memory allocation and memory remapping can only be done before any
threads are created
.
[5]



3.2.4

Memory management by the Linux kernel

Since a lot of the memory management in RTLinux is not done by the RTCore, bu
t by the
Linux kernel, I will briefly cover the memory management model used by Linux.


To be able to use memory at all we need addresses, there are three main types of addresses:


9



Physical address, this is what the actual hardware uses, what is sent on the

address
bus.



Logical address or virtual address, this is what the applications use when they want to
directly address a part of the memory.



Linear address, an address type whose space is always 0


highest address.


3.2.4.1

Pages and pages tables

Both the virtual

addresses and physical addresses are divided into parts of memory, called
pages, all pages on one system have the same size, the size of the pages may vary on different
systems though.

The virtual addresses consist of two parts, an offset and a virtual p
age frame number.

Page tables are then used to map the virtual pages to physical ones. The table contains the
following information:



Validity flag, is the page table entry valid?



Physical page address



Access information, information about what right one h
as to the information (reading,
writing etc)


3.2.4.2

Demand paging

If a virtual address whose validity flag is 0 is accessed, two things may occur:

1. If the address is invalid, the process trying to access the address will be terminated by the
operating system
.

2. If the address was valid, but the page is not in the physical memory, the page will be loaded
to the memory from the hard drive.


3.2.4.3

Swapping

Two techniques are used to decide what to do if the memory is full and a new page need to be
loaded.

1. If the
system load is not that great, the system will look for files that have not been changed
since it was loaded, in this case, the page can be removed and loaded again, when needed.

This is of course not true if the file has been changed, such a page is calle
d a dirty page, and
can not be removed without keeping record of it.

2. If the system load is high, the technique used is to keep track of the usage of pages, a page
that has been recently used is “young” and vice versa, the system then chooses an “old” pa
ge
and swaps it away from the memory temporarily.


3.2.4.4

Allocation

In the Linux system, groups
of
2
^
n

pages are called blocks. Free blocks are listed in the
free_area vector, at index 0, blocks of size 2 are listed, at index 1, blocks of size 4 and so on.

The a
lgorithm used to allocate and deallocate blocks is called the buddy algorithm.

When an allocation system call occurs, the system tries to find a free block of the same size as
the one requested, if there is no such block, the next size is searched and so o
n, until one is
found or the last block size is reached.

If a block bigger then the requested one is found, the block is broken down in smaller pieces
until a block of the requested size is acquired.



10


3.2.4.5

Deallocation

Whenever a block is deallocated, the free
_area vector is searched for blocks adjacent to the
deallocated one, if one is found, they are merged into a bigger block. This is done recursively
as the new block is merged.
[8]

4

Conclusions

The contrast between the two opera
ting systems is biggest when comparing RTLinux with
version 5.0 and lower of Windows CE.

Since all processes in
CE 5.0

are allocated in the same Virtual Memory one can theoretically
directly overwrite data used by other processes as long as the current run
ning process has
high
enough permission level.

This is not possible in CE 6.0 so specified message passing functions must be used.

RTLinux is different from both CE 5 and CE 6, the possibility to write in another tasks
address space depends on the task
s
.

T
wo normal tasks in the RTCore can write in each others
address spaces, but two tasks run in user mode can not.

However, PSDD allows real
-
time
tasks to be run

so that the Linux kernel does all the memory handling, while the RTCore does
the scheduling. This
ensures memory security.


We would recommend anyone thinking of constructing a real
-
time system with Windows CE
to choose 6.0

over any previous versions
.
Apart from the security issues i
t
’s
no longer
hampered by the 32 processes limit and moving to a more
monolithic kernel makes it faster
by eliminating many system calls.


But in the case of CE 6.0 vs RTLinux w
e can
generally say that one is better than the other.

It all depends on the specific application of a given system combined with hardware support.




5

Reference


[1]

http://support.microsoft.com/default.aspx?scid=kb;EN
-
US;Q166915

[2]

http://en.wikipedia.org/wiki/Windows_CE

[3]

http://www.microsoft.com/windows/embedded/eval/wince/default.mspx

[4]

http://en.wikipedia.org/wiki/RTLinux

[5]

http://www.fsmlabs.com/images/stories/pdf/literature/rtl_book.pdf

[6]

http://www
-
md.e
-
technik.uni
-
rostoc
k.de/ma/gol/rtsys/articulos/rtlinux.pdf

[7]

http://www.fsmlabs.com/case
-
studies.html

[8]

http://www.linuxhq.com/guides/TLK/mm/memory.html