.NET Compact Framework 2.0 Optimizing For Performance

basiliskcanoeΛογισμικό & κατασκευή λογ/κού

2 Νοε 2013 (πριν από 3 χρόνια και 7 μήνες)

81 εμφανίσεις

1

.NET Compact Framework 2.0

Optimizing For Performance

Roman Batoukov

FUN403

Development Lead

.NET Compact Framework

Microsoft Corporation

2

.NET Compact Framework

Visual

Studio


Windows CE

Low level operating system
-
specific
functionality


Threads


Memory


Networking


File I/O






CLR


Type system


Loader


JIT Compiler

Execution Engine provides typesafe
Runtime for managed code


Garbage collector


Debugger





FX

Rich class libraries to make your

life easy!



GUI: Forms


GUI: Drawing (2D & 3D)


Collections


IO, Networking, Crypto


Native interop


Web services


Data & Xml


Globalization

3

.NET Compact Framework

How we are different?

Memory constraints

Storage


Flash/ROM

Physical Memory

Virtual Memory


32MB per process

Design

28% of the surface area in 8% of the size of full .NET
Framework

Portable JIT Compiler

Fast code generation, less optimized

May pitch JIT
-
compiled code

No NGen, install time or persisted code

Interpreted virtual calls (no v
-
tables)

Sparse loading of metadata

4

Measuring Performance

Overview

Micro
-
benchmarks versus Scenarios

Benchmarking tips

Use Environment.TickCount to measure

Measure times greater than 1 second

Start from known state

Ensure nothing else is running

Measure multiple times, take average

Run each test in own AppDomain/Process

Log results at the end

Understand JIT
-
time versus run
-
time cost

5

.NET Compact Framework

.NET Compact Framework Performance v1
-
>v2

(Pocket PC 2003, XScale 400MHz)

1.0

1.0 SP3

V2

Beta1

V2

current

Method Calls (Calls/sec)

3.7M

7.1M

8.1M

Virtual Calls (Calls/sec)

2.4M

2.7M

5.6M

Simple P/Invoke (Calls/sec)

733K

1.7M

1.8M

Primes (to 1500) (iterations/sec)

562

832

855

GC Small (8 bytes) (Bytes/sec)

1M

7M

7.5M

GC Array (100 int’s) (Bytes/sec)

25M

43M

115M

XML Text Reader 200KB (seconds)

1.7

1.2

0.72

0.69

DataSet (static data)

4 tables, 1000 records (seconds)

13.1

6.6

7.3

4.0

DataSet (ReadXml)

3 tables, 100 records (seconds)

12.3

6.5

5.2

3.9

Bigger

is better

Smaller

is better

6

Measuring Performance

Performance counters

<My App>.stat (formerly mscoree.stat)

http://msdn.microsoft.com/library/enus/dnnetcomp/

html/netcfperf.asp

Registry

HKLM
\
SOFTWARE
\
Microsoft
\
.NETCompactFramework
\
PerfMonitor

Counters (DWORD) = 1

What does .stat tell you?

Working set and performance statistics

More counters added in v2

Generics usage

COM interop usage

Number of boxed valuetypes

Threading and timers

GUI objects

Network activity (socket bytes send/received)

7

.stat

counter total last datum n mean min max

Total Program Run Time (ms) 55937
-

-

-

-

-

App Domains Created 18
-

-

-

-

-

App Domains Unloaded 18
-

-

-

-

-

Assemblies Loaded 323
-

-

-

-

-

Classes Loaded 18852
-

-

-

-

-

Methods Loaded 37353
-

-

-

-

-

Closed Types Loaded 730
-

-

-

-

-

Closed Types Loaded per Definition 730 8 385 1 1 8

Open Types Loaded 78
-

-

-

-

-

Closed Methods Loaded 46
-

-

-

-

-

Closed Methods Loaded per Definition 46 1 40 1 1 2

Open Methods Loaded 0
-

-

-

-

-

Threads in Thread Pool
-

0 6 1 0 3

Pending Timers
-

0 93 0 0 1

Scheduled Timers 46
-

-

-

-

-

Timers Delayed by Thread Pool Limit 0
-

-

-

-

-

Work Items Queued 46
-

-

-

-

-

Uncontested Monitor.Enter Calls 57240
-

-

-

-

-

Contested Monitor.Enter Calls 0
-

-

-

-

-

Peak Bytes Allocated (native + managed) 4024363
-

-

-

-

-

Managed Objects Allocated 1015100
-

-

-

-

-

Managed Bytes Allocated 37291444 28 1015100 36 8 55588

Managed String Objects Allocated 112108
-

-

-

-

-

Bytes of String Objects Allocated 4596658
-

-

-

-

-

Garbage Collections (GC) 33
-

-

-

-

-

Bytes Collected By GC 25573036 41592 33 774940 41592 1096328

Managed Bytes In Use After GC
-

23528 33 259414 23176 924612

Total Bytes In Use After GC
-

3091342 33 2954574 1833928 3988607

GC Compactions 17
-

-

-

-

-

Code Pitchings 6
-

-

-

-

-

Calls to GC.Collect 0
-

-

-

-

-

GC Latency Time (ms) 279 16 33 8 0 31

Pinned Objects 156
-

-

-

-

-

Objects Moved by Compactor 73760
-

-

-

-

-

Objects Not Moved by Compactor 11811
-

-

-

-

-

Objects Finalized 6383
-

-

-

-

-

Boxed Value Types 350829
-

-

-

-

-

Process Heap
-

1626 430814 511970 952 962130

Short Term Heap
-

0 178228 718 0 21532

JIT Heap
-

0 88135 357796 0 651663

App Domain Heap
-

0 741720 647240 0 833370

GC Heap
-

0 376 855105 0 2097152

Native Bytes Jitted 7202214 152 26910 267 80 5448

Methods Jitted 26910
-

-

-

-

-

Bytes Pitched 1673873 0 7047 237 0 5448

Peak Bytes Allocated (native + managed)

JIT Heap

App Domain Heap

GC Heap

GC Latency Time (ms)

Garbage Collections (GC)

Managed String Objects Allocated

Boxed Value Types

8

FX

.NET Compact Framework

CLR

Windows CE

Globalization

GUI

Net

I/O

Crypto

System.

Globalization

System.

Cryptography

System.

IO.Ports

Microsoft.

Win32.Registry

System.IO.

File

System.

Data

System.Xml

System

mscorlib

Microsoft.

VisualBasic

JIT Compiler

& GC

Debugger

Class

Loader

Assembly

Cache

Native

Interop

App Domain

Loader

Process

Loader

Memory and

Threading

File Mapping

Cert/Security

Verification

System.

WebServices

System.Net.

Http*

System.Net.

Sockets

DirectX.

DirectD3DM

Windows.

Forms

System.

Drawing

SSL

Sockets

NTLM

GDI/GWES

Common

Controls

Registry

File I/O

Encodings

Sorting

Crypto API

Calendar

Data

Culture

Data

Redist

Host

Visual Studio

Debug Engine

ICorDbg

Managed Loader

MSI Setup

(ActiveSync)

Per Device CAB

Install (SMS, etc)

System.

Reflection

Casing

D3DM

9

Common Language Runtime

Execution engine

Call path

Managed calls are more expensive than native

Instance call: ~2
-
3X the cost of a native function call

Virtual call: ~1.4X the cost of a managed instance call

Platform invoke: ~5X the cost of managed instance call (*Marshal int
parameter)

Properties are calls

JIT compilers

All platforms has the same optimizing JIT compiler architecture in
v2

Optimizations

Method inlining for simple methods

Variable enregistration

String interning

10

Common Language Runtime

Call path (sample)

public class Shape

{


protected int m_volume;


public
virtual

int Volume


{


get {return m_volume;}


}

}

public class Cube:Shape

{


public MyType(int vol)


{


m_volume = vol;


}

}

public class Shape

{


protected int m_volume;


public int Volume


{


get {return m_volume;}


}

}

public class Cube:Shape

{


public MyType(int vol)


{


m_volume = vol;


}

}

11

Common Language Runtime

Call path (sample)

public class MyCollection


{




private const int m_capacity = 10000;






private Shape[] storage = new Shape[m_capacity];







public void Sort()



{



Shape tmp;


for (int i=0; i<m_capacity
-
1; i++) {



for (int j=0; j<m_capacity
-
1
-
i; j++)




if (
storage[j+1].Volume < storage[j].Volume
){






tmp = storage[j];






storage[j] = storage[j+1];






storage[j+1] = tmp;





}




}



}








}

callvirt instance int32 Shape::get_Volume()

12

Common Language Runtime


Call path (sample)

public class Shape

{


protected int m_volume;


public
virtual

int Volume


{


get {return m_volume;}


}

}

public class Cube:Shape

{


public MyType(int vol)


{


m_volume = vol;


}

}

public class Shape

{


protected int m_volume;


public int Volume


{


get {return m_volume;}


}

}

public class Cube:Shape

{


public MyType(int vol)


{


m_volume = vol;


}

}

No virtual call overhead

Inlined (no call overhead at all)

~ Equal to accessing field

57 sec

39 sec

13

Common Language Runtime

‘The Memory Bill’

Shared by all .NET applications running

.NET Compact Framework CLR DLLs

.NET assemblies (memory mapped)

Dynamic, per process memory costs

Objects allocated

Threads stacks

Number of classes and methods

Runtime representation of metadata

JIT compiled code

Unmanaged allocations (not under control of the CLR)

Operating System

Native DLLs called by application via P/Invoke

14

Common Language Runtime

Memory heaps

Five memory heaps to reduce fragmentation

App
-
domain

CLR dynamic representation of metadata
for the assembly loader

JIT

JIT compiled code buffers

Garbage Collector (GC)

Application and Framework object
allocations

Short
-
term

CLR temporary/short lived allocation heap

Process

Other CLR allocations

15

Going Into The Background

Yahtzee game

Application goes

into background

or low on memory

16

Real World Measurements

Yahtzee game

Where

What is the memory?

Peak

‘On
Minimize’

Shared,

RO
Demand
Paged

Code, .NET
Assemblies

Mscoree.dll, mscoree2_0.dll,

netcfagl2_0.dll

1MB

500KB

Mscorlib, Yahtzee.exe, system,

System.Drawing,

System.Windows.Forms

1.7MB

1MB

Process
Memory

CLR Heaps

JIT Heap

220KB

30KB

Process Heap

47KB

11KB

App Domain Heap

177KB

177KB

Application

GC Heap


object allocations

1MB

64KB

Total

1.5MB

282KB

Total

4.2MB

1.7MB

17

Common Language Runtime


Garbage Collector (GC)

Managed allocations are FAST

7.5MB per sec (allocating 8 byte objects)

GC manages it’s own heap

Allocates 64KB blocks, 1MB cache

Use VirtualAlloc to enable release of virtual and
physical memory back to system

Compacts heap when fragmentation occurs

18

Common Language Runtime

Garbage Collector

What triggers a GC?

Memory allocation failure

1M of GC objects allocated (v2)

Application going to background

GC.Collect() (A
void “helping” the GC!)

In general, if you don’t allocate objects, GC won’t occur

Beware of side
-
effects of calls that may allocate

objects

What happens at GC time?

Freezes all threads at
safe

point

Finds all
live

objects and marks them

An object is
live

if it is
reachable

from
root location

Unmarked objects are freed and added to finalizer queue

Finalizers are run on a separate thread

GC pools are compacted if required

Return free memory to the operating system

19

Common Language Runtime

Garbage Collector

GC Latency per collection

20

Common Language Runtime

Garbage Collector

Allocation rate

21

Unnecessary string allocations

Strings are immutable

String manipulations (Concat(), etc.) cause copies

Use StringBuilder
http://weblogs.asp.net/ricom/archive/2003/12/02/40778.aspx

Common Language Runtime

Where garbage comes from?

String result = "";

for (int i=0; i<10000; i++) {


result +=


".NET Compact Framework";


result += " Rocks!";

}

StringBuilder result =



new StringBuilder();

for (int i=0; i<10000; i++){


result.Append(".NET Compact

Framework");


result.Append(" Rocks!");

}

22

.stat

counter total last datum n mean min max

Total Program Run Time (ms) 11843
-

-

-

-

-

App Domains Created 1
-

-

-

-

-

App Domains Unloaded 1
-

-

-

-

-

Assemblies Loaded 2
-

-

-

-

-

Classes Loaded 175
-

-

-

-

-

Methods Loaded 198
-

-

-

-

-

Closed Types Loaded 0
-

-

-

-

-

Closed Types Loaded per Definition 0 0 0 0 0 0

Open Types Loaded 0
-

-

-

-

-

Closed Methods Loaded 0
-

-

-

-

-

Closed Methods Loaded per Definition 0 0 0 0 0 0

Open Methods Loaded 0
-

-

-

-

-

Threads in Thread Pool
-

0 2 0 0 1

Pending Timers
-

0 2 0 0 1

Scheduled Timers 1
-

-

-

-

-

Timers Delayed by Thread Pool Limit 0
-

-

-

-

-

Work Items Queued 1
-

-

-

-

-

Uncontested Monitor.Enter Calls 2
-

-

-

-

-

Contested Monitor.Enter Calls 0
-

-

-

-

-

Peak Bytes Allocated (native + managed) 3326004
-

-

-

-

-

Managed Objects Allocated 60266
-

-

-

-

-

Managed Bytes Allocated 5801679432 28 60266 96267 8 580020

Managed String Objects Allocated 20041
-

-

-

-

-

Bytes of String Objects Allocated 5800480578
-

-

-

-

-

Garbage Collections (GC) 4912
-

-

-

-

-

Bytes Collected By GC 5918699036 1160076 4912 1204946 597824 1572512

Managed Bytes In Use After GC
-

580752 4912 381831 8364 580752

Total Bytes In Use After GC
-

1810560 4912 1611885 1097856 1810560

GC Compactions 0
-

-

-

-

-

Code Pitchings 0
-

-

-

-

-

Calls to GC.Collect 0
-

-

-

-

-

GC Latency Time (ms) 686 0 4912 0 0 16

Pinned Objects 0
-

-

-

-

-

Objects Moved by Compactor 0
-

-

-

-

-

Objects Not Moved by Compactor 0
-

-

-

-

-

Objects Finalized 1
-

-

-

-

-

Boxed Value Types 3
-

-

-

-

-

Process Heap
-

278 235 2352 68 8733

Short Term Heap
-

0 278 986 0 10424

JIT Heap
-

0 360 12103 0 24444

App Domain Heap
-

0 1341 46799 0 64562

GC Heap
-

0 35524 2095727 0 3276800

Native Bytes Jitted 22427 140 98 228 68 1367

Methods Jitted 98
-

-

-

-

-

Bytes Pitched 0 0 0 0 0 0

Methods Pitched 0
-

-

-

-

-

Method Pitch Latency Time (ms) 0 0 0 0 0 0

Exceptions Thrown 0
-

-

-

-

-

Platform Invoke Calls

0
-

-

-

-

-

Managed String Objects Allocated

20040

Garbage Collections (GC)

4912

Bytes of String Objects Allocate

5,800,480,574

Bytes Collected By GC

5,918,699,036

GC latency


107128 ms

String result = "";

for (int i=0; i<10000; i++) {


result += ".NET Compact Framework";


result += " Rocks!";

}

Run time 173 sec

23

counter total last datum n mean min max

Total Program Run Time (ms) 11843
-

-

-

-

-

App Domains Created 1
-

-

-

-

-

App Domains Unloaded 1
-

-

-

-

-

Assemblies Loaded 2
-

-

-

-

-

Classes Loaded 175
-

-

-

-

-

Methods Loaded 198
-

-

-

-

-

Closed Types Loaded 0
-

-

-

-

-

Closed Types Loaded per Definition 0 0 0 0 0 0

Open Types Loaded 0
-

-

-

-

-

Closed Methods Loaded 0
-

-

-

-

-

Closed Methods Loaded per Definition 0 0 0 0 0 0

Open Methods Loaded 0
-

-

-

-

-

Threads in Thread Pool
-

0 2 0 0 1

Pending Timers
-

0 2 0 0 1

Scheduled Timers 1
-

-

-

-

-

Timers Delayed by Thread Pool Limit 0
-

-

-

-

-

Work Items Queued 1
-

-

-

-

-

Uncontested Monitor.Enter Calls 2
-

-

-

-

-

Contested Monitor.Enter Calls 0
-

-

-

-

-

Peak Bytes Allocated (native + managed) 3326004
-

-

-

-

-

Managed Objects Allocated 60266
-

-

-

-

-

Managed Bytes Allocated 5801679432 28 60266 96267 8 580020

Managed String Objects Allocated 20041
-

-

-

-

-

Bytes of String Objects Allocated 5800480578
-

-

-

-

-

Garbage Collections (GC) 4912
-

-

-

-

-

Bytes Collected By GC 5918699036 1160076 4912 1204946 597824 1572512

Managed Bytes In Use After GC
-

580752 4912 381831 8364 580752

Total Bytes In Use After GC
-

1810560 4912 1611885 1097856 1810560

GC Compactions 0
-

-

-

-

-

Code Pitchings 0
-

-

-

-

-

Calls to GC.Collect 0
-

-

-

-

-

GC Latency Time (ms) 686 0 4912 0 0 16

Pinned Objects 0
-

-

-

-

-

Objects Moved by Compactor 0
-

-

-

-

-

Objects Not Moved by Compactor 0
-

-

-

-

-

Objects Finalized 1
-

-

-

-

-

Boxed Value Types 3
-

-

-

-

-

Process Heap
-

278 235 2352 68 8733

Short Term Heap
-

0 278 986 0 10424

JIT Heap
-

0 360 12103 0 24444

App Domain Heap
-

0 1341 46799 0 64562

GC Heap
-

0 35524 2095727 0 3276800

Native Bytes Jitted 22427 140 98 228 68 1367

Methods Jitted 98
-

-

-

-

-

Bytes Pitched 0 0 0 0 0 0

Methods Pitched 0
-

-

-

-

-

Method Pitch Latency Time (ms) 0 0 0 0 0 0

Exceptions Thrown 0
-

-

-

-

-

Platform Invoke Calls

0
-

-

-

-

-

.stat

Managed String Objects Allocated 56

Bytes of String Objects Allocated 2097718

Garbage Collections (GC) 2

Bytes Collected By GC

1081620

GC Latency

21 ms

StringBuilder result = new StringBuilder();

for (int i=0; i<10000; i++){


result.Append(".NET Compact



Framework");


result.Append(" Rocks!");

}

Run time 0.1 sec

24

Common Language Runtime

Where garbage comes from?

Unnecessary boxing

Value types allocated on the stack


(fast to allocate)

Boxing causes a heap allocation and a copy

Use strongly typed arrays and collections


(Framework collections are
NOT

strongly typed)



class Hashtable {




struct bucket {




Object key;




Object val;


}




bucket[] buckets;





public
Object

this[
Object key
] { get; set; }


}

25

Common Language Runtime

Sample Code: Value Types and boxing

public struct AccountId {


public int m_number;


public override int GetHashCode() { return m_number; }

}

public struct AccountData {


private int m_balance;


public int Balance {



get {return m_balance;}


set {m_balance=value;}


}

}

public class Accounts {


public const int num = 10000;


Object[] accounts = new


Object[num];


public Object this[Object id] {


get {return
accounts[id.GetHashCode()];}


set {accounts[id.GetHashCode()] =
value;}


}



}

public class Accounts {


public const int num = 10000;


AccountData[] accounts = new


AccountData[num];


public AccountData this[AccountId id] {


get {return
accounts[id.GetHashCode()];}


set {accounts[id.GetHashCode()] =
value;}


}



}

26

Common Language Runtime

Sample Code: Value Types and boxing


Accounts ac = new Accounts(); int i;


for (i = 0; i < Accounts.num_accounts; i++) {



AccountData rec = new AccountData();



rec.Balance = 100;



AccountId id; id.m_number = i;



ac[id] = rec;


}


long iterations = 0;


long start = Environment.TickCount;


do {



for (i = 0; i < Accounts.num_accounts; i++) {




AccountId id; id.m_number = i;




AccountData rec = (AccountData)ac[ id ];




rec.Balance
-
=10;




ac[ id ]=rec;



}



iterations += i;


} while (Environment.TickCount
-

start < 1000);

27

Common Language Runtime

Sample Code: Value Types and boxing

public class Accounts {


public const int num = 10000;


Object[] accounts = new


Object[num];


public Object this[Object id] {


get {return
accounts[id.GetHashCode()];}


set {accounts[id.GetHashCode()] =
value;}


}



}

public class Accounts {


public const int num = 10000;


AccountData[] accounts=new


AccountData[num];


public AccountData this[AccountId id] {


get {return
accounts[id.GetHashCode()];}


set {accounts[id.GetHashCode()] =
value;}


}



}

0.15M iter/sec

Boxed value types 4138460

Garbage Collections (GC) 4

Bytes Collected By GC

4138460

GC Latency Time


132 ms



2.5M iter/sec

Boxed value types 2

Garbage Collections (GC) 0

Bytes Collected By GC 0

GC Latency Time


0 ms


28

Common Language Runtime

Sample Code: Generics

public class Accounts<U, V>

{



public const int num_accounts = 10000;



private U[] accounts = new U[num_accounts];



public U this[V id] {




get {return accounts[id.GetHashCode()];}




set {accounts[id.GetHashCode()] = value;}




}



}


Accounts<AccountData, AccountId> ac = new Accounts<AccountData, AccountId>();

int i;

for (i = 0; i < Accounts<AccountData, AccountId>.num_accounts; i++) {


AccountData rec = new AccountData(); rec.Balance = 100;


AccountId id; id.m_number = i;


ac[id] = rec;

}

long iterations = 0; long start = Environment.TickCount;

do {


for (i = 0; i < Accounts<AccountData, AccountId>.num_accounts; i++) {


AccountId id; id.m_number = i; AccountData rec = (AccountData)ac[id];
rec.Balance
-
=10; ac[id]=rec;


}


iterations += i;

} while (Environment.TickCount
-

start < 1000);



29

Common Language Runtime

Sample Code: Generics

public class Accounts<U, V>

{



public const int num_accounts = 10000;



private U[] accounts = new U[num_accounts];



public U this[V id] {




get {return accounts[id.GetHashCode()];}




set {accounts[id.GetHashCode()] = value;}




}



}


Accounts<AccountData, AccountId> ac = new Accounts<AccountData, AccountId>();

int i;

for (i = 0; i < Accounts<AccountData, AccountId>.num_accounts; i++) {


AccountData rec = new AccountData(); rec.Balance = 100;


AccountId id; id.m_number = i;


ac[id] = rec;

}

long iterations = 0; long start = Environment.TickCount;

do {


for (i = 0; i < Accounts<AccountData, AccountId>.num_accounts; i++) {


AccountId id; id.m_number = i; AccountData rec = (AccountData)ac[id]; rec.Balance
-
=10; ac[id]=rec;


}


iterations += i;

} while (Environment.TickCount
-

start < 1000);



Untyped


0.15M iter/sec

Strongly typed 2.5M iter/sec

Generic


2.5M iter/sec

30

counter total last datum n mean min max

Total Program Run Time (ms) 11843
-

-

-

-

-

App Domains Created 1
-

-

-

-

-

App Domains Unloaded 1
-

-

-

-

-

Assemblies Loaded 2
-

-

-

-

-

Classes Loaded 175
-

-

-

-

-

Methods Loaded 198
-

-

-

-

-

Closed Types Loaded 0
-

-

-

-

-

Closed Types Loaded per Definition 0 0 0 0 0 0

Open Types Loaded 0
-

-

-

-

-

Closed Methods Loaded 0
-

-

-

-

-

Closed Methods Loaded per Definition 0 0 0 0 0 0

Open Methods Loaded 0
-

-

-

-

-

Threads in Thread Pool
-

0 2 0 0 1

Pending Timers
-

0 2 0 0 1

Scheduled Timers 1
-

-

-

-

-

Timers Delayed by Thread Pool Limit 0
-

-

-

-

-

Work Items Queued 1
-

-

-

-

-

Uncontested Monitor.Enter Calls 2
-

-

-

-

-

Contested Monitor.Enter Calls 0
-

-

-

-

-

Peak Bytes Allocated (native + managed) 3326004
-

-

-

-

-

Managed Objects Allocated 60266
-

-

-

-

-

Managed Bytes Allocated 5801679432 28 60266 96267 8 580020

Managed String Objects Allocated 20041
-

-

-

-

-

Bytes of String Objects Allocated 5800480578
-

-

-

-

-

Garbage Collections (GC) 4912
-

-

-

-

-

Bytes Collected By GC 5918699036 1160076 4912 1204946 597824 1572512

Managed Bytes In Use After GC
-

580752 4912 381831 8364 580752

Total Bytes In Use After GC
-

1810560 4912 1611885 1097856 1810560

GC Compactions 0
-

-

-

-

-

Code Pitchings 0
-

-

-

-

-

Calls to GC.Collect 0
-

-

-

-

-

GC Latency Time (ms) 686 0 4912 0 0 16

Pinned Objects 0
-

-

-

-

-

Objects Moved by Compactor 0
-

-

-

-

-

Objects Not Moved by Compactor 0
-

-

-

-

-

Objects Finalized 1
-

-

-

-

-

Boxed Value Types 3
-

-

-

-

-

Process Heap
-

278 235 2352 68 8733

Short Term Heap
-

0 278 986 0 10424

JIT Heap
-

0 360 12103 0 24444

App Domain Heap
-

0 1341 46799 0 64562

GC Heap
-

0 35524 2095727 0 3276800

Native Bytes Jitted 22427 140 98 228 68 1367

Methods Jitted 98
-

-

-

-

-

Bytes Pitched 0 0 0 0 0 0

Methods Pitched 0
-

-

-

-

-

Method Pitch Latency Time (ms) 0 0 0 0 0 0

Exceptions Thrown 0
-

-

-

-

-

Platform Invoke Calls

0
-

-

-

-

-

.stat

Boxed value types 2

Garbage Collections (GC) 0

Bytes Collected By GC 0

GC Latency Time


0 ms


Closed Types Loaded 1

Closed Types per definition mean=1 max=1

31

Common Language Runtime

Generics

Strong typing without code duplication

Fully specialized implementation in .NET
Compact Framework v2

Pros

Always strongly typed

No unnecessary boxing and type casts

Specialized code is more efficient than shared

Cons

Internal execution engine data structures and JIT
-
compiled
code aren’t shared

List<int>, List<string>, List<MyType>

http://blogs.msdn.com/romanbat/archive/2005/01/06/3
48114.aspx

32

Common Language Runtime

Finalization and Dispose

Cost of finalizers

Non
-
deterministic cleanup

Extends lifetime of object

In general, rely on GC for automatic memory cleanup

The exceptions to the rule…

If your object contains an unmanaged resource that the GC is
unaware of, you need to implement a finalizer

Also implement Dispose pattern to release unmanaged resource in
deterministic manner

Dispose method should suppress finalization (FxCop rule)

If the object you are using implements Dispose, call it when you
are done with the object

‘Objects Finalized’ performance counter

33

Common Language Runtime

Exceptions

Exceptions are cheap…until you throw

Throw exceptions in exceptional circumstances

Do not use exceptions for normal flow control

Use performance counters to track the number of
exceptions thrown

Replace “On Error/Goto” with “Try/Catch/Finally”
in Microsoft Visual Basic .NET

34

Common Language Runtime

Reflection

Reflection can be expensive

Reflection performance cost

Type comparisons (for example: typeof() )

Member access (for example: Type.InvokeMember())

Think ~10
-
100x slower

Working set cost

Type and Member enumerations (for example: Assembly.GetTypes(),
Type.GetMethods())

Runtime data structures

Think ~100 bytes per loaded type, ~80 bytes per loaded method

Be aware of APIs that use reflection as a side effect

Override

Object.ToString()

GetHashCode() and Equals() (for value types)

35

Common Language Runtime

Building a Cost Model for Managed Math

Math performance

32 bit integers: Similar to native math

64 bit integers: ~5
-
10X cost of native math

Floating point: Similar to native math

ARM processors do not have FPU

36

FX

.NET Compact Framework

CLR

Windows CE

Globalization

GUI

Net

I/O

Crypto

System.

Globalization

System.

Cryptography

System.

IO.Ports

Microsoft.

Win32.Registry

System.IO.

File

System.

Data

System.Xml

System

mscorlib

Microsoft.

VisualBasic

JIT Compiler

& GC

Debugger

Class

Loader

Assembly

Cache

Native

Interop

App Domain

Loader

Process

Loader

Memory and

Threading

File Mapping

Cert/Security

Verification

System.

WebServices

System.Net.

Http*

System.Net.

Sockets

DirectX.

DirectD3DM

Windows.

Forms

System.

Drawing

SSL

Sockets

NTLM

GDI/GWES

Common

Controls

Registry

File I/O

Encodings

Sorting

Crypto API

Calendar

Data

Culture

Data

Redist

Host

Visual Studio

Debug Engine

ICorDbg

Managed Loader

MSI Setup

(ActiveSync)

Per Device CAB

Install (SMS, etc)

System.

Reflection

Casing

D3DM

37

Base Class Library

Collections

Pre
-
size collection classes appropriately

Default capacity is small (for example 4 for
ArrayList)

Resizing creates unnecessary copies

Avoid unnecessary boxing and type casts


use generic collections

Full support for all generic collections in the
.NET Compact Framework v2!

38

Windows Forms

Best Practices

Load and cache Forms in the background

Populate data separate from Form.Show()

Pre
-
populate data, or

Load data async to Form.Show()

Use BeginUpdate/EndUpdate when it is available

e.g. ListView, TreeView

Use SuspendLayout/ResumeLayout when repositioning
controls

Keep event handling code tight

Process bigger operations asynchronously

Blocking in event handlers will affect UI responsiveness

Form load performance

Reduce the number of method calls during initialization

39

Graphics And Games

Best Practices

Compose to off
-
screen buffers to minimize
direct to screen blitting

Approximately 50% faster

Avoid transparent blitting in areas that
require performance

Approximate 1/3 speed of normal blitting

Consider using pre
-
rendered images vs
using System.Drawing rendering primitives

Need to measure on a case
-
by
-
case basis

40

XML

Best Practices for Managing Large XML Data Files

Use XMLTextReader/XMLTextWriter

Smaller memory footprint than using XmlDocument

XmlTextReader is a pull model parser which only reads a
“window” of the data

XmlDocument builds a generic, untyped object model using a tree

Type stored as string

OK to use with smaller documents (64K XML: ~0.25s)

Optimize the structure of XML document


Use elements to group (allows use of Skip() in XmlReader)

Use attributes to reduce size
-

processing attribute
-
centric
documents is faster


Keep it short! (attribute and element names)

Avoid gratuitous use of white space

Use XmlReader/XmlWriter factory classes to create
optimized reader or writer

Applying proper XMLReaderSettings can improve performance

41

Data

Business logic

and presentation

GUI

controls

In
-
memory

data copy

Custom data

Structures

arrays, collections

DataSet

Serialization

Custom

binary

XmlSerializer

IXmlSerializable

File system

Transports

Active

Sync

HTTP

Sockets

Replication

Or RDA

Remote system

XmlDocument

Business

logic

Data

Adapters

Web

services

Data

Readers

Binary or

text file

XML

file

SQL

Server

Mobile

SQL

DB

Other

Data

Sources

MSMQ

Other

DB

42

Data

Business logic

and presentation

GUI

controls

In
-
memory

data copy

Custom data

Structures

arrays, collections

DataSet

Serialization

Custom

binary

XmlSerializer

IXmlSerializable

File system

XmlDocument

Business

logic

Binary or

text file

XML

file

SQL

Server

Mobile

SqlCEResultSet

DataReader

43

Data

Business logic

and presentation

GUI

controls

In
-
memory

data copy

Custom data

Structures

arrays, collections

DataSet

Serialization

Custom

binary

XmlSerializer

IXmlSerializable

File system

Transports

Active

Sync

HTTP

Sockets

Replication

Or RDA

Remote system

XmlDocument

Business

logic

Data

Adapters

Web

services

Data

Readers

Binary or

text file

XML

file

SQL

Server

Mobile

SQL

DB

Other

Data

Sources

MSMQ

Other

DB

44

Web Services

Where is a bottleneck

Are you network bound or CPU bound?

Use perf counters: socket bytes sent / received. Do
you come close to the network capacity?

If you are network bound
-

work on reducing the size of

the message

Create a “canned” message, send over HTTP.
Compare performance with the web service.

If you are CPU bound, optimize the serialization scheme

for speed

http://blogs.msdn.com/mikezintel/archive/2005/03/30/
403941.aspx

45

Moving Forward

More tools

Live Remote Performance Counters

Under construction

Allocation profiler (CLR profiler)

Call profiler

Working set improvements

More speed

46

Summary

Make performance a requirement

and measure

Understand the APIs

Avoid unnecessary object allocation and
copies due to

String manipulations

Boxing

Not pre
-
sized collections

Understand data access performance
bottlenecks

47

Community Resources

At PDC

ILL03 Intelligent Data Synchronization in a Semi
-
Connected
Environment

ILL04 Write Once, Display Anywhere: UI for Windows Mobile
Devices

TLN316 Windows Mobile: New Emulation Technology for
Building Mobile Applications with Visual Studio 2005

After PDC

MSDN dev center:
http://msdn.microsoft.com/mobility/

.NET Compact Framework Team Blog:
http://blogs.msdn.com/netcfteam/

.NET Compact Framework Performance FAQ:
http://blogs.msdn.com/netcfteam/archive/2005/05/04/4
14820.aspx


48

Your Feedback

is Important!

Please Fill Out a Survey

49

© 2005 Microsoft Corporation. All rights reserved.

This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.