CLR Fundamentals talk notes

feastcanadianΛογισμικό & κατασκευή λογ/κού

14 Δεκ 2013 (πριν από 3 χρόνια και 5 μήνες)

98 εμφανίσεις

CLR Fundamentals talk notes


Goals: provide a

safe, managed

execution environment that is easier for developers to use
than unmanaged code, and supports multiple languages and allows interoperation between
them. Reduce errors, facilitate library reuse (cf. MFC which you couldn’t use from VB6).
Metadata extensibi
lity e.g. MTS/COM+ attributes.

CLI (the standard) vs. CLR (the implementation).

Microsoft CLRs: desktop, Silverlight, WP7

Other implementations: Mono, Rotor.

C# and VB are mostly thin layers over the underlying CLR. Therefore, understanding the C
is essential to understanding C# and VB.
You cannot use these languages effectively without
understanding the platform they run on.

Also, many issues and questions that arise turn out to be runtime or framework questions,
not language questions.

arly design guidelines are generally not specific to C# or VB but
are about designing for the CLR.

Therefore, developers need to be conscious of the difference between language
issues/idioms, framework issues/idioms and fundamental issues/idioms.

Three cor
e dimensions:
(including services like loading DLLs,
memory management, etc.); common type system (enabling language interop); and
standard core libraries.


Predecessor: COM. Unmanaged environment with weak
and non
support and poor abstraction of runtime services.

1.0 and 1.1



Note 3.0 and 3.5 are not revs of the CLR, only of the higher level frameworks.

Plan of attack

Two main sections: how the CLR

programs (static aspects), and how the


programs (dynamic aspects)

Static aspects of the CLR


Fundamental concepts: types and members.

Every piece of data has a type. It’s not like C where you have a bag of bits (possibly even
uninitialized ones!) and can interpret it howev
er you like: instead, the runtime ensures that
items are of the correct type.

The CLR defines a number of types: 8
64 bit integral types in both signed and unsigned
variants, 32 and 64 bit floating point types, native int (IntPtr/UIntPtr), char, string, ob

Other types are defined by libraries (e.g. Decimal in the BCL).

Reference (managed reference) and pointer (raw pointer) types.

The CLR type system is object
oriented and unified (all data are convertible to object).

not all

inherit from object e.g. reference/pointer types, interfaces.

What kinds of type are there?

Reference types


Value types


Interface types

A bit more complicated: managed references and pointers

Types are frozen

A compiler produces a typ
e in an assembly
. That’s it. Game over.

Extension methods are syntactic sugar that give the illusion of extending a type, but don’t.

Partial classes (and methods) are a compiler artefact not a CLR feature. You cannot use
partial to extend a compiled typ

you will just end up creating a new type that shadows it!

What is the difference between reference and value types?

Reference types have
. There can be several references to the same instance.

Value types have only
. When you pass a valu
e, the bits are copied: you get a new

Objects of reference types are not accessed directly, but through a reference (that is, a
variable of reference type stores a reference rather than the object itself). A variable of
value type contains the v
ery bits of the object itself.

Corollary: value types must have a zero value (for array allocations)

or rather, they

a zero value, and custom value types must deal with this.

Corollary: value types can’t have nondefault field initialisers or othe
rwise do stuff in their
default constructor.

When should you use a value type? When you have a small object that represents a value.

Idiom: value types should be immutable.

(Because when you change a Money value from
$1000 to $2000, you’re not changing t
he value of $1000. You’re specifying a


Equality semantics: reference types usually have identity equality semantics, value types
usually have value equality semantics.

The curious case of String: value semantics but implemented as refere
nce type. Why? (1)
Efficiency in passing strings. (2) Values must be fixed size.

Value type GetHashCode default implementation sucks: override it!


Object is a reference type (so are interfaces). What happens when you store a value type in
an obj
ect reference? Boxing.

Unboxing. Note C# ‘cast object to value type’ results in a CLR unbox, not a case

uglinesses like (long)(int)obj.

What happens if you modify a boxed instance?

If you modify it through the boxed value
(e.g. reflection
), then it works. But if you unbox it then the unbox gets you a copy,
so you won’t modify the original!

It’s confusing

just another argument for immutability.

Nullable value types

Given special handling by the CLR e.g. lifted operations,

Actually value types

they never really equal null (hence okay to call HasValue on a ‘null’

…except when boxed. Boxing and unboxing give special handling to nullables

Generic types

Open generic types

Closed constructed types


How generics get JITted: one version generated for

reference types, one for

type (works because all references are the same size)


different type parameters still result in different closed constructed types, even though
they sha
re generated code

e.g. the type initialiser will run once for each closed type


Type initialisers

precise (static ctor) vs. beforefieldinit (field initialisers)

The difference between const and static readonly

Type restrictions on const: integra
l, floating point, decimal, bool, string, enum
. Pretty
random really. Consequences for default parameters.


Are just a pair of methods

get and set

Occupy no per
instance space

Relation to C# auto properties


Are just a pair of methods

add and remove

C# and VB have always offered auto events

But you can also implement events manually with your own backing store

Example: WinForms Control uses a single backing store for all events


Interface implementations are

part of the method table (though will typically
point to the same methods as the ‘main’ bit)

C# makes it look like it’s enough to have a member with the right signature, but that’s
ause it performs compiler magic

Hence how the C# compiler can accept an

inherited member as implementing an interface

Hence explicit interface implementation, where the method is not emitted into the
“methods introduced by this type” bit of the method table, but only into the “methods
introduced by this interface” bit
of the method table


Typically compiler generated classes

Inherit from MulticastDelegate

Target and method

Generated strong
typed Invoke, BeginInvoke, EndInvoke methods

Syntactic sugar for invocation: action(arg)
> action.Invoke(arg)

Combine, Get


hence C#’s += syntax

Anonymous methods: closures capture local variables.
that the compiler
detects are subject to capture
are hoisted to members

of a secret nested class, and the
method is given a local of that
class instead
This is essential to understanding what happens
when you modify the captured variable

and hence C#’s weird behaviour if you capture a
loop variable.


throw vs. throw ex

CLS exceptions

Unhandled exceptions

Fundamental types

ometimes adding strings together is the right thing to do: chained + operators and the
Concat method

Core interfaces: IConvertible, IFormattable, IComparable

System.Object fundamental type: Equals and GetHashCode

The Common Type System

And the Common Langu
age Specification

Dynamic (runtime) aspects of the CLR

Managed code

Type checking

Null reference checking

How programs run

The Virtual Execution System

MSIL: an abstract byte code targeting an imaginary stack
based virtual machine


interpreted, but compiled on the fly into machine code

When a type is loaded, it basically has stubs for all its methods. As each method is compiled,
the method table entry is updated from the stub to point to the actual machine code

The JIT compiler wor
ks one method at a time.

Hence it is okay to do naughty things (e.g. call into missing assemblies) in a method that is
never called. But it is NOT okay to do the same things in a code branch that is never called
inside a method that is called. Trick: mov
e code with dubious dependencies into separate
methods that do not expose troublesome types.

Different JITters may behave differently

e.g. the x64 JIT compiler does more aggressive tail
call optimisation than the x86 one, so x86 programs may be subject t
o stack overflows that
do not show up on x64

How programs start

the Win32 stub

Assembly loading

Probing path

Strong names


Binding redirects

Explicit assembly loading

load contexts, assembly identity

Intercepting assembly loading via AppDom

Implicit loads happen at JIT time

if a method is never JITted, the assemblies it references
don’t get loaded.

Modules (kinda theoretical these days though)

Shadow copy

avoid locking assemblies while loaded

Application domains

unit of isolation in .NET

AppDomain lives inside a process. A process may contain many AppDomains

Assemblies are loaded into an AppDomain, not a process

The CLR creates an AppDomain for your code (it creates a couple of others but we don’t
need to worry a
bout them). Some environments create multiple AppDomains e.g. IE creates
an AppDomain per site, IIS creates an AppDomain per vroot.

You can create additional AppDomains.

Only AppDomains can be unloaded. Assemblies can’t, except by unloading the AppDomain

they’re loaded into.

You can’t talk directly to objects in other AppDomains. By default objects are serialised
when passed to other AppDomains, so you end up with two independent copies

MarshalByRefObjects can be accessed using remoting. This allows you

to modify an object
in another AppDomain (if that AppDomain provides you with a remoting channel to do it of

security!). This is generally bad. Use WCF to send messages between AppDomains
instead, as if they were separate processes. Better sti
ll, just use separate processes.

Garbage collection


Ephemeral segment

check Hewardt

Concurrent vs server garbage collection

.NET 4.0 background GC

Large object heap

Mark and sweep



Most allocations are extremely fast

can be


efficient than manual memory


What are they for? Unmanaged resources,

managed resources. Don’t touch managed
resources in a finalizer

they may already have been finalized!

How finalizers get executed

the finalizer threa
d and timeouts (2s per object, 40s total).
When you new a type that has a finalizer, the instance is placed on the finalization list.
Objects that are eligible for reclamation but
are on the list
are not reclaimed immediately
but put on the freachable qu
eue (and are considered roots).

After the GC, the finalizer
thread goes through the list of objects to be finalized and runs their finalizers. So when the
GC next runs, these objects should no longer be on the list and are therefore no longer
reachable a
nd can therefore be reclaimed.

: Finalize gets JITted on type load (

later when memory might be
low); CFOs are finalized after all other objects (so other objects can use CFOs they depend
on, e.g. FileStream holding a SafeHandle);

finalizer will run even if AppDomain is rudely
unloaded or thread rudely aborted.

Most common CFO: SafeHandle (and derived classes)


Dispose and Finalize
; the Dispose pattern; GC.SuppressFinalize

Weak references