CLR Fundamentals talk notes

feastcanadianΛογισμικό & κατασκευή λογ/κού

14 Δεκ 2013 (πριν από 3 χρόνια και 7 μήνες)

105 εμφανίσεις

CLR Fundamentals talk notes


Basics



Goals: provide a

safe, managed

execution environment that is easier for developers to use
than unmanaged code, and supports multiple languages and allows interoperation between
them. Reduce errors, facilitate library reuse (cf. MFC which you couldn’t use from VB6).
Metadata extensibi
lity e.g. MTS/COM+ attributes.



CLI (the standard) vs. CLR (the implementation).

Microsoft CLRs: desktop, Silverlight, WP7
(ARM).

Other implementations: Mono, Rotor.



C# and VB are mostly thin layers over the underlying CLR. Therefore, understanding the C
LR
is essential to understanding C# and VB.
You cannot use these languages effectively without
understanding the platform they run on.



Also, many issues and questions that arise turn out to be runtime or framework questions,
not language questions.

Simil
arly design guidelines are generally not specific to C# or VB but
are about designing for the CLR.



Therefore, developers need to be conscious of the difference between language
issues/idioms, framework issues/idioms and fundamental issues/idioms.



Three cor
e dimensions:
virtual
execution
system
(including services like loading DLLs,
memory management, etc.); common type system (enabling language interop); and
standard core libraries.

History



Predecessor: COM. Unmanaged environment with weak
and non
-
extensib
le
metadata
support and poor abstraction of runtime services.



1.0 and 1.1



2.0



4.0



Note 3.0 and 3.5 are not revs of the CLR, only of the higher level frameworks.

Plan of attack



Two main sections: how the CLR
represents

programs (static aspects), and how the

CLR
executes

programs (dynamic aspects)


Static aspects of the CLR


Types



Fundamental concepts: types and members.



Every piece of data has a type. It’s not like C where you have a bag of bits (possibly even
uninitialized ones!) and can interpret it howev
er you like: instead, the runtime ensures that
items are of the correct type.



The CLR defines a number of types: 8
-
64 bit integral types in both signed and unsigned
variants, 32 and 64 bit floating point types, native int (IntPtr/UIntPtr), char, string, ob
ject.



Other types are defined by libraries (e.g. Decimal in the BCL).



Reference (managed reference) and pointer (raw pointer) types.



The CLR type system is object
-
oriented and unified (all data are convertible to object).

Note
not all
types

inherit from object e.g. reference/pointer types, interfaces.

What kinds of type are there?



Reference types


classes



Value types


structs



Interface types



A bit more complicated: managed references and pointers

Types are frozen



A compiler produces a typ
e in an assembly
. That’s it. Game over.



Extension methods are syntactic sugar that give the illusion of extending a type, but don’t.



Partial classes (and methods) are a compiler artefact not a CLR feature. You cannot use
partial to extend a compiled typ
e


you will just end up creating a new type that shadows it!

What is the difference between reference and value types?



Reference types have
identity
. There can be several references to the same instance.



Value types have only
value
. When you pass a valu
e, the bits are copied: you get a new
instance.



Objects of reference types are not accessed directly, but through a reference (that is, a
variable of reference type stores a reference rather than the object itself). A variable of
value type contains the v
ery bits of the object itself.



Corollary: value types must have a zero value (for array allocations)


or rather, they
do

have
a zero value, and custom value types must deal with this.



Corollary: value types can’t have nondefault field initialisers or othe
rwise do stuff in their
default constructor.



When should you use a value type? When you have a small object that represents a value.



Idiom: value types should be immutable.

(Because when you change a Money value from
$1000 to $2000, you’re not changing t
he value of $1000. You’re specifying a
different

value.)



Equality semantics: reference types usually have identity equality semantics, value types
usually have value equality semantics.



The curious case of String: value semantics but implemented as refere
nce type. Why? (1)
Efficiency in passing strings. (2) Values must be fixed size.



Value type GetHashCode default implementation sucks: override it!

Boxing



Object is a reference type (so are interfaces). What happens when you store a value type in
an obj
ect reference? Boxing.



Unboxing. Note C# ‘cast object to value type’ results in a CLR unbox, not a case


hence
uglinesses like (long)(int)obj.



What happens if you modify a boxed instance?

If you modify it through the boxed value
(e.g. reflection
,
dynamic
), then it works. But if you unbox it then the unbox gets you a copy,
so you won’t modify the original!

It’s confusing


just another argument for immutability.

Nullable value types



Given special handling by the CLR e.g. lifted operations,
GetType()



Actually value types


they never really equal null (hence okay to call HasValue on a ‘null’
instance)…



…except when boxed. Boxing and unboxing give special handling to nullables

Generic types



Open generic types



Closed constructed types



Generic
variance



How generics get JITted: one version generated for
all

reference types, one for
each

value
type (works because all references are the same size)



But

different type parameters still result in different closed constructed types, even though
they sha
re generated code


e.g. the type initialiser will run once for each closed type

Members



Type initialisers



precise (static ctor) vs. beforefieldinit (field initialisers)



The difference between const and static readonly



Type restrictions on const: integra
l, floating point, decimal, bool, string, enum
. Pretty
random really. Consequences for default parameters.

Properties



Are just a pair of methods


get and set



Occupy no per
-
instance space



Relation to C# auto properties

Events



Are just a pair of methods


add and remove



C# and VB have always offered auto events



But you can also implement events manually with your own backing store



Example: WinForms Control uses a single backing store for all events

Interfaces



Interface implementations are
a
separate

part of the method table (though will typically
point to the same methods as the ‘main’ bit)



C# makes it look like it’s enough to have a member with the right signature, but that’s
bec
ause it performs compiler magic



Hence how the C# compiler can accept an

inherited member as implementing an interface
member



Hence explicit interface implementation, where the method is not emitted into the
“methods introduced by this type” bit of the method table, but only into the “methods
introduced by this interface” bit
of the method table

Delegates



Typically compiler generated classes



Inherit from MulticastDelegate



Target and method



Generated strong
-
typed Invoke, BeginInvoke, EndInvoke methods



Syntactic sugar for invocation: action(arg)
-
> action.Invoke(arg)



Combine, Get
InvocationList



Immutable


hence C#’s += syntax



Anonymous methods: closures capture local variables.
Local
variables
that the compiler
detects are subject to capture
are hoisted to members

of a secret nested class, and the
method is given a local of that
class instead
.
This is essential to understanding what happens
when you modify the captured variable


and hence C#’s weird behaviour if you capture a
loop variable.

Exceptions



throw vs. throw ex



Non
-
CLS exceptions



Unhandled exceptions

Fundamental types



S
ometimes adding strings together is the right thing to do: chained + operators and the
Concat method



Core interfaces: IConvertible, IFormattable, IComparable



System.Object fundamental type: Equals and GetHashCode

The Common Type System



And the Common Langu
age Specification






Dynamic (runtime) aspects of the CLR


Managed code



Type checking



Null reference checking

How programs run



The Virtual Execution System



MSIL: an abstract byte code targeting an imaginary stack
-
based virtual machine



MSIL is
not

interpreted, but compiled on the fly into machine code



When a type is loaded, it basically has stubs for all its methods. As each method is compiled,
the method table entry is updated from the stub to point to the actual machine code



The JIT compiler wor
ks one method at a time.



Hence it is okay to do naughty things (e.g. call into missing assemblies) in a method that is
never called. But it is NOT okay to do the same things in a code branch that is never called
inside a method that is called. Trick: mov
e code with dubious dependencies into separate
methods that do not expose troublesome types.



Different JITters may behave differently


e.g. the x64 JIT compiler does more aggressive tail
call optimisation than the x86 one, so x86 programs may be subject t
o stack overflows that
do not show up on x64

How programs start


the Win32 stub

Assembly loading



Probing path



Strong names


versioning



Binding redirects



Explicit assembly loading


load contexts, assembly identity



Intercepting assembly loading via AppDom
ain.AssemblyResolve



Implicit loads happen at JIT time


if a method is never JITted, the assemblies it references
don’t get loaded.



Modules (kinda theoretical these days though)



Shadow copy


avoid locking assemblies while loaded

Application domains



The
unit of isolation in .NET



AppDomain lives inside a process. A process may contain many AppDomains



Assemblies are loaded into an AppDomain, not a process



The CLR creates an AppDomain for your code (it creates a couple of others but we don’t
need to worry a
bout them). Some environments create multiple AppDomains e.g. IE creates
an AppDomain per site, IIS creates an AppDomain per vroot.



You can create additional AppDomains.



Only AppDomains can be unloaded. Assemblies can’t, except by unloading the AppDomain

they’re loaded into.



You can’t talk directly to objects in other AppDomains. By default objects are serialised
when passed to other AppDomains, so you end up with two independent copies



MarshalByRefObjects can be accessed using remoting. This allows you

to modify an object
in another AppDomain (if that AppDomain provides you with a remoting channel to do it of
course


security!). This is generally bad. Use WCF to send messages between AppDomains
instead, as if they were separate processes. Better sti
ll, just use separate processes.

Garbage collection



Generational



Ephemeral segment


check Hewardt



Concurrent vs server garbage collection



.NET 4.0 background GC



Large object heap



Mark and sweep



Roots



Compacting



Most allocations are extremely fast


can be

more

efficient than manual memory
management

Finalizers



What are they for? Unmanaged resources,
not

managed resources. Don’t touch managed
resources in a finalizer


they may already have been finalized!



How finalizers get executed


the finalizer threa
d and timeouts (2s per object, 40s total).
When you new a type that has a finalizer, the instance is placed on the finalization list.
Objects that are eligible for reclamation but
are on the list
are not reclaimed immediately
but put on the freachable qu
eue (and are considered roots).

After the GC, the finalizer
thread goes through the list of objects to be finalized and runs their finalizers. So when the
GC next runs, these objects should no longer be on the list and are therefore no longer
reachable a
nd can therefore be reclaimed.



CriticalFinalizerObject
: Finalize gets JITted on type load (
not

later when memory might be
low); CFOs are finalized after all other objects (so other objects can use CFOs they depend
on, e.g. FileStream holding a SafeHandle);

finalizer will run even if AppDomain is rudely
unloaded or thread rudely aborted.



Most common CFO: SafeHandle (and derived classes)



Resurrection



Dispose and Finalize
; the Dispose pattern; GC.SuppressFinalize



Weak references