Flexible Dynamic Linking for .NET

31
Page 1 © Imperial College London Flexible Dynamic Linking for .NET Susan Eisenbach [email protected] Alex Buckley [email protected]

description

Flexible Dynamic Linking for .NET. Susan Eisenbach [email protected] Alex Buckley [email protected]. Agenda. Introduction to dynamic linking Flexibility v. safety at link-time Developer-centric flexibility Design issues Conclusion. Dynamic linking. Turns - PowerPoint PPT Presentation

Transcript of Flexible Dynamic Linking for .NET

Page 1 © Imperial College London

Flexible Dynamic Linkingfor .NET

Susan Eisenbach [email protected]

Alex Buckley [email protected]

Page 2 © Imperial College London

Agenda

• Introduction to dynamic linking

• Flexibility v. safety at link-time

• Developer-centric flexibility

• Design issues

• Conclusion

Page 3 © Imperial College London

Dynamic linking

Turns

ldfld MemberDescriptor

generated at compile-time

into

ldfld 0x100405

in the run-time environment

by using assembly and class definitions

Page 4 © Imperial College London

Dynamic linking

• Is taken as given in modern execution environments

• Saves space by sharing code between programs

• Enables binding policies such as:– User/system-wide upgrades (v1.0 → v2.0)

– Servicing policy (v1.1.9 → v1.2)

– Unification policy (use version corresponding to the CLR)

– Local and remote probing (check GAC, then URL)

• Supports bytecode verification on the user’s machine

Page 5 © Imperial College London

call void [mscorlib]System.Console::WriteLine(string)

Assembly metadataimport mscorlib =

MSCorLib, v1.1.1322, PK=9999, Culture=US

Class metadata

export A,B,C,D

IL code namespace System;class Console { void WriteLine(string s){…}}

LinkerCalc

v1.2.3.4

PublicKey=12345

Culture=UK

Assembly metadata

Class metadata

export System.Console

IL code

MSCorLib

v1.1.1322

PublicKey=99999

Culture=US

Page 6 © Imperial College London

Linking is constrained by compiler decisions

call void[mscorlib]System.Console::WriteLine(string)

Console.WriteLine(“Hi”);

call void[monolib]System.Console::WriteLine(string)

Known to the compiler

Available at runtime

Page 7 © Imperial College London

Platform- or vendor-specific assemblies might be available at runtime only

• Generic ODBC v. SQLServer ODBC

• Microsoft FTP v. third-party SecureFTP

A concrete example:

• Imperial’s LTSA model-checker can use non-redistributable NASA algorithms

• How to avoid compile-time dependencies on NASA code?

• Separate compilation still checks dependencies

• Have to use error-prone reflection

Page 8 © Imperial College London

How can I bind to the runtime environment when the compiler forces its choices on me?

Source code Compile-time classes Run-time classes

new DBLib() DBLib OK SQLSvrLib ??

new DBLib() (None) ??DBLib,

SQLSvrLibOK

Instead of the database library, how can I target a database library?

or,

Page 9 © Imperial College London

Initial idea: make bytecode more flexible

Assembly type variable

call void [X]System.Console::WriteLine(…)

call void [mscorlib]System.Console::WriteLine(…)

Class type variable

call void [mscorlib]Y::WriteLine(…)

call void [mscorlib]System.Console::WriteLine(…)

Compile-time

Link-time

Page 10 © Imperial College London

Flexible bytecode

• Enables late binding between program & environment

• Use “generic” classes without naming specific assembly– May have many versions of an assembly:

Development/Testing/Production/Archive

– Want to develop binary components off-site, using stub assemblies, then execute on-site, using full assemblies

– Simplified command-line compilation (fewer /r: args)

– Augments Fusion re-versioning with renaming

• Use specific assembly without mentioning a class– E.g. programmer uses List interface but implementation picked later

Page 11 © Imperial College London

Type-safety

• The substitute for an assembly or class should provide members used by the programmer, e.g.– Assembly X provides interface List

– Interface List provides ‘void compare(List l);’

– Class Y is a subclass of/implements List

So:

• Collect constraints from bytecode

• Search the GAC for suitable assemblies at run-time

• OK?

Page 12 © Imperial College London

Problems with constraints

• Subtype constraints require data-flow analysis

• Substitution may be over-constrained by unreachable code

• Constraints say nothing about behaviour

• Debugging is impractical if unknown components are chosen at run-time

Page 13 © Imperial College London

Semantic substitutions

Policy: Programmer has to know valid assemblies and classes

• Custom attributes declare possible substitutions

[LinkAssembly(Assembly1, Assembly2)]

[LinkAssembly(Assembly1, Assembly3)]

[LinkClass(Class1, Class2)]

[LinkClass(Class1, Class3)]

• Any assembly or class can be independently substituted

→ All types are type variables

• Still need member constraints, but only to avoid resolution errors, not guide substitutions

[LinkMember(Assembly1, Class1, B m(D))]

[LinkMember(Assembly1, Class1, B f]

Page 14 © Imperial College London

Attribute scoping

• Many classes will not require flexible resolution

• Minimise impact by choosing the right scope[assembly: LinkClass(…)]

[module: LinkClass(…)]

[LinkClass(…)]

class App {

void m() // Search class,module,assembly

[LinkClass(…)]

void n() // Use this method’s LinkClass

• Can rebind an assembly/class name across scopes

• Only the most local scope is used

Page 15 © Imperial College London

Substitution interfaces

• Group substitutions by platform/vendor/maturity:[LinkAssembly(A1,ABC,”win32”)]

[LinkAssembly(A2,DEF,”win32”)]

[LinkAssembly(A1,GHI,”win64”)]

[LinkAssembly(A2,JKL,”win64”)]

• On a Win32 machine, only the A1→ABC and A2→DEF substitutions will be possible

• Once A1 or A2 has been substituted, we should stay within the “win32” interface

Page 16 © Imperial College London

Interface policies

[LinkAssembly(A1,ABC,”win32”,

LOCAL_INTERFACE)]

Demand only [LinkAssembly]+[LinkClass] from “win32”

[LinkAssembly(A1,ABC,”win32”,

LOCAL_INTERFACE_PREFERRED)]

Try “win32” attributes first, but allow others on failure

[LinkAssembly(A1,ABC,”win32”,

LOCAL_INTERFACE_EAGER)]

Eagerly check that all “win32” attributes will succeed

[LinkAssembly(A1,ABC,”win32”,ANY_INTERFACE)]

No restrictions on later attributes

Page 17 © Imperial College London

Hello.cs

Preparing bytecode for flexible linking

[assembly: LinkClass]

[LinkClass]

class App {

void m() { … }

[LinkClass] void n() { … } }

Hello.exe

Assembly metadata

IL code

Class metadata

Hello.il

.assembly Hello {

.custom instance LinkClass

.class App extends … {

.custom instance LinkClass

Hello.il

.assembly Hello {

.custom instance LinkClass

.custom instance LinkMember

.class App extends … {

.custom instance LinkClass

.custom instance LinkMember

Compiler

ILDASM

Infer member constraints

Avoid source code: compilers are hard to change, and there are many of them

Metadata is backward - compatible

Must be able to compile in some default environment

ILASM

Page 18 © Imperial College London

Just-In-Time substitution

• CLI-compliant linking is very flexible

• If verification happens, its timing is not specified

• Timing of resolution is very loose– As early as install time, as late as execution time

• But actually, the CLR is lazy

• Resolves when an expression is JIT-compiled

• Verification happens at resolution

• We extend resolution to handle [Link*()] attributes

Page 19 © Imperial College London

Standard resolution

Modifying the SSCLI

Verifier/JIT compiler

FDL resolution

Fusion

Assembly/class loading

Filesystem

Constraint verification

x86 code

Attribute collection

Resolution cacheCEEInfo::findClass/Field/Method

Page 20 © Imperial College London

Attribute collection

• A LinkContext encapsulates a single resolution attempt, e,g, call [A]C::m …= A fully-qualified member reference needing resolution

+ The nearest [LinkAssembly] and [LinkClass] attributes in scope

+ Set of constraints applying to these attributes

• JIT-compiling a method creates a MasterLinkContext– Finding custom attributes is easy with Metadata Importers

• LinkContexts in a method share a MasterLinkContext

• Need a LinkContext for caller’s and callee’s scope

Page 21 © Imperial College London

Nested LinkContexts

To resolve call [A]C::m(D,E,F)

• (MasterLinkContext is already created)

• Create LinkContext for this instruction

• Use LinkContext to choose for A, and C

• Load (substituted version of) [A]C, and find m

[A]C::m expects to execute in an environment where its own custom attributes are obeyed

• Create nested LinkContext for method m in [A]C

• Resolve D,E,F under original + nested LinkContexts

Page 22 © Imperial College London

Issue: Flexible fields

class A { [LinkAssembly(...)] [LinkClass(B, C)] private B f = new B();

public void A() { .. }

In A’s constructor (.ctor): newobj instance [..]B stfld class [..]B [..]A::f

• newobj could be making a B object destined for any field

• Only at stfld do we find that it is destined for a flexlinked field

• Don’t want to rely on bytecode being in a precise order

• Don’t want to look-ahead in the JIT-compiler

class A { private B f = new B(); [LinkAssembly(...)] [LinkClass(B, C)] public void A() { .. }

Rely on an IDE to move attributes to the constructor:

(Where f is initialised)

Page 23 © Imperial College London

Issue: Static resolution of flexible fields

• C# 1.x compiler resolves fields statically

• Has the effect of hiding members that should be substituted

• Java 1.3 did the same; changed in 1.4

// Compile-time envclass A { String f;}class B extends A {}

[LinkClass(B,C)]

new B().f;

ldfld […]A::f // Does not match // the LinkClass(B,…)

Page 24 © Imperial College London

Conclusion

• CLI is a good home for flexible dynamic linking– Different runtimes and frameworks (WinFx, .NETCF, OpenCF,

Mono, Portable.NET) have different API implementations → more choices for the programmer

– Resolution guided by rich metadata

– Easy to represent FDL-related facts

– Flexibly-linked bytecode is still verifiable (type-safe)

– Tiny amounts of code in the right place are very effective

• Future work– Modify compiler for independent compilation

– (Overcome static resolution problem)

– Implement extended resolution mechanism with aspects

Page 25 © Imperial College London

Thank you

Page 26 © Imperial College London

Resolution cache

• Resolving a member reference gives a metadata token

• Which token gets cached depends on the method called first:

class A {

[LinkAssembly(“mscorlib”,”msphone”,…)]

[LinkClass(“System.Console”,”Speech.Output”,…)]

void m1() { System.Console.WriteLine(…); }

// No attributes apply to m2

void m2() { System.Console.WriteLine(…); }

• Caching flexible members → non-FDL code will use them • Not caching flexible members → repetitious FDL resolution • Generate new member refs for flexible members → complex

Page 27 © Imperial College London

True runtime discovery?

• Rather than specifying substitutions through attributes, make bytecode more abstract with variable types– [fdl05a_X]Class

– [Assembly]fdl05c_X

– [fdl05a_X]fdl05c_X

• Gather constraints on variable types, as we did for classes named by [LinkClass()]– Assembly A is variable

– Variable assembly A has class C

– [A]C has field f with signature t

– [A]C has method m with signature t

– ilasm never checks existence of referenced classes, so TypeRefs are implicitly variable

Page 28 © Imperial College London

Representing variable types in metadata

.assembly extern FDRAttributes { version 0.0.0.0 }

.assembly extern fdl05a_X{ .custom instance void [FDRAttributes]VariableTypeAttribute::ctor() }

.assembly HelloWorld {

.custom instance void

[FDRAttributes]VariableAsmHasClassAttribute::.ctor(string,string)

= ( 01 00 08 … // ...fdl05a_X.fdl05c_X )

.custom instance void

[FDRAttributes]VariableClassHasMethodAttribute::.ctor(string,string,…)

= ( 01 00 08 … // ...fdl05a_X.fdl05c_X.WriteLine… )

}

.class C { .method void Main(string[] args) managed { call void [fdl05a_X]fdl05c_X::WriteLine(…)} }

Page 29 © Imperial College London

Implementing FDLJIT-compiler

CEEInfo::findClass/Field/Method

Add LinkContext to thread’s stack

Add reference’s info to LinkContext

Find lowest scope level with the appropriate [LinkAssembly()]Choose [LinkAssembly()] directives w.r.t. interface policyGet class substitutions and constraints relating to assembly

For each GAC assembly For each class substitution Check existence of requested member Signature and constraint verification of found member

Find exact substitution

Caching

Page 30 © Imperial College London

1) JIT-compile call [X]C::…2) Ask ClassLoader of current assembly if [X]C is in current module (No)

3) Does current assembly’s

metadata have a TypeRef for C?

(Yes)

4) TypeRef points to AssemblyRef, which indicates a type variable

5) User input to substitute assembly type variable to a GAC assembly

6) Substituted assembly’s ClassLoader recognises class type variables (‘fdlC…’) and checks map

Page 31 © Imperial College London

Assembly binding can only use the names in IL

class A {B f;} class B {C g;}

class A {D f;} class D {C g;}

Execution environment

new A().f.g new A.f[A,B].g[B,C]

new A.f[A,B].g[B,C] ResolutionError

Compilation environment

compiles to

executes as