The Dagstuhl Middle Model: An Overview

28
The Dagstuhl Middle Model: An Overview Timothy C. Lethbridge SITE, University. of Ottawa [email protected]

description

The Dagstuhl Middle Model: An Overview. Timothy C. Lethbridge SITE, University. of Ottawa [email protected]. Motivation. Information interchange is necessary among reverse engineering tools So researchers/companies can Use each others’ tools Interpret each other’s data - PowerPoint PPT Presentation

Transcript of The Dagstuhl Middle Model: An Overview

Page 1: The Dagstuhl Middle Model: An Overview

The Dagstuhl Middle Model:An Overview

Timothy C. Lethbridge

SITE, University. of Ottawa

[email protected]

Page 2: The Dagstuhl Middle Model: An Overview

ateM 2003 - Victoria Timothy C. Lethbridge 2

MotivationInformation interchange is necessary among

reverse engineering tools• So researchers/companies can

Use each others’ tools Interpret each other’s data

Therefore we need one or more metamodels• (better if we can integrate and have just one metamodel)

We will use the term metamodel and schema interchangeably

Page 3: The Dagstuhl Middle Model: An Overview

ateM 2003 - Victoria Timothy C. Lethbridge 3

Various metamodels that have been used

• CDIF (obsolete now)• UML metamodel and the MOF• TA / RSF schemas

In the Canadian research community• Famix• Others

Typical problems• Too proprietary• Too complex• Don’t meet all reverse-engineering requirements

Page 4: The Dagstuhl Middle Model: An Overview

ateM 2003 - Victoria Timothy C. Lethbridge 4

Requirements not met by the UML metamodel

Effective handling of source-code level information such as:• Full procedure call hierarchy• Full details of variables and access information• Macros and other source code constructs

The UML metamodel focuses on things representable by UML• i.e. designed for forward-engineering

Page 5: The Dagstuhl Middle Model: An Overview

ateM 2003 - Victoria Timothy C. Lethbridge 5

Main types of metamodels for reverse engineering

Low-level (Abstract Syntax Graph)• Specific to a programming language• Can represent full details of code, allowing almost unlimited types of analysis

Mid-level• Relationships among program entities• More compact than low-level

Architectural• E.g. components, pipes and filters

Database

Page 6: The Dagstuhl Middle Model: An Overview

ateM 2003 - Victoria Timothy C. Lethbridge 6

The Dagstuhl Middle Model (DMM)Probably should from now on be called the

Dagstuhl Middle Metamodel

Initiated at the Dagstuhl Seminar on reverse engineering tool interoperability• Held Jan 22-26, 2001• Involved about 45 people• Ratified GXL 1.0

A syntactic representation for schemas• Started work on the four types of schemas

Page 7: The Dagstuhl Middle Model: An Overview

ateM 2003 - Victoria Timothy C. Lethbridge 7

DMM background continuedRepresents program and source elements and their

relationships• In a language-independent way• Suitable for typical OO and procedural languages

C, C++, Java, Fortran, Cobol, etc.• Extensions needed for languages with special features

E.g. Logic languages, scripting languages

Key initial contributors and inputs • University of Ottawa (TA++)• Sander Tichelaar; Berne (Famix)• Erhard Plödereder; Stuttgart (Bauhaus)• others

Web site: http://titan.cnds.unibe.ch:8080/Exchange/2

Page 8: The Dagstuhl Middle Model: An Overview

ateM 2003 - Victoria Timothy C. Lethbridge 8

Design issuesOn the next few slides we will look at some of the

issues considered when designing DMM

Most of these issues were resolved in the direction of providing• Simpler models• That capture adequate information• Enabling the reverse engineer to navigate objects and relationships in the system Once they have found information of interest they

may look at code or more detailed models

Page 9: The Dagstuhl Middle Model: An Overview

ateM 2003 - Victoria Timothy C. Lethbridge 9

Design issue 1Should you be able to regenerate the original

code given a middle model?

Decision: No• That would yield models of too great complexity for our target audience

• But this does not preclude a parser that generates a low level model, coupled to a tool that ‘lifts’ the data to a middle model

Page 10: The Dagstuhl Middle Model: An Overview

ateM 2003 - Victoria Timothy C. Lethbridge 10

Design issue 2Should a model represent original code or pre-

processed code?• Options:

Original code• More useful for browsing tools• What the reverse engineer really thinks about and needs to

modify• Can model pre-processor directives + macros

Pre-processed• Much easier to parse

Decision: Represent original code

Page 11: The Dagstuhl Middle Model: An Overview

ateM 2003 - Victoria Timothy C. Lethbridge 11

Design issue 3

How language-independent should the metamodel be?

Decision:• It should be able to abstractly represent all common relationships among elements• Even though in different languages …

… the elements may have different names … the elements may have subtly different

semantics

Page 12: The Dagstuhl Middle Model: An Overview

ateM 2003 - Victoria Timothy C. Lethbridge 12

Examples of language-independence-promoting abstractions

• The notion of type of a variable is preserved But different types of pointers and storage can be

abstracted out• Procedures, functions and methods can all be treated similarly• Classes, records and structures can be treated similarly• Etc.

Page 13: The Dagstuhl Middle Model: An Overview

ateM 2003 - Victoria Timothy C. Lethbridge 13

Design issue 4Should all references be resolved?

E.g. exactly what procedure is called by what procedure call

Resolved references• Not possible in the general case due to function pointers,

dynamic binding• Even in the non-OO case, requires full understanding of

language-dependent linkage rules• Often dependent on makefile

In practice, software engineers can understand most aspects of code without resolved references

Page 14: The Dagstuhl Middle Model: An Overview

ateM 2003 - Victoria Timothy C. Lethbridge 14

Design issue 5

What level of detail needs to be captured?• Assignments, control constructs?

No – that is for low-level ASG metamodels• Local variables?

Not normally at the middle level• Exact position of all references? (e.g. calls)

Existence of references in some object may be all that is needed

• Enough detail for full data flow analysis? No; only enough for limited DFA

Page 15: The Dagstuhl Middle Model: An Overview

ateM 2003 - Victoria Timothy C. Lethbridge 15

Design issue 6Do we separate representation of source entities from

representation of conceptual entities?

Source elements are chunks of source code• Needed for tools to point reverse engineers to where

conceptual entities are mapped to code• Also useful for representing source constructs such as

macros, conditional compilation, etc.

Such a separation has proved useful, but may not always be needed

Page 16: The Dagstuhl Middle Model: An Overview

ateM 2003 - Victoria Timothy C. Lethbridge 16

How to model DMM?

We follow the OMG convention of modeling DMM using a class diagram

Actually we use the subset of class diagrams called MOF (Meta Object Format)

Page 17: The Dagstuhl Middle Model: An Overview

ateM 2003 - Victoria Timothy C. Lethbridge 17

Top level of the hierarchy

ModelRelationship

SourceObject ModelObject

BehaviouralElement

ModelElement

StructuralElement

Relationship

*

SourceModelRelationship

SourceRelationship

*

Page 18: The Dagstuhl Middle Model: An Overview

ateM 2003 - Victoria Timothy C. Lethbridge 18

Modelelementhierarchy ModelRelationship {abstract}

Package

FormalParameter

position

ExecutableValue

SourceObject

{abstract}

ModelObject {abstract}

name

Routine

0..1 defines

BehaviouralElement {abstract}

Field

Variable

Value

ModelElement {abstract}

EnumeratedType

EnumerationLiteral

Type

StructuralElement {abstract}

Method

StructuredType

Class

1..*

*

0..1

0..1 declares*

visibility

isConstructor

isDestructor

isAbstract

isDynamicallyBound

isOverideable

inheritsFrom

**

isSubpackageOf

*

isOfType

hasValue

0..1isDefinedInTermsOf

CollectionType

size

*

0..1*

*

0..1

0..1

isSubclassable

isEnumerationLiteralOf

isFieldOf

isMethodOf

invokes

* *accesses

* *

isParameterOf

imports

*

*

0..1

*

Page 19: The Dagstuhl Middle Model: An Overview

ateM 2003 - Victoria Timothy C. Lethbridge 19

Relationshiphierarchy

Sets

IsPartOfSignatureOf

IsActualParameterOf

ModelElement

Invokes

IsFieldOf

Field

StructuredType

IsMethodOf

Method

Class

IsEnumerationLiteralOf

EnumerationLiteral

EnumeratedType

isPartOf Invokes

BehavioralElement

BehavioralElement

Accesses

BehaviouralElement

StructuralElement

IsOfType

StructuralElement

Type

Imports

IsReturnValueOf

Type

BehaviouralElement

IsParameterOf

FormalParameter

BehaviouralElement

TakesAddressOfComponent

TakesAddressOfObjectUsesObjectSetsObject

SetsComponent UsesComponent

TakesAddressOfUses

SourceRelationship

Class

LogicalPackage

ModelRelationship

Relationship

IsDefinedInTermsOf

Type

Type

Includes

ModelElement

SourceFile

Contains

SourceUnit

SourcePart

ModelElement

SourceObject

SourceObject

SourceFile

SourceModelRelationship

Defines Declares

SourceObject

ModelObject

IsExpansionOf

MacroDefinition

MacroExpansion

Describes

SourceObject

Comment

this class hierarchy

shows domain and

range of each

association class,

rather than link

attributes.

inheritsFrom

Class

Class

access

Page 20: The Dagstuhl Middle Model: An Overview

ateM 2003 - Victoria Timothy C. Lethbridge 20

Source objecthierarchy

SourceUnit

name

Resolvable

name

Definition

SourceFile

path

*

******

Declaration Reference

SourcePart

startLine

startChar

endLine

endChar

SourceObject

*

contains

includes

MacroDefinition

name

MacroArgument

name

ModelObject* *

MacroExpansion

isExpansionOf

*

Comment

*

describes

Page 21: The Dagstuhl Middle Model: An Overview

ateM 2003 - Victoria Timothy C. Lethbridge 21

Issues not handled by DMM that may concern reverse engineers

Multi-part references to members• E.g. in the expression a.b().c

How

Function pointers• Reverse engineers we have worked for really have wanted to understand what can call what via function pointers

Computed and aliased references

Templates, à la C++

Page 22: The Dagstuhl Middle Model: An Overview

ateM 2003 - Victoria Timothy C. Lethbridge 22

Syntactic carrier neutralityDMM can be represented using:• GXL: See http://www.gupro.de/GXL/• TA• XMI• RDF• Any other format capable of representing instances of classes and their associations

A tool can use whichever it likes, provided there are appropriate syntax translators

Page 23: The Dagstuhl Middle Model: An Overview

ateM 2003 - Victoria Timothy C. Lethbridge 23

DMM Development Status

General website• http://scgwiki.iam.unibe.ch:8080/Exchange/2.

University of Ottawa• Have C++ parser that generates DMM in GXL• URL: http://www.site.uottawa.ca:4322/dmm

Others are also experimenting with model

Future work• Many details to be decided based on initial experiences

Page 24: The Dagstuhl Middle Model: An Overview

ateM 2003 - Victoria Timothy C. Lethbridge 24

Topics for discussionHow useful is DMM as it stands? Is anything clearly

inadequate? What changes, variants or extensions are needed?

What other middle level metamodels are there and can we merge the ideas from these to create a common standard

Will this standard be adopted

What extensions might be useful?

Can DMM be reasonably integrated with other types of metamodels? (E.g. database, architectural, and ASG metamodels, with common classes at the intersection)

Page 25: The Dagstuhl Middle Model: An Overview

ateM 2003 - Victoria Timothy C. Lethbridge 25

(End)

Page 26: The Dagstuhl Middle Model: An Overview

ateM 2003 - Victoria Timothy C. Lethbridge 26

TheUMLMeta-model

Component

0..10..10..1

ownedElement

Comment

*

Link

Structure

1..*1..*1..*

EnumerationLiteral

ProgrammingLanguageTypeEnumerationPrimitive

******

1..*

*

Object

Parameter

defaultValuekind

InstanceNamespace

specialization*

parent

*child

GeneralizableElement

isRootisLeafisAbstract

Package

Model

SubsystemDataTypeInterfacedeploymmentLocation

* resident

*

AssociationClass

Class Node

owner

{ordered}

participant

*

specification

*

AssociationEnd

isNavigableaggregationmultiplicity

connection

2..*2..*2..*

Association

Method

body

Operation

isAbstract

0..1

qualifier *

Attribute

initialValue

{ordered}*

specification

******

* type

Generalization

discriminator

Feature

visibility

BehaviouralFeatureStructuralFeature

multiplicity

Classifier

ModelElement

name

Element

Relationship

*

*

type

importedElement

*

*

Page 27: The Dagstuhl Middle Model: An Overview

ateM 2003 - Victoria Timothy C. Lethbridge 27

A legacymid-levelmodel:TA++

* resolvedTo 0..1

potentiallyImports*

*

ImportExistence

InclusionExistence

0..1

typ0..10..10..1 type0..10..10..1

Resolvable

Declaration

TypeReference

Reference

1..*0..1

*

SourceFile

path

* ******

SourceElementName

string

StructuredTypeDef

DatumDef

EnumeratedTypeDef

DataUseExistence

Definition

SourcePart

startCharendChar

SourcePackage

*

SpecificSourceElement

***

0..10..1 * RoutineDef

EnumerationLiteralDef

FieldDef

SourceUnit

*

potentiallyIncludes *

*

*

potentiallyUsesType

*potentiallyCalls

* RoutineCallExistence

TypeUseExistence

TypeDef

ClassDef

ReferenceExistence

******

contains

...

**

potentiallyRefersTo

parameters

methods

makesReferenceTo

*

*

Page 28: The Dagstuhl Middle Model: An Overview

ateM 2003 - Victoria Timothy C. Lethbridge 28

Another input model: part of Famix

Access

InheritanceDefinition

FunctionMethodPackageClassInvocation

Object

Association Model Entity BehavioralEntity