Region-Based Memory Management

Post on 03-Jan-2016

32 views 0 download

description

Region-Based Memory Management. Region-based Memory Management. Regions represent areas of memory Objects are allocated “in” a given region Various deallocation options Various safety (no access of freed objects) options. Region r = newregion(); for (i = 0; i < 10; i++) { - PowerPoint PPT Presentation

Transcript of Region-Based Memory Management

Region-Based Memory Management

Region-based Memory Management

Regions represent areas of memoryObjects are allocated “in” a given regionVarious deallocation optionsVarious safety (no access of freed objects)

optionsRegion r = newregion();

for (i = 0; i < 10; i++) {

int *x = ralloc(r, (i + 1) * sizeof(int));

work(i, x); }

deleteregion(r);

Policy choices

Deallocation Garbage collection (GC) per-object free (per-object) region deletion (all-at-once)

Safety none (none) reachability (GC) per-region reference counting (RC) statically checked (static)

Some Existing Region Systems

Deallocation Safetyarenas all-at-once noneapache all-at-once nonezones per-object noneStoutamire GC GCvmalloc all-at-once or none

per-objectTT all-at-once staticCWM all-at-once staticC@/RC all-at-once RC

Why Regions ?

per-region allocation/deallocation policies zones (D.T. Ross, 1967), vmalloc (K. Vo, 1996)

performance arenas (D. Hanson, 1990)

locality benefits Stoutamire (1997)

expressiveness apache, arenas, C@/RC

target for compiler-inferred memory management Tofte & Talpin (1994), Crary, Walker, Morisett

(1999)

Why Regions ? (more reasons)

statically guaranteed memory safety CWM (1999)

target for garbage collection Wang & Appel (2001)

Region Performance: Allocation and Deallocation

Applies to all-at-once onlyBasic strategy:

allocate a big block of memory individual allocation is:

pointer incrementoverflow test

deallocation frees the list of big blocks

all operations are fast

a region

allocpoint

wastage

Region Performance:Locality

Regions can express locality: Sequential allocs in a region can share cache

line Allocs in different regions less likely to pollute

cache for each other

Example: Moss 24% faster when frequently accessed, small

objects placed in different region than infrequently accessed, large object

Locality: moss

1-region version: small & large objects in 1 region 2-region version: small & large objects in 2 regions 45% less cycles lost to r/w stalls in 2-region version

moss - stalls

0200400600800

10001200

1-re

g

2-re

g

meg

acyc

les

moss - time

0

5

10

15

20

25

1-re

g

2-re

g

time

(s)

Region Expressiveness

Adds some structure to memory management

Few regions: easier to keep track of delay freeing to convenient "group" time (e.g.,

end of an iteration, closing a device, etc)

No need to write "free this data structure" functions

Region Static Checking:Region Type Systems

Basic idea: name regions in typesA simple region type system:

= int | region @ | <1, …, n> @ | ' | .

: region variables

Example: .(<int, int> @ int)

Region Static checking:Tofte & Talpin

Regions follow stack discipline letregion in e:

allocate a region named evaluate e (can use )delete region

safe if:(region) type of e does not use is not free in the letregion's environment

deallocation of regions is required... problem: pure stack discipline too restrictive ("leaks")

Aiken, Fähndrich, Levien: allocate late, deallocate earlyTofte & Talpin: other optimisations

Region Static Checking:Capabilities

Crary, Walker, Morisett: capabilities available at each program point:

1: read objects in , allocate in , freergn • guarantee: no other regions alias (so freergn safe)

+: read objects in , allocate in 1 < + (capability "subtyping")

capabilities threaded through the program:newrgn adds 1 to the current capabilitiesfreergn removes 1 from current capabilitiesfunction calls can temporarily "lose" capabilities (but

recoverable on return)

no capabilities allowed at exit:deallocation of regions is required

CWM Example

Static Checking Limitations

Some types are not expressible: list of regions

Ease of programming is unknownNo clear bounds on memory usage

Region Dynamic checking:RC

Features of RC: region-based allocation:

newregion/deleteregion/ralloc

safety via reference-counting (RC):RC(region r) = number of references to objects in r

from outside rdeleteregion(r) fails if RC(r) > 0

type annotations to describe program's region structure

Example

struct list { int i; struct list @next;} *a, *b;

Region r = newregion();

RC

a

b

r

0

Example

struct list { int i; struct list @next;} *a, *b;

Region r = newregion();b = rcons(r, 77, null);

RC

77

a

b

r

1

Example

struct list { int i; struct list @next;} *a, *b;

Region r = newregion();b = rcons(r, 77, null);a = rcons(r, 23, b);

RC

23

77

a

b

r

2

Example

struct list { int i; struct list @next;} *a, *b;

Region r = newregion();b = rcons(r, 77, null);a = rcons(r, 23, b);b->next = a;

RC

23

77

a

b

r

2

Example

struct list { int i; struct list @next;} *a, *b;

Region r = newregion();b = rcons(r, 77, null);a = rcons(r, 23, b);b->next = a;a = b = null;

RC

23

77

a

b

r

0

Example

struct list { int i; struct list @next;} *a, *b;

Region r = newregion();b = rcons(r, 77, null);a = rcons(r, 23, b);b->next = a;a = b = null;deleteregion(r);

a

b

Region advantages (over regular RC):

good for cyclic structures

space cost of RCs is negligible

RC: Type annotations

User-view: int *traditional x: "traditional" C pointer (not to region) struct list { int i; struct list *sameregion next; }:

pointer within same region

Abstract view (ignoring issues with null): region type system (like for static systems) with addition

of existential types: = … | . and runtime checks anylist = .<int, 1.anylist[1]> @ list = .<int, list[]> @ runtime check that two region variables are identical:

chk 1 = 2

RC: Implementation

Compiles to CMost RC updates for local variables are

avoidedAssignments to fields and globals produce

obvious RC updates (16-23 inst. cost)Deleting a region is expensive (scan)

RC: Experiments

Machine: 333 MHz UltraSparc I, Solaris 2.7Benchmarks: 8 medium to large C

programsRegions vs malloc/freeC compiler: gcc 2.95Measurements with UltraSparc internal

counters

The Benchmarks

Eight C programs:cfrac: factorise large integersgröbner: Find the Gröbner basis of a set of

polynomialsmudlle: byte-code compiler lcc: the lcc compiler tile: partitions text files based on word frequencymoss: software plagiarism detector rc: RC compilerapache: apache web server

Results: Ease of Use(from old implementation)

Size of substantive changes: cfrac: 18 of 4203 lines gröbner: 111 of 3219 lines mudlle: 22 of 5078 lines lcc: 349 of 12430 lines tile: 10 of 926 lines moss: 4 of 2675 lines

Types of changes: extra copying clear unused references work around prototype limitations

Results: Execution Time

012345678

time

(s)

malloc

RC

Results: Safety overhead

RC overhead

0

5

10

15

20

25%

ex

ec

uti

on

tim

e

RC overhead

Dynamic Checking Limitations

Runtime overhead: 0-20%Must clear dangling referencesSmall number of objects/region is bad:

RC more painful space & time overhead