Copyright © 2005 Andrei Alexandrescu 1 Chromed Metal Safe and Fast C++ Andrei Alexandrescu...

23
Copyright © 2005 Andrei Alexandrescu 1 Chromed Metal Safe and Fast C++ Andrei Alexandrescu [email protected]

Transcript of Copyright © 2005 Andrei Alexandrescu 1 Chromed Metal Safe and Fast C++ Andrei Alexandrescu...

Page 1: Copyright © 2005 Andrei Alexandrescu 1 Chromed Metal Safe and Fast C++ Andrei Alexandrescu andrei@metalanguage.com.

Copyright © 2005 Andrei Alexandrescu 1

Chromed Metal

Safe and Fast C++

Andrei [email protected]

Page 2: Copyright © 2005 Andrei Alexandrescu 1 Chromed Metal Safe and Fast C++ Andrei Alexandrescu andrei@metalanguage.com.

Copyright © 2005 Andrei Alexandrescu 2

Agenda

Modularity and speed: a fundamental tension Example: memory allocation

Policies Eager Computation Segregate functionality Costless refinements

Based on “Composing High-Performance Memory Allocators” by Berger et al: www.heaplayers.org

Page 3: Copyright © 2005 Andrei Alexandrescu 1 Chromed Metal Safe and Fast C++ Andrei Alexandrescu andrei@metalanguage.com.

Copyright © 2005 Andrei Alexandrescu 3

Modularity: good

Developing systems from small parts is good Best known way to manage complexity

Abstraction is good Modularity and abstraction go hand in

hand Separate development is good Separate testing is good

Confinement of bugs is good

Page 4: Copyright © 2005 Andrei Alexandrescu 1 Chromed Metal Safe and Fast C++ Andrei Alexandrescu andrei@metalanguage.com.

Copyright © 2005 Andrei Alexandrescu 4

Speed: good

Getting work done is good (?)Libraries that don’t exact penalties

are good Lossless growth is good

Compounded inefficiency: abstraction’s worst enemy

Page 5: Copyright © 2005 Andrei Alexandrescu 1 Chromed Metal Safe and Fast C++ Andrei Alexandrescu andrei@metalanguage.com.

Copyright © 2005 Andrei Alexandrescu 5

Modularity and Speed

Fundamental tension: Modularity asks for separation, hiding,

abstraction, and uniform interfaces Speed asks for coalescing,

transparency, specialization, and non-uniformity

How to resolve the tension?

Page 6: Copyright © 2005 Andrei Alexandrescu 1 Chromed Metal Safe and Fast C++ Andrei Alexandrescu andrei@metalanguage.com.

Copyright © 2005 Andrei Alexandrescu 6

Two Approaches

Defer compilation/optimization Develop subsystems separately, have the

runtime optimize when it sees them all Various JIT approaches

Expedite computation/exposure Develop subsystems separately, have the

compiler see them all early Various macro and compilation systems

Page 7: Copyright © 2005 Andrei Alexandrescu 1 Chromed Metal Safe and Fast C++ Andrei Alexandrescu andrei@metalanguage.com.

Copyright © 2005 Andrei Alexandrescu 7

Example: Memory Allocation

Memory allocation: Very hard to modularize/componentize Highly competitive:

General-purpose allocators: 100 cycles/allocSpecialized allocators: < 12 cycles/alloc

Templates: Compute things early Expose modular code early

Page 8: Copyright © 2005 Andrei Alexandrescu 1 Chromed Metal Safe and Fast C++ Andrei Alexandrescu andrei@metalanguage.com.

Copyright © 2005 Andrei Alexandrescu 8

Idea #1: mixins/policies

Create uncommitted, “for adoption” derived classes

template <class Base>struct Heap : public Base { void* Alloc(size_t); void Dealloc(void*);};

Exposes modular code early

Page 9: Copyright © 2005 Andrei Alexandrescu 1 Chromed Metal Safe and Fast C++ Andrei Alexandrescu andrei@metalanguage.com.

Copyright © 2005 Andrei Alexandrescu 9

Top Class

Can’t defer forever, so without further ado…

struct MallocHeap { void* Alloc(size_t s) { return malloc(s); } void Dealloc(void* p) { return free(p);};

Page 10: Copyright © 2005 Andrei Alexandrescu 1 Chromed Metal Safe and Fast C++ Andrei Alexandrescu andrei@metalanguage.com.

Copyright © 2005 Andrei Alexandrescu 10

Idea #2: Eager Computation

Avoid redundant and runtime computation safely!

class TopHeap { void* Alloc(size_t) { ... } void Dealloc(void*) { ... } friend void* Alloc(Heap & h, size_t s) { return h.AllocImpl( (s + AlignBytes - 1) & ~(AlignBytes - 1))); } friend void Dealloc(Heap & h, void* p) { return h.Dealloc(p); }};

Page 11: Copyright © 2005 Andrei Alexandrescu 1 Chromed Metal Safe and Fast C++ Andrei Alexandrescu andrei@metalanguage.com.

Copyright © 2005 Andrei Alexandrescu 11

Idea #3: Segregate Representation

template <class Base>class SzHeap : public Base { void* Alloc(size_t s) { size_t * pS = static_cast<size_t*>( Base::AllocImpl(s + sizeof(size_t))); return *pS = s, pS + 1; } void Dealloc(void* p) { Base::Dealloc(static_cast<size_t*>(p) – 1); } size_t SizeOf(void* p) { return (static_cast<size_t*>(p))[-1]; }};

Page 12: Copyright © 2005 Andrei Alexandrescu 1 Chromed Metal Safe and Fast C++ Andrei Alexandrescu andrei@metalanguage.com.

Copyright © 2005 Andrei Alexandrescu 12

Free Lists

Unbeatable specialized allocation method

Put deallocated blocks in a freelist Consult the freelist when allocating Disadvantage: fixed size, no

coallescing, no reallocation

Page 13: Copyright © 2005 Andrei Alexandrescu 1 Chromed Metal Safe and Fast C++ Andrei Alexandrescu andrei@metalanguage.com.

Copyright © 2005 Andrei Alexandrescu 13

Free Lists Layer

template <size_t S, class Base>class FLHeap : public Base { void* Alloc(size_t s) { if (s != S || !list_) { return Base::AllocImpl(s); } void * p = list_; list_ = list_->next_; return p; } ...

Page 14: Copyright © 2005 Andrei Alexandrescu 1 Chromed Metal Safe and Fast C++ Andrei Alexandrescu andrei@metalanguage.com.

Copyright © 2005 Andrei Alexandrescu 14

(continued)

... void Dealloc(void * p) { if (SizeOf(p) != S) return Base::Dealloc(p); list * pL = static_cast<List*>(p); pL->next_ = list_; list_= pL; } ~FLHeap() { ... }private: struct List { List * next_; }};

Page 15: Copyright © 2005 Andrei Alexandrescu 1 Chromed Metal Safe and Fast C++ Andrei Alexandrescu andrei@metalanguage.com.

Copyright © 2005 Andrei Alexandrescu 15

Remarks

There is no source-level coupling between the way the size is maintained and computed, and FLHeap Combinatorial advantage

There is coupling at the object code level + Optimization - Separate linking, dynamic loading…

Page 16: Copyright © 2005 Andrei Alexandrescu 1 Chromed Metal Safe and Fast C++ Andrei Alexandrescu andrei@metalanguage.com.

Copyright © 2005 Andrei Alexandrescu 16

Building a Layered Allocator

typedef FLHeap<64, FLHeap<32,

SzHeap<MallocHeap> > >

MyHeap;

Modular Easy to understand Easy to change Efficient

Page 17: Copyright © 2005 Andrei Alexandrescu 1 Chromed Metal Safe and Fast C++ Andrei Alexandrescu andrei@metalanguage.com.

Copyright © 2005 Andrei Alexandrescu 17

Idea #4: Costless Refinements

template <class Heap>struct CanResize { enum { value = 0 }; }; template <class Heap>bool Resize(Heap &, void*, size_t &) { return 0;}

Refined implementations will “hide” the default and specialize CanResize

Can test for resizing capability at compile tim or runtime

Page 18: Copyright © 2005 Andrei Alexandrescu 1 Chromed Metal Safe and Fast C++ Andrei Alexandrescu andrei@metalanguage.com.

Copyright © 2005 Andrei Alexandrescu 18

Range Allocators

template <size_t S1, size_t S2, class Base>class RHeap : public Base { void* Alloc(size_t s) { static_assert(S1 < S2); if (s >= S1 && s < S2) s = S2; return Base::AllocImpl(s); } ...}; Improved speed at the cost of slack memory User-controlled tradeoff

Page 19: Copyright © 2005 Andrei Alexandrescu 1 Chromed Metal Safe and Fast C++ Andrei Alexandrescu andrei@metalanguage.com.

Copyright © 2005 Andrei Alexandrescu 19

Idea #2 again: Eager computation

template <size_t S1, size_t S2, size_t S3, class B>void* RHeap<S1, S2, RHeap<S2, S3, B> >::Alloc(size_t s) { static_assert(S1 < S2 && S2 < S3); if (s >= S1 && s < S3) { s = s < S2 ? S2 : S3; } return Base::AllocImpl(s); } ...};

Page 20: Copyright © 2005 Andrei Alexandrescu 1 Chromed Metal Safe and Fast C++ Andrei Alexandrescu andrei@metalanguage.com.

Copyright © 2005 Andrei Alexandrescu 20

Further Building Blocks

Profiling and debug heaps MT heaps

Locked Lock-free

Region-based Alloc bumps a pointer Dealloc doesn’t do a thing Destructor deallocates everything

Page 21: Copyright © 2005 Andrei Alexandrescu 1 Chromed Metal Safe and Fast C++ Andrei Alexandrescu andrei@metalanguage.com.

Copyright © 2005 Andrei Alexandrescu 21

Performance

1%-8% speed improvement over gcc’s ObStack

2%-3% speed loss over the Kingsley allocator

2% faster – 20% slower than Lea’s allocator Lea: monolithic general-purpose allocator Optimized for 7 years

Memory consumption similar within 5%

Page 22: Copyright © 2005 Andrei Alexandrescu 1 Chromed Metal Safe and Fast C++ Andrei Alexandrescu andrei@metalanguage.com.

Copyright © 2005 Andrei Alexandrescu 22

Conclusions

Modularity and efficiency are at odds Templates offer black-box source, white-box

compilation A few idioms for efficient, safe idioms:

Policies Eager Computation Segregate functionality Costless refinements

Page 23: Copyright © 2005 Andrei Alexandrescu 1 Chromed Metal Safe and Fast C++ Andrei Alexandrescu andrei@metalanguage.com.

Copyright © 2005 Andrei Alexandrescu 23

Bibliography

Emery Berger et al., “Composing High-Performance Memory Allocators”, PLDI 2001

Yours Truly and Emery Berger, “Policy-Based Memory Allocation”, CUJ Dec 2005