Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!

38
Oct 26, 2008 DynAMOS -- KOCSEA '08 1 Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting! Kyung Dong Ryu <[email protected]> IBM T.J. Watson Research Center Kristis Makris <[email protected]> Arizona State University

description

Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!. Kyung Dong Ryu IBM T.J. Watson Research Center Kristis Makris Arizona State University. IBM Research Worldwide. 2700 Researchers in eight labs around the world. - PowerPoint PPT Presentation

Transcript of Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!

Page 1: Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!

Oct 26, 2008 DynAMOS -- KOCSEA '081

Runtime Mutation of Commodity Operating System Kernels

orPlease, No More Rebooting!

Kyung Dong Ryu <[email protected]>IBM T.J. Watson Research Center

Kristis Makris <[email protected]>Arizona State University

Page 2: Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!

Oct 26, 2008 DynAMOS -- KOCSEA '082

IBM Research Worldwide

2700 Researchers in eight labs around the world

Page 3: Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!

Oct 26, 2008 DynAMOS -- KOCSEA '083

Culture of Innovation External Recognition

5 National Medals of Science5 Nobel Laureates8 National Medals of Technology

6 Turing Awards

59 Members in National Academy of Engineering

21 Members in NationalAcademy of Sciences

More than 300 Professional Society Fellows

10 Inductees in NationalInventors Hall of Fame

Scanning Tunneling Microscope

Electron Tunneling

Effect

High Temperature Superconductivity

Nuclear Magnetic Resonance Techniques

Basis for MRI today

High Performance

Computing

First woman recipient in the history of this prestigious ACM award

DRAMSiGe

Silicon-on-Insulator

Copper Chip Technology

• AAAS

• ACM

• ACS

• APS

• AVS

• ECS

• IEEE

• IOP

• OSA

Page 4: Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!

Oct 26, 2008 DynAMOS -- KOCSEA '084

IBM Patent Leadership – 2006

3,621

2,4512,366

2,2282,110

1,9591,771 1,731 1,671

1,610

0

500

1,000

1,500

2,000

2,500

3,000

3,500

4,000

IBM Samsung Canon Matsushita HPQ Intel Sony Hitachi Toshiba Micron

Page 5: Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!

Oct 26, 2008 DynAMOS -- KOCSEA '085

Overview Motivation Dynamic Kernel Updates Categorization System Architecture Adaptive Function Cloning Synchronized Updates Applications Conclusion

Page 6: Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!

Oct 26, 2008 DynAMOS -- KOCSEA '086

Motivation Dynamic kernel updates are essential Existing updating methods are inadequate Two approaches

– Build adaptable OS Specially crafted (K42, VINO, Synthetix) Require OS and application restructuring

– Dynamic code instrumentation No kernel source modification (KernInst, GILK) Basic block code interposition Currently limited

– No procedure replacement– No autonomous kernel adaptability– No safe, complete subsystem update guarantees

Page 7: Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!

Oct 26, 2008 DynAMOS -- KOCSEA '087

Dynamic Updates Categorization (1)

Updating variable values– Update an entry in system call table– Update owner (uid) of an inode

Needs synchronized update– Count number of system calls of a process

Needs state tracking

Updating datatypes– Add new fields in Linux PCB for process checkpointing

Update all functions that use the old datatype, or Maintain new fields in separate data structure

– Does not need state transfer

Page 8: Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!

Oct 26, 2008 DynAMOS -- KOCSEA '088

Dynamic Updates Categorization (2)

Updating single function– Correct a defect

Updating kernel threads– Update memory paging subsystem

Needs update during infinite loop

Updating function groups– Update pipefs subsystem

Needs synchronized update

Page 9: Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!

Oct 26, 2008 DynAMOS -- KOCSEA '089

Our Approach DynAMOS

– Prototype for i386 Linux 2.2-2.6 Dynamic code instrumentation

– No kernel source modification or reboot– Procedure replacement

Adaptive updates– Concurrent execution of multiple versions– State tracking– Autonomous kernel adaptability

Safe updates of complete subsystems– Quiescence detection– Update synchronization (non-quiescent subsystems)– Datatype updates– State transfer

Page 10: Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!

Oct 26, 2008 DynAMOS -- KOCSEA '0810

Unmodified kernelin memory

DynAMOS System Architecture

updatesource

gccld

vmlinux

kernelsource

makeobjectfile

insertmodule new

functionimages

originalfunctionimages

Page 11: Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!

Oct 26, 2008 DynAMOS -- KOCSEA '0811

Unmodified kernelin memory

DynAMOS System Architecture

DynAMOSkernel moduleload DynAMOS

newfunctionimages

originalfunctionimages

Page 12: Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!

Oct 26, 2008 DynAMOS -- KOCSEA '0812

Unmodified kernelin memory

DynAMOS System Architecture

DynAMOSkernel module

Update tool/dev/dynamos

version manager

initiate update

newfunctionimages

originalfunctionimages

Page 13: Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!

Oct 26, 2008 DynAMOS -- KOCSEA '0813

Update tool

Unmodified kernelin memory

DynAMOS System Architecture

DynAMOSkernel module

newfunctionimages

image relocation

disassembler

prepare update

version manager

copy

originalfunctionimages

/dev/dynamos

cloned newfunctionimages

Page 14: Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!

Oct 26, 2008 DynAMOS -- KOCSEA '0814

Unmodified kernelin memory

DynAMOS System Architecture

DynAMOSkernel module

version manager

cloned newfunctionimages

originalfunctionimages

newfunctionimages

Update tool/dev/dynamos

cloned newfunctionimages

Page 15: Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!

Oct 26, 2008 DynAMOS -- KOCSEA '0815

Unmodified kernelin memory

DynAMOS System Architecture

DynAMOSkernel module

version manager

activate updateredirection

cloned newfunctionimages

originalfunctionimages

newfunctionimages

/dev/dynamosUpdate tool

Page 16: Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!

Oct 26, 2008 DynAMOS -- KOCSEA '0816

schedule

Execution Flow Redirection

...call schedule...

caller

step 1

Apply Linger-Longer scheduler– Unobtrusive fine-grain cycle stealing– Implemented in schedule_LL as a

scheduling policy

Page 17: Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!

Oct 26, 2008 DynAMOS -- KOCSEA '0817

Execution Flow Redirection

step 2

jmp *schedule

...call schedule...

caller

trampoline

Trampoline installation– Disable processor interrupts– Flush I-cache

Indirect jump– Don’t modify page permissions

redirection handler

Page 18: Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!

Oct 26, 2008 DynAMOS -- KOCSEA '0818

schedule

Execution Flow Redirection

...call schedule...

caller

step 2

trampoline

preserve stateperform bookkeepingexecute adaptation handlerrestore state

Bookkeeping– Maintain use counters

User-defined adaptation handler– Execute if available– Select active version of function

adaptation handler

call

ret

redirection handler

Page 19: Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!

Oct 26, 2008 DynAMOS -- KOCSEA '0819

redirection handler

Execution Flow Redirection

step 3

jmp *

jump to active function

schedule_clone schedule_LL_clone

schedule

...call schedule...

caller

trampoline

adaptation handler

Page 20: Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!

Oct 26, 2008 DynAMOS -- KOCSEA '0820

Execution Flow Redirection

step 4

jump to active function

schedule_clone schedule_LL_clone

jump back jump back

jmp *

schedule

...call schedule...

caller

trampoline

adaptation handler

redirection handler

Page 21: Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!

Oct 26, 2008 DynAMOS -- KOCSEA '0821

Execution Flow Redirection

step 5

jump to active function

schedule_clone schedule_LL_clone

jump back

preserve stateperform bookkeepingrestore stateret

return to caller

jump back

schedule

...call schedule...

caller

trampoline

adaptation handler

redirection handler

Page 22: Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!

Oct 26, 2008 DynAMOS -- KOCSEA '0822

Adaptive Function Cloning Benefits

No processor state saved on stack– Function arguments accessed directly

Autonomous kernel determination of update timeliness– Using adaptation handler

Function-level updates– Basic blocks can be bypassed (no control-flow graph

needed)– Function modifications developed in original source

language

Page 23: Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!

Oct 26, 2008 DynAMOS -- KOCSEA '0823

Function Relocation Issues

Replace ret (1-byte) with jmp * (6-byte) back to handler– Adjust inbound (jmp) and outbound (call) relative offsets

Safely detect– Backward branches: jmp to code overwritten by trampoline– Outbound branches: jmp to code outside function image– Indirect outbound branches: jmp * from indirection table– Data-in-code

Need user verification– Multiple entry-points: e.g. produced by Intel C Compiler

Page 24: Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!

Oct 26, 2008 DynAMOS -- KOCSEA '0824

Overhead Small memory footprint (42k) Indirect addressing (jmp *) hurts branch prediction

– Can use direct addressing (jmp)– Overhead not correlated to path length– Mostly 1-8%

Page 25: Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!

Oct 26, 2008 DynAMOS -- KOCSEA '0825

Quiescence Detection

Needed to– Atomically update function groups

e.g. Count number of processes using a filesystem– Safely reverse updates

Implemented by– Usage counters

On entry and exit– Stack walk-through

For non-returning calls (do_exit in Linux; no ret instruction) Examine stack and program counter of all processes Default kernel compilation (works without frame pointers)

Page 26: Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!

Oct 26, 2008 DynAMOS -- KOCSEA '0826

wait fornew datain buffer

wait formore room

in buffer

Non-quiescent Subsystems

pipe_read()

{

...

acquire Sem

while (buffer_empty) {

...

release Sem

L1: sleep

acquire Sem

}

read from data buffer

release Sem

return

}

pipe_write()

{

...

acquire Sem

while (buffer_full) {

...

release Sem

L2: sleep

acquire Sem

}

write in data buffer

release Sem

return

}

Adaptively enlarge pipefs 4k copy bufferduring large data transfers

reader and writer aresynchronized with each other

Page 27: Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!

Oct 26, 2008 DynAMOS -- KOCSEA '0827

Non-quiescent Subsystems

pipe_read()

{

...

acquire Sem

while (buffer_empty) {

...

release Sem

L1: sleep

acquire Sem

}

read from data buffer

release Sem

return

}

pipe_write()

{

...

acquire Sem

while (buffer_full) {

...

release Sem

L2: sleep

acquire Sem

}

write in data buffer

release Sem

return

}

subsystem may never quiescecannot update atomically

quiescentnon-quiescent; sleeping

Page 28: Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!

Oct 26, 2008 DynAMOS -- KOCSEA '0828

Synchronized update of pipefspipe_read() {

acquire Sem

while (4k_buffer_empty) {

release Sem

L1: sleep

acquire Sem

}

read data from 4k_buffer

release Sem

return

}

Phase 1

pipe_read_v3() {

acquire Sem

while (1mb_buffer_empty) {

release Sem

L1: sleep

acquire Sem

}

read data from 1mb_buffer

release Sem

return

}

Page 29: Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!

Oct 26, 2008 DynAMOS -- KOCSEA '0829

Synchronized update of pipefspipe_read() {

acquire Sem

while (4k_buffer_empty) {

release Sem

L1: sleep

acquire Sem

}

read data from 4k_buffer

release Sem

return

}

Semantically equivalent version at source code level

Wait for pipe_read to become inactive

pipe_read_v3() {

acquire Sem

while (1mb_buffer_empty) {

release Sem

L1: sleep

acquire Sem

}

read data from 1mb_buffer

release Sem

return

}

Phase 2

pipe_read_v2() {

acquire Sem

while (4k_buffer_empty) {

release Sem

L1: sleep

acquire Sem

if (must_update) {

phase = 3

STATE TRANSFER

goto new

}

}

read data from 4k_buffer

release Sem

return

new:

}

Page 30: Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!

Oct 26, 2008 DynAMOS -- KOCSEA '0830

Synchronized update of pipefspipe_read() {

acquire Sem

while (4k_buffer_empty) {

release Sem

L1: sleep

acquire Sem

}

read data from 4k_buffer

release Sem

return

}

pipe_read_v2() {

acquire Sem

while (4k_buffer_empty) {

release Sem

L1: sleep

acquire Sem

if (must_update) {

phase = 3

STATE TRANSFER

goto new

}

}

read data from 4k_buffer

release Sem

return

while (1mb_buffer_empty) {

release Sem

sleep

acquire Sem

new:

}

read data from 1mb_buffer

release Sem

return

}

Inline updated version

pipe_read_v3() {

acquire Sem

while (1mb_buffer_empty) {

release Sem

L1: sleep

acquire Sem

}

read data from 1mb_buffer

release Sem

return

}

Phase 2

Page 31: Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!

Oct 26, 2008 DynAMOS -- KOCSEA '0831

pipe_read() {

acquire Sem

while (4k_buffer_empty) {

release Sem

L1: sleep

acquire Sem

}

read data from 4k_buffer

release Sem

return

}

pipe_read_v2() {

acquire Sem

while (4k_buffer_empty) {

release Sem

L1: sleep

acquire Sem

if (must_update) {

phase = 3

STATE TRANSFER

goto new

}

}

read data from 4k_buffer

release Sem

return

while (1mb_buffer_empty) {

release Sem

sleep

acquire Sem

new:

}

read data from 1mb_buffer

release Sem

return

}

pipe_read_v3() {

acquire Sem

while (1mb_buffer_empty) {

release Sem

L1: sleep

acquire Sem

}

read data from 1mb_buffer

release Sem

return

}

Synchronized update of pipefs

Phase 3

Page 32: Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!

Oct 26, 2008 DynAMOS -- KOCSEA '0832

pipe_read_v2() {

acquire Sem

while (4k_buffer_empty) {

release Sem

L1: sleep

acquire Sem

if (must_update) {

phase = 3

STATE TRANSFER

goto new

}

}

read data from 4k_buffer

release Sem

return

while (1mb_buffer_empty) {

release Sem

sleep

acquire Sem

new:

}

read data from 1mb_buffer

release Sem

return

}

pipe_read_v3() {

acquire Sem

while (1mb_buffer_empty) {

release Sem

L1: sleep

acquire Sem

}

read data from 1mb_buffer

release Sem

return

}pipe_read_adaptation_handler() {

if (phase == 3)

activate pipe_read_v3

else

activate pipe_read_v2

if (this process read

more than 64k)

must_update = 1

}

Sleep in original versionAwake in new version

Multi-phase approach

Adaptive update

30-90% improvementin Linux 2.6

3.2% overhead whennot adapting

Synchronized update of pipefs

Phase 3

Page 33: Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!

Oct 26, 2008 DynAMOS -- KOCSEA '0833

Adaptive Memory Paging For Efficient Gang Scheduling

Kernel thread update (kswapd), Linux 2.2– Infinite loop– Awaken by other subsystems– Goes back to sleep

e.g. calls interruptible_sleep_on in Linux

To update– Activate interruptible_sleep_on_v2

Save state, exit Start new version of kernel thread, restore state

Page 34: Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!

Oct 26, 2008 DynAMOS -- KOCSEA '0834

Kernel-Assisted Process Checkpointing

Datatype update for EPCKPT in Linux 2.4– Compact datatypes in commodity kernel. No extra room

struct task_struct: semaphores, pipes, memory mapped files

struct file: checkpoint filename

Shadow data structures– Instantiation (do_fork, sys_open): map memory address

of original variable to shadow using hash table– Removal (do_exit, fput): free shadow too– Already instantiated variables

Shadow missing: idempotent use of new fields– Update only functions that use new fields

No state transfer needed

Page 35: Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!

Oct 26, 2008 DynAMOS -- KOCSEA '0835

Related Work

K42– Specially designed with hot-swappable capabilities– Guarantees quiescence

Ginseng– User-level software updates; requires recompilation

KernInst, GILK, Detours, ATOM, EEL– Do not facilitate adaptive execution– Do not safely replace complete subsystems

Page 36: Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!

Oct 26, 2008 DynAMOS -- KOCSEA '0836

On-going and Future Work

Automatically produce updates given a patch– Apply MOSIX, Superpages: parallel applications– Apply Nooks: OS reliability– Upgrade Linux kernel

Multiprocessor support– Safely install trampoline: freeze other processors

using single-byte trap instruction (ud2)

Kernel module port– FreeBSD, OpenSolaris

Page 37: Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!

Oct 26, 2008 DynAMOS -- KOCSEA '0837

Conclusion

Dynamic Kernel Updates– Dynamic code instrumentation– Commodity operating system (prototype for i386 Linux 2.2-2.6)

Adaptive function cloning– Concurrent execution of multiple function versions

Safe updates of non-quiescent subsystems– Scheduler, kernel threads, synchronized updates

Datatype updates Demonstrated updates

– Synchronized pipefs adaptation, process checkpointing, adaptive memory paging for efficient gang-scheduling, unobtrusive fine-grain cycle stealing, public security fixes

Small memory footprint (42k), 1-8% overhead

Page 38: Runtime Mutation of Commodity Operating System Kernels or Please, No More Rebooting!

Oct 26, 2008 DynAMOS -- KOCSEA '0838

Questions ?