Introduction to Real-Time Operating Systems
-
date post
20-Oct-2014 -
Category
Technology
-
view
6.774 -
download
10
description
Transcript of Introduction to Real-Time Operating Systems
Introduction to Real-Time Operating Systems
Objectives
Understanding Real-Time Operating Systems Types of Real-Time Operating System Requirements for Real-Time Operating
System Difference between General Purpose
Operating System (GPOS) and Real-Time Operating System (RTOS)
2
Objectives
Conversion Linux kernel to support Real-Time operations Patching the linux kernel Major changes in patched kernel
Hands-on labs Conversion of Linux kernel to support real
time Code a real time application (Audio
Feedback removal)
3
Introduction
Operating System (OS) Abstraction layer over the raw hardware Manages the hardware Multiplexes the resources using scheduling
policies Types of operating systems
General purpose OS(GPOS) Optimized for throughput and fairness
Real-Time OS (RTOS) Optimized for handling tasks within timing limits
4
Types of RTOS
Hard RTOS Strict timing requirements Optimized w.r.t. predictability and timing
limits Does not share critical resources such as
CPU Missing a single deadline cause system
failure Examples of application under hard real
time Defense applications Fuel ignition controller for automobile engine Life supporting medical equipment Industrial control system
5
Types of RTOS
6
Soft RTOS Relative tolerable timing deadlines Tolerance is defined as a part of policy Occasional deadline missing may not cause
system failure Examples
Audio processing Real-time video processing Network interface subsystem
GPOS and RTOS comparison
Scheduling Locking mechanism Hardware and software support
7
Scheduling
Makes multitasking feasible Shares resources between tasks Follows a scheduling policy Task priority
A number assigned to task according to its importance
A higher priority task is preferred over a low priority task
8
Scheduling
Preemption Shift of resource control from low to high
priority task Responsibility of kernel Preemption limitations
A code segment can not be preempted when it has acquire some lock which is acquired disabling interrupts
A process can only be preempted after a specified task expiration.
Tradeoffs between coarse and fine grained timer
9
Scheduling
10
Preemption process
Scheduling
Yielding Process itself give up the resources Responsibility of process Program designer has to take care of
yielding Disadvantages
Unable to use available resources efficiently Real-time systems can not rely on voluntary
giving up of the CPU
Priority inversion A low priority task might cause high priority
task to wait for execution 11
Kernel Locks
Locks Avoid a critical code section to be accessed
by two active threads simultaneously Especially important in multicore machines
Effect of locks on real-time response Unpredictable delays for waiting threads Might cause priority inversion Some types of lock disables code
preemptions
12
Summary of comparison
RTOS Unfair scheduling
Scheduling based on priority
Kernel is preemptive either completely or up to maximum degree
Priority inversion is a major issue
Predictable behavior
GPOS Fair scheduling
Scheduling can be adjusted dynamically for optimized throughput
Kernel is non-preemptive or have long non-preemptive code sections
Priority inversion usually remain unnoticed
No predictability guarantees
13
Limitations for RT applications
System management interrupts DMA bus mastering On-demand CPU frequency scaling VGA text console Page faults Context switching
14
Limitations for RT applications
System Management Mode (SMM) SMI cause system to enter in SMM Debugs the hardware Protects system
Shutting down the system if CPU temperature exceeds
Emulates hardware Emulates USB keyboard/mouse to ps2
keyboard/mouse CPU jumps to hardwired memory location to
service SMI SMI has highest priority
15
Limitations for RT applications
System Management Mode (SMM) OS has no control to preempt SMI handler Causes unacceptable delay for RT
applications Delay ranges from tens to hundreds of
microseconds
DMA bus mastering Caused by devices using DMA
SATA/PATA/SCSI devices, network adapters, video cards etc.
Device drivers can increase latency Common issue for all RTOS
16
Limitations for RT applications
On-demand CPU frequency scaling CPU is put in low power state after a period
of inactivity Can cause unpredictable or longer delays
17
Limitations for RT applications
Page faults Occurs when requested data is not available
or its reference is absent from a TLB Types of page faults
Major/Hard page faults Minor/Soft page faults
Major page faults Occurs when requested data has to be fetched
from disk Can cause very large latencies RT application should be written to minimize
Major page faults
18
Limitations for RT applications
Page faults Minor page faults
Occurs when requested data resides in the main memory but missing from TLB
It does not involve IO operation to fetch data from disk
Has negligible impact in RT performance Tips to avoid page faults
Use mlockall() to load all the address space of the program and then lock it such that it cannot be swapped out
19
Limitations for RT applications
Page faults Tips to avoid page faults
Create all threads at startup time of the application because run time thread creation can cause latencies
Avoid dynamic allocation and freeing of memory Minimize the use system calls that are known to
generate page faults
20
Limitations for RT applications
Context switching Flushed pipeline and branch prediction
counters Can cause invalidation of cashes Changes the entries in Instruction and data
TLBs Hence context switching might cause
unacceptable behavior for some real-time application
21
Conversion of GPOS to RTOS
RT support in stock Linux kernel Stock Linux kernel supports soft RT
response Two RT scheduling policies in stock Linux
kernel SCHED_FIFO SCHED_RR
Non-RT scheduling policy SCHED_NORMAL
Static priority is implemented in RT scheduling
sched_setschedular() can be used to manage scheduling policies 22
Conversion of GPOS to RTOS
RT support in stock Linux kernel Scheduling policies definition in
/include/linux/sched.h
23
/* Scheduling policies*/#define SCHED_NORMAL 0#define SCHED_FIFO 1#define SCHED_RR 2
Conversion of GPOS to RTOS
SCHED_FIFO First-in-first-out scheduling policy Preempts all SCHED_NORMAL tasks Tasks under SCHED_FIFO cannot be
preempted Task itself can yield the resources Scheduling is not based on time slots Two SCHED_FIFO process are scheduled as
‘first come first served’ fashion
24
Conversion of GPOS to RTOS
SCHED_RR Same as SCHED_FIFO but involves time
slots Process are scheduled in round-robin
fashion Share resources in allocated time slots
Lower priority task cannot preempt higher priority task
SCHED_RR process preempts SCHED_FIFO process
25
Conversion of GPOS to RTOS
RT patch for stock Linux kernel CONFIG_PREEMPT_RT Adds support of hard RT in kernel Managed by Ingo Molnar Constantly developing patch A suitable patch is required to be selected
corresponding to chosen kernel depending on the kernel version
26
Conversion of GPOS to RTOS
Applying the patch Download the real-time patch
Download the kernel
Kernel and patch should be of same version Commands in these slides are for version
2.6.33.7
27
http://www.kernel.org/pub/linux/kernel/projects/rt/
http://www.kernel.org
Conversion of GPOS to RTOS
Applying the patch Place both kernel and patch in same
directory Unpack the kernel
Change the directory
28
$tar –jxf linux-2.6.33.7.tar.bz2
$cd linux-2.6.33.7
Conversion of GPOS to RTOS
Applying the patch Dry-run the patch
For correct patch output should be like this
29
$bzcat ../patch-2.6.33.7-rt29.bz2 | patch –dry-run –p1
patching file Documentation/hwlat_detector.txtpatching file Documentation/trace/histograms.txtpatching file MAINTAINERSpatching file Makefilepatching file arch/Kconfigpatching file arch/alpha/include/asm/rwsem.hpatching file arch/alpha/kernel/time.cpatching file arch/arm/boot/compressed/Makefile...
Conversion of GPOS to RTOS
Applying the patch In case of some error check the versions of
downloaded kernel and patch Apply the patch if dry-run is succeeded
For uncompressed patch following set of commands can be used from uncompressed kernel directory
30
$bzcat ../patch-2.6.33.7-rt29.bz2 | patch –p1
patch –dry-run –p1 < /path to uncompressed patch/patch-2.6.33.7-rt29patch –p1 < /path to uncompressed patch/patch-2.6.33.7-rt29
Conversion of GPOS to RTOS
Configuration of Linux kernel Open configuration dialog
xconfig or gconfig options can also be used Apply following changes
Activate High Resolution Timer Enable Complete Preemption (Real-Time) Apply power management settings according to
the hardware
31
$make menuconfig
Conversion of GPOS to RTOS
Configuration of Linux kernel Activate High Resolution Timer
32
Conversion of GPOS to RTOS
Configuration of Linux kernel Enable Complete Preemption (Real-Time)
33
Conversion of GPOS to RTOS
Configuration of Linux kernel Power management configuration
Some power management options can raise SMI Disabling some options can cause system
damage Read every option carefully Disable optional or unnecessary options Disable CPU frequency scaling Disable USB mouse/keyboard from BIOS Use ps/2 keyboard/mouse Disable TCO timers
34
Conversion of GPOS to RTOS
Configuration of Linux kernel Disable CPU frequency scaling (Power Management and ACPI Options ->CPU Frequency
scaling)
35
Conversion of GPOS to RTOS
Configuration of Linux kernel Disable TCO timers (Device Drivers---> watchdog timer support)
36
Conversion of GPOS to RTOS
Building the Linux kernel Make the kernel
Make the modules
Install the modules
Install the kernel
37
$make
$make modules
#make modules_install
#make install
Real-Time patch effect on kernel
Locking mechanism Priority inheritance Interrupt handler replaced with kernel
threads High resolution timers
38
Locking mechanism
Locks Prevent the shared resources to be
accessed by two active processes simultaneously.
Simultaneous access to unprotected shared resources will cause races and Heisen bugs
Types of locks Spin locks Semaphores Readers/writer locks
39
Locking mechanism
Spinlocks Waiting threads keep on trying until the lock
is acquired Waiting threads does not sleep Local interrupts are disabled when spinlock
is acquired to avoid deadlocks Used in interrupt handlers Lock should be acquired for short period of
time Interrupts are disabled Waiting threads do not sleep and utilize CPU
resources 40
Locking mechanism
Spinlocks Acquiring a spinlock
raw_spin_lock()(/include/linux/spinlock_api_smp.h)
Releasing spinlock raw_spin_unlock()(/include/linux/spinlock_api_smp.h)
41
preempt_disable();spin_acquire(&lock->dep_map, 0, 0, _RET_IP_);LOCK_CONTENDED(lock, do_raw_spin_trylock,do_raw_spin_lock);
spin_release(&lock->dep_map, 1, _RET_IP_);do_raw_spin_unlock(lock);preempt_enable();
Locking mechanism
Spinlocks Advantages of spinlocks
Low locking overhead Does not require time to sleep and awake the
waiting threads Disadvantages of spinlocks
Interrupts are disables Waiting threads utilize the CPU Lock have to released as early as possible Thread acquiring spinlock cannot sleep
42
Locking mechanism
Semaphores Sleeping locks Waiting threads sleep and wait in wait
queue Waiting task has to be awaken before
acquiring the lock Advantages
Waiting task frees the CPU for other workload Lock can be acquired for long time
Disadvantages Sleeping and awakening overhead Spinlock acquired thread cannot acquire
semaphore 43
Locking mechanism
Semaphores Sleeping locks Waiting threads sleep and wait in wait
queue Waiting task has to be awaken before
acquiring the lock Advantages
Waiting task frees the CPU for other workload Lock can be acquired for long time
Disadvantages Sleeping and awakening overhead Spinlock acquired thread cannot acquire
semaphore 44
Locking mechanism
Spinlock for RT application – A potential problem Unpredictable in nature
Time to acquire lock for waiting thread is dependent on the thread that currently holds the lock
OS cannot force the lock holder thread to release the lock
Non-preemptive sections of Linux kernel Spinlocks disables local interrupts
45
Locking mechanism
Solution for RT behavior Replace spinlock with mutex if possible
Advantages Mutex is a binary semaphore Deterministic in nature Can be preempted by high priority task
46
Locking mechanism
Spinlocks are unavoidable at low level Example: Implementation of mutex
Spinlock usage in RT kernel is just a fraction to that of stock kernel
Reduction in usage reduces probability to contend the same spinlock by two different threads
47
Locking mechanism
Real-Time implementation of spinlocks rt_spin_lock()
rt_spin_lock_fastlock()
48
rt_spin_lock_fastlock(&lock->lock, rt_spin_lock_slowlock);spin_acquire(&lock->dep_map, 0, 0, _RET_IP_);
if (likely(rt_mutex_cmpxchg(lock, NULL, current)))rt_mutex_deadlock_account_lock(lock, current);
elseslowfn(lock);
Locking mechanism
Real-Time implementation of spinlocks rt_spin_unlock()
rt_spin_lock_fastlock()
49
spin_release(&lock->dep_map, 1, _RET_IP_);rt_spin_lock_fastunlock(&lock->lock, rt_spin_lock_slowunlock);
if (likely(rt_mutex_cmpxchg(lock, current, NULL)))rt_mutex_deadlock_account_unlock(current);
elseslowfn(lock);
Priority inheritance
Priority inversion
L and H require same lock but M requires another lock 50
http://book.opensourceproject.org.cn/embedded/oreillyembed/opensource/0596009836/id-i_0596009836_chp_10_sect_4.html
Priority inheritance
Priority inversion Priority inversion may not be noticeable in
GPOS High priority tasks may get starved of
resources in GPOS due to priority inversion High priority task should be executed as
early as possible
51
Priority inversion
Solution of priority inversion Priority inheritance (PI)
Priority inversion Priority Inheritance
52http://www.linuxjournal.com/article/9361?page=0,3
Priority inversion
Priority inversion non-mutual exclusion Locks PI works fine for mutual exclusion locks Problematic in case of other types of locks
Example readers/writer locks PI can cause unacceptable response
Solution Only one task at a time can read Use read-copy-update if necessary
53
Priority inversion
Read-copy-update Technique to share read/write resources Low cost reading
Reading is carried out from local copy of data Does not involve conventional locking for reading
High cost writing Updates a global pointer to new data Maintains copy of old data until all reading
processes are finished Suitable for situation where there are fewer
writes and many read operations
54
Threaded interrupt handler
Interrupt handler Function that runs whenever a particular
interrupt occurs Two parts
Top half Bottom half
Divided into two parts to reduce interrupt disabled time
55
Threaded interrupt handler
Top half Performs essential services to hardware Works with interrupts disabled Gathers raw data Performs critical time sensitive work
Bottom half Performs time consuming tasks Processes raw data
56
Threaded interrupt handler
Interrupt handler in GOPS
57
Threaded interrupt handler
Interrupt handler in GOPS Kernel identifies the interrupt Calls interrupt handler CPU jumps to a particular location Executes top half Raise softirq and returns to pre-interrupt
position This behavior is not acceptable in RTOS
Cause unpredictability and long latencies Local interrupts are disabled for longer time
58
Threaded interrupt handler
Interrupt handler in RTOS Interrupt handler runs as kernel thread
59Building Embedded Linux Systems ISBN-10 = 0596529686, chapter 14,section: Interrupts as threads
Threaded interrupt handler
Interrupt handler in RTOS Top half identifies the interrupt Sends acknowledgement Initializes kernel thread Rest of task is performed as kernel thread
A few interrupts are still handled conventionally
Threaded interrupt handler are also available in stock Linux kernel
60
Threaded interrupt handler
Advantages of conversion of handler to kernel thread Preemptable interrupt handler Reduces interrupt blocked time Reduces probability of priority inversion Handlers scheduling can be controlled by
assigning a priority number to it Reduces unpredictable latencies
61
Threaded interrupt handler
Kernel code snippet request_threaded_irq() is to register
threaded interrupt handler
62
int request_threaded_irq(unsigned int irq, irq_handler_t handler, irq_handler_t thread_fn, unsigned long irqflags, const char *devname, void *dev_id);
Threaded interrupt handler
request_threaded_irq() code snippet Initializes pointers and calls __setup_irq()
63
action->handler = handler;action->thread_fn = thread_fn;action->flags = irqflags;action->name = devname;action->dev_id = dev_id;chip_bus_lock(irq, desc);retval = __setup_irq(irq, desc, action);chip_bus_sync_unlock(irq, desc);
Threaded interrupt handler
__setup_irq() code snippet
preempt_hardirq_setup() code snippet
64
/* Preempt-RT setup for forced threading */preempt_hardirq_setup(new);
new->flags |= IRQF_ONESHOT;new->thread_fn = new->handler;new->handler = irq_default_primary_handler;
Threaded interrupt handler
irq_default_primary_handler() code snippet This function raise the interrupt handler
65
static irqreturn_t irq_default_primary_handler(int irq, void *dev_id){
return IRQ_WAKE_THREAD;}
High resolution timers
Timer triggers an event at a specified time
Shorter the tick of timer clock, higher the timer resolution
Jiffy: Software ticking unit Granularity of time is reciprocal of jiffy Increasing the jiffy value:
improves the resolution Increases the timer overhead
Timer overhead is increased as jiffy is required to be updated more frequently
66
High resolution timers
Improving the timer resolution Shift ticking source from jiffy to hardware
clock Advantages
Improves granularity of clock Reduces timer overhead to update software
entity Frees resources that handle jiffy
67
High resolution timers
Timer wheel Timer as wheel and delays as bucket in the
wheel 1st set of buckets can provide 256 units
delay If an event is to be triggered after 10 units it is
placed in 10th bucket Next level of buckets are used for delay
greater than 256 units Each bucket in 2nd level represents 265 unit
delay An event is placed in next layer of buckets if
delay is greater than 65536 (256 X 256) 68
High resolution timers
Issues with timer wheel implementation in RTOS Unpredictable behavior Rehashing is required for each shift from 2nd
level to 1st level Rehashing is a function of no. of events in
the wheel Delays increases as no of events increase
(Rehashing takes O(n), where n is no. of events in the wheel)
69
High resolution timers
High resolution(HR) timers for real-time kernel Introduced by Thomas Gleixner Division of timers in two types
Action timers Timeout timers
Action timers Measures elapsed time using timestamps
Timeout timers Trigger an event after a specific time
70
High resolution timers
Action timers in timer wheel Rehashing is required for each event
entered in wheel Rehashing takes time order O(n) Entering or removal of an entry is of order
O(1) Rehashing delay is not constant and hence
not predictable and might cause high latency
71
High resolution timers
Timeout timers in wheel Does not require rehashing Operating is of order O(1) Efficient and predictable
HR timers majorly solves action timer problems
72
High resolution timers
HR timers Implements timer in red/black trees Does not involve hash data structure
Advantages Add or removes the node in order O(logn) Nodes in trees are already sorted performance improves form O(n) to O(logn)
where n is usually a large value
73
Examples of RT application
Time bound is the basic characteristic for RT application
Example Control systems in industry Position and speed control system for
synthetic aperture radars (SARs) Operating system for ebook reader e.g.
Kindle Audio feedback cancellation is selected
for detailed discussion
74
Audio feedback cancellation
Output signal is fed back to input and amplified again
Amplified signal includes both input and feedback resulting in noise and suppressing the original signal
Usually undesirable and required to eliminate as feedback development is in process
75
Audio feedback cancellation
Audio feedback block diagram representation
Condition for audio feedback |H(f)Hf(f)| < 1 phase( H(f)Hf(f)) = n360o where n is integer
76
Audio feedback cancellation
Factors effecting feedback Distance Amplification Non constant gain of amplifier Feed-forward path Feed-back path Transfer function of surrounding
environment
77
Audio feedback cancellation
Audio feedback cause the system to oscillate at single frequency
Feedback starts with a number of frequencies
Frequency with higher magnitude grow faster
This frequency consumes greater power Causes the die out of other feedback
frequencies 78
Audio feedback cancellation
Frequency spectrum for a voice sample
79
Audio feedback cancellation
Frequency spectrum for a voice sample with feedback
80
Audio feedback cancellation
Methods of audio feedback cancellation Two major categories
Manual feedback cancellation Automatic feedback cancellation
Manual feedback methods are useful in static environment
In dynamic environments automatic feedback cancellation methods are preffered
81
Audio feedback cancellation
Manual feedback cancellation Move the position of microphone
Changes phase of the signal fed back to input Reduces magnitude of signal if mic is moved
away Tuning the sound source (musical
instruments) according to environment Feedback frequencies attenuators can be
sued if feedback frequencies are known
82
Audio feedback cancellation
Automatic feedback cancellation Two major categories for automatic
feedback cancellation Automatic equalization Frequency shifting
83
Audio feedback cancellation
Automatic equalization Signal spectrum is observed all the time Adaptive filters are used to suppress the
frequency buildup If feedback causes a particular frequency to
increase, adaptive filters adopt themselves to suppress that typical frequency
Filter response is critical
84
Audio feedback cancellation
Frequency shifting Shifts the input signal by a small frequency Breaks feedback path for a particular
frequency Very effective method to remove feedback Disturbs the harmonic relationship between
the signal components Frequency shift should be selected with care
85
Audio feedback cancellation
Frequency shifting
86http://www.alango.com/contents/products/technologies/afr/papers/alango_afc.pdf
Audio feedback cancellation
Implemented feedback removal technique Human voice is highly uncorrelated when
sampled after reasonable time interval Feedback signal is a single tone present in
the voice signal High value of correlation of two non-
overlapping signals taken from voice sample indicates presence of feedback
87
Audio feedback cancellation
Implemented feedback removal technique In case of feedback, remove problematic
frequency component form the signal
Block diagram for feedback cancellation algorithm
88
Audio feedback cancellation
Removing the feedback frequency Adaptive filters are used for removing the
feedback Filter properties
Adaptive Quick to adopt to new frequency Very high Q as to remove a single frequency
89
Audio feedback cancellation
Filter implementation Implementation of such a filter is very
difficult Very large number of taps to get required
response Low Q filter completely distort the signal Frequency domain filter
Procedure for frequency rectification Convert the signal to frequency domain Detect the frequency that is to be removed
Remove that frequency Convert the signal back to time domain
90
Audio feedback cancellation
Block diagram for frequency domain filter
Removing frequency component can be seen as multiplying the signal with a specific vector
91
Audio feedback cancellation
Filter implementation Initially signal is converted to frequency
domain using FFT Required frequency is removed by zeroing
the appropriate elements of FFT output Signal is converted back to time domain
using IFFT
92
Audio feedback cancellation
Results of frequency domain filtering
MATLAB plots for testing audio signal
93
Audio feedback cancellation
Parallelization of filter Major workload in filter is FFT and IFFT Parallelization of filter means to implement
the FFT and IFFT in parallel In the sample code FFT and IFFT is carried out
using FFTW library which is open source library that supports parallel implementation of FFT and IFFT.
Parallelization improves the throughput of the system
94
Audio feedback cancellation
95
Result of parallelization( Data is gathered using intel Xeon 2 GHz, Quad core
based system)
Increase in number of threads my lower the throughput due to synchronization overhead of threads and it might be overcome by increasing the amount of workload.
Threads Execution Time* (ms)
1 1294.873
2 836.96
3 645.57
4 578.07
5 510.23
6 487.7
7 462.05
8 682.008
*Throughput is for 100 second workload
Labs
Conversion of stock kernel to real time kernel Procedure to patch the kernel and to install
it Audio feedback cancellation
Algorithm and C implementation of removal of audio feedback
96