RGF M.S. Thesis Presentaton 12/01 1
Reduced Development Costs
in the
Operating System
RGF M.S. Thesis Presentaton 12/01 2
OutlineOutline Premise of thesis Results obtained Sample project What is a single address space What is a Single Address Space Operating System (SASOS) What is Sombrero
– Features and services– Services used in project
What is a Multiple Address Space Operating System (MASOS)– Overview– Services used in project
Compare Sombrero services to Windows 2000 services Why costs are reduced under Sombrero Metric selection criteria Test Environment
– Execution– Metrics
Results– Metrics– Other factors
Conclusions Questions
RGF M.S. Thesis Presentaton 12/01 3
PremisePremise
We believe that the Sombrero operating system can lead to significant cost savings over the software lifecycle– A result of Sombrero’s architecture is a
simplified development environment Powerful system abstractions are transparent to the
developer Allows programmer to concentrate on application
not the system
RGF M.S. Thesis Presentaton 12/01 4
ResultsResults Halstead
Program Effort
HalsteadBug
Prediction
McCabe LOC Basic COCOMO
Windows
Phase 1 2,637,658 6.36 85 852 2.03
Phase 2 4,287,457 8.80 121 1348 3.28
Phase 3 4,420,421 8.98 127 1420 3.46
Phase 4 6,204,183 11.26 145 1612 3.96
DSM 4,987,458 9.73 132 1481 3.62
Sombrero
Phase 1 1,025,712 3.39 67 639 1.50
Phase 2 1,084,757 3.52 65 647 1.52
Phase 3 1,135,836 3.63 68 719 1.70
Phase 4 1,135,836 3.63 68 719 1.70
RGF M.S. Thesis Presentaton 12/01 5
Sample ProjectSample Project
Why a simple DBS for a project?– Needed to embody system services to
demonstrate the differences between MASOS and SASOS API.
– DBS was something that can be identified with.
Why four phases?– Allows us to measure how the application
features affect complexity.
RGF M.S. Thesis Presentaton 12/01 6
4 Phases of Functionality4 Phases of Functionality Phase 1 – Simple DBS class
– Single threaded– Linked directly to client, no IPC required– Requires exclusive access to dataset
No sharing
Database Class
Open()Close()Insert()Search()Delete()Replace()
Local Data On Disk
Driver Application
RGF M.S. Thesis Presentaton 12/01 7
4 Phases of Functionality4 Phases of Functionality(Windows 2000 View)(Windows 2000 View)
Phase 2 – Multiple clients– Client - Server paradigm
Server supports multiple clients– Single threaded
– Allows multiple clients to access the same dataset
• Execution is serialized
Client
Serverdatabase-1database-2database-3
database-1
Client
database-2
Client
database-3
Client
Open()Close()Insert()Search()Delete()Replace()
Server
Open()Close()Insert()Search()Delete()Replace()
Database Operation requests
Operation results
TCP/IP
TCP/IP
RGF M.S. Thesis Presentaton 12/01 8
4 Phases of Functionality4 Phases of Functionality(Windows 2000 View)(Windows 2000 View)
Phase 3 – Multithreading– Multiple clients– Multi-threaded servers
Locking algorithm
Close()Insert()Search()Delete()Replace()
Dataset
Loop construct waits for DBS request from client
exec a worker threadfor the DBS method
Act onLocal Data
Database Server
RGF M.S. Thesis Presentaton 12/01 9
4 Phases of Functionality4 Phases of Functionality(Windows 2000 View)(Windows 2000 View)
Phase 4 – Distribution and Load Balancing– Read operations round robin– Data must stay consistent across multiple servers
Write operations can be load balanced with DSM
Client
Serverdatabase-2database-5
Serverdatabase-1database-2database-4
Serverdatabase-1database-2database-4
Serverdatabase-1database-2database-3
database-1
Client
database-4
Search requests are sent round-robin to available servers
Client
database-5
Client
Serverdatabase-1database-2database-4
Serverdatabase-1database-2database-4
Serverdatabase-1database-2database-3
database-1
Client
database-4
Client always sends write requests to the same server
Database-1 write operationDatabase-4 write operation
Duplicate
DB-1 Write
Duplicate
DB-4 Write
Duplicate
DB-1 Write
RGF M.S. Thesis Presentaton 12/01 10
What is a Single Address SpaceWhat is a Single Address Space
Virtual addresses can be permanently and uniquely bound to all code and data objects– VA can serve as unique names– VA space can serve as the only namespace
The virtual address namespace spans all levels of the storage hierarchy on every node– All physical storage can be viewed as a hierarchy of
caches for the contents of virtual addresses
RGF M.S. Thesis Presentaton 12/01 11
What is a Single Address Space Operating What is a Single Address Space Operating System (SASOS)System (SASOS)
Address translations remain the same for all programs Threads are free to travel throughout the VA space with no
changes in the environment in which they are running in except for protection context
Network-wide communication requires no prior or additional setup
Internal pointers and pointers into other objects remain the same across all levels of storage and all programs
Persistence can be obtained without the use of a separate file system
Protection by restricting what a computation is allowed to access rather than what it is allowed to address– managing IPC is reduced to managing protection
RGF M.S. Thesis Presentaton 12/01 12
What is SombreroWhat is Sombrero
SASOS design that includes special purpose protection hardware– Only one privilege mode
System architecture can be flat and modular - OS services, environment servers and user programs accessible to each other via ordinary procedure calls
OOD and OOP can be directly supported - an object class can be implemented directly as a protection domain and a server as an instantiation of the class
RGF M.S. Thesis Presentaton 12/01 13
What is Sombrero (Cont.)What is Sombrero (Cont.) The common VA space enables simple scaleable per-
object copy set replication and consistency management– Pointers have the same meaning to all threads of execution on the
network.
Object granularity can be independent of page granularity
Clean separation of OS support for protection implementation; user definition of protection policy is simple– O/S provides services by granting permissions to access specific
interfaces
RGF M.S. Thesis Presentaton 12/01 14
Additional Sombrero FeaturesAdditional Sombrero Features
PersistenceThread and Data MigrationData ReplicationPassive ServersLoad BalancingData Replication
RGF M.S. Thesis Presentaton 12/01 15
PersistencePersistence The Virtual address namespace spans all levels of the storage
hierarchy on every node– All physical storage can be viewed as a hierarchy of caches for the
contents of virtual addresses– Lowest layer of hierarchy is non-volatile
Persistence without use of a separate file system– Traditional file systems are still available but not required
Instantiation of objects (code, thread state and data) can persist without a thread of execution.
– All code and data created by allocate and “new” persists for the life of the parent object
– Objects created at the command line with “new” persist until explicitly deleted. NOTE: This is key for creating passive servers
Persistent code and data have the same VA on all nodes– H/W protection mechanism provides access control
RGF M.S. Thesis Presentaton 12/01 16
Thread / Data MigrationThread / Data Migration
Thread Migration: Threads are free to travel throughout the network as thread context (stack, virtual addresses and pointers) remain the same.
– Threads can be scheduled to run anywhere on the network; network nodes appear like CPUs in a multiprocessor.
Data Migration: Data can be transparently replicated throughout the network for performance or availability.
– Data can migrate to a thread of execution for performance– The location of data is transparent to the user
RGF M.S. Thesis Presentaton 12/01 17
Data ReplicationData Replication
Data Replication– Datasets can be transparently replicated for availability or
performance
RGF M.S. Thesis Presentaton 12/01 18
Passive ServerPassive Server Important Sombrero feature A passive server is an object (instantiation of a class) that persists but
without any threads of execution– All passive server code and data persist until the object is destroyed– Passive servers are accessed by client threads of execution via a proxy
class Looks like standard pointer to an object Services supplied to clients via local subroutine call
– Passive server threads are supplied by the client when the client accesses the server’s methods
Passive server is animated by the client’s thread of execution Client thread migrates to passive server
Passive server objects can be created using the Sombrero “new()” service at a command line
RGF M.S. Thesis Presentaton 12/01 19
Sombrero Versus SASOS’ That Sombrero Versus SASOS’ That Use Stock RISC ProcessorsUse Stock RISC Processors
Sombrero H/W based SASOS does not have legacy features of stock CPU based SASOS– Object grained protection mechanisms
Native support for OOD and OOP
– Improved program development environment Sombrero applications do not have to manage
capabilities – protection provided by hardware
Greater performanceBetter support for distributed applications
RGF M.S. Thesis Presentaton 12/01 20
Sombrero Services Used In Sombrero Services Used In Sample ApplicationSample Application
Create persistent object– New form of “new” accessible from the
command line Required to create passive server
Allocate– Allocates virtual memory– Similar to malloc with file open semantics– Can define replication and duplication for
object
RGF M.S. Thesis Presentaton 12/01 21
Simplification of Complex Simplification of Complex ServicesServices
Persistence, single VA, and thread & data migration allow additional abstractions that simplifies the development environment.– Passive servers are written as single threaded applications but
multithread by default. Locking semantics must be included to protect critical sections.
Load balancing, distribution and fault tolerance– The Sombrero scheduler can migrate threads or data transparently.
Thus execution can be distributed across all nodes as needed.– Forced data and code distribution by passing a parameter to new()
or allocate().– Sombrero manages scheduling and data consistency.
RGF M.S. Thesis Presentaton 12/01 22
What is a MASOSWhat is a MASOS
Multiple Address Space Operating System– Process oriented– Multiple domains – Multiple namespaces
RGF M.S. Thesis Presentaton 12/01 23
Basic WIN32 API ServicesBasic WIN32 API Services
File Operations– Required to save dataset status– Required for mapping datasets into VM
Memory Mapping– Required to access dataset as memory
IPC– TCP/IP is used to pass message between client and server
Threads / Synchronization– Standard thread and synchronization primitives where used
RGF M.S. Thesis Presentaton 12/01 24
Required Additional Windows Required Additional Windows ServicesServices
Mapped Memory Management– Manage VM when mapped data is larger than address spaces– Swizzle pointers
Messaging– Setup client / server communications– Compose / interpret messages
Data Distribution– Duplicate messages to multiple server to keep data consistent
Load balancing– Send DBS requests round-robin to set of duplicated servers
Threads– Create a thread dispatcher in the server to create worker threads
RGF M.S. Thesis Presentaton 12/01 25
Mapped Memory ManagementMapped Memory Management
0x000000000000
0x1FFFFFFFFFFF
0x100000FFF
0x5000
Disk based data file (over 32 bits in size)
32 bit Virtual address space
196KB (0x111000) view mapped into VM
Addresses in the data’s namespace
0xFFFFFFFF
0x00000000
0x4000
0x115000
Addresses in VM’snamespace
0x1000
0x116000
Mapped Memory Management– Dataset is larger than VM– Multiple namespaces for the same data
RGF M.S. Thesis Presentaton 12/01 26
Message PassingMessage Passing
Mechanism to move data from one namespace to another
Must agree on protocol/formatsSender must compose messagesReader must interpret messages
RGF M.S. Thesis Presentaton 12/01 27
Windows 2000 Services Mapped Windows 2000 Services Mapped to Sombreroto Sombrero
Windows 2000 Mapped Memory /
Memory Mgt / Swizzling
IPC / Message passing
Multithreaded Server
Data replication / distribution
Load balancing
Sombrero Single Persistent VA
Passive server*
Passive Server**
New() / Allocate()
Sombrero scheduler
****No message passing required.No message passing required.**No explicit code required to enable multithreading. Only code required is to protect critical sections.**No explicit code required to enable multithreading. Only code required is to protect critical sections.
RGF M.S. Thesis Presentaton 12/01 28
Why Costs are Reduced Why Costs are Reduced Under SombreroUnder Sombrero
No message passing– Passive server supplies services via local subroutine call
Single persistent name space– No translations between namespaces
No pointer swizzling No memory mapped data
– Data is always available at the same address– Native support for OOD and OOP
Complex services now transparent– Distribution– Replication– Fault tolerance
RGF M.S. Thesis Presentaton 12/01 29
MetricsMetrics
Different ways to measure S/W development costs– LOC
Size
– Halstead Vocabulary
– McCabe Complexity
RGF M.S. Thesis Presentaton 12/01 30
ResultsResults Halstead
Program Effort
HalsteadBug
Prediction
McCabe LOC Basic COCOMO
Windows
Phase 1 2,637,658 6.36 85 852 2.03
Phase 2 4,287,457 8.80 121 1348 3.28
Phase 3 4,420,421 8.98 127 1420 3.46
Phase 4 6,204,183 11.26 145 1612 3.96
DSM 4,987,458 9.73 132 1481 3.62
Sombrero
Phase 1 1,025,712 3.39 67 639 1.50
Phase 2 1,084,757 3.52 65 647 1.52
Phase 3 1,135,836 3.63 68 719 1.70
Phase 4 1,135,836 3.63 68 719 1.70
RGF M.S. Thesis Presentaton 12/01 31
Test EnvironmentTest Environment
How the Windows 2000 application was tested.
How metrics were captured for each stage of development.
RGF M.S. Thesis Presentaton 12/01 32
Phase 1: Single Client, Single Phase 1: Single Client, Single Thread of ExecutionThread of Execution
Client and Server are the same applicationSingle thread of executionSingle dataset
PC-1
Driver Application
Phase-1 DatabaseClass:
Database() Insert() Search() Delete() Replace() ~Database()
Local Storage
Database A
RGF M.S. Thesis Presentaton 12/01 33
Phase 2: Multiple Clients, Single Phase 2: Multiple Clients, Single
Server, Single ThreadedServer, Single Threaded Client and server processes Server supports
– Multiple clients– Multiple datasets
PC-1
Database Client for DataSet-A
Database Client for DataSet-A
Database Client for DataSet-B
PC-2
DataSet-A
DataSet-B
DataSet-C
Database Client for DataSet-C
Database Server
Driver Application
Phase-1 DatabaseClient Class:
Database() Insert() Search() Delete() Replace() ~Database()
TCP/IP
Database Server
Phase-1 DatabaseServer Class:
Database() Insert() Search() Delete() Replace() ~Database()
RGF M.S. Thesis Presentaton 12/01 34
Phase 3: Multithreaded serverPhase 3: Multithreaded server
Add Multithreading to the DBS server
Close()Insert()Search()Delete()Replace()
Dataset
Loop construct waits for DBS request from client
exec a worker threadfor the DBS method
Act onLocal Data
Database ServerPC-1
Database Client for DataSet-A
Database Client for DataSet-A
Database Client for DataSet-B
PC-2
DataSet-A
DataSet-B
DataSet-C
Database Client for DataSet-C
Database Server
RGF M.S. Thesis Presentaton 12/01 35
Phase 4: Load balancing and Phase 4: Load balancing and Data DistributionData Distribution
Add load balancing and data distribution to DBS server
PC-1
Database Client for DataSet-A
Database Client for DataSet-B
Database Client for DataSet-A
PC-2
DataSet-A
DataSet-B
Database Client for DataSet-B
Database Server
DataSet-A
DataSet-B
Database Server
RGF M.S. Thesis Presentaton 12/01 36
MetricsMetrics For each development phase
– Create an additional file that is the concatenation of all project files (except driver application).
Use Power Software to:– Generate McCabe’s and LOC measurements on the project files (less driver
application).– Generate Halstead’s vocabulary and bug prediction on the concatenated file.
Total the individual (per method) McCabe’s metrics into Excel
*I ported one phase of development to Unix to gather metrics using Motorola’s metric tool from McCabe Software. McCabe provides metrics per project not just per file or methods. McCabe’s returned similar results as obtained using the method described above.
Multiple ProjectFiles
Single ProjectFile
Metrics Tool
Metrics
RGF M.S. Thesis Presentaton 12/01 37
Other factorsOther factors
Trends in modern computing – Software is getting more complex
Distributed Load balanced High availability
Psychology– Developers are influenced by perceived complexity
Sample application did not include a pre-processor to simulate persistence in the MASOS– That was available in Sombrero– Would tilt metrics more toward Sombrero
RGF M.S. Thesis Presentaton 12/01 38
ConclusionsConclusions
It is likely that for many applications Sombrero can reduce complexity– Reduced complexity leads to simpler
application, thus lower costs.– Computational intensive applications like a
single threaded math or graphics package would probably not benefit from Sombrero.
– Many other large applications (i.e DBS) could benefit from the Sombrero Operating System.
RGF M.S. Thesis Presentaton 12/01 39