Major Application Areas in Cyberspace Joel Crichlow, Ph.D.

38
Major Application Areas in Cyberspace Joel Crichlow, Ph.D

Transcript of Major Application Areas in Cyberspace Joel Crichlow, Ph.D.

Page 1: Major Application Areas in Cyberspace Joel Crichlow, Ph.D.

Major Application Areas in Cyberspace

Joel Crichlow, Ph.D

Page 2: Major Application Areas in Cyberspace Joel Crichlow, Ph.D.

AreasDistributed File SystemsDistributed Database SystemsDistributed Computation SystemsDistributed Real-Time SystemsDistributed Multimedia SystemsDistributed Operating Systems

Page 3: Major Application Areas in Cyberspace Joel Crichlow, Ph.D.

Distributed File Systems

Structure◦ Client-Server◦ Peer-to-Peer

Issues◦ Unit of Access

◦ File◦ Page/Block◦ Record◦ Word/Byte

Page 4: Major Application Areas in Cyberspace Joel Crichlow, Ph.D.

Distributed File Systems

Issues◦ Division of Labor

◦ Clients maintain own file system◦ Server maintains a global file system

◦ All file commands are channeled to the server◦ Use mounting to combine local file systems with global

file system

Page 5: Major Application Areas in Cyberspace Joel Crichlow, Ph.D.

Distributed File Systems

Client maintains file system

Maps local textual names onto global FIDs

Client Server

User 1 Filename FID FID Page mapUser 2 Friends 100179 entryUser 3 Foes 428761 100179

Filename FID Page Block0 41 72 3

File map

Page 6: Major Application Areas in Cyberspace Joel Crichlow, Ph.D.

Distributed File SystemsClients maintain own file systemGlobal file naming is done at the server levelIf the file server provides automatic backup and recovery facilities, then files can be classified as recoverable, robust or ordinaryThe unit of access available to the client will determine how much data are stored at the server for mapping the client’s logical request onto the physical address

Page 7: Major Application Areas in Cyberspace Joel Crichlow, Ph.D.

Distributed File Systems

Use mounting to combine local file systems with global file system

Server 1 has a directory ‘play’; server 2 has a directory ‘work’. Client 1 places ‘play’ and ‘work’ at the same level; client 2 places ‘work’ in a sub-directory of ‘play’

Server 1: play

playhard

playeasy

workhard

workeasy

Client 1work play

Client 2

play

playhard

playeasy

workhard

workeasy

workhard

workeasy

work playhard playeasy

Server 2: work

Page 8: Major Application Areas in Cyberspace Joel Crichlow, Ph.D.

Google File System (GFS)Latest version is called Colossus.Two of the key issues addressed by the designers were (a) The frequency of component failures.(b) The management of very large data sets.

Page 9: Major Application Areas in Cyberspace Joel Crichlow, Ph.D.

Google File System (GFS)GFS runs on thousands of storage machines built from inexpensive commodity parts, and it is accessed by an equivalent number of client machinesFailure is viewed as the norm rather than the exceptionThe system must constantly monitor itself to detect, tolerate and recover from failure

Page 10: Major Application Areas in Cyberspace Joel Crichlow, Ph.D.

Google File System (GFS)The system supports millions of files of any size, but multi-GB files are common. Many of the accesses to these files are large streaming reads that can read up to 1 MB or more.

Page 11: Major Application Areas in Cyberspace Joel Crichlow, Ph.D.

Google File System (GFS)There are many large sequential writes that append multiple KB to MB of data to filesMultiple clients can append atomically to the same file concurrentlyThere are also small reads of a few KB at any offset and small writes to arbitrary positions in a file

Page 12: Major Application Areas in Cyberspace Joel Crichlow, Ph.D.

Google File System (GFS)The GFS architecture comprises a single master, multiple chunkservers and multiple clients. Files are divided into fixed-size blocks called chunks of 64 MB (current size).

Page 13: Major Application Areas in Cyberspace Joel Crichlow, Ph.D.

Google File System (GFS)The master keeps informed of the current state of the system by sending (Heartbeat) messages periodically to each chunkserver.

The GFS client provides the interface for applications to use the file system.

Page 14: Major Application Areas in Cyberspace Joel Crichlow, Ph.D.

Distributed Database Systems

Distribution Problem and Pattern◦ Volume and Activity◦ Number of Participating Hosts◦ Storage Facilities◦ Communication Load◦ Replication and Partitioning

Page 15: Major Application Areas in Cyberspace Joel Crichlow, Ph.D.

Distributed Database Systems

Queries and Updates

Phases

Query phases◦ Copy identification phase◦ Query decomposition◦ Response composition

Update phases◦ Copy Identification◦ Pessimistic/Optimistic approach

Page 16: Major Application Areas in Cyberspace Joel Crichlow, Ph.D.

Distributed Database Systems

Queries Supplier relation Unit price relation

What are the names of suppliers in NY who supply screws at a unit price of less than $1.00?

S# Name City S# P# Price100 JOHN POS 100 1011 $0.50200 DOE NY 100 1300 $1.50

200 1123 $0.60200 1246 $0.70

 

Parts relation DictionaryP# Pname Quant. Relat. Locat. #Tups T-size1011 Bolt 400 Sup. Site 1 800 101123 Nut 400 Part Site 2 1500 101246 Screw 600 Price Site 3 10000 31300 Nail 500

Site 1Site 2

Site 3

Supplerrelation

Partsrelation

Unit pricerelation

Query is made here

Page 17: Major Application Areas in Cyberspace Joel Crichlow, Ph.D.

Distributed Database Systems

Updates◦ Integrity◦ Concurrency◦ Replication

Page 18: Major Application Areas in Cyberspace Joel Crichlow, Ph.D.

Big Data managementHandles very large amounts of data distributed over many serversHighly available service with no single point of failureKey-value storeDifferent levels of consistencyAutomatic replication of data to multiple nodes

Page 19: Major Application Areas in Cyberspace Joel Crichlow, Ph.D.

Google BigTableGoogle’s NoSQL distributed data management system.BigTable is a sparse map or (key, value) store distributed over multiple servers.It is designed to include clusters comprising thousands of commodity servers storing petabytes of data.

Page 20: Major Application Areas in Cyberspace Joel Crichlow, Ph.D.

Google BigTable The data or values stored in BigTable are treated as uninterpreted strings. The BigTable key is three-dimensional. The three-part key contains a row key, a column key and a timestamp. Therefore the mapping takes the form:

(row key, column key, timestamp) value.

Col Family 0 Col Family 1 Col Fam 2

Col 0 Col 1 Col 1 Col 1 Col 0 Col 0 Col 2

Row

Timestamp

Page 21: Major Application Areas in Cyberspace Joel Crichlow, Ph.D.

Distributed Computation SystemsNetworked computers cooperate in the execution of a computationally intensive programThe Network PlatformAlgorithm Design and ImplementationLanguages, Standards and Tools

Page 22: Major Application Areas in Cyberspace Joel Crichlow, Ph.D.

Distributed Computation Systems

The Network Platform◦ Cluster Computing◦ The Internet◦ The Lambdagrid

Algorithm Design and Implementation◦ control parallelism◦ data parallelism

Page 23: Major Application Areas in Cyberspace Joel Crichlow, Ph.D.

Distributed Computation Systems

Languages, Standards and Tools◦ PVM◦ MPI◦ DCE◦ CORBA◦ Globus Toolkit

Page 24: Major Application Areas in Cyberspace Joel Crichlow, Ph.D.

Distributed Computation

Tasks (T) interact with each other in a PVM running context. PVM uses network protocols (N) for communication among the computers

Distributed applications use MIDDLEWARE tools to interoperate over a network of heterogeneous computers

T

T

T

T

PVM

PVMPVM

NN

N

Distributed applications

MIDDLEWARE

Host OS and network service

Distributed applications

MIDDLEWARE

Host OS and network service

network

Page 25: Major Application Areas in Cyberspace Joel Crichlow, Ph.D.

XSEDEThe Extreme Science and Engineering Discovery Environment, XSEDE, tightly integrates supercomputing resources, storage and scientific instruments across geographically dispersed major research centersThe interconnection network includes a backbone of hubs allowing interhub transmission capacity of 40 Gbps.To the hubs are linked border routers which are the interfaces between the grid and the sites.Each site has up to 10 Gbps dedicated transmission capacity

Page 26: Major Application Areas in Cyberspace Joel Crichlow, Ph.D.

XSEDEThe XSEDE interconnection network is hierarchical.

Page 27: Major Application Areas in Cyberspace Joel Crichlow, Ph.D.

Distributed Real-Time Systems

EnvironmentGeographic RangeCommunication TrafficComputer Processing

Page 28: Major Application Areas in Cyberspace Joel Crichlow, Ph.D.

Distributed Real-Time Systems

Computer Processing

Distributed real-time processing may be hierarchical, involving a low-level network of sensors feeding data to data -aggregation nodes which feed high-level servers

Server Server

Data Aggregation Network

Network of Sensors

Page 29: Major Application Areas in Cyberspace Joel Crichlow, Ph.D.

Distributed Multimedia Systems

The Signals◦ Stereo quality audio CD would require up to 1.411 Mbps.◦ Video: flash discrete images at a rate of 50 or more images per second◦ The images in video can be represented as a sequence of frames (a

frame is a rectangular grid of pixels)◦ Twenty-four bits per pixel with a frame of 1024 * 768 pixels is

illustrative of present high-resolution technology◦ Transmitting at 25 frames per second would require transmission

capacity in excess of 400 Mbps

Peer-to-Peer Multimedia Systems Media On Demand (MOD)

◦ Video on Demand (VOD) or On Demand (OD)

Page 30: Major Application Areas in Cyberspace Joel Crichlow, Ph.D.

Distributed Multimedia

Media On Demand (MOD)MOD server maintains a digital repository of videos which home users, via communication networks, can access and view immediately

MOD

servernetwork

home

home

home

Page 31: Major Application Areas in Cyberspace Joel Crichlow, Ph.D.

Distributed MultimediaMedia On Demand (MOD)Massive storage must be arranged as hierarchical structure

network

client

client viewers

Server

RAM

Server

RAM

Mag disk

Mag disk

Opt disk

Opt diskMag

tapeMag tape

Page 32: Major Application Areas in Cyberspace Joel Crichlow, Ph.D.

Distributed Operating Systems

Network Operating SystemDistributed Operating SystemIssuesThreads

Page 33: Major Application Areas in Cyberspace Joel Crichlow, Ph.D.

Network Operating System

Network operating system of agents and different local operating systems

network

agent

agent

agent

Local OS Local OS

Local OS

Page 34: Major Application Areas in Cyberspace Joel Crichlow, Ph.D.

Distributed Operating Systems

Homogeneous network-wide operating system

Issues◦ Fundamental OS problems◦ Data integrity◦ Fail-Soft operation◦ Security◦ Performance◦ Scalability

Threads

Page 35: Major Application Areas in Cyberspace Joel Crichlow, Ph.D.

Windows NT family The Windows NT family comprises a series of releases of operating systems that support distributed system applications

The NT architecture comprises a number of layers◦ Hardware Abstraction Layer (HAL)◦ Kernel◦ Executive◦ Subsystems.

Page 36: Major Application Areas in Cyberspace Joel Crichlow, Ph.D.

Windows NT family Architecture

Page 37: Major Application Areas in Cyberspace Joel Crichlow, Ph.D.

Mach microkernel Mach is a distributed operating system project that has seen its kernel used in several Unix-like operating systems and in the Mac OS X operating system

Page 38: Major Application Areas in Cyberspace Joel Crichlow, Ph.D.

ConclusionWe looked at:

Distributed File Systems

Distributed Database Systems

Distributed Computation Systems

Distributed Real-Time Systems

Distributed Multimedia Systems

Distributed Operating Systems