Savio Chau Five Classic Components of a Computer Current Topic: Input and Output Control Datapath...
-
date post
21-Dec-2015 -
Category
Documents
-
view
263 -
download
0
Transcript of Savio Chau Five Classic Components of a Computer Current Topic: Input and Output Control Datapath...
Savio Chau
Five Classic Components of a Computer
• Current Topic: Input and Output
Control
Datapath
Memory
Processor
Input
Output
Control
Datapath
Memory
Processor
Input
Output
Network
Peripheral Devices
Savio Chau
What You Will Learn in This Set of Lectures
• I/O System Overview
• I/O System Design Considerations
• I/O System Design Parameters– Connectivtiy – Protocol– Access Control– Performance– Expandability– Failure Handling– Operating System Support
• I/O Implementation Example
Savio Chau
Who Cares About I/O?
• CPU Performance: 60% per year
• I/O system performance limited by mechanical delays (disk I/O)< 10% per year (IO per sec or MB per sec)
• Amdahl's Law: system speed-up limited by the slowest part!10% IO & 10x CPU 5x Performance (lose 50%)10% IO & 100x CPU 10x Performance (lose 90%)
• I/O bottleneck: Diminishing fraction of time in CPUDiminishing value of faster CPUs
Savio Chau
I/O System Architecture Overview
User Application
Operating System
Device Driver
I/O Controller
I/O Device
I/O Device
system call
Memory or I/O Bus
Media
Software
Hardware
Device Driver
Protocol can be defined at
all levels
I/O Controller
Physical
Logical
System Interface
Savio Chau
PC
IF/I
D R
egis
ter
ID/E
X R
egis
ter
EX
/Mem
Re
gis
ter
Mem
/Wr
Re
gis
ter
RFile
Exec Unit
Ra
Rb
Rw Di
Imm16
BusABusB
1
0
0
1
A
PC+4Imm16
BranchTarget
Address
Rs
Rt
Rt
RdI01
Zero
How is the I/O System Related to Processor
Co
ntr
ol
Sig
na
ls
Co
ntr
ol
Sig
na
ls
Co
ntr
ol
Sig
na
ls
PC+4
ExtOpALUSrcALUOpRegDstMemWrBranchMemtoRegRegWrM
ain
Co
ntr
ol
Memory Hierarchy
RAWA
Di
Do
4
I/O Controller
I/O Device
Ad
der
Wr
Savio Chau
A Classificaiton of I/O According to the Targets of I/O Operation
• Processor to MemoryVery low latency, very high throughput, very low protocol overhead
• Processor to PeripheralLatency, throughput, and protocol overhead vary according to the I/O devices
• Processor to Processors
– Tightly Coupled: all processors share a physical memoryLow latency, high throughput, low overhead protocol, coherence problem
– Loosely Coupled: each processor has its own physical memoryMedium latency, medium throughput, high protocol overhead, scalable
• Processor to NetworkHigh latency, low throughput, high protocol overhead, very scalable
Savio Chau
I/O System Example
Processor
Cache
Memory - I/O Bus
MainMemory
I/O Controller
Graphics
Network
DiskDisk
I/O Controller Network Interface
Controller
IEEE 1394 Bus Interface
Contorller
Processor
Cache
To Other Processors or Peripherals on the
IEEE 1394 Bus
Savio Chau
I/O Devices Examples
Device Behavior Partner Data Rate (KB/sec)
Keyboard Input Human 0.01
Mouse Input Human 0.02
Line Printer Output Human 1.00
Floppy disk Storage Machine 50.00
Laser Printer Output Human 100.00
Optical Disk Storage Machine 500.00
Magnetic Disk Storage Machine 5,000.00
Network-LAN Input/Output Machine 20 – 1,000.00
Graphics Display Output Human 30,000.00
• See Backup Slides for More Detailed Information about Some of the I/O Devices
Savio Chau
I/O System Design Process• Establish Requirements: Understanding What You Need
• Select the I/O System That Has the Required Capability: Understand What the I/O System being Considered Can Do
• Integration: Understand How Everything Fits Together
• Implementation
Device A? Device B?
Device B? Device C? Device D?
Bus A?
Bus B?Bus C?
Device A Device B
Device B Device C Device D
Bus B? ?
? ? ?
Savio Chau
I/O System Design Example: Establish Requirements
• Design an I/O architecture for a spacecraft that has the following equipment
Flight Computer
(CDH)
Flight Computer
(ACS)
Flight Computer (Payload)
Star TrackerStar TrackerTelecom Subsystem
Telecom Subsystem
Inertia Measurement Unit
Inertia Measurement Unit
Power Control Unit
Power Control Unit
Thruster Control Unit
Thruster Control Unit
Wide Angle Camera
High Resolution Camera
Radar Sounder
Altimeter
Data Rate: 5 Kbps1transaction/secLatency < 10 ms
Data Rate: 8 Mbps1000 samples/secLatency < 0.1 ms
Data Rate: 10 Kbps1000 samples/secLatency < 0.1 ms
Data Rate: 400 bps2 commands/secLatency < 0.5 sec
Data Rate < 100 bps10 commands/secLatency < 0.1 ms
Data Rate: 20 Mbps2 frames/secLatency < 0.5 sec
Data Rate: 20 Mbps2 frames/secLatency < 0.5 sec
Data Rate: 1 Mbps1 transaction/secLatency < 1 sec
Data Rate: 5 Kbps100 samples/secLatency < 0.01 sec
I/O?
System Constraints (Prioritized):1. Total power consumption of the avionics system < 100 W. 2. The I/O system power consumption should be less than 35% of the avionics system.3. Each subsystem has to meet the latency and throughput requirements4. System reliability should exceed 12 years (i.e., requires fault tolerance)5. The system design should be scalable and distributed.6. Maximum distance between subsystems is 5 meters. Average distance is 3 m.7. Minimize the cable mass.
Savio Chau
I/O System Design Example: Candidate I/O Interface
Metrics IEEE 1394(Cable version)
IEEE 1393 Fiber Channel I2C UART (Direct Interface)
Ethernet(IEEE 802.3)
Raw Bandwidth 100, 200, 400 Mbps
200 to 1000 Mbps 1 Gbps 100, 400 Kbps 115 Kbps to 10 Mbps
10, 100 Mbps
Latency 125 s max 196 bits N nodes
196 bits N (loop)
Undeterministic < 100 ns Undeterministic
Topology Tree Ring Loop, Star, Switch network
Multi-Drop Star Multi-Drop
Signal Level Protocol
Async Async Async Async Async Async
Cable Type Electrical (Twisted pair)
Optical Fiber Optical Fiber, Electrical
(Twisted pair)
Electrical(Single end)
Electrical(Twisted pair)
Electrical(Coaxial)
Power Note 1 1 W/node 8 W/node 8 W/node 5 mW/node 35 mW/node 150 mW/node
Multi-master Yes Yes Yes Yes No Yes
Max. # Nodes 64 127 127 for Loop 128 N/A 248
Max Bus Length Note 1
72 m(4.5 m/hop)
10 km,(100m/hop)
Fiber: 10 kmElectrical: 30m
Approx. 40 m (load<400 pf)
Approx. 10 m 500 m
Protocol Overhead
8 % for 278 byte data
3 bytes per 53-byte frame
25 % for 2168 byte data Note 2
1 byte address +Ack bit / byte
1 start + 1 stop bits/byte (25%)
64 bytes / msg (msg < 1500 B)
Savio Chau
I/O System Design Example: Selecting an I/O Interface
• There are 17 nodes in the system and the power allocation of the I/O system is 35 W. This eliminates the Fiber Channel and the IEEE 1393
• The latency requirement eliminates the I2C and Ethernet• The total bandwidth requirement of the system 56 Mbps. This eliminates the UART• The system reliability requirement eliminates the IEEE 1394 bus because tree topology is
not very fault tolerant• All interface options, except the UART, are buses and thus meet the scalability
requirement. All bus options here support distributed processing.• The distance requirement prohibits the search for a parallel bus• All interface options, except the UART, are serial buses and thus meet the cable mass
requirement
PROBLEM: WE DON’T HAVE AN OPTION THAT CAN MEET ALL REQUIREMENTS!
Resolution: Since power consumption and latency are technology dependent and difficult to improve, the next best option is to improve system reliability using fault tolerance design techniques. Therefore, the IEEE 1394 is the best choice in this case but need to be enhanced with fault tolerance design techniques. Use dual redundant buses.
Check: Since redundant buses have to be used, the number of interfaces of the IEEE 1394 bus is doubled. The power consumption will be 17 x 1 W x 2 = 34 W. This is OK since it is still within the 35 W power constraint.
Savio Chau
Establish Requirements: Understanding What You Need
• Application and Environments of the I/O System– Home Computing
– Industrial Control
– Network
– Aerospace• Capability Required
– Number of I/O Devices
– Data Rate of the I/O Devices
– Required Throughput: How much data need to be transferred?
– Maximum Latency: How much delay the I/O devices can tolerate?
– Future Expansion• Constraints
– Cost Constraints: How much money do you have?
– Power Constraints: Do you have enough power?
– Electrical Interface Constraints Imposed by the I/O devices
– Mechanical Interface Constraints Imposed by the I/O devices
– Logical Interface Constraints: Protocol Imposed by the I/O devices
Savio Chau
Select I/O System with Required Capability: Understand the I/O System Being Considered
• Performance: How much data can be handled by I/O system being considered– Throughput: function of Bit Rate, Bus Width, Block Size, Protocol Overhead
– Latency or Response Time
– Impact on Processor Performance
• Expandability: How many devices can it handle– Bus Length: Parallel Buses Are Shorter, Serial Buses Are Longer
– Drive Capability: Bus Loading, Transmission Line Effect
– Multi-Level Buses: Bridge Between Buses
• Access Control: How to arbitrate I/O requests among nodes– Master-Slave: One Master Controls All Transactions
• Passive Slaves, Active Slaves (interrupt)
– Multi-Master: Arbitration Required Among Masters (processors, controllers)
• Failure Handling: What the I/O system can do in case of failures?– Reliability vs. Availability
– Fault Tolerant: Fault Detection, Fault Isolation, Fault Recovery
Savio Chau
Integration: Understand How Everything Fits Together
• Physical Interface with the I/O Devices– Electrical Interface
– Mechanical Interface
• Topology – Star, Multi-Drop, Ring, Tree, etc.
• Protocol: Rules of Communication with the I/O Devices– Signal Level Protocol
• Synchronization: Synchronous (Clocked), Asynchronous (handshake)
– Packet / Message Level Protocol
– Addressing Capability: Directed, Broadcast, Multi-Cast
– Transaction Types: Split, Unified
• Operating System Support– Software Device Driver
– Method of Addressing the Devices: I/O Address, Memory Mapped I/O:
– Processor & I/O Devices Interaction: Interrupt, Polling, DMA, I/O Processor
– Resource Management: Sharing of I/O Devices
– Protection: Ensure No Conflicts among I/O Devices
Savio Chau
Implementation
• If Your I/O System Requirement Can be Met by Standard Interfaces– It is easy! Just purchase commercial off-the-shelf (COTS)
components, software, and test equipment and then integrate them
• If Your I/O System Requirement Needs Custom Design, You Have to– Specify the protocol and timing of the signals at the
interface– Design the logic required to implement the specification– Realize the logic design in hardware– Write the software driver to drive the hardware
Savio Chau
Key I/O Design Parameters to be Discussed
• Connectivity
• Protocol
• Access Control
• Performance
• Expandability
• Failure Handling
• Operating System Support
Physical • Protocol• Connectivity• Access Control• Performance• Expandability• Failure Handling
Logical• Protocol• Failure Handling
System Interface• Operating System Support• Failure Handling
Typical I/O System Layers and Key Parameters
Savio Chau
Connecting I/O to Processor: Direct Interface
• Ad Hoc– No definite number of signals, protocol, electrical interface etc.
• Standards– RS232: Serial interface. Signals include Request-to-Send, Clear-
to-Send, TxData, RxData
– UART (Universal Asynchronous Receiver Transmitter): Serial interface protocol, usually used with the RS232
– IEEE 1284: Parallel interface, commonly used for printer port on PCs
Control
Datapath
Memory
Processor Input
Output
Input
Output
Savio Chau
Connecting I/O to Processor: Buses
• A Bus is – shared communication link
– single set of wires used to connect multiple subsystems
• Bus is also a fundamental tool for composing large, complex systems– systematic means of abstraction
Control
Datapath
Memory
Processor
Input
Output
Input
Output
Savio Chau
Types of Buses• Processor-Memory Bus (design specific)
– Used for Process-to-Memory I/O– Usually is parallel, short, high speed and on the processor broad– Match the processor and memory interfaces to maximize bandwidth– Optimized for cache block transfers
• I/O Bus (industry standard)– Used for Process-to-Peripheral, loosely coupled Processor-to-Processor, and
Processor-to-Network I/Os – Usually is serial, lengthy, slower, and implemented by cables but flexible– Need to match a wide range of I/O devices– Connects to the processor-memory bus or backplane bus through bridges
• Backplane Bus (standard or proprietary)– Used for Process-to-Peripheral, tightly coupled Processor-to-Processor I/Os,
and Processor-to-Network I/Os– Backplane: an interconnection structure within the chassis– Allow processors, memory, and I/O devices to coexist– Usually is parallel, speed is between Processor and I/O Bus– Cost advantage: one bus for all components
• See Backup Slides for Bus Surveys
Savio Chau
A Computer System with One Bus: Backplane Bus
• A single bus (the backplane bus) is used for:– Processor to memory communication
– Communication between I/O devices and memory
• Advantages: Simple and low cost• Disadvantages: slow and the bus can become a major
bottleneck• Example: IBM PC - AT
Savio Chau
A Two-Level Bus System
• I/O buses tap into the processor-memory bus via bus adaptors:– Processor-memory bus: mainly for processor-memory traffic– I/O buses: provide expansion slots for I/O devices
• Examples: – Apple Macintosh-II– NuBus: Processor, memory, and a few selected I/O devices– SCCI Bus: the rest of the I/O devices
Savio Chau
A Three-Level Bus System
• A small number of backplane buses tap into the processor-memory bus– Processor-memory bus is used for processor memory traffic
– I/O buses are connected to the backplane bus
• Advantage: loading on the processor bus is greatly reduced• Example: See PCI Bus Example
Savio Chau
The General Organization of a Bus
• Data Lines Carry Information Between the Source and the Destination:– Data and Addresses
– Complex Commands
• Control Lines:– Signal Requests and Acknowledgments
Data Lines
Control Lines
Savio Chau
Typical Bus Operation and Interface Control
• I/O Operation Consists of– Check if Device is Available (e.g., check busy signal)
– Send Operation Parameters (e.g., send read/write signals, address)
– Data Transfer (e.g., read or write to Data, Control, Status registers)
– Termination (e.g., send or receive acknowledge signal)
• Methods are:– Programmed I/O
– Interrupt- Driven
– Direct Memory Access (DMA)
Savio Chau
How to Specify a BusParameter Consideration Data Bus Width
Wider bus has higher performance but higher cost
Address Bus Width
Wider bus has larger address space but higher cost
Block size Smaller block has higher percent overhead but lower latency Larger block has lower percent overhead but higher latency.
Topology Multi-drop topology is simpler but not very scalable. Point-to-point topology is more scalable but more complicate.
Access Control
Single master is simpler. Multiple masters is more complicate (requires arbitration) but supports distributed processing
Protocol Synchronous bus is simpler but timing is tighter. Asynchronous bus has more relax timing but more complicate
Transaction Type
Unified transaction is simpler but has lower throughput. Pipelined transaction has higher throughput but is more complicate.
Savio Chau
Examples of Bus Topologies• Multi-Drop Bus
– One media is shared by many devices– If the media is a cable, each device needs a coupler to “tap” into the bus– Need to consider short protection, electrical isolation, and termination
Device 1 Device NDevice 2
Data
Command / Address
Short protection resistors
Bus couplers
Termination Resistors
Isolation (transformer, Optical etc.)
• Point-to-Point Buses– One media between each pair of devices– Many topologies are possible (e.g., ring, tree, star etc.)– Short protection, electrical isolation, and termination are less critical
Device Device
Device
Device Device
Device Device
Device
Device
Device
Device Device
Device
Device
Ring (e.g. Token Ring) Tree (e.g. IEEE 1394) Star (e.g. Fiber Channel)
Savio Chau
Examples of Bus Transaction Types
• Unified Transaction– Request (address and read/write commands) is followed
immediately by response (data)
• Split Transaction– Request is not followed immediately by response. Other
requests can be issued in-between
Rd Addr 1R/W Address(Requesting Node)
Data1Data(Responding Node)
Data1Data(Responding Node)
Data2
Rd Addr 1R/W Address(Requesting Node)
Rd Addr 2
Rd Addr 2
Data2
Savio Chau
Examples of Bus Protocols
• Synchronous Bus:– Includes a clock in the control lines– A fixed protocol for communication that is relative to the clock– Advantage: involves very little logic and can run very fast– Disadvantages:
• Every device on the bus must run at the same clock rate
• To avoid clock skew, they cannot be long if they are fast
• Asynchronous Bus:– It is not clocked– It can accommodate a wide range of devices (fast and slow)– It can be lengthened without worrying about clock skew– It requires a handshaking protocol which can significantly
reduce the effactive bandwidth
• Some more details in the Protocol discussion
Savio Chau
Key Features of PCI Bus
• 32-bit or 64-bit bus running at 33 MHz or 66 MHz, synchronized to host processor clock
• Block oriented data transfer• Reconfigure bus nodes upon system startup or configuration
changes (Plug-and-Play)• Multi-master, but only one master has bus arbitration capability• Sub-buses include
– Address and Data Bus (Multiplexed)
– Command and Byte Enable Bus
– Interface Control Signals
– Arbitration Signals
– Error Signals
• Reflected wave signal switching• Device select and negative acknowlegment
• More details in the Protocol discussion
Savio Chau
Serial Bus Example: IEEE 1394 (Firewire)
I/O I/O I/O
CPU memory I/O CPU
Any Backplane Bus
IEEE 1394 Bus (backplane environment) bridge
CPU memory I/O CPU
Any Backplane Bus
IEEE 1394 Bus (backplane environment) bridge
IEEE 1394 Bus (Cable environment)
nodes
ports
nodes
Note: IEEE 1394 Bus is a serial bus in both backplane and cable environments
More details in the Protocol discussion
Savio Chau
Key Features of the IEEE 1394 Bus• A digital interface – there is no need to convert digital data into analog
and tolerate a loss of data integrity
• Physically small - the thin serial cable can replace larger and more expensive interfaces
• Adopts a tree topology in cable environment and multi-drop topology in backplane enviroment
• Easy to use - no need for terminators, device IDs, or elaborate setup
• Hot pluggable - users can add or remove 1394 devices with the bus active
• Inexpensive - priced for consumer products
• Scalable architecture - may mix 100, 200, and 400 Mbps devices on a bus
• Flexible topology - support of daisy chaining and branching for true peer-to-peer communication
• Fast - even multimedia data can be guaranteed its bandwidth for just-in-time delivery
• Non-proprietary
• Mixed asynchronous and isochornous traffic
• More details in the Protocol discussion
Savio Chau
Advantages of Buses
• Versatility:– New devices can be added easily
– Peripherals can be moved between computer systems that use the same bus standard
• Low Cost:– A single set of wires is shared in multiple ways
– Easy to maintain
• Manage complexity by partitioning the design
Savio Chau
Disadvantage of Buses
• It creates a communication bottleneck– The bandwidth of that bus can limit the maximum I/O throughput
• The maximum bus speed is largely limited by:– The length of the bus
– The number of devices on the bus
– The need to support a range of devices with:• Widely varying latencies • Widely varying data transfer rates
• A single point of failure: one bus failure (e.g., short to ground) can fail the entire system
Savio Chau
Signal Level Protocol:Typical Synchronous Protocol
• Wait signal is optional: Slave can use this signal to indicate when it is prepared for data transfer
• Actual transfer goes at bus rate
Valid(master)
Cmd+AddrR/WAddress(master)
Data1 Data1Data(master/slave)
Wait(slave)
Data2
Clock(master)
Savio Chau
Signal Level Protocol: Typical Asynchronous Protocol (Handshaking)
• t0 : Master has obtained control and asserts address, direction, data Waits a specified amount of time for slaves to decode target
• t1: Master asserts request line
• t2: Slave asserts ack, indicating data received
• t3: Master releases req
• t4: Slave releases ack
Address
Data
Rd / Wr(Master)
Req(Master)
Ack(Slave)
Master Asserts Address
Master Asserts Data
Next Address
Write Transaction
t0 t1 t2 t3 t4 t5
Savio Chau
Signal Level Protocol: Typical Asynchronous Protocol (Handshaking)
• t0 : Master has obtained control and asserts address and direction.Waits a specified amount of time for slaves to decode target
• t1: Master asserts request line• t2: Slave asserts ack, indicating ready to transmit data• t3: Master releases req, data received• t4: Slave releases ack
Address
Data(Slave)
Rd / Wr(Master)
Req(Master)
Ack(Slave)
Master Asserts Address Next Address
t0 t1 t2 t3 t4
Read Transaction
Savio Chau
Asynchronous Protocol for Multiplexed Bus
• Three Control Lines– ReadReq: Indicates a Read Request for Memory
• Address is Put on the Data Lines at the Same Time
– DataRdy: Indicates the Data Word is Now Ready on the Data Lines• Data is put on the Data Lines at the Same Time
– Ack: Acknowledge the ReadReq or the DataRdy of the Other Party
(master)
(slave/master)
(slave)
(master/slave)
(slave) (master)
(master) (slave)
Read Example
Savio Chau
State Diagram for Asynchronous Multiplexed Bus
I/O Devices (master)
PutAddress ondata lines;
AssertReadReq
Ack
Ack
2Release data
lines; deassertReadReq
DataRdy
Dat
aRd
y
5Read memory
data fromdata linesassert Ack
DataRdy
Dat
aRd
y
7Deassert
Ack
Idle
ReadReq
Memory (slave)
Rea
dR
eq
1Record
address ondata lines and
assert Ack
ReadReqR
ead
Req
3,4Drop Ack; putmemory dataon data lines;
assertDataRdy
Ack
Ack
6Release data
lines andDataRdy
Address Phase Done
Data Phase Done
Savio Chau
Example of Signal Level Protocol:PCI Bus Protocol (Read)
AddrData
1Data
3Data
2
BusCmd
ByteEnable
ByteEnable
ByteEnable
Target begins to drive data back to initiator Initiator deasserts FRAME#
indicating that it is ready to complete last data phase
Initiator deasserts IRDY#, returning bus to idle state
Wait States
Data Transfers
Target deassertsTRDY# and DEVSEL#
Initiator starts transaction by asserting FRAME#, driviing
address onto AD bus and command onto C/BE bus
Target latch and decode address
and command
Turn-Around cycle. Initiator stops driving AD bus
Initiator stops drivint command and startsdriving byte enables
Target keeps TRDY# deasserted to enforce
turn-around cycle
Target deviceasserts DEVSEL#
CLK
FRAME#
AD
C/BE#
IRDY#
TRDY#
DEVSEL#
GNT#
1 2 3 4 5 6 7 8 9
Savio Chau
Packet Level Protocol
• Packet is unit of information exchange in I/O system
• Packet level protocol specifies the rules of communication with the contents of the packets
• General format of a packet– Header Fields
• Destination address
• Command
• Data length
• Source address (optional)
• Other auxiliary information
– Data Field
– Error Checking Code
• Network usually requires multi-level headers
Packet Level Protocol
Savio Chau
Example of Packet Level Protocol: IEEE 1394 Protocol
• Physical Layer: Data-Strobe Encoding
• Cycle Structure
Packet A Packet B Packet BCycle start
data = xCycle start
data = yCh J Ch K Ch L Ch N Ch J
Cycle #m Cycle #m+1Cycle #m-1
Subaction (long) gapIsochronous (short) gaps
Subaction (long) gaps
ack (short) gapsCycle #m
start delay = xCycle #m+1start delay = x
Cycle synch Cycle synchNorminal cycle period = 125 s
ack
ack
ack
Data Line
Strobe Line
Data xor Strobe(Used for Clock)
1 1 1 10000
Isochronous Packets
Asynchronous Packets
Acknowledge Packets
Savio Chau
IEEE 1394 Packet Examples
• Asynchronous Packets
Read Request Packet
Source_ID
Header_CRC
tl rt tcode pri
Destination_offset
Destination_ID
Destination_offset
Data_length Extended_tcode
Read Response Packet
Source_ID
Header_CRC
tl rt tcode pri
Reserved
Destination_ID
Reserved
Data_length Extended_tcode
rcode
Data Block
. . .
Last Data BlockLast Data Block (zero padded if necessary)
Data_CRC
• Isochronous Packets (always multicasted)
Header_CRC
tag Channel tcode syData_length
Data Block
. . .
Last Data BlockLast Data Block (zero padded if necessary)
Data_CRC
Acknowledge Packet
Ack_code Ack_parity
Note: Broadcast and multi-cast packets does not require acknowledgement or response. Therefore, it usually does not have a source address
Savio Chau
TCP/IP Protocol Stack and Packet Headers
Network Interface
Internet
Transport
Physical
Application
Network Interface
Internet
Transport
Physical
Application
Source Node Destination Node
Network
TCP/IP has 5 Levels of
Protocol
Frame Header IP Header TCP Header Data
VERS H. Len Service Type Total Length
Identification Flags Fragment Offset
Time to Live Type Header Checksum
Source IP Address
Destination IP Address
IP Options Padding
Source Port
H.Len Unused Window
Sequence Number
Options
Code Bots
Destination Port
Acknowldegment Number
Chechsum Urgent Pointer
Savio Chau
Obtaining Access to the Bus• One of the most important issues in bus design:
– Since bus is a shared resource, how a device reserves the bus when it wishes to use the bus?
• Chaos is avoided by a master-slave arrangement:– Only the bus master can control access to the bus: It initiates
and controls all bus requests
– A slave responds to read and write requests
• The simplest system:– Processor is the only bus master
– All bus requests must be controlled by the processor
– Major drawback: the processor is involved in every transaction
BusMaster
BusSlave
Master issues command & address
Data can go either wayBus
SlaveSelected
Not Selected
Savio Chau
Bus Transaction in a Single Master Bus
• A bus transaction consists of two parts:– Issuing the command (and address) – request
– Transferring the data – action
• Master is the one who starts the bus transaction by:– issuing the command (and address)
• Slave is the one who responds to the address by:– Sending data to the master if the master ask for data
– Receiving data from the master if the master wants to send data
BusMaster
BusSlave
Master issues command & address
Data can go either wayBus
SlaveSelected
Not Selected
Savio Chau
Multiple Master Bus
• More than one device has the capability to become bus master and initiate bus transactions
• The target device will respond whether it is master-capable or just a slave • Advantage: the workload can be shared among bus masters• Disadvantage: need to determine who has the right to use the bus -
arbitration
Most modern buses are multi-master
BusMaster
BusSlave
Master issues command & address
Data can go either way
BusMaster
Arbitration
Savio Chau
Arbitration in Multi-Master Bus
• Bus arbitration scheme:– A bus master wanting to use the bus asserts the bus request
– A bus master cannot use the bus until its request is granted
– A bus master must signal to the arbiter after finish using the bus
• Bus arbitration schemes usually try to balance two factors:– Bus priority: the highest priority device should be serviced first
– Fairness: Even the lowest priority device should never be completely locked out from the bus
• Bus arbitration schemes can be divided into four broad classes:– Daisy chain arbitration: single device with all request lines.
– Centralized, parallel arbitration: see next-next slide
– Distributed arbitration by self-selection: each device wanting the bus places a code indicating its identity on the bus.
– Distributed arbitration by collision detection: Ethernet uses this.
Savio Chau
Daisy Chain Bus Arbitration Scheme
• Advantage: simple• Disadvantages:
– Cannot assure fairness: A low-priority device may be locked out indefinitely
– The use of the daisy chain grant signal also limits the bus speed
BusArbiter
Device 1HighestPriority
Device NLowestPriority
Device 2
Grant Grant Grant
Release
Request
wired-OR
Savio Chau
Centralized Parallel Arbitration
• Used in essentially all processor-memory buses and high-speed I/O buses
• Disadvantage: Number of wires increases with devices
BusArbiter
DeviceA
Device N
DeviceB
Grant AReq A
Grant BReq B
Grant N
Req N
Data BusControl Bus
Savio Chau
Increasing Transaction Rate on Multimaster Bus
• Overlapped arbitration– perform arbitration for next transaction during current
transaction
• Bus parking– master can holds onto bus and performs multiple
transactions as long as no other master makes request
• Overlapped address / data phases (previous slide)– requires one of the above techniques
• Split-phase (or packet switched) bus– completely separate address and data phases– arbitrate separately for each– address phase yield a tag which is matched with data
phase
• ”All of the above” in most modern mem busses
Savio Chau
I/O System Performance
• I/O System performance depends on many aspects of the system (“limited by weakest link in the chain”):– The CPU speed – The bandwidth and latency of underlying interconnection
(buses)– The speed of the I/O controller– The speed of the I/O device– The speed of the I/O software (Operating System)– The efficiency of the software’s use of the I/O devices– The speed of the memory system:
• Internal and external caches
• Main Memory
• Two common performance metrics:– Throughput: I/O bandwidth– Response time: Latency
Savio Chau
Performance Analysis Using Queueing Model Example: Disk I/O
• Disk Access Time = (Seek time + Rotational Latency + Transfer time + Controller Time) + Queueing Delay
– Seek time = time to look up directory – Rotation latency = time to wait for data come under read head
• Estimating Queue Length:– Utilization: U = Request Rate / Service Rate– Mean Queue Length = U / (1 - U)– As Request Rate Service Rate
• Mean Queue Length Infinity
ProcessorQueue
DiskController
Service RateRequest Rate
Queue
DiskController
Disk
Disk
Savio Chau
Disk Access Time Example
• 512 byte sector, rotate at 5400 RPM, advertised seeks is 12 ms, transfer rate is 4 BM/sec, controller overhead is 1 ms, queue idle so no service time
• Disk Access Time = Seek time + Rotational Latency + Transfer time + Controller Time + Queueing Delay
= 12 ms + 0.5 / 5400 RPM + 0.5 KB / 4 MB/s + 1 ms + 0
= 12 ms + 0.5 / 90 RPS + 0.125 / 1024 s + 1 ms + 0
= 12 ms + 5.5 ms + 0.1 ms + 1 ms + 0 ms
= 18.6 ms
– Note: rotation latency is 0.5 rotation on the average
• If real seeks are 1/3 advertised seeks, then its 10.6 ms, with rotation delay at 50% of the time!
Savio Chau
Simple Producer-Server Model
• Throughput:– The number of tasks completed by the server in unit time– In order to get the highest possible throughput:
• The server should never be idle• The queue should never be empty
• Response time:– Begins when a task is placed in the queue– Ends when it is completed by the server– In order to minimize the response time:
• The queue should be empty• The server should be idle
Producer ServerQueue
Savio Chau
Performance Enhancement
• In general throughput can be improved by:– Throwing more hardware at the problem– Reducing load-related overhead
• Response time is much harder to reduce:– Function of technology
Producer
ServerQueue
QueueServer
Savio Chau
I/O Performance Enhancement Example: Increasing Bus Throughput
• Separate versus multiplexed address and data lines:– Address and data can be transmitted in one bus cycle if separate
address and data lines are available
– Cost: (a) more bus lines, (b) increased complexity
• Data bus width:– By increasing the width of the data bus, transfers of multiple words
require fewer bus cycles
– Example: SPARCstation 20’s memory bus is 128 bit wide
– Cost: more bus lines
• Block transfers:– Allow the bus to transfer multiple words in back-to-back bus cycles
– Only one address needs to be sent at the beginning
– The bus is not released until the last word is transferred
– Cost: Increased complexity and slower response time
• Pipelined Bus– Initiate next address phase during current data phase
– Cost: Increased complexity in bus control logic
Savio Chau
Expandability
• Depends on Many Factors– Bus Length: Constrained by bit rate and cross-talk– Bus Driver Capability: Constrained by how much current can
source or sink by each node– Topology
• The number of devices in star or point-to-point configuration must be determined ahead of time
• Multi-drop buses are more expandable. Devices can be added any time, but the shared bus media will eventually become a bottleneck
• Point-to-point buses is much more scalable
– Built-In Expandability: Some buses support expansion by using repeaters and bridges
– Bus Bandwidth: The higher bandwidth of the I/O system, the more nodes it can support
– Processor Performance: Faster processor can handle more I/O operations and thus more I/O devices
Savio Chau
Resilience in the face of failure• Two terms that are often confused:
– Reliability: Is anything broken? There are two views:• Is the system broken (e.g., your computer crashed)?• Is the component (i.e., I/O devices) broken (e.g. printer not working)?
– Availability: Is the system available to the user?
• System Reliability can be improved by:– Component reliability
• Can only be improved by building more reliable components using better quality control or more advanced technology
– Fault tolerant design • Adding fault detection logic and redundant components
– Building with fewer components• This contradicts fault tolerant design. Careful trade-off is required.
– Better environmental conditions
• Availability can be improved by:– Have a good repair personnel
– Have sufficient spare components
Savio Chau
Basic Ideas of Fault Tolerance Design
• Fault Detection:Hareware techniques– Using duplicate-and-compare– Using coding technique
Software techniques– Watchdog timer
• Fault Isolation: to identify the location of the faulty component– Usually done in softare
• Fault Recovery– Replace the faulty component with a backup component– Sometimes fault recovery can be done by masking the
fault with error correction code or voting
Savio Chau
Error Checking Codes: Parity and Checksum
• Parity: Add a check bit to make the total number of 1’s even (odd) for even (odd) parity. For checking, XOR all bits should get a 0 if no error.
– Very simple
– Cannot detect even number of error bits
• Checksum: Compute a check symbol by adding logically (i.e., xor) the bits of all data bytes in a block. For checking, XOR the bits of all data bytes and the check symbol should get a 0 if no error.
– Simple
– Cannot detect even number of error bits in the same position
Example: Even Parity 10001011Checking: Syndrome = 1 0 0 0 1 0 1 1 =
0
Example: 11101011 Checking: 11101011 00110101 00110101
Check Symbol: 11011111 11011111Syndrome = 00000000
Savio Chau
Error Checking Code: Cyclic Redundancy CheckCapable to detect n error bits with a n-bit check symbolLet’s Use the Following Definitions:
M - The original frame to be transmitted, before adding the check symbol. It is k bits long.F - The resulting check symbol to be added to M. It is n bits long.T - The cascading of M and F. This is the resulting frame that will be transmitted. It is k+n bits long.P - The pre-defined CRC Polynomial. A pattern of n+1 bits. For the CRC to be effective, P should be a prime number.
The main idea behind the CRC algorithm is to find a value of F such that the reminder of T/P is zero. The process to create CRC is as follows:
1. Get the raw frame M and left shit it by n bits (I.e., M’ = Mn) 2. Shift Mn into a linear feedback shift register (LFSR) constructed according to P 3. After all the bits of Mn shifted, the reminder in the LFSR is the check symbol F 4. Append F to the M. The result is the frame T to transmit
CRC check process: Upon receiving T, verify the remainder of T/P is still zero.
1. Receive the frame T 2. Divide T by P by shifting all the bits of T into the LFSR 3. Check the remainder in the LFSR. There is an error in the frame if it is not zero.
Savio Chau
CRC Creation: Lets assume the check symbol F is 5 bits in length (n=5).
M = 1010001101 (k=10) M6 = 101000110100000 and,P = 110101 (n+1=6)
F can be computed by shifing M6 into a linear feedback shift register (LFSR). The feedback connections of the LFSR correspond to the bits of P.
F = 01110 = content of LFSR after all bits of M6 shifted in
Then the transmitted frame will be:
T = 101000110101110
Check: Shift All Bits of T into the LFSR and Check the RemainderWithout Error: Remainder = 00000
If Error Introduced During Transmission: T’ = 101000110100110With Error: Remainder = 01000
CRC Example
1 1 0 1 0 1
F/F F/F F/F F/F F/F+ + +
X5 X4 X3 X2 X1 X0
P(x) = X5 + X4 + X2 + 1
1010001101 00000(Generate) 1010001101 01110
1010001101 01110(Check) 1010001101 00000
Savio Chau
Processor Giving Commands to I/O Devices
• Two methods are used to address the device:– Special I/O instructions
– Memory-mapped I/O
• Special I/O instructions specify:– Both the device number and the command word
• Device number: the processor communicates this via a set of wires normally included as part of the I/O bus
• Command word: this is usually send on the bus’s data lines
• Memory-mapped I/O:– Portions of the address space are assigned to I/O device
– Read and writes to those addresses are interpreted as commands to the I/O devices
– User programs are prevented from issuing I/O operations directly• The I/O address space is protected by the address translation
Savio Chau
I/O Device Communicating to the Processor
• The Operating System needs to know when:– The I/O device has completed an operation– The I/O operation has encountered an error
• This can be accomplished in two different ways:– Polling:
• The I/O device put information in a status register
• The Operating System periodically check the status register
– I/O Interrupt:• Whenever an I/O device needs attention from the processor,
it interrupts the processor from what it is currently doing.
Savio Chau
Polling: Programmed I/O
• Advantage: – Simple: the processor is totally in control and does all the work
• Disadvantage:– Polling can consume a lot of CPU time
– Processor cannot respond to I/O events in real time unless polling the I/O devices at very high speed
Processor
Device #1 Device #2 Device #3 Device #4
Has data Has no data Has no data Has data
BusPoll #1 Poll #2 Poll #3 Poll #4Poll #1
Savio Chau
I/O Interrupt
• I/O interrupt is just like the exceptions except:– An I/O interrupt is asynchronous– Further information needs to be conveyed
• I/O interrupt is asynchronous with respect to instruction execution:– I/O interrupt is not associated with any instruction– I/O interrupt does not prevent any instruction from
completion• You can pick your own convenient point to take an interrupt
• I/O interrupt is more complicated than exception:– Needs to convey the identity of the device generating the
interrupt– Interrupt requests can have different urgencies:
• Interrupt request needs to be prioritized
Savio Chau
Interrupt Driven Data Transfer
• Advantage:– User program progress is only halted during actual transfer– “Immediate” response to I/O requests (most of the real time
systems use interrupt driven approach)
• Disadvantage, special hardware is needed to:– Cause an interrupt (I/O device)– Detect an interrupt (processor)– Save the proper states to resume after the interrupt (processor)
Processor
Device #1 Device #2 Device #3
Has no data Has no data Has no data
Has dataData served Has dataData served
Savio Chau
Hardware Implementation of I/O Interrupt
DQ DQ
DQ DQ
DQ
DQ
DQ
DQ
DQ
DQ
DQ
DQ
Interrupt Source 1
Interrupt Source 2
Decoder
Decoder
Interrupt Source 3
Interrupt Source 4
Interrupt Source 1
Interrupt Source 2
Interrupt Source 3
Interrupt Source 4
Clk
Clk
Interrupt Mask Reg
Interrupt Priority Logic
I/O Controller
Clk
Double Synchronizer
I/O Device 1
I/O Device 2
Data Bus (to/from Processor)
Address Bus (from Processor)
Processor
0
0
0
0
0
0
0
0
1Interrupt Register = 1000 means source 1
Device 1
Address for Device 1
Savio Chau
Programmer’s View of Interrupt
• Interrupt Target Address Options:
– General: Branch to a Common Address for all Interrupts; Software then Decodes the Cause and Figures Out What to do.
– Specific: Automatically Branch to Different Addresses Based on Interrupt Type and/ or Level — Known as Vectored Interrupt
Savio Chau
Issues in the Use of Interrupts• High Priority is Assigned to Devices that Require Rapid Response.
Otherwise, They Lose Data (e. g. Hard Disk Controllers).
• Low Priority Given to Slow Devices (e. g., Keyboard and Mouse)
• Higher Priority Interrupts can Interrupt Lower Priority Interrupts, But all Interrupt Routines Should Disable All Interrupts While They Are Saving State — Otherwise the System May Fail
Savio Chau
Disadvantages of Interrupt Driven I/O
• Large Overheads are Associated with Interrupt- Driven I/O
• For Each Item Transferred Within a Block of Data– Identify Interrupt– Save State of Process– Load State of Service Routine– Access Address and Count In Memory– Restore State of Process
• Results in– Inefficient use of CPU– Slow Response to I/ O
• Overhead can be Reduced by– Including a “block transfer” Instruction in Architecture– General Interrupt Used only at End of Block Transfer
Savio Chau
Delegating I/O Responsibility from CPU: DMA
• Direct Memory Access (DMA):– External to the CPU
– Act as a Bus Master
– After Initial Set Up by the CPU, DMA Control Can Transfer Blocks of Data to or From Memory Without CPU Intervention
Set up DMA parameters
DMA request
DMA Ack
Data transfer directly to memory without processor’s help
Savio Chau
Advantages of DMA
• Improves Response Time for Devices with Large Amounts of Data Transfers (Disks, Drums)
• Reduces CPU Overhead
• Data Transferred Directly to Memory
• Memory Addresses / Counts Computed in DMA Controller
• Cycle Stealing or Simultaneous Access (Multimodule Memory)
– Cycle Stealing: DMA controller has Priority Over CPU to Use Processor- Memory Bus. The CPU must Wait During the DMA Controller’s Infrequent Access to Memory
Savio Chau
Delegating I/O Responsibility from CPU: IOP
CPU IOP
Mem
D1
D2
Dn
. .
.
main memorybus
I/Obus
CPU(1) CPU issues instruction to IOP
Command Format in Memory
(2)
(3)IOP executes commands by transferring I/O device data to/from memory directly.
IOP steals memory cycles.
(4) IOP interrupts CPU when done
IOP
target device
where cmnds are
I/O Instruction Format
IOP looks in memory for commands
OP
whatto do
Where to put data
howmuch
specialrequests (e.g., next Instr addr)
Addr Cnt Other
OP Device Address
• CPU - IOP Interactions
• I/O Processor (IOP)
Savio Chau
Responsibilities of the Operating System
• The operating system acts as the interface between:– The I/O hardware and the program that requests I/O
• Three characteristics of the I/O systems in OS:– The I/O system is shared by multiple program using the
processor– I/O systems often use interrupts (external generated
exceptions) to communicate information about I/O operations.
• Interrupts must be handled by the OS because they cause a transfer to supervisor mode
– The low-level control of an I/O device is complex:• Managing a set of concurrent events
• The requirements for correct device control are very detailed
Savio Chau
Operating System Requirements
• Provides abstraction for accessing devices:– Supply routines that handle low-level device operation
• Handles the interrupts generated by I/O devices• Provide protection to shared I/O resources
– Guarantees that a user’s program can only access theportions of an I/O device to which the user has rights
• Provide equitable access to the shared I/O resources– All user programs must have equal access to the I/O
resources
• Schedule accesses in order to enhance system throughput
Savio Chau
OS and I/O System Communication Requirements
• The Operating System must be able to prevent:– The user program from communicating with the I/O
device directly
• If user programs could perform I/O directly:– Protection to the shared I/O resources could not be
provided
• Three types of communication are required:– The OS must be able to give commands to the I/O
devices– The I/O device must be able to notify the OS when the
I/O device has completed an operation or has encountered an error
– Data must be transferred between memory and an I/O device
Savio Chau
Specification of the Interface Signals
Proc Data Bus(Processor controller)
Proc Address Bus(Processor controller)
00000001
00050000
(go-read)
Controller Read Request(Controller device)
Write Enable(Processor controller)
Read Enable(Processor controller)
I/O Data Bus(Device Controller)
I/O Data Ready(Device Controller)
Valid data
00000000
00050001
100000000
00050001
Valid data
00050002
Processor
Proc Data Bus
Proc Addr Bus
Write Enable
Read Enable
I/O Controller
Read Request
I/O Data Bus
I/O Data ReadyI/O Device
Design an I/O controller that reads a 32-bit word from an I/O device under the command of the processor. The protocol and timing are as follows
Write Command Read Status Read Status Read Data
Savio Chau
Logic Design in RTLRTL of I/O Controller:Clock 1: Wait_Proc1: If proc_addr_bus = 0x00050002 & read_enable = 1(Decoding) Then proc_data_bus STATUS_REG
Goto Wait_Proc1 If proc_addr_bus = 0x00050001 & read_enable = 1
Then proc_data_bus DATA_REGGoto Wait_Proc1
If proc_addr_bus = 0x00050000 & write_enable = 1Then COMMAND_REG proc_data_bus
If COMMAND_REG != 0x00000001 Then Goto Wait_Proc1
Else read_request 1 Clock 2: Wait_Dev: If io_data_ready = 0(Get I/O data) Then goto Wait_Dev
Else DATA_REG io_data_busSTATUS_REG<31> 1read_request 0
If proc_addr_bus = 0x00050002 & read_enable = 1Then proc_data_bus STATUS_REG
Clock 3: Wait_Proc2: If proc_addr_bus = 0x00050001 & read_enable = 1(Proc get data) Then proc_data_bus DATA_REG
Else goto Wait_Proc2:If proc_addr_bus = 0x00050002 & read_enable = 1Then proc_data_bus STATUS_REG
Clock 4: Goto Wait_Proc1
Savio Chau
Realization of the Design in Hardware
Decoder
Command
Reg
Status
Reg
Data Reg
Control Logicmux
01
DRWrite
DRRead
SRRead
SRWrite
CRWrite
GoRead
io_data_ready
Read_request
Proc_addr
Proc_data
Read_Enable
Write_Enable
IO_data
DataReady
CRWrite = 1; DataReady = 0SRWrite = 0; SRRead = 1DRRead = 1;DRWrite = 0If GoRead, ReadRequest = 1, else ReadRequest = 0
CRWrite = 0SRWrite = 0; SRRead = 1DRRead = 0;If io_data_ready, DRWrite = 1Else DRWrite = 0If io_data_ready, DataReady = 1Else DataReady = 0If io_data_ready, ReadRequest = 0Else ReadRequest = 1
GoRead
GoRead
Io_data_readyy
Io_data_ready
GoRead
CRWrite = 0SRWrite = 0; SRRead = 1DRWrite = 0If Read Data Reg, DRRead = 1Else DRRead = 0DataReady = 0;ReadRequest = 0
Read Data Reg
Read Data Reg
CRWrite = 0SRWrite = 0; SRRead = 1DRRead = 1;DRWrite = 0DataReady = 0;ReadRequest = 0
I/O Controller Data Path and Control:
Savio Chau
Writing the Software Driver for the Processor
MIPS Device Driver for the I/O Controller:
# Assuming the I/O Controller is memory mapped# Assuming Command Register address (0x00050000) is in $s0# Assuming the GoRead command (0x00000001) is in $t0# Assuming Status Register address (0x00050001) is in $s1# When Status Register = 0x10000000, it indicates data in Data Register
is ready# Assuming Data Register address (0x00050002) is in $s2# The read data will be stored in $s3
sw $t0, 0($s0) # Proc writes GoRead to Command Reg
Wait: lw $t1, 0($s1) # Proc checks Status Regsubi $t2, $t1, 0x10000000bne $t2, $0, Wait # Wait if I/O data not readylw $s3 0($s2) # Proc read Data Reg
Savio Chau
I/O System Summary• Three Types of Buses
– Processor- Memory Buses– I/O Buses– Backplane Buses
• Bus Arbitration Schemes– Daisy Chain Arbitration: It Cannot Assure Fairness– Centralized Parallel Arbitration: Requires a Central Arbiter
• I/O Device Notifying the Operating System– Polling: It Can Waste a Lot of Processor Time– I/O Interrupt: Similar to Exception Except it is Asynchronous
• Delegating I/O Responsibility from the CPU– Direct Memory Access (DMA)
Savio Chau
Memory Requirements of a Display Monitor• Character graphics (Black and White)
– 25 x 80 characters, 8 x 6 pixels/character, 1 bit/pixel, 30 frames/sec
Video RAM requirement = 96 kbits, I/O bus bandwidth = 2.88 Mbits/sec
• Bitmapped (pixel) graphics– Different Graphics Resolutions:
N= 640*480, 800x600, 1024x768, 1280x1024
30 frames/sec
– Black and white 1*N bits per screen, I/O bus bandwidth = 30*N kbits/sec
– Sixteen intensity gray scale 4*N bits, I/O bus bandwidth = 120*N kbits/sec
– 256 color display (RGB) 3*8*N bits, I/O bus bandwidth = 720*N kbits/sec
– True color (24 bits RGB) 3*24*N bits, I/O bus bandwidth = 2160*N kbits/sec
Example: What is the video memory and I/O bus bandwidth requirements to play a movie on a monitor, resolution 1280 x 1024, true color, 30 frames/sec
Video RAM = 1280 x 1024 pixels/frame x 9 bytes/pixel = 11.8 Mbytes
I/O Bus Bandwidth = 1280 x 1024 pixels/frame x 72 bits/pixel x 30 frames/sec
= 2.83 Gbits/sec
This can be handled by a 66 MHz, 64-bit PCI bus (bandwidth = 4.2 Gbits/sec)
Savio Chau
I/O Device Example: Magnetic Disks
Characteristics IBM 3090 IBM UltraStar Integral 1820
Disk diameter (inches) 10.88 3.50 1.80
Formatted data capacity (MB) 22,700 4,300 21
MTTF (hours) 50,000 1,000,000 100,000
Number of arms/box 12 1 1
Rotation speed (RPM) 3,600 7,200 3,800
Transfer rate (MB/sec) 4.2 9-12 1.9
Power/box (watts) 2,900 13 2
MB/watt 8 102 10.5
Volume (cubic feet) 97 0.13 0.02
MB/cubic feet 234 33000 1050
Savio Chau
Organization of a Hard Magnetic Disk
• Typical numbers (depending on the disk size):– 500 to 2,000 tracks per surface
– 32 to 128 sectors per track• A sector is the smallest unit that can be read or written
• Traditionally all tracks have the same number of sectors:– Constant bit density: record more sectors on the outer tracks
– Recently relaxed: constant bit size, speed varies with track location
Platters
Track
Sector
Savio Chau
Magnetic Disk Characteristic
• Cylinder: all the tacks under the head at a given point on all surface
• Read/write data is a three-stage process:– Seek time: position the arm over the proper
track– Rotational latency: wait for the desired sector to
rotate under the read/write head– Transfer time: transfer a block of bits (sector)
under the read-write head
SectorTrack
Cylinder
HeadPlatter
• Average seek time as reported by the industry:– Typically in the range of 8 ms to 12 ms
– (Sum of the time for all possible seek) / (total # of possible seeks)
• Due to locality of disk reference, actual average seek time may:– Only be 25% to 33% of the advertised number
Savio Chau
Typical Numbers of a Magnetic Disk• Rotational Latency:
– Most disks rotate at 3,600 to 7200 RPM– Approximately 16 ms to 8 ms per revolution– An average latency to the desired
information is halfway around the disk: 8 ms at 3600 RPM, 4 ms at 7200 RPM
SectorTrack
Cylinder
HeadPlatter
• Transfer Time is a function of :– Transfer size (usually a sector): 1 KB / sector– Rotation speed: 3600 RPM to 7200 RPM– Recording density: bits per inch on a track– Diameter typical diameter ranges from 2.5 to 5.25 in– Typical values: 2 to 12 MB per second– Indicate What Type of Information is on the Data Lines
• A Bus Transaction Includes Two Parts:– Sending the Address and Control
– Receiving or Sending the Data
Savio Chau
Memory Bus Survey (1993)
Characteristics MBus Summit Challenge XDBus Originator Sun HP SGI Sun Clock Rate (MHz) 40 60 48 66 Address lines 36 48 40 muxed Data lines 64 128 256 144 (parity) Data Sizes (bits) 256 512 1024 512 Clocks/transfer 4 5 4? Peak (MB/s) 320(80) 960 1200 1056 Master Multi Multi Multi Multi Arbitration Central Central Central Central Slots 16 9 10 Busses/system 1 1 1 2 Length 13 inches 12? inches 17 inches
Savio Chau
Backplane Bus Survery
Characteristics SBus TurboChannel MicroChannel PCI Originator Sun DEC IBM Intel Clock Rate (MHz) 16-25 12.5-25 async 33 Addressing Virtual Physical Physical Physical Data Sizes (bits) 8,16,32 8,16,24,32 8,16,24,32,64 8,16,24,32,64 Master Multi Single Multi Multi Arbitration Central Central Central Central 32 bit read (MB/s) 33 25 20 33 Peak (MB/s) 89 84 75 111 (222) Max Power (W) 16 26 13 25
Savio Chau
Serial I/O Bus Survey
Metrics IEEE 1394 (cable) SFODB (IEEE 1393) FIBRE CHANNEL I2C
Effective Bandwidth
100, 200, 400 Mbps 200 Mbps, 1 Gbps 1 Gbps 100, 400 Kbps
Power 1 W/node 5 W/node 2 W/node 5 mW/node
Multi-master Yes Yes Yes Yes
Fault Tolerance CRC for error detection
Redundant ring with cross-strap bypass
links
Redundant ring with bypass links, CRC for error detection
No redundancy in topology. Error detection by Ack bit
Max. # Nodes 64 nodes 127 nodes Loop; 127Switched: no limit
Addressable up to 128 nodes or up to max loading of 400 pf
Max Bus Length 72 m (4.5 m/hop) 10 km, 100m node spacing
Fiber: 10 kmElectrical: 30 m
up to max loading of 400 pf
Protocol Overhead 8 % overhead with 278 byte payload data
10 % overhead with 53 byte payload
25 % overhead with 2168 byte payload
1 byte for address & R/W + 1 Ack bit per data byte
Savio Chau
Standard Buses
Characteristic VME Bus NuBus FutureBus IPI SCSI PCI
Bus Type Backplane Backplane Backplane I/O I/O Backplane
Bus width (signals) 128 96 96 16 8 48
Address / data Multiplexed?
Not Multiplexed
Multiplexed Multiplexed N/A Multiplexed Multiplexed
Data width (primary) 16- 32 bits 32 bits 32 bits 16 bits 8 bits 32 bits
Number of bus masters
Multiple Multiple Multiple Single Multiple Multiple
Arbitration Multiple daisy chain
Distributed self-
selection
Distributed self-
selection
N/A Self-selection
Any
Clocking Async Sync Async Async Either Sync
Bandwidth, 150ns memory single word
12. 9 MB/s 13. 2 MB/s 15. 5 MB/s 25. 0 MB/s 5.0 MB/s or 1.5 MB/s
16. 5 MB/s
Bandwidth, 150ns memory, multi word
(infinite length)
13. 6 MB/s 26. 4 MB/s 20. 8 MB/s 25. 0 MB/s 5.0 MB/s or 1.5 MB/s
132 MB/s
Maximum number of devices
21 16 20 8 7 10
Maximum bus length 0.5 meters 0.5 meters 0.5 meters 50 meters 25 meters 0.5 meters
Standard name IEEE 1014 IEEE 1196 IEEE 869. 1 ANSI X3.129 ANSI X3.131 PCI