CS 61C: Great Ideas in Computer Architecture (Machine …cs61c/fa17/lec/24/L24 IO,DMA,Disks,… ·...
Transcript of CS 61C: Great Ideas in Computer Architecture (Machine …cs61c/fa17/lec/24/L24 IO,DMA,Disks,… ·...
CS61C:GreatIdeasinComputerArchitecture(MachineStructures)
Lecture24:MoreI/O:DMA,Disks,Networking
Instructors:Krste Asanović &RandyH.Katzhttp://inst.eecs.berkeley.edu/~cs61c/fa17
11/21/17 Fall2017 -- Lecture#24 1
Review:AddressTranslationandProtection
• Every instruction and data access needs address translation and protection checks
Good VM design should be fast (~one cycle) and space efficient
Physical Address
Virtual Address
AddressTranslation
Virtual Page No. (VPN) offset
Physical Page No. (PPN) offset
ProtectionCheck
Exception?
Kernel/User Mode
Read/WriteMiss
2Fall2017-- Lecture#2411/21/17
Review:HierarchicalPageTables
Level 1 Page TablePage size 10b à1024 x 4096B Level 2
Page Tables12b à 4096B
Data Pages
page in primary memory page in secondary memory
Root of the CurrentPage Table
p1
p2
Virtual Address
(ProcessorRegister)
PTE of a nonexistent page
p1 p2 offset01112212231
10-bitL1 index
10-bit L2 index
Phys
ical
Mem
ory
3Fall2017-- Lecture#2411/21/17
Review:TranslationLookasideBuffers(TLB)Address translation is very expensive!
In a two-level page table, each reference becomes three memory accesses
Solution: Cache some translations in TLBTLB hit Þ Single-Cycle TranslationTLB miss Þ Page-Table Walk to refill
VPN offset
V D tag PPN
physical address PPN offset
virtual address
hit?
(VPN = virtual page number)
(PPN = physical page number)
4Fall2017-- Lecture#2411/21/17
VM-relatedEventsinPipeline
• HandlingaTLBmissneedsahw orsw mechanismtorefillTLB– Usuallydoneinhardware
• Handlingapagefault(e.g.,pageisondisk)needsprecise trapsosoftwarehandlercaneasilyresumeafterretrievingpage
• Protectionviolationmayabortprocess
PCInst TLB
Inst. Cache D Decode E M
Data TLB
Data Cache W+
TLB miss? Page Fault?Protection violation?
TLB miss? Page Fault?Protection violation?
5Fall2017-- Lecture#2411/21/17
AddressTranslation:PuttingitallTogetherVirtual Address
TLBLookup
Page TableWalk
Update TLBPage Fault(OS loads page)
ProtectionCheck
PhysicalAddress(to cache)
miss hit
the page is Ïmemory Îmemory denied permitted
ProtectionFault
hardwarehardware or softwaresoftware
SEGFAULTWhere?6Fall2017-- Lecture#2411/21/17
Review:I/O
• “MemorymappedI/O”:Devicecontrol/dataregistersmappedtoCPUaddressspace
• CPUsynchronizeswithI/Odevice:– Polling– Interrupts
• “ProgrammedI/O”:– CPUexecslw/sw instructionsforalldatamovementto/fromdevices– CPUspendstimedoingtwothings:
1. Gettingdatafromdevicetomainmemory2. Usingdatatocompute
11/21/17 Fall2017 -- Lecture#24 7
RealityCheck!
• “MemorymappedI/O”:Devicecontrol/dataregistersmappedtoCPUaddressspace
• CPUsynchronizeswithI/Odevice:– Polling– Interrupts
• “ProgrammedI/O”: DMA– CPUexecslw/sw instructionsforalldatamovementto/fromdevices– CPUspendstimedoing2 things:
1. Gettingdatafromdevicetomainmemory2. Usingdatatocompute
11/21/17 Fall2017 -- Lecture#24 8
Outline
• DirectMemoryAccess• Review:Disks• Networking• StorageAttachmentEvolution• RackScaleMemory• AndinConclusion…
11/21/17 Fall2017 – Lecture#24 9
Outline
• DirectMemoryAccess• Disks• Networking• StorageAttachmentEvolution• RackScaleMemory• AndinConclusion…
11/21/17 Fall2017 – Lecture#24 10
What’sWrongwithProgrammedI/O?
• Notidealbecause…1. CPUhastoexecutealltransfers,couldbedoingotherwork2. Devicespeedsdon’talignwellwithCPUspeeds3. Energycostofusingbeefygeneral-purposeCPUwheresimpler
hardwarewouldsuffice• UntilnowCPUhassolecontrolofmainmemory• 5%ofCPUcyclesonGoogleServersspentinmemcpy()andmemmove()libraryroutines!**Kanev etal.,“Profilingawarehouse-scalecomputer,”ICSA2015,(June2015),Portland,OR.
11/21/17 Fall2017 -- Lecture#24 11
PIOvs.DMA
11/21/17 Fall2017 -- Lecture#24 12
DirectMemoryAccess(DMA)
• AllowsI/Odevicestodirectlyread/writemainmemory• NewHardware:theDMAEngine• DMAenginecontainsregisterswrittenbyCPU:
– Memoryaddresstoplacedata– #ofbytes– I/Odevice#,directionoftransfer– unitoftransfer,amounttotransferperburst
11/21/17 Fall2017 -- Lecture#24 13
OperationofaDMATransfer
11/21/17 Fall2017 -- Lecture#24 14
[FromSection5.1.4DirectMemoryAccessinModernOperatingSystemsbyAndrewS.Tanenbaum,HerbertBos,2014]
DMA:IncomingData
1. Receiveinterruptfromdevice2. CPUtakesinterrupt,beginstransfer
– InstructsDMAengine/devicetoplacedata@certainaddress
3. Device/DMAenginehandlethetransfer– CPUisfreetoexecuteotherthings
4. Uponcompletion,Device/DMAengineinterrupttheCPUagain
11/21/17 Fall2017 -- Lecture#24 15
DMA:OutgoingData
1. CPUdecidestoinitiatetransfer,confirmsthatexternaldeviceisready
2. CPUbeginstransfer– InstructsDMAengine/devicethatdataisavailable@certainaddress
3. Device/DMAenginehandlethetransfer– CPUisfreetoexecuteotherthings
4. Device/DMAengineinterrupttheCPUagaintosignalcompletion
11/21/17 Fall2017 -- Lecture#24 16
DMA:SomeNewProblems
• WhereinthememoryhierarchydoweplugintheDMAengine?Twoextremes:– BetweenL1$andCPU:
• Pro:Freecoherency• Con:TrashtheCPU’sworkingsetwithtransferreddata
– BetweenLast-levelcacheandmainmemory:• Pro:Don’tmesswithcaches• Con:Needtoexplicitlymanagecoherency
11/21/17 Fall2017 -- Lecture#24 17
Outline
• DirectMemoryAccess• Disks• Networking• AndinConclusion…
11/21/17 Fall2017 – Lecture#24 18
ComputerMemoryHierarchyOneofour“GreatIdeas”
11/21/17 Fall2017 -- Lecture#24 19
RackScaleMemory/Storage
Storage-CentricViewoftheMemoryHierarchy
FileSystemCache
StorageController
CacheEmbedded DiskController Cache
Magnetic Disk StorageDisk Array
Optical Disk Jukebox
Magnetic or Optical Tape Library
Shelved Magnetic or Optical Tape
Solid State Disk
RAM
On-Line
Near-Line
Off-Line
FileSystemCache
StorageController
CacheEmbedded DiskController Cache
Magnetic Disk StorageDisk Array
Optical Disk Jukebox
Magnetic or Optical Tape Library
Shelved Magnetic or Optical Tape
Solid State Disk
RAM
On-Line
Near-Line
Off-Line
11/21/17 Fall2017 -- Lecture#24 20
PreviousSlide!
21
DRAM
FlashDisk
DiskDevicePerformance(1/2)
• DiskAccessTime=SeekTime+RotationTime+TransferTime+ControllerOverhead– SeekTime=timetopositiontheheadassemblyatthepropercylinder– RotationTime=timeforthedisktorotatetothepointwherethefirstsectorsoftheblockto
accessreachthehead– TransferTime=timetakenbythesectorsoftheblockandanygapsbetweenthemtorotatepast
thehead
Platter
Arm
Actuator
HeadSectorInnerTrack
OuterTrack
ControllerSpindle
11/21/17 Fall2017 -- Lecture#24 22
DiskDevicePerformance(2/2)
• Averagevaluestoplugintotheformula:• RotationTime:Averagedistanceofsectorfromhead?
– 1/2timeofarotation• 7200RevolutionsPerMinuteÞ 120Rev/sec• 1revolution=1/120secÞ 8.33milliseconds• 1/2rotation(revolution)Þ 4.17ms
• Seektime:Averageno.trackstomovearm?– Numberoftracks/3(seeCS186forthemath)– Then,seektime=numberoftracksmoved× timetomoveacrossonetrack
11/21/17 Fall2017 -- Lecture#24 23
Butwait!
• Performanceestimatesaredifferentinpractice• Moderndiskshaveon-diskcaches,whicharehiddenfromtheoutsideworld– Generally,whatlimitsrealperformanceistheon-diskcacheaccesstime
11/21/17 Fall2017 -- Lecture#24 24
FlashMemory/SSDTechnology
• NMOStransistorwithanadditionalconductorbetweengateandsource/drainwhich“traps”electrons.Thepresence/absenceisa1or0
• Memorycellscanwithstandalimitednumberofprogram-erasecycles.ControllersuseatechniquecalledwearlevelingtodistributewritesasevenlyaspossibleacrossalltheflashblocksintheSSD.
11/21/17 Fall2017 -- Lecture#24 25
Administrivia (1/2)• Project3.2(PerformanceContest)hasbeenreleased
– Upto5extracreditpointsforthehighestspeedups• Finalexam:
– 14December,7-10PM@TBA– ContactheadTA(StevenHo)aboutconflictsifyouhaven’tbeencontactedyet
– ReviewLecturesandBookwitheyeontheimportantconceptsofthecourse
• ReviewSessionFriDec8,5-8PM@TBA• ElectronicCourseEvaluationsstartingthisweek!Seehttps://course-evaluations.berkeley.edu
11/21/17 Fall2017 -- Lecture#24 26
Administrivia (2/2)• HW6party tonightnight (Nov21)inthe Woz from 5-8PM
• Nodiscussionsorlabsthisweek!– LabsresumeMondayafterThanksgiving
• Lab11dueinanylabbeforeDecember1• Lab13dueinanyOHbeforeDecember8
• Homework6duetomorrownight• Project4tobereleasedFridaylatest• Homework7tobereleasedMondayafterbreak11/21/17 Fall2017 -- Lecture#24 27
WinnersoftheProject3PerformanceCompetition!
11/21/17 Fall2017 -- Lecture#24 28
HappyBirthdayRAID:11/20/87
11/21/17 Fall2017 -- Lecture#24 29
RAID(RedundantArraysofInexpensiveDisks)coveredinnextTuesday’slecture!
HappyBirthdayInternet!11/21/69
30
CS61cintheNews:Supercomputerina
FileDrawer“TheRaspberryPimodulesletdevelopersfigureouthowtowritethissoftwareandgetittoworkreliablywithouthavingadedicatedtestbedofthesamesize,whichwouldcostaquarterbilliondollarsanduse25megawattsofelectricity.”
GaryGrider,leaderoftheHighPerformanceComputingDivision
11/21/17 Fall2017 -- Lecture#24 31
3211/21/17 Fall2017 - Lecture#24
PeerInstructionQuestion• Wehavethefollowingdisk:
– 15000Cylinders,1ms tocross1000Cylinders– 15000RPM=4ms perrotation– Wanttocopy1MB,transferrateof1000MB/s– 1ms controllerprocessingtime
• Whatistheaccesstimeusingourmodel?
DiskAccessTime=SeekTime+RotationTime+TransferTime+ControllerProcessingTime
A B C D
10.5 ms 9 ms 8.5ms 11.5ms
11/21/17 Fall2017 -- Lecture#24 33
Outline
• DirectMemoryAccess• Disks• Networking• StorageAttachmentEvolution• RackScaleMemory• AndinConclusion…
11/21/17 Fall2017 – Lecture#24 34
Networks:TalkingtotheOutsideWorld
• OriginallysharingI/Odevicesbetweencomputers– E.g.,printers
• Thencommunicatingbetweencomputers– E.g.,filetransferprotocol
• Thencommunicatingbetweenpeople– E.g.,e-mail
• Thencommunicatingbetweennetworksofcomputers– E.g.,filesharing,www,…
11/21/17 Fall2017 -- Lecture#24 35
• History– 1963:JCRLicklider,whileatDoD’s ARPA,writesamemodescribingdesiretoconnectthecomputersatvariousresearchuniversities:Stanford,Berkeley,UCLA,...
– 1969:ARPAdeploys4“nodes”@UCLA,SRI,Utah,&UCSB
– 1973RobertKahn&Vint CerfinventTCP,nowpartoftheInternetProtocolSuite
• Internetgrowthrates– Exponentialsincestart!
TheInternet(1962)www.computerhistory.org/internet_history
www.greatachievements.org/?id=3736en.wikipedia.org/wiki/Internet_Protocol_Suite
“Lick”
Vint Cerf“Revolutions like this don't
come along very often”
11/21/17 Fall2017 -- Lecture#24 36
• “SystemofinterlinkedhypertextdocumentsontheInternet”
• History– 1945:Vannevar Bushdescribeshypertextsystemcalled“memex”inarticle
– 1989:SirTimBerners-LeeproposedandimplementedthefirstsuccessfulcommunicationbetweenaHypertextTransferProtocol(HTTP)clientandserverusingtheinternet.
– ~2000Dot-comentrepreneursrushedin,2001bubbleburst
• Today:Accessanywhere!
TheWorldWideWeb(1989)en.wikipedia.org/wiki/History_of_the_World_Wide_Web
Tim Berners-LeeWorld’s First web
server in 1990
11/21/17 Fall2017 -- Lecture#24 37
Sharedvs.Switch-BasedNetworks
• Sharedvs.Switched:• Shared:1atatime(CSMA/CD)• Switched: pairs(“point-to-point”connections)communicateatsametime
• Aggregatebandwidth(BW)inswitchednetworkismanytimesthatofshared:• Point-to-pointfastersincenoarbitration,simplerinterface
Node Node Node
Shared
CrossbarSwitch
Node
Node
Node
Node
11/21/17 Fall2017 -- Lecture#24 38
WhatMakesNetworksWork?• Links connectingswitchesand/orrouters toeachotherandtocomputersordevices
Computer
networkinterface
switch
switch
switch
• Abilitytonamethecomponentsandtoroutepacketsofinformation-messages- fromasourcetoadestination
• Layering,redundancy,protocols,andencapsulationasmeansofabstraction(61Cbigidea)
11/21/17 Fall2017 -- Lecture#24 39
SoftwareProtocoltoSendandReceive• SWSendsteps
1:ApplicationcopiesdatatoOSbuffer2:OScalculateschecksum,startstimer3:OSsendsdatatonetworkinterfaceHWandsaysstart
• SWReceivesteps3:OScopiesdatafromnetworkinterfaceHWtoOSbuffer2:OScalculateschecksum,ifOK,sendACK;ifnot,deletemessage (senderresendswhentimerexpires)
1:IfOK,OScopiesdatatouseraddressspace,&signalsapplicationtocontinue
11/21/17 Fall2017 -- Lecture#24 40
Header Payload
Checksum
TrailerCMD/Address/DataNetID NetID Len ACK
INFO
Dest Src
NetworksarelikeOgres
https://www.youtube.com/watch?v=_bMcXVe8zIs
11/21/17 Fall2017 -- Lecture#24 41
Protocols for Networks of Networks?Whatdoesittaketosendpacketsacrosstheglobe?• Bitsonwireorair• Packetsonwireorair• Deliverypacketswithinasinglephysicalnetwork• Deliverpacketsacrossmultiplenetworks• Ensurethedestinationreceivedthedata• Createdataatthesenderandmakeuseofthedataatthereceiver
11/21/17 Fall2017 -- Lecture#24 42
Lotstodoandatmultiplelevels!
Useabstraction tocopewithcomplexityofcommunication
•Networksarelikeogres onions•Hierarchyoflayers:
- Application(chatclient,game,etc.)- Transport(TCP,UDP)- Network(IP)- DataLinkLayer(ethernet)- PhysicalLink(copper,wireless,etc.)
ProtocolforNetworksofNetworks?
11/21/17 Fall2017 -- Lecture#24 43
ProtocolFamilyConcept• Protocol:packetstructureandcontrolcommandstomanagecommunication
• Protocolfamilies(suites):asetofcooperatingprotocolsthatimplementthenetworkstack
• Keytoprotocolfamilies isthatcommunicationoccurslogically atthesameleveloftheprotocol,calledpeer-to-peer…
…butisimplementedviaservicesatthenextlowerlevel• Encapsulation:carryhigherlevelinformationwithinlowerlevel“envelope”
11/21/17 Fall2017 -- Lecture#24 44
Dear Bill,
Your days are numbered.
--Steve
• CEO A writes letter to CEO B– Folds letter and hands it to assistant
• Assistant:– Puts letter in envelope with CEO B’s full name– Takes to FedEx
• FedEx Office– Puts letter in larger envelope– Puts name and street address on FedEx envelope– Puts package on FedEx delivery truck
• FedEx delivers to other company
Inspiration …
11/21/17 Fall2017 -- Lecture#24 45
CEO
Aide
FedEx
CEO
Aide
FedExLocationFedex Envelope(FE)
The Path of the Letter
Letter
Envelope
SemanticContent
Identity
“Peers”oneachsideunderstandthesamethingsNooneelseneedsto
Lowestlevelhasmostpackaging
11/21/17 Fall2017 -- Lecture#24 46
ProtocolFamilyConcept
Message Message
TH Message TH Message TH TH
Actual Actual
Physical
Message TH Message THActual Actual
Logical
Logical
Eachlowerlevelofstack“encapsulates”informationfromlayerabovebyaddingheaderandtrailer
11/21/17 Fall2017 -- Lecture#24 47
MostPopularProtocolforNetworkofNetworks
• TransmissionControlProtocol/InternetProtocol(TCP/IP)• ThisprotocolfamilyisthebasisoftheInternet,aWAN(wideareanetwork)protocol– IPmakesbestefforttodeliver– Packetscanbelost,corrupted– TCPguaranteesdelivery– TCP/IPsopopularitisusedevenwhencommunicatinglocally:evenacrosshomogeneousLAN(localareanetwork)
11/21/17 Fall2017 -- Lecture#24 48
Message
TCP/IPPacket,EthernetPacket,Protocols
• Applicationsendsmessage
TCP data
TCP HeaderIP Header
IP DataEH
Ethernet Hdr
Ethernet Hdr• TCPbreaksinto64KiBsegments,adds20Bheader
• IPadds20Bheader,sendstonetwork
• IfEthernet,brokeninto1500Bpacketswithheaders,trailers
11/21/17 Fall2017 -- Lecture#24 49
5011/21/17 Fall2017 - Lecture#20
Outline
• DirectMemoryAccess• Disks• Networking• StorageAttachmentEvolution• RackScaleMemory• AndinConclusion…
11/21/17 Fall2017 – Lecture#24 51
StorageAttachmentEvolution
Host
OS
Disk Interface (DI)
AllocationTable
Disk, Cylinder,Track,
Sector
LAN
Host
Host
HostNetwork
FileServer
OS
NetworkInterface
(NI)
NetworkInterface
(NI)
NetworkInterface
(NI)
File Name, Offset, LengthDirectAttachment
NetworkServerAttachment
11/21/17 Fall2017 -- Lecture#24 52
StorageAttachmentEvolution
DiskStorage
Subsystem
WorkStation
MainFrame
MainFrame
ChannelInterface
OSOSLUN,
Offset,Length
LUNToPHY
LAN
Host
Host
Host
NetworkFile
Server
OS
NetworkInterface
(NI)
NetworkInterface
(NI)
NetworkInterface (NI)
Network-attachedStorage (NAS)
OSOS
File Name, Offset, Length
Disk, Cylinder,Track, Sector
11/21/17 Fall2017 -- Lecture#24 53
NetworkAttached
ChannelAttached
OpticalDisk
StorageSubsystem
SAN
MainFrame
DiskStorage
Subsystem
TapeStorage
Subsystem
ChannelInterface
LAN
Host
Host
Host
NetworkInterface
(NI)
NetworkInterface
(NI)
NetworkInterface
(NI)
File Name, Offset, Length
FileServer
FileServer
FileServer
CI
LUN,Offset,Length
Gateway
WAN
Gateway
LAN SAN
MainFrame
FS DSS
Remote SAN
CI
CI
CI
LUN,Offset, Length
PHY Device,Cyl, Trk, Sector
StorageAttachmentEvolution
11/21/17 Fall2017 -- Lecture#24 54
StorageAreaNetworks(SAN)
Outline
• DirectMemoryAccess• Disks• Networking• StorageAttachmentEvolution• RackScaleMemory• AndinConclusion…
11/21/17 Fall2017 – Lecture#24 55
StorageClassMemoryakaRack-ScaleMemory
11/21/17 Fall2017 -- Lecture#24 56
StorageClassMemoryakaRack-ScaleMemory
• Highbandwidthandlowlatencythroughasimplifiedinterfacebasedonmemorysemantics(i.e.,ld/st),scalablefromtenstoseveralhundredGB/secofbandwidth,withsub-100nanosecs load-to-usememorylatency
• Supportsscalablememorypoolsandresourcesforreal-timeanalyticsandin-memoryapplications(i.e.,map-reduce)
• Highlysoftwarecompatiblewithnorequiredchangestotheoperatingsystem
• Scalesfromsimple,lowcostconnectivitytohighlycapable,rackscaleinterconnect
11/21/17 Fall2017 -- Lecture#24 57
StorageClassMemoryakaRack-ScaleMemory
11/21/17 Fall2017 -- Lecture#24 58
CheaperthanDRAMMoreexpensivethandiskNon-Volatileandfasterthandisk
StorageClassMemoryakaRackScaleMemory
• Systemmemoryisflatorshrinking– Memorybandwidthpercorecontinuestodecrease– Memorycapacitypercoreisgenerallyflat– MemoryischangingonadifferentcadencecomparedtotheCPU
• Dataisgrowing– Datathatrequiresreal-timeanalysisisgrowingexponentially– Thevalueoftheanalysisdecreasesifittakestoolongtoprovideinsights
• Industryneedsanopenarchitecturetosolvetheproblems– Memorytierswillbecomeincreasinglyimportant– Rack-scalecomposability requiresahighbandwidth,lowlatencyfabric– MustseamlesslyplugintoexistingecosystemswithoutrequiringOSchanges
11/21/17 Fall2017 -- Lecture#24 59
RemoteDirectMemoryAccess
60
RemoteDirectMemoryAccess
61
ConventionalNetworking
Cut-throughMemoryaccessOvernetwork
Outline
• DirectMemoryAccess• Disks• Networking• RackScaleMemory• And,inConclusion…
11/21/17 Fall2017 – Lecture#24 62
“And,inConclusion…”
• I/Ogivescomputerstheir5senses• I/Ospeedrangeis100-milliontoone• DMAtoavoidwastingCPUtimeondatatransfers• Disksforpersistentstorage,beingreplacedbyflashandemerging“storageclassmemory”
• Networks:computer-to-computerI/O– Protocolsuitesallownetworkingofheterogeneouscomponents.GreatIdea:LayersandAbstraction
– Emergingclass:Rack-scale/Storage-classMemoryaccessibleoverRDMAorothernetworkinterconnect
11/21/17 Fall2017 -- Lecture#24 63