BigBen @ PSC

Click here to load reader

download BigBen @ PSC

of 15

  • date post

  • Category


  • view

  • download


Embed Size (px)


BigBen @ PSC. BigBen @ PSC. BigBen @ PSC. Compute Nodes 2068 nodes running Catamount (QK) microkernel Seastar interconnect in a 3-D torus configuration No external connectivity (no TCP) All Inter-node communication is over Portals Applications use MPI which is based on Portals. - PowerPoint PPT Presentation

Transcript of BigBen @ PSC

  • BigBen @ PSC

  • BigBen @ PSC

  • BigBen @ PSC

  • BigBen FeaturesCompute Nodes2068 nodes running Catamount (QK) microkernelSeastar interconnect in a 3-D torus configurationNo external connectivity (no TCP)All Inter-node communication is over PortalsApplications use MPI which is based on PortalsService & I/O Nodes (SIO) Nodes22 nodes running Suse LinuxAlso on the Seastar interconnectSIO nodes can have PCI-X hardware installed, defining unique roles for each2 SIO nodes are externally connected to ETF with 10GigE cards (currently)

  • Portals Direct I/O (PDIO) DetailsPortals-to-TCP routingPDIO daemons aggregate hundreds of portals data streams into a configurable number of outgoing TCP streamsHeterogenous portals (both QK + Linux nodes)Explicit ParallelismConfigurable # of Portals receivers (on SIO nodes)Distributed across multiple 10GigE-connected Service & I/O (SIO) nodesCorresponding # of TCP streams (to the WAN)one per PDIO daemonA Parallel TCP receiver in the Goodhue boothSupports a variable/dynamic number of connections

  • Portals Direct I/O (PDIO) DetailsUtilizing the ETF network10GigE end-to-endBenchmarked >1Gbps in testingInherent flow-control feedback to applicationAggregation protocol allows TCP transmission or even remote file system performance to throttle the data streams coming out of the application (!)Variable message sizes and file metadata supportedMulti-threaded ring buffer in the PDIO daemonAllows the Portals receiver, TCP sender, and computation to proceed asynchronously

  • Portals Direct I/O (PDIO) ConfigUser-configurable/tunable parameters:Network targetsCan be different for each jobNumber of streamsCan be tuned for optimal host/network utilizationTCP network buffer sizeCan be tuned for maximum throughput over the WANRing buffer size/lengthControls total memory utilization of PDIO daemonsNumber of portals writersCan be any subset of the running applications processesRemote filename(s)File metadata are propagated through the full chain, per write

  • HPC resource and rendererwaitingETFnetworkCompute NodesI/O NodesSteeringiGRIDPSC

  • Launch PPM job, PDIO daemons, and iGRID recverspdiodpdiodpdiodpdiodpdiodpdiodrecvrecvrecvETFnetworkCompute NodesI/O NodesSteeringiGRIDPSC

  • Aggregate data via PortalspdiodpdiodpdiodpdiodpdiodpdiodrecvrecvrecvETFnetworkCompute NodesI/O NodesSteeringiGRIDPSC

  • Route traffic to ETF netpdiodpdiodpdiodpdiodpdiodpdiodrecvrecvrecvETFnetworkCompute NodesI/O NodesSteeringiGRIDPSC

  • Recv data @ iGRIDpdiodpdiodpdiodpdiodpdiodpdiodrecvrecvrecvETFnetworkCompute NodesI/O NodesSteeringiGRIDPSC

  • Render real-time datapdiodpdiodpdiodpdiodpdiodpdiodrecvrecvrecvETFnetworkCompute NodesI/O NodesSteeringiGRIDPSC

  • Send steering data back toactive jobpdiodpdiodpdiodpdiodpdiodpdiodrecvrecvrecvETFnetworkCompute NodesI/O NodesSteeringiGRIDPSCinput

  • Dynamically update renderingpdiodpdiodpdiodpdiodpdiodpdiodrecvrecvrecvETFnetworkCompute NodesI/O NodesSteeringiGRIDPSCinput