Visual Analytics Sandbox: A big data platform for ... · –Virtual big data environment for stream...
Transcript of Visual Analytics Sandbox: A big data platform for ... · –Virtual big data environment for stream...
![Page 1: Visual Analytics Sandbox: A big data platform for ... · –Virtual big data environment for stream processing, graph analytics –Leverage stream processing, graph mining & visualization](https://reader033.fdocuments.net/reader033/viewer/2022042304/5ecf44fb9508ef087b42088f/html5/thumbnails/1.jpg)
ANationalScienceFoundationIndustry/UniversityCooperativeResearchCenter
VisualAnalyticsSandbox:Abigdataplatformforprocessingnetworktraffic
RajuGottumukkala,Ph.D.DirectorofResearch,InformaticsResearchInstitute
SiteDirector,NSFCenterforVisualandDecisionInformaticsAssistantProfessor,CollegeofEngineering
UniversityofLouisianaatLafayette
2017Internet2GlobalSummit(04/26/2017)
FundedbyNSFawardNo.1429526
![Page 2: Visual Analytics Sandbox: A big data platform for ... · –Virtual big data environment for stream processing, graph analytics –Leverage stream processing, graph mining & visualization](https://reader033.fdocuments.net/reader033/viewer/2022042304/5ecf44fb9508ef087b42088f/html5/thumbnails/2.jpg)
ANationalScienceFoundationIndustry/UniversityCooperativeResearchCenter
Motivation
• Cyberenvironmentisincreasinglygettingcomplex
• Existingtoolsdonotsupportinteractiveanalysisofdynamicgraphs
Ip FlowGraph[©CVDI]Webgraph[©GW3BI]
• Graphsarecomplexdatastores
![Page 3: Visual Analytics Sandbox: A big data platform for ... · –Virtual big data environment for stream processing, graph analytics –Leverage stream processing, graph mining & visualization](https://reader033.fdocuments.net/reader033/viewer/2022042304/5ecf44fb9508ef087b42088f/html5/thumbnails/3.jpg)
ANationalScienceFoundationIndustry/UniversityCooperativeResearchCenter
ApplicationofTVGincyber-security
• Extraction of Traffic Dispersion Graph(TDG) from IP-Flows
• Graph structural properties indicateabnormal traffic patterns (malware)
ATDGforsoribada P2P
![Page 4: Visual Analytics Sandbox: A big data platform for ... · –Virtual big data environment for stream processing, graph analytics –Leverage stream processing, graph mining & visualization](https://reader033.fdocuments.net/reader033/viewer/2022042304/5ecf44fb9508ef087b42088f/html5/thumbnails/4.jpg)
ANationalScienceFoundationIndustry/UniversityCooperativeResearchCenter
WhatisVisualAnalyticsSandbox?
• Auniqueanalyticsenvironmentforprocessinghigh-volume,high-velocitydatastreams
– IoT,IPflowgraphs,clickstreams,socialmedia,etc.
• Experimentalinfrastructuretodevelopnext-generationdecisionsupporttools
VisualAnalytics
Virtualization
ResourceManager
NetworkInfrastructure Processing Memory
SSD-Storage
DiskStorage
StreamProcessing
BatchProcessing
DataManagement GraphStorage
Pre-ProcessingLibraries
MachineLearning
GraphProcessing
StatisticalPackages
VisualizationAlgorithms
LayoutAlgorithms
GraphReduction
HCITools(touch/3D)
PredictiveAnalytics
InteractiveAnalytics
AnomalyDetection
![Page 5: Visual Analytics Sandbox: A big data platform for ... · –Virtual big data environment for stream processing, graph analytics –Leverage stream processing, graph mining & visualization](https://reader033.fdocuments.net/reader033/viewer/2022042304/5ecf44fb9508ef087b42088f/html5/thumbnails/5.jpg)
ANationalScienceFoundationIndustry/UniversityCooperativeResearchCenter
Hardware
• 112TFlopsofcomputing– 22IntelXeonE5processors(308
cores)– 12NvDIAP100GPU’s
• Storage– 1.8TBRAM– 25.6TBNvMESSD
• 25XfasterthanHDD,5XfasterthanSSD
– 20TBHDD• Networking
– 2X10GEthernetCardsperNode
InformationManagement
Nodes
StreamProcessingNodes
AnalyticsNode
VisualizationNode
![Page 6: Visual Analytics Sandbox: A big data platform for ... · –Virtual big data environment for stream processing, graph analytics –Leverage stream processing, graph mining & visualization](https://reader033.fdocuments.net/reader033/viewer/2022042304/5ecf44fb9508ef087b42088f/html5/thumbnails/6.jpg)
ANationalScienceFoundationIndustry/UniversityCooperativeResearchCenter
BenchmarkingGraphOperations
TypeofTemporalCharacteristic GraphOperations
Temporalnetworktopology Degree,connectivity,density
Reachabilityanalysis Paths,walks,trails
Detectingoutliers Nodeoredgeclustering
Nodeneighborhoods Persistentpatterns&motifs
Execution times for retrieving cumulative edge weights of a node (Neo4J vs MySQL)
Figure 9. Execution times for retrieving shortest path betweentwo nodes (Neo4j with timestamp on node and caching vs nocaching)
Execution times for retrieving shortest path between two selected nodes (Neo4j schemes)
Execution times for retrieving weighted shortest path between two nodes (Neo4j with timestamp on node and
caching vs no caching)
![Page 7: Visual Analytics Sandbox: A big data platform for ... · –Virtual big data environment for stream processing, graph analytics –Leverage stream processing, graph mining & visualization](https://reader033.fdocuments.net/reader033/viewer/2022042304/5ecf44fb9508ef087b42088f/html5/thumbnails/7.jpg)
ANationalScienceFoundationIndustry/UniversityCooperativeResearchCenter
Touchinteractions
(a) Node history graphs with timestamp browsing
(b) Graph filtering based on edge weight with multitouch controller (lower-bound highlighted yellow).
(c) Control widget (around finger) and zoom window with detached source.
(d) Local neighborhood of a selected node arranged and rendered in detached window(upper-right).
(e) Progress indicator circle provides feedback to users while analytics jobs are processed.
(f) Shortest paths between nodes are emphasized in red, and nodes along the paths are highlighted on the graph.
a) b)
c) d)
e) f)
![Page 8: Visual Analytics Sandbox: A big data platform for ... · –Virtual big data environment for stream processing, graph analytics –Leverage stream processing, graph mining & visualization](https://reader033.fdocuments.net/reader033/viewer/2022042304/5ecf44fb9508ef087b42088f/html5/thumbnails/8.jpg)
ANationalScienceFoundationIndustry/UniversityCooperativeResearchCenter
Connectivity
• ConnectedtoLONINetworkandScienceDMZ
• ConnectedtoInternet2.0throughGENIswitch
• 40Gbps connectionsforfasterdatatransfer
GENISwitch
10Gbps
VASandbox
40Gbps
Internet2.0
1Gbps
Internet1.0
![Page 9: Visual Analytics Sandbox: A big data platform for ... · –Virtual big data environment for stream processing, graph analytics –Leverage stream processing, graph mining & visualization](https://reader033.fdocuments.net/reader033/viewer/2022042304/5ecf44fb9508ef087b42088f/html5/thumbnails/9.jpg)
ANationalScienceFoundationIndustry/UniversityCooperativeResearchCenter
Expectations&Timeline• Howcanyouusethissystem?– Virtualbigdataenvironmentforstreamprocessing,graphanalytics
– Leveragestreamprocessing,graphmining&visualizationtoolsthatarepartofthesandbox
• Planfordeployment– Softwaredevelopedinapilotenvironmentfor(a)intelligentleveesurveillance,(b)eventdetectioninsocialmediaand(c)influenzaforecasting
– Lookingforcollaborationforoneusecaseincybersecurityusecasefornetworkanalysis(Aug2017)
– SystemwillbemadeavailableforusersonInternet2(Spring2018)
![Page 10: Visual Analytics Sandbox: A big data platform for ... · –Virtual big data environment for stream processing, graph analytics –Leverage stream processing, graph mining & visualization](https://reader033.fdocuments.net/reader033/viewer/2022042304/5ecf44fb9508ef087b42088f/html5/thumbnails/10.jpg)
ANationalScienceFoundationIndustry/UniversityCooperativeResearchCenter
ThankYou
Contact:[email protected]:www.ucs.louisiana.edu/~nrg0821
![Page 11: Visual Analytics Sandbox: A big data platform for ... · –Virtual big data environment for stream processing, graph analytics –Leverage stream processing, graph mining & visualization](https://reader033.fdocuments.net/reader033/viewer/2022042304/5ecf44fb9508ef087b42088f/html5/thumbnails/11.jpg)
ANationalScienceFoundationIndustry/UniversityCooperativeResearchCenter
RealTimeEventDetection
AnalyticsEngine
[©CVDI]