KNOWLEDGE ARCHITECTURE AND BIG DATA - …files.meetup.com/19117935/NASA - Knowledge...

Post on 31-May-2020

5 views 0 download

Transcript of KNOWLEDGE ARCHITECTURE AND BIG DATA - …files.meetup.com/19117935/NASA - Knowledge...

©2015IHS.ALLRIGHTSRESERVED.

KNOWLEDGEARCHITECTUREANDBIGDATA

HowtoApplyKnowledgeArchitecturetoBigData

DavidMezaChiefKnowledgeArchitectNASAJohnsonSpaceCenter

FederalReserveJune15,2016

AGENDA

•  KnowledgeArchitecture•  NASADataStrategy•  CogniPveCompuPng

2

“ThemostimportantcontribuPonmanagementneedstomakeinthe21stCenturyistoincreasetheproducPvityofknowledgeworkandtheknowledgeworker.”PETERF.DRUCKER,1999

ToconvertdatatoknowledgeaconvergenceofKnowledgeManagement,InformaPonArchitectureandDataScienceisnecessary.

4

KnowledgeManagement

DataScienceInformaPonArchitecture

KnowledgeArchitecture•  Thepeople,processes,andtechnologyofdesigning,implemenPng,andapplying

theintellectualinfrastructureoforganizaPons.

•  Whatisanintellectualinfrastructure?

•  ThesetofacPviPestocreate,capture,organize,analyze,visualize,present,

anduPlizetheinformaPonpartoftheinformaPonage..

•  InformaPon+Contexts=Knowledge

•  InformaPonArchitecture+KnowledgeManagement+DataScience=Knowledge

Architecture

•  KMwithoutapplicaPonsisempty(StrategyOnly)

•  ApplicaPonswithoutKAareblind(ITbasedKM)

•  DataSciencetransformyourdatatoknowledge

5

KnowledgeManagement"Knowledgemanagementistheprocessofcapturing,distribuPng,andeffecPvely

usingknowledge.”

ThisdefiniPonhasthevirtueofbeingsimple,stark,andtothepoint.Afewyearslater,the

GartnerGroupcreatedanotherseconddefiniPonofKM,whichisperhapsthemostfrequently

citedone(Duhon,1998):

"Knowledgemanagementisadisciplinethatpromotesanintegratedapproachto

idenPfying,capturing,evaluaPng,retrieving,andsharingallofanenterprise's

informaPonassets.Theseassetsmayincludedatabases,documents,policies,

procedures,andpreviouslyun-capturedexperPseandexperienceinindividual

workers.”

6

InformaPonArchitectureTheintentistoachieveavarietyofcapabiliPestoenabletheAgencytoefficiently

acquireorgenerate,findandaccess,useandreuse,shareandexchange,manageand

govern,andstoreandrePreourdata.

7

DataScienceDatascienceisaninterdisciplinaryfieldaboutprocessesandsystemstoextract

knowledgeorinsightsfromdatainvariousforms,eitherstructuredorunstructured,

whichisaconPnuaPonofsomeofthedataanalysisfieldssuchasstaPsPcs,data

mining,andpredicPveanalyPcs,similartoKnowledgeDiscoveryinDatabases(KDD).TheKnowledgeDiscoveryinDatabases(KDD)processiscommonlydefinedwiththestages:(1)SelecPon(2)Pre-processing(3)TransformaPon(4)DataMining(5)InterpretaPon/EvaluaPon.

8

DataStrategy

9

Key Recommendations : •  Data Management •  Unified Data Lifecycle •  Data Governance •  Data Analytics Lab •  Data Fellows Program •  Data Stewards

DataStrategyFramework

10

Challenge Example Opportunity RecommendaEonLackofanexplicitdatamanagementframework,fragmenteddatalifecycleandlackofdataintegraPon

NoAgency-widearchitectureandstandardsforinformaPoninteroperability.MuchofthedataNASAproducesisinaccessibleorhuman-readableonly,withnomethodtodraw-in,parse,organize,ormakeuseofthisdata.

Improvedarchitecture,standardsandaccessibilitypermimngquickerandmoreeffecPvecollecPon,digiPzaPonanddiscovery;increasedfocusonmission-specificdataneedsandtype-specificapproaches

1.  DataManagement2.  UnifiedDataLifecycle3.  DataGovernanceProgram

NeedfornewemergingdataanalyPcstechnologiesandcapabiliPestoaddressmissionspecificchallenges

ManyofNASA’scurrentdatasystemsaresignificantlyoutdatedandcannotscaletomeetdemand.

ExperimenPngwithnewalgorithms,applicaPons,andtechniques

4.DataAnalyPcsLab

DataexperPsegap DatascienPstsareinlowsupplyandhighdemand,andNASAwillneedtocompetewithindustrytoapractthebest&brightest.

CollaboraPvepartnershipstobuildinternalcapacityandexperPseanduPlizeexternaltalent,tools,andinformaPon

5.DataFellowsProgram

NeedtoeffecPvelyaddresscultureandpolicyissuesalongsidetechnology

Inmanycases,individualsarenotmoPvatedtosharedataforcollaboraPveusewithothers.

Increasedcross-agencyandcross-stakeholderownershipandapproachtodatamanagementanddataanalyPcschallenges

6.DataStewards

KNOWLEDGEARCHITECTURE–ANALYTICSFRAMEWORK

11

IT&IntellectualInfrastructure

Security,DataQuality,WorkflowManagement,DataManagement,ResourceManagement

DataProducts:•  PredicPons•  Models•  VisualizaPons•  DecisionAnalysis•  Wiki

Sources:•  Sensor•  Experimental•  Computed

(modeling&simulaPon)

Forms:•  Digital•  Text•  VisualOrganizaPon:•  Structured•  Semi-Structured•  Unstructured

FuncPons:•  Governance•  Taxonomy•  Ontology•  Comm.Plan•  OperaPons

Management•  Security•  MasterData

Management•  Content

Management•  Metadata•  DataQuality

Tools&Environments:•  Largescalestorage•  RDBMS•  ParallelRDBMS•  NOSQL•  HadoopOrganizaPon:•  Structured•  Semi-Structured•  Unstructured

Tools&Environments:•  ComputaPon&data

access•  DataMining•  TextMining•  OpPmizaPon•  NetAlgorithm•  NewAlgorithm•  VisualizaPonAccessPapern:•  Structured•  Semi-Structured•  Unstructured•  Predictable•  Unpredictable

DataAcquisiPon&CreaPon

DataManagement

DataWarehousing

DataAnalyPcs,BI

(KnowledgeExtracPon)

KnowledgePresentaPon

andVisualizaPon

Source User

“Wehaveanopportunityforeveryoneintheworldtohaveaccesstoalltheworld’sinformaPon.Thishasneverbeforebeenpossible.WhyisubiquitousinformaPonsoprofound?Itisatremendousequalizer.InformaPonispower.”ERICSCHMIDT(FORMERCEOOFGOOGLE)

30%oftotalR&DspendiswastedduplicaPngresearchandworkpreviouslydone.Source:Na+onalBoardofPatentsandRegistra+on(PRH),WIPO,IFA

54%ofdecisionsaremadewithincomplete,inconsistentandinadequateinformaPonSource:InfoCentricResearch

46%Workerscan’tfindtheinformaPontheyneedalmosthalfthePme.Source:IDC

KnowledgeArchitecture:TheNextPhase

14

15

16

17

PushversusPull

18

WHATCOULDYOUACCOMPLISHIFYOUCOULD:

•  Empowerfasterandmoreinformeddecision-making

•  Leveragelessonsofthepasttominimizewaste,rework,re-invenPonandredundancy

•  Reducethelearningcurvefornewemployees

•  EnhanceandextendexisPngcontentanddocumentmanagementsystems

19

JSCKnowledgeArchitectureServices:§  AnalyPcs

§  WebPlauormforAnalysisandVisualizaPon

§  NOSQL-Neo4jandMongoDB

§  VisualizaPonServices-BusinessIntelligence

§  RepositorySpecificSearch

§  WikiFarm

§  CodeSharingandProjectcollaboraPon

§  Training

Contact Information

David Meza – david.meza-1@nasa.gov

Twitter - @davidmeza1

Linkedin - hpps://www.linkedin.com/pub/david-meza/16/543/50b

Github – davidmeza1

Blog davidmeza1.github.io

20

Contents

©2015IHS.ALLRIGHTSRESERVED. 21ReportName/Month2015

QUESTIONS?