THE FEASIBILITY AND UTILITY OF IMPLEMENTING …...innovation leads to business innovation, has been...
Transcript of THE FEASIBILITY AND UTILITY OF IMPLEMENTING …...innovation leads to business innovation, has been...
THE FEASIBILITY AND UTILITY OF IMPLEMENTING TEMPORAL DATA CUBES
TO SUPPORT PROJECTION OR “FORECAST” MODELS AND LAND CHANGE TRENDS
AReportoftheNationalGeospatialAdvisoryCommitteeLandsatAdvisoryGroup
April2018
NGACDataCubeFeasibilityforForecastingPaper April2018
NationalGeospatialAdvisoryCommittee(www.fgdc.gov/ngac)
1
THEFEASIBILITYANDUTILITYOFIMPLEMENTINGTEMPORALDATACUBESTOSUPPORTPROJECTIONOR“FORECAST”MODELSANDLANDCHANGETRENDSExecutiveSummaryInAugustof2016,theU.S.GeologicalSurvey(USGS)requestedthattheLandsatAdvisoryGroup(LAG),asubcommitteeoftheNationalGeospatialAdvisoryCommittee,studythefeasibilityandutilityofimplementingtemporaldatacubestosupportprojectionor‘forecast’modelsoflandchangetrends.Thisstudywasafollow-ontotwopreviousLAGstudypaperson“ProductImprovement”and“Cloudcomputing”thathadbothbeenpublishedin2013.Thestudywasproposedtohelpaddresswhetheradeepermarketdemandforforecastinglandchangewoulddevelop.SeveralquestionswerealsoposedbasedonthepresumptiveuseofadatacubewithLandsatderivedinformation,asameasure,andtime,asadimension,whichthisreportdiscusses.BackgroundThejointNationalAeronauticsandSpaceAdministration(NASA)/UnitedStatesGeologicalSurvey(USGS)Landsatprogramprovidesthelongestcontinuousandopenlyavailablespace-basedrecordofEarth'slandinexistence.Landsatmissionshaveacquiredmoderateresolutionmultispectraldataforover40years.TheEuropeanSpaceAgency(ESA)hasbeengatheringEarthobservationdataforalongtimeandinitiatedsystematicarchivingandanalysisofdatafromotheragencies’satellitesintheearly1980s.ItbeganitsownEarthobservationswithEurope’sfirstEarthResourcesSatellite(ERS).TheESAEarthObservingSentinelsatellitesovernearlythepastfouryearshaveaddedtotheamount,thecomplexity,andtherelevanceofreadilyaccessibleremotelysenseddata.Havingafacile,agile,andreliablewayforalltointeract,directlyorindirectly,withthese,alreadyvastbutalsogrowing,collectionshasbothnationalandinternationalinterest.Thesecollectionsposethe“BigData”technologychallengetopreviousdataarchitecturesandtoolstomanipulateortointerrogatepricelessobservationsfromawide-rangeofsensors.Higherspatial,spectral,andtemporalresolutionofthecollectioncompoundsthechallengeaswellastheopportunitiestobetterunderstandourEarth.Improvedapproachestothemanagement,preparation,distributionandanalysiswillrelievesomeofthedata-to-information-to-knowledgeprogressionstress.Algorithmsforstatisticalanalysisofincreasinglylargersamples(andperhapssignificantlyvarying)“BigData,”usedunderdifferentconditionstoaddressdifferentissuesandperspectives,mustbewiselyselectedandusedtoavoiderroneousstatisticalinferenceorinadequateconclusion.TheFederalGeographicDataCommittee(FGDC)requested,forthe2016program,thattheLandsatAdvisoryGroupprovideadviceon“thefeasibilityandutilityofimplementingtemporaldatacubestosupportprojectionor‘forecast’modelsoflandchangetrends”andnotedthatthisworkwas“intendedasafollowontopictotheLAGstudypapersonProductImprovementandCloudcomputingpublishedin2013.”Fivequestionswereposed:
• InadditiontoLandsat,whatotherdatasources(toincludeEO,SAR,andLIDAR)areoptimallysuitedforleveraging(e.g.,co-registered)tosupportdatacubeimplementationsforlandchangeanalysisandforecastmodeling?
NGACDataCubeFeasibilityforForecastingPaper April2018
NationalGeospatialAdvisoryCommittee(www.fgdc.gov/ngac)
2
• WhatkindsofLandsattime-seriesproductswouldhavethebroadestcommunityuseormostimpactfulcontributioninspecificareas?
• WhichorganizationswithexpertiseinforecastmodelingarebestposturedtoevaluateanddemonstratetheforecastpotentialfromaLandsat-basedtemporaldatacube?
• HowfarbackintimeintotheLandsatarchiveshouldthestagingof‘analysisreadydata’beconsidered?E.g.,earlydatacollectionssuchasmulti-spectralscanner(MSS)dataarelessequipped(intermsofmetadata)tosupportrigorousgeometricandradiometriccalibrationcomparedtolatercollections.
• Howcouldefficientsynergyberealizedamonggovernmentandcommercialrolesfordatacubedevelopment,andoperations(processing,storage,distribution)tosatisfybroadcommunityneeds?
TheNGACPaper,dated11December2013,on“ProductImprovement:AdviseUSGSonpotentialmeansofmodifyingthecurrentproductstomakethemmoreusefultocommercialinformationprovidersandvalue-addedanalysts”1madethegeneralrecommendationthat“USGSfurtherimproveLandsatproductstobothenhancethescientificvalueoftheimagery,butalsotoprovideadditionalvaluetothecommercialandgovernmentorganizationswishingtoextractthemaximumvaluefromtheimagery.”SevenpointsexpandedthatsummaryrecommendationforUSGS:
• ClearlydefinewhatUSGSwillproduceandavoidcompetitionwithcommercialwork.• Refinegeometricaccuracyandradiometricmeasurementstoenablebetterchange
detection.• ImproveL1Gproductgeometricaccuracyandco-registration.• Defineastandardsurfacereflectanceproduct.• Consolidatescientificresearchandpublishbestpracticesforarangeofproducts.• Providecertification/validationfacilitiesforproductsnotproducedbyUSGS.• SimplifyaccesstotheL1Tproduct.
ThesecondNGACPaperofthesamedate,entitled“CloudComputing:PotentialNewApproachestoDataManagementandDistribution”1endorsedtheuseofcloudcomputingandsuggestedhowUSGS/ EarthResourcesObservationandScience(EROS)shouldleveragethattechnologyby:
• Supportingthird-partycloudprovidersbyprovidingbulkdatadownload;• Co-locatingdataandon-demandprocessingforonlythedesiredinformation;• Transmittingtherequiredprocessingmodeltothecloudsomassivedatacouldbehandled
bymultipleCPUs;• DownloadingsubsetsofL1Tproducts;• Givingattentiontouseofopensoftwarestandardstoavoidtyinganyservicestoproprietary
software;and• Streamliningsecurity.
IntroductionExplaininginterestinthespatio-temporaldatacubeTheoptionsforstoringandaccessingrelevantdataofferarangeoffunctionalitybutaresomewhatlimitedwithmassivedataandspecificrequirementstosatisfyparticularbusinesscases.Inmanycases,adatawarehousecanadequatelysupportinformationprocessingasastableplatformforconsolidated
1 Twoofthe2013NGACKeyDocumentsfoundathttps://www.fgdc.gov/ngac/key-documents
NGACDataCubeFeasibilityforForecastingPaper April2018
NationalGeospatialAdvisoryCommittee(www.fgdc.gov/ngac)
3
andtransactionaldata.However,ofincreasinginterest,onlineanalyticalprocessing(OLAP)moreadequatelyallowsformulti-facetedconsumptionofdatatomeetvariedneeds.Thedatacubeprovidesnotonlyastoragestructurebutalsothe“staging”spaceforanalysisoftheinformation.TheOLAPcubeisamulti-dimensionaldatabase,whichhasdrawnincreasingattentionoverthepastseveralyearsforearthobservationcollection.AmarketingpromotionforanEarthserverProjectworkshopondatacubesdescribedthedaylongworkshopfocusinthefollowingway.“Thedatacubeconceptpromisestotacklesomeofthechallengesthatcomealongwithlargevolumesofenvironmentalandgeospatialdata.Datacubesofferamoreon-demandandanalysis-readyaccesston-dimensionaldata,whichcanbeaccessedalonganyaxis,allowingforefficienttrimorsliceoperations.Thedatacubeconceptmakeslargevolumesofenvironmentalandgeospatialdatamoremanageableandthus,increasesthegeneraluptakeofBigEarthdata.”2ExamininganotionalarchitectureTheCommitteeonEarthObservationSatellites(CEOS)ofwhichtheUSisamembercountrybegananOpenDatacubeinitiativein2016.BrianKillough(NASA)andRobertWoodcock(CSIROofAustralia)havebeenprincipaladvocatesfortheinitiative.Whentheinitiativelaunched,useofthedatacube,withthedimensionsofspace,time,anddatatype,wasalreadyaprovenconceptbyGeoscienceAustraliaandtheAustralianSpaceAgencyandwithindevelopmentfortheirLandsatdataarchive.Theobjectivewastohave20countriesoperationallyinvolvedby2022.3Thepace,however,isexceedingtheJuly2017plan.InJuly2017,threecountries(Australia,Colombia,andSwitzerland)hadoperationalcapability.Fourotherwereunderdevelopmentandtwenty-onecountrieswereunderreview.DuringateleconferenceddiscussionwithDr.KilloughandtheTaskTeam2inmid-October2017,hecommentedthat29countrieswerealreadyunderreview.InMarch2018,duringabriefingattheCEOS7thWorkingGroupforCapacityBuildingandDataDemocracyAnnualMeetinginBrazil,itwasmentionedthatatleast40countrieshaveenteredintosomelevelofdiscussionalthoughtheobjectivedoesremain20.ThespeakernotedthatAustralia,Colombia,andSwitzerlandarestilldoingwell.TheUnitedKingdom,Uganda,Vietnam,Taiwan,Georgia,andMoldovaaremakingprogress.ThereareAfricanregionaldatacubesinGhana,Kenya,Senegal,SierraLeone,andTanzania.Therefore,thenotionofthedatacubeisgaininginterestandsupport.Theglobalnatureoftheinterest,however,addstothecomplexityofhowtheUSplanstoexpanditseffortswiththeLandsatcollections.Finding1:InternationallytheutilityofthedatacubefororganizingLandsatdataovertimeandlocationhasgrowingacknowledgementtosupporttimeseriesanalysis.3Colombiahasfoundvalueinexamininglandchangesince2000andenablingunderstandingthetrendsforforestmappingandmanagement.ThemainobjectivesoftheSwissDataCube(SDC)aretosupporttheSwissgovernmentforenvironmentalmonitoring.4TheVietnamDataCubeisintendedtocreatebroadapplicationsforsocio-economicsustainabledevelopmentgoalsforVietnamaswellasothercountriesintheregion.5AnalysisReadyData(ARD)6feedtheformationofadatacube.Landsat8OperationalLandImager(OLI)/ThermalInfraredSensor(TIRS)Tier1and2,Landsat7EnhancedThematicMapperPlus(ETM+)Tier1,
2 https://themes.jrc.ec.europa.eu/news/view/158675/earthserver-workshop-data-cubes-for-big-earth-data-19th-20th-october-2017-frascati-rm-italy 3Killough,BrianOpenDataCubeBackgroundandVision,https://www.opendatacube.org/eventsJuly7th,2017 4http://www.swissdatacube.org/5https://vnsc.org.vn/en/news-events/news/internal-news/introduction-of-satellite-data-sharing-system-vietnam-data-cube/6 https://landsat.usgs.gov/ard
NGACDataCubeFeasibilityforForecastingPaper April2018
NationalGeospatialAdvisoryCommittee(www.fgdc.gov/ngac)
4
andLandsat4-5ThematicMapper(TM)Tier1comprisethecontiguousUS,Alaska,andHawai’iARD,whichisavailablefromtheEROSCenter,usingEarthExplorertodownload.Startingmid-March2018,twonewLandsatscienceproducts,SurfaceTemperatureandDynamicSurfaceWaterExtentwillbegintobeintegrated.Dr.RobertWoodcock,whohasworkedforalmosttwodecadesinthefieldofvisualization,spatialinformationsystemsandanalyticsanditsapplicationtoEarthSciencewithafocusonensuringresearch
Figure1.ThearchitecturalconceptsoftheAustralianGeoscienceDataCube
NGACDataCubeFeasibilityforForecastingPaper April2018
NationalGeospatialAdvisoryCommittee(www.fgdc.gov/ngac)
5
innovationleadstobusinessinnovation,hasbeenreinforcingtheabove-mentionedworkwiththeOpenDatacubeinitiativeusinghisextensiveexperience.He,withsomecolleagues,preparedthediagramseeninFigure1,7whichdescribesanotionalarchitectureemployingthedatacube.Thefourlayersfrombottomtotopasfollows:
DataAcquisitionandInflow-Observationsarecollectedandpre-processedtoan‘analysisready’levelbyvariouscustodians;DatacubeInfrastructure-analysisreadydataareindexedintotheAGDCv2includingingestionintomulti-dimensionaldatasets,withasuiteoftoolsfortaskexecution,discovery,visualizationandsoon;DataandApplicationPlatform-Platformsandenvironmentsthatallowroutinegenerationofproducts,and,explorationofnewproductsina‘virtuallaboratory’environment;andUIandApplicationLayer-Adiversesetofapplicationsisenabledbytheunderlyinginfrastructure.
Finding2:TherecommendationsfromtheaforementionedLAGpaperscanbealignedwiththisnotionalarchitecture“tobothenhancethescientificvalueoftheimagery,butalsotoprovideadditionalvaluetothecommercialandgovernmentorganizationswishingtoextractthemaximumvaluefromtheimagery”andtooffer“potentialnewapproachestodatamanagementanddistribution.”EROSCenteranddatacubes:InNovember2016,USGS/EROSprovidedtheLAGteamwithabriefingontheLandChangeMonitoring,Assessment,andProjection(LCMAP)initiative“toharnesstheLandsatrecordinordertoprovidestate-of-the-artlandchangecapabilitiesneededbyscientists,resourcemanagers,anddecisionmakers.”Asexplainedduringthepresentation,tomanagetheresultantland-changeproductsrequiredaddressingtheissue“thattheLandsatarchive,currentlyorganizedaspathrows,isnotsufficientlyefficientfortimeseriesstudies.Movingtoagrid-baseddatacubeapproachwithAPI’sthatconditionandservedataperuserspecificationwillreducedatapreparationtime.”ThedatastructuretobeusedwasidentifiedasanOLAPcube.ThedatacontentitselfistheAnalysisReadyData(ARD)inthediagramabove.ThetilingschemeismodeledupontheWebEnabledLandsatData(WELD)andwillusetheAlbersEqualAreaConicprojectionandtheWordGeodeticSystem84datum.ARDarestandardizedwell-characterizedradiometricandgeometricproducts.Dr.TomLovelandcharacterizedtheARDasLandsatdataprocessedtoalevelthatenablesdirectuseinapplications.
§ Itwillsupportgeospatial,multi-spectral,andmulti-temporalmanipulationsforthepurposesofdatareduction,analysis,andinterpretation.
§ Itoffersconsistentradiometricprocessingscaledbothtotop-of-atmosphere(TOA)reflectanceandsurfacereflectance.
§ Itisdesignedforconsistentgeometryincludingspatialcoverageandcartographicprojection–e.g.,pixelsalignthroughtime,<12mRMSE.
§ Itprovidesmetadataondataprovenance,geographicextent,anddataquality.
7AdamLewis,SimonOliver,LeoLymburner,BenEvans,LesleyWyborn,NormanMueller,GregoryRaevksi,JeremyHooke,RobWoodcock,JoshuaSixsmith,WenjunWu,PeterTan,FuqinLi,BrianKillough,StuartMinchin,DaleRoberts,DamienAyers,BiswajitBala,Lan-WeiWangTheAustralianGeoscienceDataCube—Foundationsandlessonslearnedhttps://www.sciencedirect.com/science/article/pii/S0034425717301086
NGACDataCubeFeasibilityforForecastingPaper April2018
NationalGeospatialAdvisoryCommittee(www.fgdc.gov/ngac)
6
Insimplewords,ARDareintendedtoprovidesomepre-processedproductsthatalleviatesomeworkburdenonthepartoftheusers.ThereinliesbothitsbenefitformostconsumersandconcernforsomeotherLandsatusersthatwillbeaddressedlater. QuestionsPosedbyUSGSInadditiontoLandsat,whatotherdatasources(toincludeEO,SAR,andLIDAR)areoptimallysuitedforleveraging(e.g.,co-registered)tosupportdatacubeimplementationsforlandchangeanalysisandforecastmodeling?AmongtheeffortsconsideredbytheLCMAPteamalreadyhasbeentoincreasetimeseriesdensitybyaddingSentinel-2.8IntheCEOSinitiative,ColombiaandSwitzerlandarestudyinghowtoincorporatebothSentinel-1(SAR)andSentinel-2(multi-spectral).TheVietnamprototypeincludesbothSentinel-2andALOSdata.Progresswasdiscussedon6March2018,whentheVietnamNationalSpaceCenterorganizedaworkshop“IntroductionofsatellitedatasharingsystemVietnamDataCube”inHanoi.Oneshouldnotassumethatallotherdatasourcescould,would,orshouldbehousedbyUSGS/EROS.Thedatacubedesignmustallowadditionaldimensionsorlayerstothecube.Itwilloftenbenecessaryforanothergovernment,academicorcommercialorganizationtoincorporatetheirown,sometimesproprietary,datasettoimprovetheresultsortopreparetailoredanalysis.Thus,inFigure1,onecouldconsideranalysisreadydatatobemultipledatasetsthathavebeenreadiedbysomepre-processingtoenterintothedatacubestructuring.HereasseeninFigure29,layersofdifferentdatasourceproductsandextensionsofmorelocationsortimescanbeadaptivelyincorporatedtoaddresseithersomespecificorgenericissue.Thegraphicmayobscuretherealitythatprospective“layering”demandsconsiderationofsomestandardizingstructureandfunctionalguidelines.
Figure2.GraphicofConceptualDataCube
ThenotionthatavarietyofpossiblesourcesofdatawouldaccompanytheARDwithintheframeworkoftheUSGSLCMAPinitiativewascharacterizedinthegraphicofFigure3providedbyDr.Loveland.
8Dwyer,John,“USGSAnalysisReadyData”presentationtotheLandsatScienceTeamonJanuary14,2016andrecentreleaseindicatinginterest:https://landsat.usgs.gov/february-17-2018-us-landsat-ard-special-issue-call-manuscripts9Adaptedfromhttps://www.slideshare.net/algum/data-cubes-7923771/5
NGACDataCubeFeasibilityforForecastingPaper April2018
NationalGeospatialAdvisoryCommittee(www.fgdc.gov/ngac)
7
Figure3.LCMAPConceptualFrameworkandFlow10
WhatkindsofLandsattime-seriesproductswouldhavethebroadestcommunityuseormostimpactfulcontributioninspecificareas?TheAnalysis-readydata(ARD)preparation,ingeneral,restsuponfoundationaltechnologythatcanbenefitnearlyallusersofLandsatdata,notjustafewspecificapplications.Forexample,ensuringthatalldataareconsistentlycalibratedandcarryappropriatequality-assurancemetadataisofbenefittoeveryone,regardlessofwhethertheyareusingdatainoneoftheexistingUTMgridsoranewcountry-specificgrid.TheU.S.LandsatARDtilingsystemisamodifiedversionoftheWELDstructure.ThreetilegridextentsaredefinedforCONUS,Alaska,andHawaii.ThegridoriginsaredefinedinrelationtotheWGSdatumbutadjustedtoalignwiththeNationalLandCoverDatabase.Theyarecountryspecific.Inaddition,thedevelopmentofaUS-specificARD-baseddatacubeinanAlbersEqualAreaConicmapping projectioniswell-alignedwiththemissionoftheUSGSservingitsUScustomers,asispreprocessingothergeographically-coincidentdatasetstobeavailableinthatsameprojection.ThatapproachbothenablesandfacilitatesthedevelopmentofarangeofUS-specifichigher-leveldataproductsandservices.However,asdatacubesbecomeubiquitous,whatworkswellfortheUSmaybequiteawkward
10Loveland,ThomasAnLCMAPOverview:LandChangeMonitoring,Assessment,andProjection,aDiscussionwiththeLandsatAdvisoryGroupandAmericaViewMembers,November16,2016
NGACDataCubeFeasibilityforForecastingPaper April2018
NationalGeospatialAdvisoryCommittee(www.fgdc.gov/ngac)
8
forothercountries.WhentheOpenGeospatialConsortium(OGC®)firstbeganitsDiscreteGlobalGridSystem(DGGS)11workinggroup,itsoughttoestablishaspecificationtoaddresscollatingspatialdatafrommultipleplacesandsourcesandovercomingthechallengesofworkingwithdifferentreferenceorgridsystems.ARDpresenttheBigDatachallenge.AsexplainedbyUSGS/EROS,ARDarethefoundationofLCMAPprovidingstandardizedwell-characterizedradiometricandgeometricproducts(Level-1Collection1),theatmosphericcorrectionandgeo-physicalsurfacereflectanceandsurfacebrightnessretrievals(Level-2),andhierarchicalmetadatatoincludepixel-levelattributes.Alsoavailablewouldbe“opensource”codetoestablishtheprocessingandmetadatastandards,toaccommodatescalablearchitectures,todeployintopublicorprivateclouds.ThelatterareallpointsconsistentwiththerecommendationsfromtheCloudComputingpaper.Taskteammembersendorsedthisopennessbutrecommendalsothedistributionofverificationproceduresthatthemethodsandworkflowshavebeenreplicatedproperlyforanynon-USGSproductionthatincorporatedothersourcesandinitiatestailoredanalysis.ThoseproceduresmightmirrorwhatUSGSitselfuse.Atthistime,theARD’s“opensource”codeisaccessiblethroughhttps://github.com/USGS-EROS.Thedownloadofatilestillinvolves5000x5000pixelspercollectioneventandanypartitioningdowntosomesmallergeographicfootprintforamorelocalareaoccursinthechosenenvironmentoftheuser.Improvementstothelengthyandspacedemandingdownloadandprocessingtasksareneeded.Finding3:Non-USGSprocessingofdatausingtheopen-sourcecodeandalgorithmsavailablefromUSGScouldnecessitatethatUSGSalsoreleaseproceduresdocumentationandsomeverificationtestdatasets.ThetaskteamresponsibleforthisreportcautionsthattheUSGSshouldensurethatitsvariouseffortsrelatingtoLandsatdataprocessingdistributionarewellalignedwitheachotherandclearlyarticulatedtotheusercommunity.Inparticular,therelationshipbetweentheCollection1reprocessingeffort,existingSurfaceReflectanceprocessinganddistributionefforts,andtheAnalysis-ReadyDataeffort,mayneedtobeclarified.Fundamentalimprovementstoprocesses,likesensorcalibration,shouldbeappliedequallytoprocessinganddeliveryofbothUTManddatacubedata.Similarly,bothTOAandSurfaceReflectancedataareofvalueinallproductformsandshouldbemadeavailableinaconsistentmanner.Keepingalltheseeffortsalignedmayminimizeduplicationofeffort,butmoreimportantly,itwillavoiduserconfusion,whichcouldotherwiseleadtoerroneoususeofdatabyendusers.12RecentlybroughttotheattentionoftheTaskTeamhasbeenthevoiceofthosewhoworryabouttheimpactof“normalizing”thereflectanceproductacrossallthecollections.Fromtheirperspective,theyagreethatpre-processingtheLandsatdataintothis“normalized”statesothattime-seriesanalysisofmultiplecollectionsoveralargeareabringsgreatefficienciesbyreducingprocessingburdenonamanyorevenmostoftheusers.Whattheconcernedgroupquestionsiswhaterrorisintroducedinthatpre-processingthatmightaffectanalysisofsmallerfootprintsandmorerestrictedtimesequences.Importantly,theyarenotclaimingthatsignificanterrorsmightresult.RathertheyareconcernedthatwhateveranalysismayhavebeencompletedbeforemovingaheadwithARDhasnotbeenquantifiedforthem.TheyendorsethattheLevel1Tproductswillremainavailableandwillwanttodomorestudyon
11http://www.opengeospatial.org/projects/groups/dggsswg12StevenJ.Covington,PrincipalSystemsEngineerfortheUSGSLandRemoteSensingProgram,commentedCurrentthinkinghasCollection2encodedwithCloudOptimizedGeoTIFF(COG)toenableefficientextractionofuser-definedareassmallerthantheplannedstoragegranule(aWRSScene)
NGACDataCubeFeasibilityforForecastingPaper April2018
NationalGeospatialAdvisoryCommittee(www.fgdc.gov/ngac)
9
thealgorithmsthathavebeenusedtocreateARDsotheycanreliablyassesstheerror,ifanyornegligible,introducedintothedata.ArecommendationwouldbethatUSGSEROSCenterreleaseanystudyanalysiscompletedontheerrorimpactofthepreprocessingorinitiatesuchastudy.(Itisrecognizedthatifthereisconcernthatcannotberesolved,onecanreversetheprocessthatproducedtheTOAproductandhavethehistoricalradianceproduct.)WhichorganizationswithexpertiseinforecastmodelingarebestposturedtoevaluateanddemonstratetheforecastpotentialfromaLandsat-basedtemporaldatacube?MuchhasbeenwrittenaboutproblemsofforecastingwithanyBigData,includingalltheimageryandgeospatialcollections-withtheforemostchallengebeingthelackofpersonnelskilledforthistask.ThetilingschemechosenforARDandappliedtotheLandsatimagesovertheUSshouldassurealignmentoftilessothat“drilling”thoughseveralimagesoverthesamegeographyprovidesthesamefootprintforsubsequenttimeseriesanalysisthatcouldleadtoforecastingfutureconditionsbaseduponpastinformation.Itistrustedthatrigoroustestinghasbeendonetoassurethelayeredfootprintsovertimearepositionedwithinsomedefineddegreeofpositionalaccuracy.OneobjectivesoftheARDeffortcouldbetoimproveuseof“biggeodata.”Withinsomeoftheresearchandanalysisworkwithlargequantitiesofgeospatialdatahasbeendiscussionofthefrustratinginsufficiencyoftraditionalstatisticaltechniquesorofthechallengingselectionofthemostappropriatestatisticaltechniquetoobtainreliableandconsistentforecastsfromlargequantitiesofdata.IntheinitialreleasesoftheLandsatARDandthetemporaldatacube,itwouldbewisetoconsidertheuseofacademicresearchcenterstoassesshowmuchthenewstructureactuallyfacilitatesanalysisandtoencourageuniversitiestoreviseclassroommodulesthatpreparethefutureanalystsandinformationmanagers.WillARDenablebetterforecastswithBigDatausingavarietyofnoveltechniques?NotonlycanacademicorganizationsbeexcellentpartnerswiththegovernmentusingthesevaststoresofdatabutalsoseveralprivatecompanieswillbeeagertousetheARDandbuildversionsofthedatacubetailoredtosupportprocessingthatdeliverstheanswersneededbytheircustomers.HowfarbackintimeintotheLandsatarchiveshouldthestagingof‘analysisreadydata’beconsidered?E.g.,earlydatacollectionssuchasmulti-spectralscanner(MSS)dataarelessequipped(intermsofmetadata)tosupportrigorousgeometricandradiometriccalibrationcomparedtolatercollections.ThedecisiontoincludetheMSSdatahasbeenstronglyrecommendedwithinUSGSattheEROSCenter.Addressingthequestionmaybeamootpoint,givenitsvalueinthelongtermofcontinuousEarthimagingandobservationanditsinclusionbeingstronglyrecommendedbysomemembersofthepreviousLandsatScienceTeam.However,thistaskteamstronglyrecommendsthatprioritizingdevelopmentworkshouldbecarefullyscrutinizedwithinUSGS.IsglobalARDwithoutMSSofgreatervalue,toagrowinginternationalcommunityofusers,thanUSARDwithMSS?Inaddition,followingsomeoftheconcernaboutforecastsfrommassivedatastores,theissueofsignaltonoise(noteasilymitigatedbytheseriouslydiminishedamountofmetadataforMSSdata)shouldalsobeevaluated.Howcouldefficientsynergyberealizedamonggovernmentandcommercialrolesfordatacubedevelopment,andoperations(processing,storage,distribution)tosatisfybroadcommunityneeds?CautionwasurgedbyteammembersabouthowmuchoftheproductionworkloadshouldbeassumedbyUSGS.TheanalysisoftheARD,asingestedintothedatacubeinfrastructure,shouldnotbesolely
NGACDataCubeFeasibilityforForecastingPaper April2018
NationalGeospatialAdvisoryCommittee(www.fgdc.gov/ngac)
10
dependentonthecomputinginfrastructureoftheUSGS,whichisunlikelytohavereadyaccesstosomeofthelatesttechnologyadvancements,giventhebudgetingprocesses.ManyoftheorganizationsveryinterestedinthepromiseofLCMAPmightneedattentionmorefocusedonspecificareasthattheEROSCenterhadnotplannedtoaddressimmediately.ThosespecificareasmighthavelargerorsmallerfootprintsortheymightbeoutsidetheUS.Itisnotclearthatthegovernmentispreparedforsuchflexibleresponseforbuildingsuchspecificdatacubes,norhasanygoodjustificationbeenprovidedforwhythegovernmentshouldassumethatroleofproduction.MembersoftheTaskTeamhighlyrecommendedmoreconsiderationoftheprivatepublicpartnershipconceptintheend-to-endprocessfromLandsatlevel1productstoARDtouser-tailoreddatacube.TheTaskTeamagreedthatUSGS,astheLandsatsourceexperts,shouldberesponsibleforLandsatARDqualityandconsistency,althoughtheywouldlikelybenefitfromcommercialsupportfortheprocessinganddistributioninfrastructure.Finding4:Thecommercialsectorisreadytoprovidedatacubetailoringassistance,givenitsincreasingexperiencewithglobalgeospatialdata.ItisalsopreparedtoprovisioninfrastructuretoassistintheproductionofARD.Astheneededtoolsandtechniquesmature,theteamsimilarlyrecommendsthatUSGSshouldnotundertaketoscalethiscountry-specificeffortgloballythemselves.Thereisnoonepeerlessglobalprojectioncoordinatesystem.Givenspecificneeds,anyspatialmulti-dimensioneddatacubecanbequiteparochial,andeachcountryorregionthatwantsadatacubewouldlikelyselecttheirowntilinggridtominimizedistortionintheirregionandmaximizeinteroperabilitywithotherexistingregionaldatasets.TheUSGSshouldfocusonopeningupitstoolsandthenecessaryinputdatasetssothatthirdpartiesintheprivatesectorcanofferaserviceofbuildingthesedatacubesforglobalcustomersinaccordancewithUSGSbestpractices.Suchascenario,mightalsoinvolveothercountriesproducingtheirownARD,andiffromLandsat,thatcouldrequiretheUSGStoreleaseimagedata(perhapsLevel0),DEM,GCPdata,andallothernecessaryinputsinadditiontothecodethatUSGSusestocreatetheUSARDproduct.Inthisway,theUSGScouldfocusondevelopingexpertiseandonbuildingoperationalsystemsfortheUS,withoutstrayingintobuildingoperationalsystemsfortheworld.TheconcernaboutUSGSproducingeitherARDordatacubesfortheglobalcustomerrelatesbacktotheearlierdescriptionofbothamappingprojectionandagridsystemthatdonotapplywellglobally.ThatraisedthequestionabouttheprioritiesoftheUSGSproductionplansandhowandwhytheprivatesectorcanstepforward.TheCEOSinitiativeisnotwithoutquestionsforsimilarchallenges.EvenifglobalstakeholdersagreethatanOpenDatacubevisionhaspromise,willtheymaketheircontributionstomitigatetheriskthattheconceptcannotbescaledwithlimitedresources?GiventheadoptionoftheconceptandthedevelopmentofnationaldatacubesundertheCEOSinitiative,havingexcellenttransformationalgorithmsfortheprojectionswouldallownecessaryflexibility.Thetilingscheme,however,couldbefarmorechallenging,ifandwhenadjacentcountriesbuildnationaldatacubesandselectdifferingschemes.TheroleofCEOSinestablishingorinstantiatingstandardsandspecifications,likethoseintheDGGSmentionedabove,shouldnotbeunderestimated.Previouslythispapermentionedstandardswithrespecttotheopensoftwarestandardsneededtoavailanyrequesterofthesoftware,whomightrequirethealgorithmsusedbyUSGSinpreparingtheARDatanypointintheanticipatedimprovementsovertime.Finding5:Thedatacubeimplementationinvolvesabroadscopeofstandardsissues.• InFebruary2018,afteraninformaldiscussionofthetopic,anOGCgrouppreparedanOGC
discussionpaper:“Inresponsetoarecentdiscussion(viatheOGCemaillists)regardingperceptions
NGACDataCubeFeasibilityforForecastingPaper April2018
NationalGeospatialAdvisoryCommittee(www.fgdc.gov/ngac)
11
aboutdatacubesandDGGS,itwassuggestedthatwebeginamoreformaldiscussiononthistopicwithintheOGCTechnicalCommittee.ThisinformationdocumentaimstoinitiateadiscussionofthebroaderdefinitionofadatacubeandthecomplementaryrolethatDGGStechnologiesplay.”13Thefollowingfutureactionswereidentified.“…theAuthorsrecommendsometargetedactionswithwhichweshouldproceedincollaborationwiththecommunityoftheDGGSspecificationanddomaingroups.Theseactionsmainlyfocusoninvestigatingtheefficiencyonqueryingandexploringlargemulti-dimensionalarrayswhileusingtheDGGStechnologiesonDatacubes.Theseactivitieswillbeexercisedunderspecificongoingbigdataresearchinternationalprojects.”
• AlsoinFebruary,theOpenGeospatialConsortiumannouncedthatitwasseekingpubliccommentonWebCoverageService(WCS)2.1CandidateStandard.Thequalifierstatementfortheannouncementread“UpdatedWCS2.1Standardwillsimplifyaccesstospatio-temporal‘bigdatacubes’”.14Thereleasealsooffersmoreexplanation.“BysupportingthemoregeneraldatacubemodelofCIS1.1,theWCS2.1standardwillsimplifyaccesstospatio-temporal‘bigdatacubes’,withanoperationspectrumrangingfromsimplesub-settinginspaceandtimeuptocomplexspatio-temporalanalyticsthroughWebCoverageProcessingService(WCPS).WCPSoffersaprotocol-independentlanguagefortheextraction,processing,andanalysisofmulti-dimensionalcoveragesrepresentingsensor,image,orstatisticsdata,suchasmightbeenvelopedwithinadatacube.
• In2017,Dr.PeterBaumann,ProfessorofComputerScience,JacobsUniversityBremen,publishedapositivelyprovocativepaperwithinthecommunityofinterest,namedtheDataCubeManifesto15,inwhichhecommented,“Recently,thetermdatacubeisreceivingincreasingattentionasithasthepotentialofgreatlysimplifying“BigEarthData”servicesforusersbyprovidingmassivespatio-temporaldatainananalysis-readyway.However,thereisconsiderableconfusionaboutthedataandservicemodelofsuchdatacubes.”Thatstatementwasfollowedbyhissixprinciplesofdatacubeserviceconcludingwiththesixthbeing“Datacubesshallsupportalanguageallowingclientstosubmitsimpleaswellascompositeextraction,processing,filtering,andfusiontasksinanad-hocfashion…TheOGCdatacubestandards,CISandWCS/WCPS,areembracedbyopen-sourceandproprietaryimplementers,comingwithcompliancetestsenablinginteroperabilitydowntothelevelofsinglepixels.Availabilityofdatacubestandardsandtoolsisheraldinganeweraofservicequalityand,ultimately,betterdatainsights.”
TheTaskTeamrecommendsthatOGCbeencouragedtocontinueworkonthestandardsthatsupporttheagileandreliableandconsistentuseofadatacubeapproach.Thiswouldhelpaddressthissection’squestionabouttheefficientsynergybetweenpublicandprivatesectorusetomeetcustomer/clientrequirements.
AnotherquiterelevantpointthathasemergedduringthemonthsofdiscussiononthisLAGtaskassignmenthasbeenthequestionoflocalorcloudstorageand/orprocessing.Assumptionsaboutthedesirefornationstowantalltheirdatadownloadedtotheirownserversratherthanpreferringthevalue-addedsolutionsprovidedbythecloudserviceprovidersarenotnecessarilyreinforcedbythe
13Purss,M.,Peterson,P.,Strobl,P.,andSabeur,Z.DiscussionPaper:ADGGSPerspectiveonDatacubes18-006,14February2018(Permissiontouse:TheOGCWorkingGroupthatdevelopedthepaperapprovedreleaseforusebytheNGACmembership.26March2018.)14http://www.opengeospatial.org/pressroom/pressreleases/273815Baumann,Peter,TheDatacubeManifestohttp://earthserver.eu/tech/datacube-manifestoResearchsupportedbyECcontract654367
NGACDataCubeFeasibilityforForecastingPaper April2018
NationalGeospatialAdvisoryCommittee(www.fgdc.gov/ngac)
12
emergingevidence.16Insomecountries,thetoolstoworkwiththemassivedataareeithernotavailableortheskilllevelsarecurrentlyinadequate.Theprivatesectorworkingcloselywiththenationalgovernment’simagerycoulddramaticallysimplifydatausewhensolutionsratherthandataisthedesiredoutcome.Cloudcomputing,inadditionorinlieuofcloudstorage,maybethetailoredapproach.Evenwhendataarewhatmaybeneeded,theresponsecouldbeatailoreddatacubeprovisionwherethenationaldata,likeARD,arelayeredwithotherdatasourcesandrefinedtoaparticularfootprint,consistentwiththeprecedingdiscussioninthisstudy.OnequestionraisedwashowtheprivatesectormightcollaboratetohelpwithtilingtheadditionalsourcestomatchthatofARDasthelayersofthedatacubeareincorporated.DuringtheCEOSbriefinginBrazilinMarch201817,thetopicofcooperationwiththeprivatesectorundersomegrantagreementswasincludedasaneededfacilitatoroftheglobaleffort.PartnershipswithGoogle,Amazon,andotherswereseenasenablingthe“scalablesolution.”MajorRecommendations
Thisreportmakessomespecificrecommendations,specificallywithrespecttotheU.S.LandsatAnalysisReadyData(ARD)anditspotentialforbeingincorporatedinavarietyofdatacubes,asadirect-usedatasetinmonitoringandassessinglandscapechange.1. TaskteammembersendorsetheopennessoftheEROSCentercommitmenttoprovidethesource
dataandtopublish,as“opensource,”thesoftwareandalgorithmsusedtoproduceARD.TheTeamrecommendstheUSGSshouldpublishverificationproceduresthatthemethodsandworkflowshavebeenreplicatedproperlyforanynon-USGSprocessing.TheseprocedureswouldlikelyreflecttheveryprocessesthatUSGShasusedinpreparingARD.TheverificationtaskitselfwouldnotbetheresponsibilityoftheEROSCenterbutratherofanyotherentityusingthesoftwareandalgorithms.
2. Studiesmayalreadyexistthatcharacterizehow“normalizing”reflectanceacrosssensorsandyears
mightaffectvalues.TheTaskTeamrecommendsthatUSGSEROSCenterreleaseanyerror/differencestudyandanalysisbetweenthereflectancevaluesoftraditionalscenepixelsandtheARDunitpixels,whichmayhavealreadybeencompleted,todetermineanyradiometricchangesresultingfrompreprocessingtocreatetheARD.Offeringaccesstothosestudiescouldbebeneficialtosomeresearchers.Ifsuchananalysishasnotbeencompleted,theTaskTeamrecommendsthatonebeinitiated.
3. TheTaskTeamexpectsprocessingtechniques,algorithms,andassociatedtoolstoimproveover
time.Reprocessingtheentiredataset,vicelimitingnewapproachestoonlydataacquiredafterthedevelopmentofimprovements,wouldmeetthe“analysis-ready”objectiveofreducingthedataprocessingloadofdatauses.TheTeambelievesthatcompleterevisionoftheentireARDcouldfollowaMODISapproach.TheteamwasadvisedbyUSGSthatsuchprocessingofsomuchdatacouldtakeuptotenmonthssoareasonablescheduleforupdateswillneedtobeestablished.TheTaskTeamrecommendsthatwhenimprovedprocessingapproachesareready,thereprocessing
16TheCEOSDataCube,Three-YearWorkPlan2016-2018http://ceos.org/document_management/Ad_Hoc_Teams/SDCG_for_GFOI/Meetings/SDCG-10/Cube%203-Year%20Work%20Plan%20-%20v1.0.pdf.17Holloway,Kim“OpenDataCubeInitiative”Agendaitem#8,CEOS7thWorkingGroupforCapacityBuildingandDataDemocracyAnnualMeeting,INPEJosedosCampos,Brazil,6-8thMarch2018
NGACDataCubeFeasibilityforForecastingPaper April2018
NationalGeospatialAdvisoryCommittee(www.fgdc.gov/ngac)
13
shouldapplytotheentiredatasetinuseandthatusersshouldnotberequiredthemselvestoapplycompatibilityadjustmentstoanyARDreceivedpriortothechange.
4. TheTaskTeamdoesagreethatMSSshouldbeincorporatedintoARDtooptimizeuseoftheentire
forty-fiveyearsofcollectionhistory.However,theTeamrecommendsthatprioritizingdevelopmentworkshouldbecarefullyscrutinizedwithconsiderationgiventowhethergloballyextendingARDmaybemoreimportantthanspendingavailabletimeincorporatingtheMSScollection.Ingeneral,itisrecommendedthatUSGSassessallneedsandwantsandestablishcriteriatoprioritizeLandsatwork,includingenhancementstotheARDinitiative.
5. TheTaskTeamrecommendsthatUSGSshouldnotundertaketoscaletheUSARDcoverageeffort
globallybythemselves,astheprivatesectorisbetterpreparedwithneededtools,maturetechniques,and,particularlyscalableinfrastructure.
Thisreportalsomakesrecommendationsaboutgeospatialdatacubes,astheybecomemoregloballyemployedtomanageandexchangeinformationforavarietyofapplications.1. TheTaskTeamrecommendsthatUSGSrepresentation,asaStrategicMember,totheOpen
GeospatialConsortiumshouldadvocateforandparticipateinmorediscussionaboutdatacubestandardswithintheOGCTechnicalCommittee.
2. TheTaskTeamrecommendsthatpreparingdatacubesforspecificusesshouldnotbeanobjectiveofthegovernment,whichshouldbecautiousaboutproceedingevenwithproductionofsomegenericformsofadatacube.Thetailoreddatacubesshouldnotbeafederalgovernmentproductionresponsibility.
Additionalrecommendationsaremadewithreferencetothisreport.1. TheTaskTeamrecommendsthatasubsequentrequestbemadetoafutureLAGTeamtoevaluate
progressonthefindingsandrecommendationsofthispaperandtoupdateasneeded.2. TheUSGShasonlyfledglingexperiencewithARD,havingfirstreleasedittothecommunityofusers
attheendofOctober2017.Atthispoint,therehasnotbeenextensiveexperienceonthepartofARDusersandcertainlynotmuchevidenceoftheresultingdatacubes.ItwouldbehelpfulforUSGStosurveythosewhorequesttheARDonsomeroutinebasis,gatheringinformationforasubsequentreport.AmongthefactorstobesurveyedwouldbeifusersaretransitioningtoARDorstillrequestingthepreviousdistributionformats.ThePecoraConferenceinmid-November2017providedaninitialopportunityforgroupsofLandsatuserstodiscusstheirearlyreactionstothereleaseofARD.Sincethattime,usehasincreasedbutnotallusersarefullycomfortableknowinghowtousethedatatoitsbestadvantage.Similarly,on-goinginformationexchangesbetweenthepublicandprivatesectorsmayprovidemoreinsightintodefiningtheinterdependenciestomakedatacubesthemosteffectivewaytoadvanceuseofimageryandexpansionofGIStechnology.
NGACDataCubeFeasibilityforForecastingPaper April2018
NationalGeospatialAdvisoryCommittee(www.fgdc.gov/ngac)
14
AcronymListforthisPaperALOS AdvancedLandObservingSatelliteJapaneseEarth-observationsatellite,
developedbyJAXA(JapanAerospaceExplorationAgency)ARD AnalysisReadyDataCEOS CommitteeonEarthObservationSatellitesCOG CloudOptimizedGeoTIFFCSIRO CommonwealthScientificandIndustrialResearchOrganisationisan
independentagencyoftheAustralianFederalGovernmentresponsibleforscientificresearchinAustralia.
DGGS DiscreteGlobalGridSystemEO Electro-opticalsystemsoperateintheopticalportionoftheelectromagnetic
spectrum.EROS EarthResourcesObservationandScience,aUSGSCenternearSiouxFall,SDERS EarthResourcesSatellite,thefirsttworemotesensingsatelliteslaunchedbyESAESA EuropeanSpaceAgencyETM+ EnhancedThematicMapperPlus,asensoronboardtheLandsat7satelliteFGDC FederalGeographicDataCommitteeGeoTIFF GeoreferencedTaggedImageFileFormat,apublicdomainmetadatastandard
whichallowsgeoreferencinginformationtobeembeddedwithinaTIFFfileL1 Level-1Landsatproductswiththebestavailableprocessinglevelforeach
particularsceneL1G Level-1Landsatradiometricallycalibratedwithsystematicgeometriccorrections
usingspacecraftephemerisL1T Level-1Landsatradiometricallycalibratedandorthorectifiedusingground
controlpointsanddigitalelevationmodeldatatocorrectforreliefdisplacementLAG LandsatAdvisoryGroupLCMAP LandChangeMonitoring,Assessment,andProjection,aUSGSinitiative
implementedatEROSLIDAR(Lidar,LiDAR) LightDetectionandRanging,aremotesensingandsurveyingmethodthat
measuresdistancetoatargetbyilluminatingthetargetwithpulsedlaserlightandmeasuringthereflectedpulseswithasensor
MSS Multi-spectralscanner,linescanningdevicesobservingtheEarthperpendiculartotheorbitaltrackonthefirstfiveLandsats
NASA NationalAeronauticsandSpaceAdministrationOGC® OpenGeospatialConsortiumOLAP Onlineanalyticalprocessing,useofdataorganizedmulti-dimensionallytoallow
comparisonsfromdifferentperspectivesOLI OperationalLandImager,apushbroomscanneronLandsat8thatusesafour-
mirrortelescopewithfixedmirrorsSAR Synthetic-apertureradar,atechniqueforproducingfineresolutionimagesfrom
anintrinsicallyresolution-limitedradarsystemSDC SwissDataCube
NGACDataCubeFeasibilityforForecastingPaper April2018
NationalGeospatialAdvisoryCommittee(www.fgdc.gov/ngac)
15
TIRS ThermalInfraredSensor,asystemonLandsatthatmeasureslandsurfacetemperatureintwothermalbands
TOA Top-of-atmosphereUSGS U.S.GeologicalSurveyUTM UniversalTransverseMercator,acoordinatesystemwhichdividestheEarthinto
60zones,each6°oflongitudeinwidthWCPS WebCoverageProcessingService,aprotocol-independentlanguageforthe
extraction,processing,andanalysisofmulti-dimensionalcoveragesrepresentingsensor,image,orstatisticsdata
WELD WebEnabledLandsatDataWRS TheWorldwideReferenceSystem,aglobalnotationsystemforLandsatdataAcknowledgementsThispaperwasapprovedbytheNGACLandsatAdvisoryGroup(LAG)onMarch20,2018andadoptedbytheNGACasawholeonApril3,2018.TheLAGteamdevelopingthispaperincludedRobertaLenczowski,RobertaE.LenczowskiConsulting(TeamLead);FrankAvila,NationalGeospatial-IntelligenceAgency;PeterBecker,ESRI;StevenBrumby,DescartesLabs;RebeccaMoore,GoogleInc.;andTonyWillardson,WesternStatesWaterCouncil.MatthewHancher(Google,Inc.)andSaraLarsen(WesternStatesWaterCouncil)alsocontributedtothispaper.