Big Data in Water - Water Resources Center · Great Flood of Mississippi River, 1993 6 Cedo Caka...
Transcript of Big Data in Water - Water Resources Center · Great Flood of Mississippi River, 1993 6 Cedo Caka...
1/23/2018
1
Big Data in Water: Opportunities and Challenges for Machine Learning
Vipin Kumar
Department of Computer Science and Engineering
University of Minnesota
www.cs.umn.edu/~kumar
1Headwaters Lecture - 2018
2018 Water Resources Assembly and Research SymposiumHeadwaters Lecture
Water : A Grand Societal Challenge of the 21st Century
2Shrinking Lake Mead
Droughts in Southern California
Harmful Algal Bloom in Lake Erie
Floods due to Hurricane Harvey
1/23/2018
2
Big Data in Water
3
Satellite Imagery Weather/Climate Models Hydrological Models
IOT for Water
• Hugely successful in commercial applications:
Golden Age of Data Science
4
1/23/2018
3
Case Study: Monitoring Global Surface Water Dynamics
5
Cedo Caka Lakein Tibet, 1984
Cedo Caka Lakein Tibet, 2011 Aral Sea in 2014Aral Sea in 1989
Impact of Climate Change Impact of Human Actions Early Warning Systems
Great Flood of Mississippi River, 1993
6
Cedo Caka Lakein Tibet, 1984
Cedo Caka Lakein Tibet, 2011 Aral Sea in 2014Aral Sea in 1989
Impact of Climate Change Impact of Human Actions Early Warning Systems
Great Flood of Mississippi River, 1993
Quantifying water stocks and flow
Global projections of water risks (red)
Integrating with hydrological models
Case Study: Monitoring Global Surface Water Dynamics
1/23/2018
4
Satellite Big Data
7
A vegetation index measures the surface “greenness” – proxy for total
biomassThis vegetation time seriescaptures temporal dynamics around the site of the China National Convention Center
Data Type Coverage SpatialResolution
Temporal Resolution
Spectral Resolution
Duration Availability
LANDSAT Multispectral Global 30 m 16 days 7 1972 - present Public
Hyperion Hyperspectral Regional 30 m 16 days 220 2001 - present Public
Sentinal - 1 Radar Global 5 m 12 days - 2014 - present Public
Sentinal - 2 Multispectral Global 10 m 6days 13 2015 - present Public
Quickbird Multispectral Global 2.16 m 2 to 12 days 4 2001 - 2014 Private
MODIS covers ~ 5 billion locations globally at 250m resolution daily since Feb 2000.
Longitude
Latitude
Time
grid cell
• SWBD (SRTM Water Body Dataset (Feb 2000)• Google-JRC water body product (1984 – 2015)
Challenges for Traditional Big Data Methods
• Challenge 1: Heterogeneity in space and time
- Water and land bodies look different in different regions of the world
- Same water body can look different at different time‐instances
8
Great Bitter Lake, Egypt Lake Tana, Ethiopia Lake Abbe, Africa
Mar Chiquita Lake, Argentina in 2000 (left) and 2012 (right)
• Challenge 2: Data Quality
– Clouds, shadows, atmospheric disturbances
• Incorrect labels
• Missing data – no labels
Poyang Lake, China (Pink color shows missing data)
1/23/2018
5
Method Innovations for Monitoring Water
• Ensemble Learning Methods for Handling Heterogeneity in Data 1,2
9
P1
P2
P3
Positive Modes(Water)
Negative Modes(Land)
N1
N2
N3
• Using Physics Guided Labeling to Handle Poor Data Quality3,4
Elevation A > B > C > D
Learn an ensemble of classifiers to distinguish b/w different pairs of positive and negative
modesUse elevation information to
constrain physically-consistent labels
3 Khandelwal et al. ICDM 20154 Mithal et al. (PhD Dissertation)
1 Karpatne et al. SDM 20152 Karpatne et al. ICDM 2015
A Global Surface Water Monitoring System http://z.umn.edu/monitoringwater
• Maps the dynamics of all major surface water bodies (surface area > 2.5 km2) shown as blue dots
Key Highlights:
• Detects melting of glacial lakes• Maps changes in river morphology• Identifies reservoir constructions• Finds relationships b/w surface water
and precipitation/groundwater
10
1/23/2018
6
Showing Surface Water Dynamics
Don Martin Dam, Mexico
Surface area of water around Don Martin Dam across time
Annual Landsat Time‐lapse of this region (Courtesy: Google Earth Engine)
11
Regions of Change in South America
Red Dots (Water Gain):Region of size > 2.5 km2 that have changed
from land to water in the last 15 years
Green Dots (Water Loss):Region of size > 2.5 km2 that have changed
from water to land in the last 15 years
Example time series of a Water Gain region
Example time series of a Water Loss region12
1/23/2018
7
Examples of Change: Shrinking Water Bodies
Aggregate dynamics of all green dots shown on left
(Green dots show regions changing from water to land in last 15 years)
Annual Time‐lapse of an example green dot
13
September 2013
November 2015
November 2015
Examples of Change: Melting Glacial Lakes in TibetWater Gain regions (red dots) show melting of lakes
Red polygons show regions changing from land to water
Aggregate dynamics of all red regions in Tibet
14
1/23/2018
8
Examples of Change: River Meandering(Adjacent occurrence of Water Gain (red) and Water Loss (green) regions all along the river indicate the displacement of water from the green dots to the red dots)
Zoomed‐in View
Example time series of a Water Gain region
Example time series of a Water Loss region
1
Time‐lapse of 1
2
Time‐lapse of 2
15Headwaters Lecture ‐ 2018
16
Examples of Change: Shrinking Island
Headwaters Lecture ‐ 2018
1/23/2018
9
Examples of Change: Dam Construction
17
Construction of Chubetsu Dam, Japan
Construction of a dam characterized by a sudden and persistent increase in surface area
Headwaters Lecture ‐ 2018
Global Reservoir and Dam (GRanD) Database
Global Reservoir and Dam (GRanD) Database:
• A data curation initiative by Global Water System Project (GWSP)
• Finds 61 dams constructed after 2001
UMN Approach:
• Finds 701 dams constructed after 2001
Dams reported by GRanD since 2001: 35
18
A data curation initiative by Global Water System Project (GWSP)
Headwaters Lecture ‐ 2018
1/23/2018
10
Comparison of Dam Detections with GRanDGlobal Reservoir and Dam (GRanD)
Database:
• A data curation initiative by Global Water System Project (GWSP)
• Finds 61 dams constructed after 2001
UMN Approach:
• Finds 701 dams constructed after 2001
Dams only reported by GRanD: 5Dams reported by both UMN and GRanD: 30Dams only reported by UMN: 671
19Headwaters Lecture ‐ 2018
Relationship between Ground Waterand Surface Water Area Dynamics
• GRACE land data:– Obtained from http://grace.jpl.nasa.gov
– Available at 1° spatial resolution, monthly since 2002
– Preprocessing:• Average of GFZ, CSR, and JPL versions computed
• Prescribed grid scaling factors applied
• Surface Water Area Dynamics:– Number of MODIS water pixels counted for every 1° grid cell every month (to match resolutions with GRACE)
– Preprocessing:• Grid cells with less than 50 MODIS water pixels ignored
• Data spatially smoothed using a 3° X 3° windowHeadwaters Lecture ‐ 2018 20
1/23/2018
11
Correlations with GRACE
• Most regions show strong positive correlations b/w surface water dynamics and GRACE measurements
GRACE: Gravimetry Recovery and Climate Experiment• Measures changes in total water mass (surface + groundwater) at ~100
km
21
Examples of Positive Correlations (1)
Correlation: 0.902
Blue: Surface area time seriesRed: GRACE data
22
1/23/2018
12
Negative Correlations in Indus Basin: Over‐consumption of groundwater?
• Increase in area of surface water due to rice/paddy farming and widening of Indus river
• GRACE shows decrease due to depletion of groundwater for agriculture Headwaters Lecture ‐ 2018 23
Negative Correlations in Bangladesh and Thailand
24
1/23/2018
13
Can we produce daily surface water extents maps at high spatial resolution ?
• Challenge: - MODIS (500m resolution, daily)
- LANDSAT (30m, every 16 days),
Sentinel‐2 (10m, every 5‐10 days)
• Solution: ORBIT ‐ Ordering Based Information Transfer across space and time
25
Kajakai ReservoirAfghanistan
Extent at coarse resolution (500m) Extent at high resolution (30m) created using our approach
Quantifying water stocks and flow
Global projections of water risks (red)
Background:LANDSAT 7 image
of Dec 13, 2000
Daily surface water mapping at 30m: Lake Mead, USA
26
1/23/2018
14
Surface Extent at 500m created from MODIS data on Dec 13, 2000 27
Daily surface water mapping at 30m: Lake Mead, USA
Background:LANDSAT 7 image
of Dec 13, 2000
Surface Extent at 30m from MODIS 500m data on Dec 13, 2000 by ORBIT approach using USGS 30m DEM data 28
Daily surface water mapping at 30m: Lake Mead, USA
Background:LANDSAT 7 image
of Dec 13, 2000
1/23/2018
15
29
Daily surface water mapping at 30m: Lake Mead, USA
Background:LANDSAT 7 image
of Dec 13, 2000
Surface Extent at 30m from MODIS 500m data on Dec 13, 2000 by ORBIT approach using USGS 30m DEM data
Surface Extent at 500m created from MODIS data on Dec 13, 2000
30
Daily surface water mapping at 30m: Lake Mead, USA
Background:LANDSAT 7 image
of Dec 13, 2000
1/23/2018
16
31
Surface Extent at 30m created from MODIS 500m data on Dec 13, 2000 by ORBIT approach using USGS 30m DEM data
Daily surface water mapping at 30m: Lake Mead, USA
Background:LANDSAT 7 image
of Dec 13, 2000
32
Surface Extent at 30m created from MODIS 500m data on Dec 13, 2000 by ORBIT approach using USGS 30m DEM data
Daily surface water mapping at 30m: Lake Mead, USA
Background:LANDSAT 7 image
of Dec 13, 2000
1/23/2018
17
Surface Extent at 500m created from MODIS data on Dec 13, 2000
33
Daily surface water mapping at 30m: Lake Mead, USA
Background:LANDSAT 7 image
of Dec 13, 2000
34
Surface Extent at 30m created from MODIS 500m data on Dec 13, 2000by ORBIT approach using USGS 30m DEM data
Daily surface water mapping at 30m: Lake Mead, USA
Background:LANDSAT 7 image
of Dec 13, 2000
1/23/2018
18
35
Surface Extent at 30m created from MODIS 500m data on Dec 13, 2000 by ORBIT approach using USGS 30m DEM data
Daily surface water mapping at 30m: Lake Mead, USA
Background:LANDSAT 7 image
of Dec 13, 2000
MODIS 500m pixel grid in cyan color
Surface Extent at 500m created from MODIS data on Apr 02, 2016
36
Daily surface water mapping at 10m: Richland Chambers Reservoir, USA
(Background image: Sentinel-2 image of
Apr 02, 2016)
1/23/2018
19
Surface Extent at 10m created from MODIS 500m data on Apr 02, 2016 by ORBIT approach using USGS 10m DEM data
37
Daily surface water mapping at 10m: Richland Chambers Reservoir, USA
(Background image: Sentinel-2 image of
Apr 02, 2016)
38
Daily surface water mapping at 10m: Richland Chambers Reservoir, USA
Surface Extent at 10m created from MODIS 500m data on Apr 02, 2016 by ORBIT approach using USGS 10m DEM data
(Background image: Sentinel-2 image of
Apr 02, 2016)
1/23/2018
20
Surface Extent at 500m created from MODIS data on Apr 02, 2016
39
Daily surface water mapping at 10m: Richland Chambers Reservoir, USA
(Background image: Sentinel-2 image of
Apr 02, 2016)
40
Surface Extent at 10m created from MODIS 500m data on Apr 02, 2016 by ORBIT approach using USGS 10m DEM data
Daily surface water mapping at 10m: Richland Chambers Reservoir, USA
(Background image: Sentinel-2 image of
Apr 02, 2016)
1/23/2018
21
Other Applications of Big Data in Water
41
Hydrological Models for Streamflow
Digital Twin of AnacostiaWatershed
Modeling Lake Water QualityCollaboration: USGS
Cover Crop Mapping
Leakage Detection using smart meters
Collaboration: Northeastern University
Land-Water Interaction
Hybrid Physics-Data Models IOT for Water
Collaboration:D.C. Water
Collaboration: University of Minnesota
Team Members
University of Minnesota:Arindam Banerjee, Snigdhansu Chatterjee, Michael Steinbach, Jeff Peterson, David Mulla
Northeastern University: Auroop Ganguly, Ed Beighley
University of Wisconsin: Paul Hansen, Hilary Dugan
USGS: Jordan Read
UCLA: Dennis Lettenmaier
University of Maryland: Charon Birkett
Ankush KhandelwalAnuj Karpatne Xiaowei Jia
42