-- The Analects of Confucius

21
181203Geodetector & Its Applications in Natural & Social Sciences (C) [email protected] 1 2018.12.03 Geodetector & Its Applications in Natural & Social Sciences Wang Jinfeng State Key Laboratory of Resources & Environmental Information System Institute of Geographical Sciences & Nature Resources Research Chinese Academy of Sciences Beijing 100101, PRC [email protected] www.geodetector.cn Sharp tools make good work -- The Analects of Confucius 1. Background 2. Spatial Stratified Heterogeneity: Geodetector 3. Properties of Geodetector 4. Software & Examples 5. Q/A 6. Flexible Geodetector 7. Summary

Transcript of -- The Analects of Confucius

181203Geodetector & Its Applications in Natural & Social Sciences

(C) [email protected]

2018.12.03

Geodetector & Its Applications in Natural & Social Sciences

Wang JinfengState Key Laboratory of Resources & Environmental Information System

Institute of Geographical Sciences & Nature Resources ResearchChinese Academy of Sciences

Beijing 100101, [email protected]

Sharp tools make good work-- The Analects of Confucius

1. Background

2. Spatial Stratified Heterogeneity: Geodetector

3. Properties of Geodetector

4. Software & Examples

5. Q/A

6. Flexible Geodetector

7. Summary

181203Geodetector & Its Applications in Natural & Social Sciences

(C) [email protected]

2018.12.03

1. Background

2. Spatial Stratified Heterogeneity: Geodetector

3. Properties of Geodetector

4. Software & Examples

5. Q/A

6. Flexible Geodetector

7. Summary

Geo

Heterogeneity

Graphy

Strata

Spatial Stratified Heterogeneity(SSH)

Wang JF, Zhang TL, Fu BJ. 2016. A measure of spatial stratified heterogeneity. Ecological Indicators 7: 250-256.

181203Geodetector & Its Applications in Natural & Social Sciences

(C) [email protected]

2018.12.03

Spatial Stratified Heterogeneity (SSH) is Prevalent in Nature

within strata is more similar than between strata

Ecological Zones

Landuse Types

In east China, 36% land supports 96% human population;

In west China, 64% land supports 4% human population.

Plague ~ Dryness (±b) Xu et al. 2011. PNAS Flood in Canada (fs, fa) Buttle et al. 2016. CWRJ

Wet Dry

China

ChallengeChallenge

North China South China

Wet Dry Wet DryWet Dry

South China

SSH� Spatial�Counfounding

Wang JF, Zhang TL, Fu BJ. 2016. A measure of spatial stratified heterogeneity. Ecological Indicators 7: 250-256.

181203Geodetector & Its Applications in Natural & Social Sciences

(C) [email protected]

2018.12.03

Annual Temperature in China in 1900 ?

Climate zones (SSH)

Meteorological Stations in China in 1900

SSH� Bias�of�Sample

Wang JF, Xu CD, Hu MG, Li QX, Yan ZW, Zhao P, Jones P. 2014. A new estimate of the China temperature anomaly series and uncertainty assessment in 1900-2006. Journal of Geophysical Research 119(1): 1-9.

SSH:�A�Spatial�Pattern

World ecological zones China NDVI pattern

US land dissection pattern China population pattern

94%6%

181203Geodetector & Its Applications in Natural & Social Sciences

(C) [email protected]

2018.12.03

SSH� Spatial�Attribution

World ecological zones China NDVI pattern

q = 0.53*** 53% of NDVI variation

is attributed to climate zones in China

Humidity province

Latitudinalregions

Altitudinalbelts

q = 0.323***Moran’s I = 0.24

Rodents’Density Interpolation

Spatial autocorrelation (SAC) is too weak for a SAC basedinterpolation: Kriging, IDW, Spline, etc.When population is SSH, Sandwich interpolation works

SSH�based�Spatial�Interpolation

Sem

ivariogram

Distance (m)

181203Geodetector & Its Applications in Natural & Social Sciences

(C) [email protected]

2018.12.03

SSH confoundingSSH sample biasSSH patternSSH attributionSSH interpolation

SSH is     prevalent 

in the nature 

Statistics for Spatial Stratified Heterogeneity

Measure & Attribute Inference Decision Making

SSH confoundingSSH sample biasSSH patternSSH attributionSSH interpolation

SSH is     prevalent 

in the nature 

Statistics for Spatial Stratified Heterogeneity

Measure & Attribute Inference Decision Making

181203Geodetector & Its Applications in Natural & Social Sciences

(C) [email protected]

2018.12.03

1. Background

2. Spatial Stratified Heterogeneity: Geodetector

3. Properties of Geodetector

4. Software & Examples

5. Q/A

6. Flexible Geodetector

7. Summary

Wang JF, Zhang TL, Fu BJ. 2016. A measure of spatial stratified heterogeneity. Ecological Indicators 7: 250-256.

q = 1∑

YSample

h = 1h = 2

h = 1 h = 3 h = 4SSH(Y)

Spatial Stratified Heterogeneity (SSH): Measure(Pattern: within strata is more similar than between strata)

~ F(L1, NL; )q [0, 1], Y has 100q% degree of SSH= 0, if Y has no SSH= 1, if Y is fully SSH

181203Geodetector & Its Applications in Natural & Social Sciences

(C) [email protected]

2018.12.03

Nw = 32NB = 32

5.0y25.02

Goodchild M. 1986. CATMOG, GeoBooks, Norwich

32

0 1

q = 0

Wang JF, Zhang TL, Fu BJ. 2016. A measure of spatial stratified heterogeneity. Ecological Indicators 67: 250-256.

Line 1

q1 = 1***

Line 1

q1 = 0.16*

Line 2

q2 = 0.27** q1&2 = 0.67***

Line 2

q2 = 0

I = -1.000 I = +0.393 I = +0.857

Moran P. 1950. Notes on continuous stochastic phenomena. Biometrika 37:17-23.

Wang JF, Zhang TL, Fu BJ. 2016. A measure of spatial stratified heterogeneity. Ecological Indicators 7: 250-256.

q = 1∑

YSample

h = 1h = 2

h = 1 h = 3 h = 4SSH(X)

Spatial Stratified Heterogeneity (SSH): Attribute(Axiom: If X causes Y, then their maps would be coupled)

q [0, 1], 100q% of Y is explained by X0, if Y is independent to X1, if Y is fully depended upon X

~ F(L1, NL; )

181203Geodetector & Its Applications in Natural & Social Sciences

(C) [email protected]

2018.12.03

q(NDVI | climate zones)= 0.53***Climate zones explain 53% of NDVI variation in China

q(Y | #X):If X controls Y, they will be spatially coupled

1. Background

2. Spatial Stratified Heterogeneity: Geodetector

3. Properties of Geodetector

4. Software & Examples

5. Q/A

6. Flexible Geodetector

7. Summary

181203Geodetector & Its Applications in Natural & Social Sciences

(C) [email protected]

2018.12.03

(# in 0D)

(Pairs in 0D)

(Order in 1D)

(Coupled in 2D)

(1) Correlation Causation

Wang JF, Zhang TL, Fu BJ. 2016. A measure of spatial stratified heterogeneity. Ecological Indicators 7: 250-256.

y

x

#

Nonlinear

Linear

Histogram Pearson Corr. Pearson Corr. Geodetector q

Linear

y

x y

xy

x

I would rather discover one causal law

than be King of Persia-- Democritus

(460-370 B.C.)

Confounding

Overfitting+

+

SSH

Global modelsY = a + bX

Figure 5: GWR showing the coefficients from the output. Positive values indicate a positive association, while negative values indicate a negative associationhttp://blogs.oregonstate.edu/geog566spatialstatistics/2017/06/11/beetle-outbreaks-wildfire-overlapping-time-space/

Local modelsY = a(s) + b(s)X

(2) Global Local SSH

Wang JF, Zhang TL, Fu BJ. 2016. A measure of spatial stratified heterogeneity. Ecological Indicators 7: 250-256.

Xu et al 2011,PNAS

Interpret in Strata

Geodetector

Modeling in Strata Strata

Strata

181203Geodetector & Its Applications in Natural & Social Sciences

(C) [email protected]

2018.12.03

Y

X2

X1

X1 ∩ X2

X13

X11

X12

X22

X21

X23

X11∩X22

X11∩X21

X13∩X21

X13∩X22

X12∩X22

X11∩X23

X13∩X23

(3) Interaction: Y X1 ∩ X2

Name Mechanism; Equation Test Conclusion

CorrelationG883; E107

Linear:

= ∑ ̅

∑ ̅ ∑

F =/

/

~ F(k-1, n-k)

Regression(& MLM)G62 G84G259,893G140,258

Linear: y = a + bx;

b = ∑ ̅

∑ ̅;R2=1 -

= b2∑ ̅=

∑ ̅ ∑ ̅ ∑

F =/

/

~ F(k-1, n-k)

1. X explains linearly 100R2% of Y

2. =

ANOVAG140;C333;C337

Means in strata:Yij = ai + eij

Assume: Eeij = 0,Veij = const.

H0: 1 = … = k

= [0. +)

~ F(k - 1; n - k)

Significance of the difference btw means in strata

Classification Min SSW + Max SSB: Kmeans;

F == [0. +)

~ F(k-1; n-k);Kappa

Classes

Geodetectorq-statistic

Min SSW + Max SSB

q = 1 - = 1 -∑

=

H0: h2 = 2

= [0. 1]

~ F(L-1; N-L; )

1. SSH degree of Y2. X explains 100q% of Y

3. = in strata

Mean & Var

ANOVA F = F = 9894 N/A N/A F = 0

q-statistic q = 1 q = 0.8 q = 0.7 q = 0.6 q = 0

181203Geodetector & Its Applications in Natural & Social Sciences

(C) [email protected]

2018.12.03

Properties of Geodetector

(1) Finds coupling between y & x, no assumption of linearity or nonlinearity.

(2) Avoid confounding btw x and by a prior stratification of population by the others, then apply Geodetector (y|x).

(3) Finds general interaction between x1 and x2, no assumption of product as in econometrics.

(4) Explanatory variable x could be categorical, response variable y is numerical; numerical x should be stratified.

(5) When sample is small, Geodetector outperforms regression, because stratification creates homogeneity. so a sample becomes more representative in each stratum.

(6) Immune collinearity between covariates x1 and x2.(7) Stratification is flexible: spatial, time, attribute, spectrum.

1. Background

2. Spatial Stratified Heterogeneity: Geodetector

3. Properties of Geodetector

4. Software & Examples

5. Q/A

6. Flexible Geodetector

7. Summary

181203Geodetector & Its Applications in Natural & Social Sciences

(C) [email protected]

2018.12.03

Software: www.geodetector.cn

Prepare data Run software Output

y

X

Pollution: Soil Antibiotic in a Town Medicine: NTD in a County

Nature: Dissection in USA Social: Urbanization in China

181203Geodetector & Its Applications in Natural & Social Sciences

(C) [email protected]

2018.12.03

Determinants of Land Dissection Density in US

Luo W, Jasiewicz J, Stepinski T, et al. 2016. Spatial association between dissection density and environmental factors over the entire conterminous United States, Geophys. Res. Lett. 43: 692–700

Y

Y X

lithology Glaciation Elevation

Slope Precipitation Permeability

刘彦随,杨忍. 2012.中国县域城镇化的空间特征与形成机理. 地理学报 67(8): 1011–1020

Area Fixedinvest

Dist fromcenter city

Farmerincome

Indusstruct

Populadensity

Cropper cpt

GDPPer cpt

Transportation

Elevation

Precipitation

106 Highway 0.30 0.34 0.58 0.18 0.10 0.45 0.53 0.13 0.00 0.11

North Border 0.06 0.03 0.13 0.05 0.01 0.23 0.02 0.21 0.22 0.18

Eastern Coast 0.16 0.17 0.26 0.10 0.28 0.09 0.36 0.08 0.05 0.01

Yangzi River 0.27 0.19 0.39 0.33 0.29 0.21 0.53 0.13 0.19 0.07

Belt & Road 0.51 0.28 0.41 0.22 0.04 0.22 0.29 0.03 0.01 0.17

Y X

Driving Forces of Urbanization in China

Y

181203Geodetector & Its Applications in Natural & Social Sciences

(C) [email protected]

2018.12.03

Y X

Determinants of Soil Antibiotic in a Town

Li XW, et al. 2013. Influence of planting patterns on Fluoroquinolone residues in the soil of an intensive vegetable cultivation area in north China. Science of the Total Environment 458-460: 63-69.

Vegetable planting models (0.28) > chicken dung quantity (0.20) > elevation (0.18) > planting age (0.09) > chicken dung types (0.06) > area of greenhouse (0.02).

Y

Y X

Determinants of NTD in a County

Lithology Types

Topography

Soil Types

Wang JF, Li XH, Christakos G, et al 2010. Geographical detectors-based health risk assessment and its application in the neural tube defects study of the Heshun Region, China. International Journal of Geographical Information Science 24(1): 107-127.

Primary environment (watershed, lithozone, and soil) strongly controls the NTD. Basic nutrition (food) is more important than artificial environment (fertilizer) in

controlling the spatial pattern of NTDs. Ancient materials released from faults then spreading along slopes dramatically

increase the risk of NTDs.

Y

181203Geodetector & Its Applications in Natural & Social Sciences

(C) [email protected]

2018.12.03

1. Background

2. Spatial Stratified Heterogeneity: Geodetector

3. Properties of Geodetector

4. Software & Examples

5. Q/A

6. Flexible Geodetector

7. Summary

How to make stratification (MS)1. If x is categorical, that’s fine;2. If x is numerical:(1) MS according to the existing commons in your discipline;(2) Try different stratifications{h}, use the one with bigger q, good explanation

like regression, (OLS) try different {b}, use one which maximize R2.(3) Equally divided 7 strata, according to the human sense, like ARCGIS.3. Multi {xi} may have different strata {hi}, like {xi}~{bi} in regressionWhen software reports error or abnormal result, please checkNumerical X should be stratified, > 2 sample units are required in each stratum.q value and p valueWhen measuring SSH of a variable y: 100q% SSH degree, at p sig. level.When attributing y to x: x explains 100q% of y, no need to report p value.Should qi = 1 ?No, because of nonlinear coupling between y and x, and interaction between x.Direction of q ?No direction for nonlinear; linear then direction in strata.Big sample, say 1m pixels of a remote sensing imageResampling, 100 sample units in a stratum is fine; R-Geodetector software

181203Geodetector & Its Applications in Natural & Social Sciences

(C) [email protected]

2018.12.03

1. Background

2. Spatial Stratified Heterogeneity: Geodetector

3. Properties of Geodetector

4. Software & Examples

5. Q/A

6. Flexible Geodetector

7. Summary

1. Measure SSH: q(y|#), where # stands for strata

2. Find SSH: q(y|#)

3. Ranking factors {xk}:qk(y|#xk), k

4. Compare factors’ ranks{xk} btw regions{z}: qzk(y|#xk)

5. Factors’ rank {xk} evolves in time {t}: qtk(y|#xk)

6. Compare factors’ ranks{xk} btw PCAs {s}: qsk(y|#xk)

7. Product interact q(y|x1x2) General interact q(y|#x1#x2)

8. To get higher q value:

(1) First, linear regress y ~ x + e; then Geodetector (e|x);

(2) Super q algorithm: try different strata, get the one max q.

9. dy/dx: Geodetector to find SSH; then regress y~x in strata.

10. If categorical y & numerical x, use the software in (x|y)

11. Visualize Gedetector (please see the next slides …)

181203Geodetector & Its Applications in Natural & Social Sciences

(C) [email protected]

2018.12.03

Fig. 7. Determinant power of factors influencing hazard index for health risk Wang HF, et al 2018. EP

Visualization of Geodetector q values

Figure 5: Power of each determinant in ascending sequence (Zhang KS, et al 2017. JTLU)

Figure 6. The power of determinant for the investment in residential buildings (IRB), immigrant (Yang SF, et al 2017. Sustainability)

Visualization of Geodetector’s interaction btw two variables

Song YZ, et al 2018. RSRen Y, et al 2016. EP

181203Geodetector & Its Applications in Natural & Social Sciences

(C) [email protected]

2018.12.03

1. Background

2. Spatial Stratified Heterogeneity: Geodetector

3. Properties of Geodetector

4. Software & Examples

5. Q/A

6. Flexible Geodetector

7. Summary

Sampling

Monitor

Remotesensing Spatial Statistics

Mathematics

Data Modelling Population

Modern�Paradigm�in�Geographic�Studies

Prediction

Patterns

Attribution

181203Geodetector & Its Applications in Natural & Social Sciences

(C) [email protected]

2018.12.03

Population

Statistics

Spatial Autocorrelation(1950 ~ 1995)

Spatial Heterogeneity

Local(1992 ~ 2000)

Stratified(2010-)

Measure Moran I; Semivariogram Gi; LISA; SatScan Geodetector q

Attribution Spatial regress; BHM GWR Geodetector q

Prediction Kriging; IDW; Spine N/A Sandwich; MSN, Bshade; SPA

Spatial�Statistics

Wang JF, Zhang TL, Fu BJ. 2016. A measure of spatial stratified heterogeneity. Ecological Indicators 7: 250-256.

q = 1∑

h = 1h = 2

h = 1 h = 3 h = 4

Sample

Geodetector

181203Geodetector & Its Applications in Natural & Social Sciences

(C) [email protected]

2018.12.03

Applications of Geodetector q

Spatial stratified heterogeneity (SSH) is prevalent in the nature• Good news: understand the nature; linearization • Bad news: cause sample bias; global model confounded

Properties of Geodetector q• Y is numerical, X is categorical• Y = q(X): general coupling, no assumption of linearity, • Y = q(X1 X2): general interaction, not limitation of product.• The stratification is flexible: in spatial, temporal, attribute, spectrum, etc.

Aim Algorithms Interpretation

#Yq(Y | #); q [0, 1]

q = 0, no SSH; q = 1, perfect SSH; q: degree of SSHmax q(Y | #)

Xq(Y | #X) If X determines Y, their spatial distribution tend to be consistent

q [0, 1]: 100q% of variance of Y is explained by the strata of Xmax q(Y | #X)

X1,2 q(Y | X1 X2) Interaction between X1 and X2 impacting Y

Thanks !

www.geodetector.cn