An Analysis of Approaches to Presence-Only Datahastie/TALKS/hastieSDM.pdf · An Analysis of...
Transcript of An Analysis of Approaches to Presence-Only Datahastie/TALKS/hastieSDM.pdf · An Analysis of...
An Analysis of Approaches to Presence-OnlyData
William Fithian and Trevor HastieDepartment of Statistics
Stanford University
July 30, 2012
Species Distribution Modeling
Question: where may a given species be found?
Motivations:
• Plan wildlife management actions
• Monitor endangered or invasive species
• Scientific understanding
• etc.
What geographic features predict greater abundance?
Species Distribution Modeling
Question: where may a given species be found?
Motivations:
• Plan wildlife management actions
• Monitor endangered or invasive species
• Scientific understanding
• etc.
What geographic features predict greater abundance?
Species Distribution Modeling
Question: where may a given species be found?
Motivations:
• Plan wildlife management actions
• Monitor endangered or invasive species
• Scientific understanding
• etc.
What geographic features predict greater abundance?
Presence-Absence / Count Data
Scientists visit patch of land
Record whether any specimens encountered / how many
Relatively high quality data
Expensive, difficult for rare or elusive species
Presence-Absence / Count Data
Scientists visit patch of land
Record whether any specimens encountered / how many
Relatively high quality data
Expensive, difficult for rare or elusive species
Presence-Absence / Count Data
Scientists visit patch of land
Record whether any specimens encountered / how many
Relatively high quality data
Expensive, difficult for rare or elusive species
Presence-Absence / Count Data
Scientists visit patch of land
Record whether any specimens encountered / how many
Relatively high quality data
Expensive, difficult for rare or elusive species
Presence-Only Data
Motorist spies koala
Calls museum excitedly
Museum records location
Lower quality data
More of it exists
Increasingly popular object of study with advent of geographicinformation systems
Presence-Only Data
Motorist spies koala
Calls museum excitedly
Museum records location
Lower quality data
More of it exists
Increasingly popular object of study with advent of geographicinformation systems
Presence-Only Data
Motorist spies koala
Calls museum excitedly
Museum records location
Lower quality data
More of it exists
Increasingly popular object of study with advent of geographicinformation systems
Presence-Only Data
Motorist spies koala
Calls museum excitedly
Museum records location
Lower quality data
More of it exists
Increasingly popular object of study with advent of geographicinformation systems
Presence-Only Data
Motorist spies koala
Calls museum excitedly
Museum records location
Lower quality data
More of it exists
Increasingly popular object of study with advent of geographicinformation systems
Presence-Only Data
Motorist spies koala
Calls museum excitedly
Museum records location
Lower quality data
More of it exists
Increasingly popular object of study with advent of geographicinformation systems
Overview
Proliferation of methods for study
Recent papers have pointed out close connections
• Warton and Shepherd (2010)
• Aarts et al. (2011)
Goals here:
1 Interpret
2 Explore implications
3 Extend results
Overview
Proliferation of methods for study
Recent papers have pointed out close connections
• Warton and Shepherd (2010)
• Aarts et al. (2011)
Goals here:
1 Interpret
2 Explore implications
3 Extend results
Overview
Proliferation of methods for study
Recent papers have pointed out close connections
• Warton and Shepherd (2010)
• Aarts et al. (2011)
Goals here:
1 Interpret
2 Explore implications
3 Extend results
Overview
Proliferation of methods for study
Recent papers have pointed out close connections
• Warton and Shepherd (2010)
• Aarts et al. (2011)
Goals here:
1 Interpret
2 Explore implications
3 Extend results
Overview
Proliferation of methods for study
Recent papers have pointed out close connections
• Warton and Shepherd (2010)
• Aarts et al. (2011)
Goals here:
1 Interpret
2 Explore implications
3 Extend results
Outline
1 Inhomogeneous Poisson Process Model / Maxent
2 Logistic Regression
3 Pooling Different Kinds of Data
Notation
n1 presence observations, n0 background observations
Geographic coordinates zi ∈ D ⊆ R2, i = 1, . . . , n0 + n1
Features xi = x(zi) measured via GIS
yi = 1 for presence, 0 for background
Notation
n1 presence observations, n0 background observations
Geographic coordinates zi ∈ D ⊆ R2, i = 1, . . . , n0 + n1
Features xi = x(zi) measured via GIS
yi = 1 for presence, 0 for background
Notation
n1 presence observations, n0 background observations
Geographic coordinates zi ∈ D ⊆ R2, i = 1, . . . , n0 + n1
Features xi = x(zi) measured via GIS
yi = 1 for presence, 0 for background
Notation
n1 presence observations, n0 background observations
Geographic coordinates zi ∈ D ⊆ R2, i = 1, . . . , n0 + n1
Features xi = x(zi) measured via GIS
yi = 1 for presence, 0 for background
Outline
1 Inhomogeneous Poisson Process Model / Maxent
2 Logistic Regression
3 Pooling Different Kinds of Data
Inhomogeneous Poisson ProcessIntensity function
λ(z) : D → [0,∞)
Λ(A) =
∫Aλ(z) dz
Assume Λ(D) <∞.
pλ(z) = λ(z)/Λ(D)
Definition 1: choose poisson # points, then simple random sample
n1 ∼ Poisson(Λ(D))
zi|yi = 1i.i.d.∼ pλ
Definition 2: continuous limit of discrete poisson model
N(A) = #{i : zi ∈ A, yi = 1}∼ Poisson(Λ(A))
A ∩B = ∅ ⇒ N(A) ⊥⊥ N(B)
Inhomogeneous Poisson ProcessIntensity function
λ(z) : D → [0,∞)
Λ(A) =
∫Aλ(z) dz
Assume Λ(D) <∞.
pλ(z) = λ(z)/Λ(D)
Definition 1: choose poisson # points, then simple random sample
n1 ∼ Poisson(Λ(D))
zi|yi = 1i.i.d.∼ pλ
Definition 2: continuous limit of discrete poisson model
N(A) = #{i : zi ∈ A, yi = 1}∼ Poisson(Λ(A))
A ∩B = ∅ ⇒ N(A) ⊥⊥ N(B)
Inhomogeneous Poisson ProcessIntensity function
λ(z) : D → [0,∞)
Λ(A) =
∫Aλ(z) dz
Assume Λ(D) <∞.
pλ(z) = λ(z)/Λ(D)
Definition 1: choose poisson # points, then simple random sample
n1 ∼ Poisson(Λ(D))
zi|yi = 1i.i.d.∼ pλ
Definition 2: continuous limit of discrete poisson model
N(A) = #{i : zi ∈ A, yi = 1}∼ Poisson(Λ(A))
A ∩B = ∅ ⇒ N(A) ⊥⊥ N(B)
Inhomogeneous Poisson ProcessIntensity function
λ(z) : D → [0,∞)
Λ(A) =
∫Aλ(z) dz
Assume Λ(D) <∞.
pλ(z) = λ(z)/Λ(D)
Definition 1: choose poisson # points, then simple random sample
n1 ∼ Poisson(Λ(D))
zi|yi = 1i.i.d.∼ pλ
Definition 2: continuous limit of discrete poisson model
N(A) = #{i : zi ∈ A, yi = 1}∼ Poisson(Λ(A))
A ∩B = ∅ ⇒ N(A) ⊥⊥ N(B)
Inhomogeneous Poisson ProcessIntensity function
λ(z) : D → [0,∞)
Λ(A) =
∫Aλ(z) dz
Assume Λ(D) <∞.
pλ(z) = λ(z)/Λ(D)
Definition 1: choose poisson # points, then simple random sample
n1 ∼ Poisson(Λ(D))
zi|yi = 1i.i.d.∼ pλ
Definition 2: continuous limit of discrete poisson model
N(A) = #{i : zi ∈ A, yi = 1}∼ Poisson(Λ(A))
A ∩B = ∅ ⇒ N(A) ⊥⊥ N(B)
Presence-Only Data as IPP
Warton & Shepherd (2010) propose log-linear IPP forpresence-only data
λ(z) = eα+β′x(z)
pλ(z) =eβ
′x(z)∫D e
β′x(z) dz
β determines pλ
α determines Λ(D)
Presence-Only Data as IPP
Warton & Shepherd (2010) propose log-linear IPP forpresence-only data
λ(z) = eα+β′x(z)
pλ(z) =eβ
′x(z)∫D e
β′x(z) dz
β determines pλ
α determines Λ(D)
Presence-Only Data as IPP
Warton & Shepherd (2010) propose log-linear IPP forpresence-only data
λ(z) = eα+β′x(z)
pλ(z) =eβ
′x(z)∫D e
β′x(z) dz
β determines pλ
α determines Λ(D)
Presence-Only Data as IPP
Warton & Shepherd (2010) propose log-linear IPP forpresence-only data
λ(z) = eα+β′x(z)
pλ(z) =eβ
′x(z)∫D e
β′x(z) dz
β determines pλ
α determines Λ(D)
Presence-Only Data as IPP
Warton & Shepherd (2010) propose log-linear IPP forpresence-only data
λ(z) = eα+β′x(z)
pλ(z) =eβ
′x(z)∫D e
β′x(z) dz
β determines pλ
α determines Λ(D)
Identifiability and Observer Bias
Occurrence process of scientific interest
Presence-only data reflect rate of sightings
Observation process is thinned occurrence process
λobs(z) = λocc(z)s(z)
= eα̃+β̃′x(z)eγ+δ
′x(z)
Options:
1 Assume s is constant (optimistic)
2 Assume s and λocc depend on different features
Either way, α̃ unidentifiable (α = γ + α̃)
Identifiability and Observer Bias
Occurrence process of scientific interest
Presence-only data reflect rate of sightings
Observation process is thinned occurrence process
λobs(z) = λocc(z)s(z)
= eα̃+β̃′x(z)eγ+δ
′x(z)
Options:
1 Assume s is constant (optimistic)
2 Assume s and λocc depend on different features
Either way, α̃ unidentifiable (α = γ + α̃)
Identifiability and Observer Bias
Occurrence process of scientific interest
Presence-only data reflect rate of sightings
Observation process is thinned occurrence process
λobs(z) = λocc(z)s(z)
= eα̃+β̃′x(z)eγ+δ
′x(z)
Options:
1 Assume s is constant (optimistic)
2 Assume s and λocc depend on different features
Either way, α̃ unidentifiable (α = γ + α̃)
Identifiability and Observer Bias
Occurrence process of scientific interest
Presence-only data reflect rate of sightings
Observation process is thinned occurrence process
λobs(z) = λocc(z)s(z)
= eα̃+β̃′x(z)eγ+δ
′x(z)
Options:
1 Assume s is constant (optimistic)
2 Assume s and λocc depend on different features
Either way, α̃ unidentifiable (α = γ + α̃)
Identifiability and Observer Bias
Occurrence process of scientific interest
Presence-only data reflect rate of sightings
Observation process is thinned occurrence process
λobs(z) = λocc(z)s(z)
= eα̃+β̃′x(z)eγ+δ
′x(z)
Options:
1 Assume s is constant (optimistic)
2 Assume s and λocc depend on different features
Either way, α̃ unidentifiable (α = γ + α̃)
Identifiability and Observer Bias
Occurrence process of scientific interest
Presence-only data reflect rate of sightings
Observation process is thinned occurrence process
λobs(z) = λocc(z)s(z)
= eα̃+β̃′x(z)eγ+δ
′x(z)
Options:
1 Assume s is constant (optimistic)
2 Assume s and λocc depend on different features
Either way, α̃ unidentifiable (α = γ + α̃)
Maximum Likelihood for IPP
Log-likelihood
`(α, β) =∑yi=1
α+ β′xi −∫Deα+β
′x(z) dz
Score equation for α:
n1 =
∫Deα+β
′x(z) dz = Λ(D)
Implication: α̂ not of scientific interest unless n1 is
Maximum Likelihood for IPP
Log-likelihood
`(α, β) =∑yi=1
α+ β′xi −∫Deα+β
′x(z) dz
Score equation for α:
n1 =
∫Deα+β
′x(z) dz = Λ(D)
Implication: α̂ not of scientific interest unless n1 is
Maximum Likelihood for IPP
Log-likelihood
`(α, β) =∑yi=1
α+ β′xi −∫Deα+β
′x(z) dz
Score equation for α:
n1 =
∫Deα+β
′x(z) dz = Λ(D)
Implication: α̂ not of scientific interest unless n1 is
Maximum Likelihood for IPP
Plug in α̂(β) (partially maximize `):
`∗(β) =∑yi=1
β′xi − n1 log
(∫Deβ
′x(z) dz
)=∑yi=1
log pλ(zi)
Score equations for β:
1
n1
∑yi=1
xi =
∫D e
β′x(z)x(z) dz∫D e
β′x(z) dz= Epλx(z)
Interpretation:
1 Choose β̂ to match means of features x(z)
2 Choose α̂ so Λ(D) = n1
1. Estimate density. 2. Multiply by n1.
Maximum Likelihood for IPP
Plug in α̂(β) (partially maximize `):
`∗(β) =∑yi=1
β′xi − n1 log
(∫Deβ
′x(z) dz
)=∑yi=1
log pλ(zi)
Score equations for β:
1
n1
∑yi=1
xi =
∫D e
β′x(z)x(z) dz∫D e
β′x(z) dz= Epλx(z)
Interpretation:
1 Choose β̂ to match means of features x(z)
2 Choose α̂ so Λ(D) = n1
1. Estimate density. 2. Multiply by n1.
Maximum Likelihood for IPP
Plug in α̂(β) (partially maximize `):
`∗(β) =∑yi=1
β′xi − n1 log
(∫Deβ
′x(z) dz
)=∑yi=1
log pλ(zi)
Score equations for β:
1
n1
∑yi=1
xi =
∫D e
β′x(z)x(z) dz∫D e
β′x(z) dz= Epλx(z)
Interpretation:
1 Choose β̂ to match means of features x(z)
2 Choose α̂ so Λ(D) = n1
1. Estimate density. 2. Multiply by n1.
Maximum Likelihood for IPP
Plug in α̂(β) (partially maximize `):
`∗(β) =∑yi=1
β′xi − n1 log
(∫Deβ
′x(z) dz
)=∑yi=1
log pλ(zi)
Score equations for β:
1
n1
∑yi=1
xi =
∫D e
β′x(z)x(z) dz∫D e
β′x(z) dz= Epλx(z)
Interpretation:
1 Choose β̂ to match means of features x(z)
2 Choose α̂ so Λ(D) = n1
1. Estimate density. 2. Multiply by n1.
Maximum Likelihood for IPP
Plug in α̂(β) (partially maximize `):
`∗(β) =∑yi=1
β′xi − n1 log
(∫Deβ
′x(z) dz
)=∑yi=1
log pλ(zi)
Score equations for β:
1
n1
∑yi=1
xi =
∫D e
β′x(z)x(z) dz∫D e
β′x(z) dz= Epλx(z)
Interpretation:
1 Choose β̂ to match means of features x(z)
2 Choose α̂ so Λ(D) = n1
1. Estimate density.
2. Multiply by n1.
Maximum Likelihood for IPP
Plug in α̂(β) (partially maximize `):
`∗(β) =∑yi=1
β′xi − n1 log
(∫Deβ
′x(z) dz
)=∑yi=1
log pλ(zi)
Score equations for β:
1
n1
∑yi=1
xi =
∫D e
β′x(z)x(z) dz∫D e
β′x(z) dz= Epλx(z)
Interpretation:
1 Choose β̂ to match means of features x(z)
2 Choose α̂ so Λ(D) = n1
1. Estimate density. 2. Multiply by n1.
Numerical Approximation of IPP Likelihood
In practice, can’t evaluate integrals analytically
Replace by numerical approximation
`(α, β) =∑yi=1
α+ β′xi −|D|n0
∑yi=0
eα+β′xi
Same interpretation of score equations
Numerical Approximation of IPP Likelihood
In practice, can’t evaluate integrals analytically
Replace by numerical approximation
`(α, β) =∑yi=1
α+ β′xi −|D|n0
∑yi=0
eα+β′xi
Same interpretation of score equations
Numerical Approximation of IPP Likelihood
In practice, can’t evaluate integrals analytically
Replace by numerical approximation
`(α, β) =∑yi=1
α+ β′xi −|D|n0
∑yi=0
eα+β′xi
Same interpretation of score equations
Maxent / Conditional IPP
Phillips et al. (2004, 2006, 2008)
Nonparametric density for presence samples: zi|yi = 1i.i.d.∼ p(z)
Maximize H(p) = −∫p(z) log p(z) dz subject to
1
n1
∑yi=1
x(zi) = Epx(z)
Authors show solution has parametric form:
zii.i.d.∼ eβ
′x(z)∫eβ′x(u) du
Aarts et al. (2011): same slopes β̂ as IPP
Maxent / Conditional IPP
Phillips et al. (2004, 2006, 2008)
Nonparametric density for presence samples: zi|yi = 1i.i.d.∼ p(z)
Maximize H(p) = −∫p(z) log p(z) dz subject to
1
n1
∑yi=1
x(zi) = Epx(z)
Authors show solution has parametric form:
zii.i.d.∼ eβ
′x(z)∫eβ′x(u) du
Aarts et al. (2011): same slopes β̂ as IPP
Maxent / Conditional IPP
Phillips et al. (2004, 2006, 2008)
Nonparametric density for presence samples: zi|yi = 1i.i.d.∼ p(z)
Maximize H(p) = −∫p(z) log p(z) dz subject to
1
n1
∑yi=1
x(zi) = Epx(z)
Authors show solution has parametric form:
zii.i.d.∼ eβ
′x(z)∫eβ′x(u) du
Aarts et al. (2011): same slopes β̂ as IPP
Maxent / Conditional IPP
Phillips et al. (2004, 2006, 2008)
Nonparametric density for presence samples: zi|yi = 1i.i.d.∼ p(z)
Maximize H(p) = −∫p(z) log p(z) dz subject to
1
n1
∑yi=1
x(zi) = Epx(z)
Authors show solution has parametric form:
zii.i.d.∼ eβ
′x(z)∫eβ′x(u) du
Aarts et al. (2011): same slopes β̂ as IPP
Maxent / Conditional IPP
Phillips et al. (2004, 2006, 2008)
Nonparametric density for presence samples: zi|yi = 1i.i.d.∼ p(z)
Maximize H(p) = −∫p(z) log p(z) dz subject to
1
n1
∑yi=1
x(zi) = Epx(z)
Authors show solution has parametric form:
zii.i.d.∼ eβ
′x(z)∫eβ′x(u) du
Aarts et al. (2011): same slopes β̂ as IPP
Equivalence Under Penalization
Maxent software uses large basis expansion, `1 penalty for β
If IPP, Maxent use
• same data (incl. background)
• same basis expansion
• same penalty on β
• α unpenalized in IPP
then β̂IPP = β̂Maxent
Can replace β′x(z) with fθ(z)
Same p̂(z), IPP also computes λ̂(z) = n1p̂(z)
Equivalence Under Penalization
Maxent software uses large basis expansion, `1 penalty for β
If IPP, Maxent use
• same data (incl. background)
• same basis expansion
• same penalty on β
• α unpenalized in IPP
then β̂IPP = β̂Maxent
Can replace β′x(z) with fθ(z)
Same p̂(z), IPP also computes λ̂(z) = n1p̂(z)
Equivalence Under Penalization
Maxent software uses large basis expansion, `1 penalty for β
If IPP, Maxent use
• same data (incl. background)
• same basis expansion
• same penalty on β
• α unpenalized in IPP
then β̂IPP = β̂Maxent
Can replace β′x(z) with fθ(z)
Same p̂(z), IPP also computes λ̂(z) = n1p̂(z)
Equivalence Under Penalization
Maxent software uses large basis expansion, `1 penalty for β
If IPP, Maxent use
• same data (incl. background)
• same basis expansion
• same penalty on β
• α unpenalized in IPP
then β̂IPP = β̂Maxent
Can replace β′x(z) with fθ(z)
Same p̂(z), IPP also computes λ̂(z) = n1p̂(z)
Outline
1 Inhomogeneous Poisson Process Model / Maxent
2 Logistic Regression
3 Pooling Different Kinds of Data
“Naive” Logistic Regression
Treat xi as fixed:
yi|xi ∼ Bernoulli
(eη+β
′xi
1 + eη+β′xi
)
Flexible modeling framework: GAM, MARS, boosting, LASSO, etc.
“Naive” Logistic Regression
Treat xi as fixed:
yi|xi ∼ Bernoulli
(eη+β
′xi
1 + eη+β′xi
)
Flexible modeling framework: GAM, MARS, boosting, LASSO, etc.
Case-Control Sampling
Back to IPP Model
Condition on zi:
P(y = 1|z) =P(y = 1)P(z|y = 1)
P(y = 0)P(z|y = 0) + P(y = 1)P(z|y = 1)
=n1e
α+β′x(z)/Λ(D)
n0 + n1eα+β′x(z)/Λ(D)
=eη+β
′x(z)
1 + eη+β′x(z)
“Case-control” sampling design
Logistic regression likelihood = conditional IPP likelihood
Case-Control Sampling
Back to IPP Model
Condition on zi:
P(y = 1|z) =P(y = 1)P(z|y = 1)
P(y = 0)P(z|y = 0) + P(y = 1)P(z|y = 1)
=n1e
α+β′x(z)/Λ(D)
n0 + n1eα+β′x(z)/Λ(D)
=eη+β
′x(z)
1 + eη+β′x(z)
“Case-control” sampling design
Logistic regression likelihood = conditional IPP likelihood
Case-Control Sampling
Back to IPP Model
Condition on zi:
P(y = 1|z) =P(y = 1)P(z|y = 1)
P(y = 0)P(z|y = 0) + P(y = 1)P(z|y = 1)
=n1e
α+β′x(z)/Λ(D)
n0 + n1eα+β′x(z)/Λ(D)
=eη+β
′x(z)
1 + eη+β′x(z)
“Case-control” sampling design
Logistic regression likelihood = conditional IPP likelihood
Case-Control Sampling
Back to IPP Model
Condition on zi:
P(y = 1|z) =P(y = 1)P(z|y = 1)
P(y = 0)P(z|y = 0) + P(y = 1)P(z|y = 1)
=n1e
α+β′x(z)/Λ(D)
n0 + n1eα+β′x(z)/Λ(D)
=eη+β
′x(z)
1 + eη+β′x(z)
“Case-control” sampling design
Logistic regression likelihood = conditional IPP likelihood
Logistic Regression vs IPP
Both estimate same β, but get different β̂
Warton & Shepherd (2010) show β̂LR → β̂IPP as n0 →∞ withn1 fixed
Misspecified case: not true if n0, n1 →∞ together (limit dependson limn1/n0)
Logistic Regression vs IPP
Both estimate same β, but get different β̂
Warton & Shepherd (2010) show β̂LR → β̂IPP as n0 →∞ withn1 fixed
Misspecified case: not true if n0, n1 →∞ together (limit dependson limn1/n0)
Logistic Regression vs IPP
Both estimate same β, but get different β̂
Warton & Shepherd (2010) show β̂LR → β̂IPP as n0 →∞ withn1 fixed
Misspecified case: not true if n0, n1 →∞ together (limit dependson limn1/n0)
Logistic Regression vs IPPFixed presence sample, n1 = 1000. True λ quadratic in x
●
●
100 1000 10000 1e+05 1e+06
0.0
0.2
0.4
0.6
0.8
1.0
1.2
Logistic Regression Estimates (n1 = 1000)
n0
β̂
Weighted Logistic Regression
Don’t really need n0 →∞
Weight sample to reflect undersampling of background points
wi =
{W yi = 01 yi = 1
As W →∞, β̂WLR → β̂IPP
Weighted logistic regression = numerical IPP = numerical Maxent
Weighted Logistic Regression
Don’t really need n0 →∞
Weight sample to reflect undersampling of background points
wi =
{W yi = 01 yi = 1
As W →∞, β̂WLR → β̂IPP
Weighted logistic regression = numerical IPP = numerical Maxent
Weighted Logistic Regression
Don’t really need n0 →∞
Weight sample to reflect undersampling of background points
wi =
{W yi = 01 yi = 1
As W →∞, β̂WLR → β̂IPP
Weighted logistic regression = numerical IPP = numerical Maxent
Weighted Logistic Regression
Don’t really need n0 →∞
Weight sample to reflect undersampling of background points
wi =
{W yi = 01 yi = 1
As W →∞, β̂WLR → β̂IPP
Weighted logistic regression = numerical IPP = numerical Maxent
Weighted Logistic Regression
Don’t really need n0 →∞
Weight sample to reflect undersampling of background points
wi =
{W yi = 01 yi = 1
As W →∞, β̂WLR → β̂IPP
Weighted logistic regression = numerical IPP = numerical Maxent
Weighted vs Unweighted Logistic RegressionWeighted LR converges faster to large-n0 limit.
100 1000 10000 1e+05 1e+06
0.0
0.2
0.4
0.6
0.8
1.0
1.2
Weighted and Unweighted Estimates for Logistic Regression
n0
β̂
●
●
100 1000 10000 1e+05 1e+06
0.0
0.2
0.4
0.6
0.8
1.0
1.2
100 1000 10000 1e+05 1e+06
0.0
0.2
0.4
0.6
0.8
1.0
1.2
WeightedUnweighted
Outline
1 Inhomogeneous Poisson Process Model / Maxent
2 Logistic Regression
3 Pooling Different Kinds of Data
Presence-Absence and Count Data
Implied likelihood for count / presence-absence data:
N |x ∼ Poisson(Aeα̃−ε+β̃
′x)
Can pool data from multiple studies
Presence-Absence and Count Data
Implied likelihood for count / presence-absence data:
N |x ∼ Poisson(Aeα̃−ε+β̃
′x)
Can pool data from multiple studies
Presence-Absence and Count Data
Implied likelihood for count / presence-absence data:
N |x ∼ Poisson(Aeα̃−ε+β̃
′x)
Can pool data from multiple studies
Example: Correcting for Bias
Assume: multiple species, same bias
λocc,j(z) = eα̃j+β̃′jx(z)
λobs,j(z) = eα̃j+γj+(β̃j+δ)′x(z)
Model is identifiable given
1 Presence-only data for all species (to estimate βj)
2 Presence-absence / count data for at least one species (toestimate δ)
Example: Correcting for Bias
Assume: multiple species, same bias
λocc,j(z) = eα̃j+β̃′jx(z)
λobs,j(z) = eα̃j+γj+(β̃j+δ)′x(z)
Model is identifiable given
1 Presence-only data for all species (to estimate βj)
2 Presence-absence / count data for at least one species (toestimate δ)
Example: Correcting for Bias
Assume: multiple species, same bias
λocc,j(z) = eα̃j+β̃′jx(z)
λobs,j(z) = eα̃j+γj+(β̃j+δ)′x(z)
Model is identifiable given
1 Presence-only data for all species (to estimate βj)
2 Presence-absence / count data for at least one species (toestimate δ)
Conclusions
IPP, MaxEnt, and Logistic Regression all motivated by sameunderlying model
All estimate same β (α is uninteresting)
β̂ for IPP, MaxEnt can be fit by weighted logistic regression/ GAM/ Boosted Trees / MARS / Group LASSO / ...
boosted.ipp <- gbm(y~., family="bernoulli",
data=banksia, weights=1000^(1-y))
Can combine presence-only, presence-absence, and other data
Conclusions
IPP, MaxEnt, and Logistic Regression all motivated by sameunderlying model
All estimate same β (α is uninteresting)
β̂ for IPP, MaxEnt can be fit by weighted logistic regression/ GAM/ Boosted Trees / MARS / Group LASSO / ...
boosted.ipp <- gbm(y~., family="bernoulli",
data=banksia, weights=1000^(1-y))
Can combine presence-only, presence-absence, and other data
Conclusions
IPP, MaxEnt, and Logistic Regression all motivated by sameunderlying model
All estimate same β (α is uninteresting)
β̂ for IPP, MaxEnt can be fit by weighted logistic regression
/ GAM/ Boosted Trees / MARS / Group LASSO / ...
boosted.ipp <- gbm(y~., family="bernoulli",
data=banksia, weights=1000^(1-y))
Can combine presence-only, presence-absence, and other data
Conclusions
IPP, MaxEnt, and Logistic Regression all motivated by sameunderlying model
All estimate same β (α is uninteresting)
β̂ for IPP, MaxEnt can be fit by weighted logistic regression/ GAM/ Boosted Trees / MARS / Group LASSO / ...
boosted.ipp <- gbm(y~., family="bernoulli",
data=banksia, weights=1000^(1-y))
Can combine presence-only, presence-absence, and other data
Conclusions
IPP, MaxEnt, and Logistic Regression all motivated by sameunderlying model
All estimate same β (α is uninteresting)
β̂ for IPP, MaxEnt can be fit by weighted logistic regression/ GAM/ Boosted Trees / MARS / Group LASSO / ...
boosted.ipp <- gbm(y~., family="bernoulli",
data=banksia, weights=1000^(1-y))
Can combine presence-only, presence-absence, and other data
Conclusions
IPP, MaxEnt, and Logistic Regression all motivated by sameunderlying model
All estimate same β (α is uninteresting)
β̂ for IPP, MaxEnt can be fit by weighted logistic regression/ GAM/ Boosted Trees / MARS / Group LASSO / ...
boosted.ipp <- gbm(y~., family="bernoulli",
data=banksia, weights=1000^(1-y))
Can combine presence-only, presence-absence, and other data