Erin Childs (Pomona College), Andrew Calderon (Heritage University), Evan Goldman (Bard College,...

download Erin Childs (Pomona College), Andrew Calderon (Heritage University), Evan Goldman (Bard College, Boston University), Molly O’Neill (Lehigh University),

If you can't read please download the document

Transcript of Erin Childs (Pomona College), Andrew Calderon (Heritage University), Evan Goldman (Bard College,...

  • Slide 1
  • Erin Childs (Pomona College), Andrew Calderon (Heritage University), Evan Goldman (Bard College, Boston University), Molly ONeill (Lehigh University), Clay Showalter (Evergreen University), with the help of Olivia Poblacion (Oregon State University)
  • Slide 2
  • Acknowledgements Dr. Dietterich, CS Professor Dr. Wong, CS Professor Steven Highland, Geosciences PhD Candidate Jorge Ramirez, Math Professor Dan Sheldon, CS Post-doc Julia Jones, Geosciences Professor Rebecca Hutchinson, CS Post-doc Javier Illan, PhD, Post-doc
  • Slide 3
  • Studying Climate Change: Lepidoptera Why are Lepidoptera are good indicator of climate change? Past studies on Lepidoptera Woiwod 1996: Detecting the effects of climate change on Lepidoptera Dewar and Watt 1992: Predicted changes in the synchrony of larval emergence and budburst under climatic warming
  • Slide 4
  • Research QuestionsResearch Questions 1)How is vegetation related to moth species distribution and composition? 2)How does climate affect moth phenology?
  • Slide 5
  • Study SiteStudy Site H.J. Andrews Experimental Forest http://andrewsforest.oregonstate.edu/about.cfm?topnav=2
  • Slide 6
  • Slide 7
  • Vegetation Surveying: MethodsVegetation Surveying: Methods GPS coordinates Walked out 30m and 100m radius in all directions Presence/absence of 71 species of known host plants
  • Slide 8
  • Slide 9
  • Moth Trapping: MethodsMoth Trapping: Methods Moth Trapping 9 sites selected Equipment used Moth preservation
  • Slide 10
  • Methods Moth Identification
  • Slide 11
  • Moth Trapping ResultsMoth Trapping Results Semiothis signaria Pero occidentalis
  • Slide 12
  • Overview: Is vegetation a good predictor of moth species presence/absence? Develop software tools for exploring/analyzing data Run generalized boosted regression models (GBMs) for each moth species Create GIS layers for the predicted locations of each moth species
  • Slide 13
  • Software Tasks for Data Exploration Format data Compare the similarities and differences between sites, moths and vegetation Discover correlations between vegetation and moth species Calculate marginal probabilities of plant occurrences Visualize results
  • Slide 14
  • Measuring Similarity: Hamming DistanceMeasuring Similarity: Hamming Distance Hamming distance is the number of co-variates that differ between sample sets Smaller number means sets are more similar
  • Slide 15
  • Slide 16
  • Marginal ProbabilitiesMarginal Probabilities Using the vegetation data collected at 20 sites, generate marginal probabilities for plants occurrences If huckleberry (VAHU) is found at a site, what is the probability of finding thimbleberry (RUPA) but not licorice root (!LIGR) at that site?
  • Slide 17
  • Canonical Correlation Analysis (CCA) Canonical correlations analysis aims at highlighting correlations between two data sets Gives us a way of making sense of cross-covariance matrices Allows ecologists to relate the abundance of species to environmental variables Using CCA we analyzed our vegetation data and moth data
  • Slide 18
  • X-correlation: Highlights any correlations among only moth species (422x422) Y-correlation: Highlights any correlations among only plant species (71x71) Cross-correlation: Highlights any correlations between both data sets (71x422)
  • Slide 19
  • Generalized Boosted Regression Models (GBMs)Generalized Boosted Regression Models (GBMs) Regression analysis allows us to explore the relationships between individual moth presence/absence (dependent variable) and various characteristics of each site (independent variables) The goal is to minimize the loss function, which represents the loss associated with an estimate being different from the true value Basis functions are an element of a set of vectors that, in linear combination, can represent every vector in a given vector space Every function can be represented as a linear combination of basis function Boosting is the process of iteratively adding basis functions in a greedy fashion so that each additional basis function further reduces the selected loss function The model is run several times with different values for the tuning parameters to determine the best values
  • Slide 20
  • Validating the GBMValidating the GBM All available regressors are used in the model, meaning that the choice of independent variables is not supported by theory The standard approach to validating models is to split the data into a training and a test data set The model is fit on the training data, then used to make predictions on the test data This ensures that the model is generalizable and not overfit
  • Slide 21
  • Running the ModelRunning the Model Ran the model for individual moth species using all 256 trap sites at HJA, using moth trapping data collected from 2004 to 2008 Did not include vegetation data, since we only collected it at 20 sites The GBM lays a grid over the Andrews forest and calculates the predicted probability of the moth species being present for each grid cell
  • Slide 22
  • Visualizing GBM ResultsVisualizing GBM Results
  • Slide 23
  • Slide 24
  • Thermal Climate of the H.J. Andrews Experimental Forest PRISM estimated mean monthly maximum and minimum temperature maps showing topographic effects of radiation and sky view factors. Provided by Jonathan W. Smith
  • Slide 25
  • Daily temperatures at climate stations Mean monthly temperatures at climate stations Mean monthly temperatures at trap sites Correlation between climate stations and trap sites Daily temperatures at trap sites Degree day curve for trap sites
  • Slide 26
  • Slide 27
  • Degree Day CurveDegree Day Curve Use a linear regression model to interpolate the degree for a given trap site for specific days of a year Parameterize temperature in order to later be included in the temporal model Produce degree day curves for any trap site
  • Slide 28
  • Find Coefficients Each Trap_ID will have two sets of coefficients (Maximum and Minumum) Multi-Linear Regression Analysis
  • Slide 29
  • Predicting Daily TempPredicting Daily Temp Linear Interpolation Fill gaps in the daily temperature data In goes the trap_ID, start_date and end_date Out comes the min and max for the given day(s)
  • Slide 30
  • Temporal Distribution of Moths
  • Slide 31
  • The ProblemThe Problem Year-round distribution of moths Limited observation points Unseen, unmeasurable data Catching probabilities Total moth population
  • Slide 32
  • Example: Flight timesExample: Flight times Consider 3 trapping times and 4 associated intervals, and moths with flight times as follows t1t1 t2t2 t3t3 I0I0 I3I3 I2I2 I1I1
  • Slide 33
  • Example: DistributionExample: Distribution This gives us a distribution table: I0I0 I1I1 I2I2 I3I3 I0I0 0000 I1I1 0000 I2I2 0000 I3I3 0000 t1t1 t2t2 t3t3 I0I0 I3I3 I2I2 I1I1
  • Slide 34
  • Example: DistributionExample: Distribution This gives us a distribution table: I0I0 I1I1 I2I2 I3I3 I0I0 0000 I1I1 0100 I2I2 0000 I3I3 0000 t1t1 t2t2 t3t3 I0I0 I3I3 I2I2 I1I1
  • Slide 35
  • Example: DistributionExample: Distribution This gives us a distribution table: I0I0 I1I1 I2I2 I3I3 I0I0 0100 I1I1 0100 I2I2 0000 I3I3 0000 t1t1 t2t2 t3t3 I0I0 I3I3 I2I2 I1I1
  • Slide 36
  • Example: DistributionExample: Distribution This gives us a distribution table: I0I0 I1I1 I2I2 I3I3 I0I0 0100 I1I1 0101 I2I2 0000 I3I3 0000 t1t1 t2t2 t3t3 I0I0 I3I3 I2I2 I1I1
  • Slide 37
  • Example: DistributionExample: Distribution This gives us a distribution table: I0I0 I1I1 I2I2 I3I3 I0I0 0110 I1I1 0101 I2I2 0000 I3I3 0000 t1t1 t2t2 t3t3 I0I0 I3I3 I2I2 I1I1
  • Slide 38
  • Example: DistributionExample: Distribution This gives us a distribution table: I0I0 I1I1 I2I2 I3I3 I0I0 1241 I1I1 0233 I2I2 0012 I3I3 0001
  • Slide 39
  • Example contExample cont I0I0 I1I1 I2I2 I3I3 I0I0 1241 I1I1 0233 I2I2 0012 I3I3 0001 7f1f1 This gives us a distribution table and flight counts
  • Slide 40
  • Example contExample cont I0I0 I1I1 I2I2 I3I3 I0I0 1241 I1I1 0233 I2I2 0012 I3I3 0001 7 11 f1f1 f2f2 This gives us a distribution table and flight counts
  • Slide 41
  • Example contExample cont I0I0 I1I1 I2I2 I3I3 I0I0 1241 I1I1 0233 I2I2 0012 I3I3 0001 7 11 6 f1f1 f2f2 f3f3 This gives us a distribution table and flight counts
  • Slide 42
  • Example: Flight CountsExample: Flight Counts When trapping moths, all we see is flight counts Given flight counts, we want to predict moth distribution 7 11 6 f1f1 f2f2 f3f3
  • Slide 43
  • Maximum Likelihood Model Maximize Prob (Data | Parameters) Data = Moth trapping moths trapped: f=(f 1, f 2, f T ) times trapped: t=(t 1, t 2, t T )
  • Slide 44
  • Maximum Likelihood Model Parameters = probability distribution of emergence time and life span Emergence and life span assumed to be Gaussian with parameters E, E, S, S Emergence ~ N( E, E ) Life Span ~ N( S, S )
  • Slide 45
  • Moth DistributionMoth Distribution Use distributions to calculate p(j,k), the probability of a moth emerging in interval j and dying in interval k tjtj r s d tktk t k+1 t j+1 IjIj IkIk
  • Slide 46
  • Calculating ProbabilitiesCalculating Probabilities
  • Slide 47
  • Probability TableProbability Table Emergence Interval I0I0 I1I1 ITIT I0I0 P(0,0)P(0,1)P(0,T) I1I1 P(1,0)P(1,1)P(1,T) I2I2 I3I3 P(T,1)P(T,2)P(T,T)
  • Slide 48
  • Multinomial DistributionMultinomial Distribution All moths fall into one of the probability squares Moths have a multinomial distribution Approximate this with a multivariate Gaussian (or normal)
  • Slide 49
  • Approximation ErrorApproximation Error What is the error associated with this approximation? approximated as m!=s(m) Error of
  • Slide 50
  • Likelihood ={E, E, S, S}
  • Slide 51
  • Likelihood surfaceLikelihood surface Log Loss ee ss 21 19 17 15 13 11 9 7 5 3 1
  • Slide 52
  • Results Semiothisa Signaria Trap 38B 2005
  • Slide 53
  • Results R 2 =0.23 p