Spatial capture recapture for categorically marked … et al 2019 SCR...tity and unmarked SCR and...

22
Spatial capturerecapture for categorically marked populations with an application to genetic capturerecapture BEN C. AUGUSTINE, 1,  J. ANDREW ROYLE, 2 SEAN M. MURPHY , 3 RICHARD B. CHANDLER, 4 JOHN J. COX, 3 AND MARCELLA J. KELLY 5 1 Atkinson Center for a Sustainable Future and Department of Natural Resources, Cornell University, Ithaca, New York 14843 USA 2 Patuxent Wildlife Research Center, U.S. Geological Survey, Laurel, Maryland 20708 USA 3 Department of Forestry, University of Kentucky, Lexington, Kentucky 40546 USA 4 Department of Forestry and Natural Resources, University of Georgia, Athens, Georgia 30602 USA 5 Department of Fish and Wildlife Conservation, Virginia Tech, Blacksburg, Virginia 24061 USA Citation: Augustine, B. C., J. A. Royle, S. M. Murphy, R. B. Chandler, J. J. Cox, and M. J. Kelly. 2019. Spatial capturerecapture for categorically marked populations with an application to genetic capturerecapture. Ecosphere 10(4): e02627. 10.1002/ecs2.2627 Abstract. Recently introduced unmarked spatial capturerecapture (SCR), spatial markresight (SMR), and 2-ank spatial partial identity models (SPIMs) extend the domain of SCR to populations or observa- tion systems that do not always allow for individual identity to be determined with certainty. For example, some species do not have natural marks that can reliably produce individual identities from photographs, and some methods of observation produce partial identity samples as is the case with remote cameras that sometimes produce single-ank photographs. Unmarked SCR, SMR, and SPIM share the feature that they probabilistically resolve the uncertainty in individual identity using the spatial location where samples were collected. Spatial location is informative of individual identity in spatially structured populations because a sample is more likely to have been produced by an individual living near the trap where it was recorded than an individual living further away from the trap. Further, the level of information about indi- vidual identity that a spatial location contains is related to two key ecological concepts, population density and home range size, which we quantify using a proposed Identity Diversity Index (IDI). We show that latent and partial identity SCR models produce imprecise and biased density estimates in many high IDI scenarios when data are sparse. We then extend the unmarked SCR model to incorporate categorical, par- tially identifying covariates, which reduce the level of uncertainty in individual identity, increasing the reli- ability and precision of density estimates, and allowing reliable density estimation in scenarios with higher IDI values and with more sparse data. We illustrate the performance of this categorical SPIMvia simula- tions and by applying it to a black bear data set using microsatellite loci as categorical covariates, where we reproduce the full data set estimates with only slightly less precision using fewer loci than necessary for condent individual identication. We then discuss how the categorical SPIM can be applied to other wildlife sampling scenarios such as remote camera surveys, where natural or researcher-applied partial marks can be observed in photographs. Finally, we discuss how the categorical SPIM can be added to SMR, 2-ank SPIM, or other latent identity SCR models. Key words: genetic markrecapture; Identity Diversity Index; microsatellites; partial identity; spatial capturerecapture; spatial markresight; unmarked spatial capturerecapture. Received 28 July 2018; revised 13 December 2018; accepted 9 January 2019; nal version received 31 January 2019. Corresponding Editor: Lucas N. Joppa. Copyright: © 2019 The Authors. This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.  E-mail: [email protected] www.esajournals.org 1 April 2019 Volume 10(4) Article e02627

Transcript of Spatial capture recapture for categorically marked … et al 2019 SCR...tity and unmarked SCR and...

  • Spatial capture–recapture for categorically marked populationswith an application to genetic capture–recapture

    BEN C. AUGUSTINE,1,� J. ANDREW ROYLE,2 SEAN M. MURPHY ,3 RICHARD B. CHANDLER,4 JOHN J. COX,3 ANDMARCELLA J. KELLY5

    1Atkinson Center for a Sustainable Future and Department of Natural Resources, Cornell University, Ithaca, New York 14843 USA2Patuxent Wildlife Research Center, U.S. Geological Survey, Laurel, Maryland 20708 USA

    3Department of Forestry, University of Kentucky, Lexington, Kentucky 40546 USA4Department of Forestry and Natural Resources, University of Georgia, Athens, Georgia 30602 USA

    5Department of Fish and Wildlife Conservation, Virginia Tech, Blacksburg, Virginia 24061 USA

    Citation: Augustine, B. C., J. A. Royle, S. M. Murphy, R. B. Chandler, J. J. Cox, and M. J. Kelly. 2019. Spatial capture–recapture for categorically marked populations with an application to genetic capture–recapture. Ecosphere 10(4):e02627. 10.1002/ecs2.2627

    Abstract. Recently introduced unmarked spatial capture–recapture (SCR), spatial mark–resight (SMR),and 2-flank spatial partial identity models (SPIMs) extend the domain of SCR to populations or observa-tion systems that do not always allow for individual identity to be determined with certainty. For example,some species do not have natural marks that can reliably produce individual identities from photographs,and some methods of observation produce partial identity samples as is the case with remote cameras thatsometimes produce single-flank photographs. Unmarked SCR, SMR, and SPIM share the feature that theyprobabilistically resolve the uncertainty in individual identity using the spatial location where sampleswere collected. Spatial location is informative of individual identity in spatially structured populationsbecause a sample is more likely to have been produced by an individual living near the trap where it wasrecorded than an individual living further away from the trap. Further, the level of information about indi-vidual identity that a spatial location contains is related to two key ecological concepts, population densityand home range size, which we quantify using a proposed Identity Diversity Index (IDI). We show thatlatent and partial identity SCR models produce imprecise and biased density estimates in many high IDIscenarios when data are sparse. We then extend the unmarked SCR model to incorporate categorical, par-tially identifying covariates, which reduce the level of uncertainty in individual identity, increasing the reli-ability and precision of density estimates, and allowing reliable density estimation in scenarios with higherIDI values and with more sparse data. We illustrate the performance of this “categorical SPIM” via simula-tions and by applying it to a black bear data set using microsatellite loci as categorical covariates, wherewe reproduce the full data set estimates with only slightly less precision using fewer loci than necessaryfor confident individual identification. We then discuss how the categorical SPIM can be applied to otherwildlife sampling scenarios such as remote camera surveys, where natural or researcher-applied partialmarks can be observed in photographs. Finally, we discuss how the categorical SPIM can be added toSMR, 2-flank SPIM, or other latent identity SCR models.

    Key words: genetic mark–recapture; Identity Diversity Index; microsatellites; partial identity; spatial capture–recapture; spatial mark–resight; unmarked spatial capture–recapture.

    Received 28 July 2018; revised 13 December 2018; accepted 9 January 2019; final version received 31 January 2019.Corresponding Editor: Lucas N. Joppa.Copyright: © 2019 The Authors. This is an open access article under the terms of the Creative Commons AttributionLicense, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.� E-mail: [email protected]

    ❖ www.esajournals.org 1 April 2019 ❖ Volume 10(4) ❖ Article e02627

    https://orcid.org/0000-0002-9404-8878https://orcid.org/0000-0002-9404-8878https://orcid.org/0000-0002-9404-8878info:doi/10.1002/ecs2.2627http://creativecommons.org/licenses/by/3.0/http://crossmark.crossref.org/dialog/?doi=10.1002%2Fecs2.2627&domain=pdf&date_stamp=2019-04-01

  • INTRODUCTION

    Animal population density is a fundamentalquantity in wildlife ecology, and therefore, esti-mating population density is a primary challengefor ecologists (Laake et al. 1993, Efford 2004).Mark–recapture and spatial capture–recapture(SCR) methods are among the most reliablemethods for estimating population abundanceand density; however, they generally require thatindividual identities of captured animals aredetermined with certainty (e.g., marks arerecorded correctly and not lost; Otis et al. 1978).Recently, several classes of SCR models that uti-lize latent or partially latent individual identitieshave been introduced, extending the utility ofSCR models to populations that are unmarked(unmarked SCR, alternatively known as a spatialcounts model; Chandler and Royle 2013), popu-lations for which only a subset of individuals aremarked (spatial mark–resight [SMR]; Sollmannet al. 2013), populations for which individualsare identified only in a subset of the years in amultiyear study (spatially explicit IntegratedPopulation Models; Chandler and Clark 2014),and populations for which some or all samplescarry only partial identifications (spatial partialidentity models [SPIMs]; Augustine et al. 2018).Similar to recently developed classical partialidentity models (Bonner and Holmberg 2013,McClintock et al. 2013), this class of SCR modelswith latent individual identities shares the fea-ture that the true capture histories for some or allobserved individuals are latent and must beprobabilistically reconstructed using MarkovChain Monte Carlo (MCMC; or possiblymarginalized out of the likelihood if computa-tionally feasible). Unlike the classical partialidentity models, however, these SCR modelswith latent individual identities use the spatiallocations where samples were collected, togetherwith a spatially explicit model of sample deposi-tion, to probabilistically associate latent or partialidentity samples, thereby reducing uncertainty inindividual identity. This reduced uncertainty inindividual identity is then propagated to areduced uncertainty in other model parameterestimates, including population size and density.Therefore, the spatial location where a samplewas collected constitutes a continuous partial iden-tity and unmarked SCR and SMR can be

    considered special cases of a SPIM, along withthe “2-flank” SPIM that probabilistically associ-ates left-flank and right-flank photographs forspecies with bilateral identification (Augustineet al. 2018).The key feature of a SPIM is that the magni-

    tude of uncertainty in individual identity and themodel used to resolve it stems directly from twokey aspects of population ecology–populationdensity and home range size. In a SPIM, therewill be more uncertainty about sample identitywhen more individuals are exposed to capture atthe same traps, which happens when animalsoccur at higher densities, and when their homeranges are larger. Within the context of an SCRmodel, this is equivalent to scenarios in whichthere are higher densities of individual activitycenters within the state space and larger detec-tion function spatial scale parameters, for exam-ple, r for the common half-normal detectionfunction model. Note, however, that the key fea-ture that both population density and homerange size determine, which is directly responsi-ble for the magnitude of uncertainty in individ-ual identity, is the magnitude of home range overlap,or more conceptually, the magnitude of overlapin individual utilization distributions (e.g., asquantified by Fieberg and Kochanny 2005),which increases independently by increasingeither population density or home range size.We propose a metric to quantify the degree of

    home range overlap for a given population den-sity and r, and thus, the expected magnitude ofuncertainty in individual identity. The Simpson’sDiversity Index can be applied to the spatiallyexplicit, individual detection probabilities andaveraged over many points on the landscape andrealizations of the SCR process model (configura-tions of activity centers) to produce an IdentityDiversity Index (IDI), which is conceptually simi-lar to a metric of utilization distribution overlapfor all individuals in the population. The IDIquantifies the expected diversity in individualidentity of collected samples in one location,averaged over the landscape, described in detailin Appendix S1. Fig. 1 provides a visualizationof how the magnitude of uncertainty in individ-ual identity conceptually scales with populationdensity and home range size as quantified by theSCR r parameter. This relationship between themagnitude of uncertainty in individual identity

    ❖ www.esajournals.org 2 April 2019 ❖ Volume 10(4) ❖ Article e02627

    AUGUSTINE ET AL.

  • and the spatial features of animal populations iswhat set a SPIM apart from the classical partialidentity models (Bonner and Holmberg 2013,McClintock et al. 2013, Knapp et al. 2009), wherethe magnitude of uncertainty in individual iden-tity scales with population abundance alone.

    Another key feature of the currently availableSPIMs is that partial identity information canreduce the uncertainty in individual identitythrough three mechanisms: (1) by adding deter-ministic identity associations, (2) by addingdeterministic identity exclusions, and (3) byimproving probabilistic identity associations. Toour knowledge, the mechanics of these modelshave not previously been framed in this manner,which we believe is important for understandinghow they use partial identity information toimprove density estimation, and for extendingthe types of partial identity information used byfuture SPIMs. Here, we define a deterministicidentity association as a connection betweensamples from the same individual that alsoimplies that the samples are excluded from beingconnected with samples from other individuals.This is distinguished from a deterministic

    identity exclusion, which can only prevent cer-tain samples from being combined together. Aprobabilistic identity association occurs whentwo samples have a positive posterior probabilityof belonging to the same individual, and as thisprobability increases, the probability they belongto another individual necessarily decreases.Probabilistic identity associations can beimproved with partial identity information,effectively converging to deterministic identityassociations as partial identity informationincreases, as we later demonstrate usingmicrosatellite loci.All SPIMs use the spatial location where sam-

    ples were collected to improve probabilistic iden-tity associations. The unmarked SCR model(Chandler and Royle 2013) and the model ofChandler and Clark (2014) that relates occupancydata to a latent SCR model use spatial informa-tion alone to inform individual identity. TypicalSCR, on the other hand, uses all possible deter-ministic identity associations. Spatial mark–re-sight (Sollmann et al. 2013) and the “2-flank”SPIM (Augustine et al. 2018) represent two inter-mediate cases that both utilize some determinis-tic identity associations and exclusions. Spatialmark–resight makes deterministic identity asso-ciations between the samples of marked andidentifiable individuals, which simultaneouslyexcludes them from being connected to samplesfrom other individuals. Deterministic identityexclusions are then made between the samples ofunidentifiable individuals whose mark statuscan be observed (e.g., an unmarked sample can-not belong to a marked individual; Royle et al.2013). The 2-flank SPIM makes deterministicidentity associations across the same flank of thesame individual, enforcing exclusions with non-matching samples from the same flank. Finally,deterministic identity exclusions arise in the 2-flank SPIM from the fact that an individual canonly have one left and right flank.Augustine et al. (2018) demonstrated that fur-

    ther deterministic identity exclusions are possiblein SPIMs by using individual sex to split a dataset into two population subgroups whose identi-ties could not logically match, reducing theuncertainty in individual identity and thus abun-dance and density estimates. Splitting the popu-lation into identity subgroups of increasinglysmaller size is conceptually similar to applying

    Fig. 1. The Identity Diversity Index, quantifying themagnitude of uncertainty in individual identity, as afunction of population density (D) and the spatial cap-ture–recapture spatial scale parameter (r). IncreasingD and/or r increases the magnitude of uncertainty inindividual identity and with N fixed, the expected pre-cision of N and D estimates.

    ❖ www.esajournals.org 3 April 2019 ❖ Volume 10(4) ❖ Article e02627

    AUGUSTINE ET AL.

  • unmarked SCR to separate populations, eachwith increasingly lower population densities,moving the population under study to morefavorable regions of the IDI along the densityaxis (Fig. 1). However, rather than splitting datasets into increasingly smaller subsets, it is desir-able to have a model that incorporates these cate-gorical identity exclusions, allows for imperfectobservation of the category levels, and allowsparameters to be shared across identity sub-groups. Further, when all categories are com-bined into a single analysis, the distribution ofindividuals across the category levels providesfurther information that can improve probabilis-tic identity associations. For example, if a popu-lation is 75% female, it is more likely that twonearby male samples came from a single individ-ual than if the population is 75% male. Finally,the proportion of the population in each identitysubgroup may be of ecological interest, such asindividual sex or age class. Therefore, we areintroducing a new class of SCR model, the “cate-gorical SPIM”, which uses partially identifyingcategorical covariates to add both deterministicidentity exclusions and reduce the uncertainty inprobabilistic identity associations.

    Partially identifying categorical covariates existin many types of invasive and noninvasive wild-life sampling; for example, in studies usingremote cameras, features such as sex, age class,and color morph may be observable in at leastsome photographs and similar categorical fea-tures can be extracted from bioacoustic sensorsfor some species (Reby et al. 1999, McIntoshet al. 2015). In more invasive wildlife samplinginvolving live capture, many more features aremeasurable and researchers may apply categori-cal marks whose combination do not provide fullidentities (e.g., colored collars or ear tags), or cat-egorical marks may be fully identifying (Lewiset al. 2015), but imperfectly observed, or fullyidentifying categorical marks may be partiallylost over time, similar to the problem of completetag loss typically resolved by double tagging(Cowen and Schwarz 2006). Perhaps the mostinformative source of categorical identity covari-ates in wildlife sampling, however, comes frommicrosatellite genotypes.

    Microsatellite loci are not typically thought ofas categorical identity covariates because studieshave traditionally aimed to ensure that unique

    multilocus genotypes correspond to unique indi-viduals with high probability through the use ofmany highly variable loci. This determination istypically made using P(ID) and/or P(sib) criteria,both of which estimate the probability that anytwo randomly selected individuals (or full sib-lings) in a population would share the same mul-tilocus genotype, given the observed allelediversities and frequencies (Waits et al. 2001).The possibility that multiple individuals have thesame multilocus genotype in a capture–recapturedata set has been referred to as the “shadoweffect” (Mills et al. 2000) and is considered a typeof low-frequency error in assigning individualidentities that introduces minimal bias intoparameter estimates in capture–recapture studiesif P(ID) or P(sib) criteria are strictly enforced.Using the categorical SPIM to model genotypedata is especially appealing when genotypes arenot variable enough to be considered uniquebecause it does not make deterministic connec-tions between samples with matching genotypes;thus, in the categorical SPIM we propose, multi-ple individuals in the population may have thesame genotype, shifting the “shadow effect”from a source of bias to an additional source ofuncertainty. Further, microsatellite data are idealfor investigating the performance of the categori-cal SPIM across the full range of uncertainty inindividual identity—from unmarked SCR toSCR—due to the near perfect individual informa-tion content in the full genotype. Therefore, wewill use genotypes as the main application todemonstrate how categorical identity and spatialinformation combine to determine the magni-tude of uncertainty in individual identity and theresulting density estimates, but other types ofcategorical covariates could be interchangedwithout loss of generality.Here, we generalize the unmarked SCR model

    to develop the “categorical SPIM.” We show viasimulation that in scenarios with more sparsedata than previously considered and/or scenarioswith larger rs and higher densities, the unmarkedSCR density estimator is biased, very imprecise,and the parameters are frequently not identifiable,demonstrating the importance of population den-sity and home range size to the application oflatent and partial identity SCR models. We thenshow that adding categorical identity covariatesremoves this bias and removes parameter non-

    ❖ www.esajournals.org 4 April 2019 ❖ Volume 10(4) ❖ Article e02627

    AUGUSTINE ET AL.

  • identifiability, increases precision, allowing forreliable density estimation across a wider range ofvalues of density and r for a given capture pro-cess scenario. We also demonstrate that the uncer-tainty in the posterior for ncap, the latent numberof individuals captured during a survey, corre-lates well with the uncertainty in the posterior ofN, suggesting it is a good single metric to quantifythe observed magnitude of uncertainty in individ-ual identity for a given data set. Finally, we applythe categorical SPIM to a previously publishedblack bear data set in which we demonstrate howwell the proposed model can reproduce the origi-nal density estimate using fewer loci than origi-nally genotyped. Using this data set, wedemonstrate that all uncertainty in individualidentity can be removed with enough categoricalidentity covariates, producing equivalent esti-mates to an SCR model where all identities areknown with certainty.

    METHODS—DATA AND MODEL DESCRIPTION

    Methods—unmarked SCR FoundationFirst, we introduce the version of the unmarked

    SCR model that we will expand to allow categori-cal identity covariates. The unmarked SCR modelis a typical hierarchical SCR model except thatinformation about individual identity is notretained during the observation process. Formalinference is achieved by relating the spatial pat-tern of observed counts or detections at each ofthe J traps to the latent structure of the SCR pro-cess model. For the process model, we assume theN individuals in the population have activity cen-ters that are distributed uniformly across a two-dimensional state space S of arbitrary size (A)and shape, that is, si �UniformðSÞ, i = 1, . . ., N(see Borchers and Efford 2008, Reich and Gardner2014, Royle et al. 2016 for alternative specifica-tions). The activity centers are organized in theN 9 2 matrix S.

    For the observation model, we introduce theN 9 J fully latent capture history Ytrue, recordingthe number of detections or counts for each indi-vidual at each trap summed across the K occa-sions. The locations of the J traps are stored inthe J 9 2 matrix X. We assume that the numberof counts or detections for each individual ateach trap is a decreasing function of distancebetween the activity centers and traps. If using a

    count model, we assume the latent counts arePoisson: ytrueij �PoisðKkðsi; xjÞÞ, where kðsi; xjÞ ¼k0 exp � jjsi�xjjj

    2

    2r2

    � �, xj is the location of trap j, k0 is

    the expected number of counts for a trap locatedat distance 0 from an activity center, and r is thespatial scale parameter determining how quicklythe expected counts decline with distance fromthe activity center. We also consider an alterna-tive Bernoulli observation model for whichytrueij �Binðpðsi; xjÞ;KÞ, where p(si, xj) = 1 - exp( - k(si, xj)).During the observation process, the true, latent

    capture history, Ytrue, is disaggregated into theobserved capture history, Yobs, discarding infor-mation about individual identity by storing oneobservation per row in Yobs (e.g., no samples aredeterministically connected to the same individ-ual). More specifically, Yobs is the nobs 9 J matrixwith entries 1 if sample m was recorded in trap jand 0 otherwise. Note that if we assume a Ber-noulli observation model, each detection eventwill constitute a single observation, while if weassume a Poisson observation model, counts aredisaggregated into observations of single counts,because counts from the same individuals cannotbe deterministically connected without certainand unique identities. To visualize this, below isan example of true and disaggregated observeddata set where N = 2 and J = 3:

    Ytrue ¼ 2 0 00 1 1� �

    Yobs ¼1 0 01 0 00 1 00 0 1

    2664

    3775

    Methods—categorical SPIMWe propose a class-structured version of the

    unmarked SCR model in which class membershipis determined by each individual’s full categoricalidentity (e.g., full genotype). Here, we define a fullcategorical identity to be an individual’s set of truevalues for ncat categorical covariates, where n

    cat isthe maximum number of categorical covariatesconsidered, and multiple individuals in the popu-lation can share the same full categorical identity.We will modify the unmarked SCR model suchthat single or multiple categorical covariates,potentially partially or even fully unobserved, arerecorded with each observed sample at each trap.Further, continuous covariates could be

    ❖ www.esajournals.org 5 April 2019 ❖ Volume 10(4) ❖ Article e02627

    AUGUSTINE ET AL.

  • discretized into categories if it is safe to assumethere is no measurement error. The linked densityand categorical covariate models (joint processmodel) are fully latent and we use the nobs trap-referenced, observed categorical covariates tomake inference about this latent structure. Theobserved data then consist of two linked datastructures: Yobs, an nobs 9 J capture history indi-cating the trap at which each sample wasrecorded and Gobs, an nobs 9 ncat identity historyrecording the observed categorical covariate(s) ofeach sample with category level enumeratedsequentially as described below, or recorded as a0 if not observed.

    For the joint process model, we assume thateach individual has a full categorical identityassociated with its activity center. FollowingWright et al. (2009), we assume that all possiblecategory levels for each categorical covariate areknown, with the number of categories for eachcovariate l being nlevelsl , l = 1, . . ., n

    cat. Next, weintroduce the population category level probabili-ties for covariate l, cl, of length nlevelsl and corre-sponding to the enumerated category levels(1; . . .; nlevelsl ) for covariate l. Then, we introducethe N 9 ncat matrix Gtrue, where gtruei is the fullcategorical identity of the individual with activitycenter si. Finally, we assume the categoricalidentity of each individual for each covariateis distributed following the covariate-specific cate-gory level probabilities according to gtrueil �CategoricalðclÞ, implying that category levels areindependent across covariates (e.g., linkage equi-librium in the genetic context) and individuals.Using the example true and observed capture his-tories above, potential true and observed struc-tures for the categorical identities assuming 3categorical identity covariates with 4 levels eachare as follows:

    Gtrue ¼ 1 4 34 2 3� �

    Gobs ¼1 4 31 4 34 0 30 0 3

    2664

    3775

    In this case, the first two observed samplescould possibly have come from the same individ-ual as could the third and fourth sample; how-ever, the third sample could not have come fromthe same individual as the first two samples. Thefourth sample with two unobserved categories

    could possibly belong to the same individual asthat which produced the first three samples, withonly the third sample being a correct match. Inthis example, potential values of l for any of thecl for any of the 3 categories could be(0.3, 0.1, 0.2, 0.4), corresponding to categorylevels (1, 2, 3, 4), with the only requirement onthese category level probabilities being that theysum to 1 for each identity covariate. Also, notethat each covariate may have a different numberof category levels.The observation process is the same as

    unmarked SCR except that a categorical identity,potentially partially or fully latent, is associatedwith each trap-referenced observation. Themissing data process could be a simple binomialmodel for identification success, perhaps withcovariate-specific identification probabilities; how-ever, if we assume that covariate observation val-ues do not vary by individual or by categorylevel, the likelihood for the missing data processdoes not change when updating latent identitiesor latent categorical covariate values and can beignored in the MCMC algorithm. Therefore, wemake the assumption that covariate identificationprobabilities do not vary by individual or categorylevel, but this assumption would be easy to relax.The unmarked SCR model and categorical

    SPIM use a process similar to data augmentationto estimate population abundance and density(Royle et al. 2007) and to model the uncertaintyin individual identity by providing latent struc-ture to allow for different configurations of theobserved samples across the individuals in thepopulation (Chandler and Royle 2013, Augustineet al. 2018). Unlike typical data augmentation,the number of captured individuals, ncap, andtheir capture histories are unknown in unmarkedSCR, so rather than augmenting the observedcapture history, we augment a possibly true cap-ture history constructed by assigning individualidentities to the observed samples in Yobs basedon the spatial proximity of samples and the com-patibility of their observed categorical identities.We then augment Gtrue to size M 9 ncat, which

    is initialized using the minimally implied cate-gorical identity of the samples each individual isinitialized with, and the remaining elements ofGtrue that are not determined by these samplesare simulated from the category level probabili-ties. Similar to typical data augmentation, M is

    ❖ www.esajournals.org 6 April 2019 ❖ Volume 10(4) ❖ Article e02627

    AUGUSTINE ET AL.

  • chosen by the analyst to be much larger than Nand we use z, a latent indicator vector of lengthM, to indicate which individuals are in the popu-lation; however, unlike typical data augmenta-tion, this vector is fully, rather than partiallylatent. We assume zi � Bernoulli(φ), inducingthe relationship N � Binomial(M, φ). Then,population abundance is N ¼ PMi¼1 zi and popu-lation density, D, is NA. See Appendix S2 for a fulldescription of the MCMC algorithm.

    Note that ncap, the total number of individualscaptured, typically denoted by n and anobserved statistic in capture–recapture models, isa derived, random variable in unmarked SCRwith a posterior distribution that quantifies themagnitude of uncertainty in individual identity.Specifically, the posterior sample for ncap oneach MCMC iteration is calculated followingncap ¼ PMi¼1ðPJj¼1 Ytrueij Þ[ 0—simply the numberof individuals currently allocated at least onesample. As more categorical identity informationis added, the posterior distribution of ncap shouldconverge to the single, true value. Finally, weintroduce the derived vector, ID, of size nobs, thatrecords the latent individual, 1, . . ., M, each sam-ple is assigned to. This vector is updated on eachMCMC iteration, producing a posterior for trueidentity for each sample, which can be post-pro-cessed to obtain pairwise posterior probabilitiesthat any two samples originated from the sameindividual. The posterior distribution of the truecovariate values of samples with missing valuescan also be recorded.

    SIMULATIONS

    We conducted two simulation studies (A andB) to explore the performance of the categoricalSPIM.

    Simulation A specificationsFirst, we conducted a simulation study to

    demonstrate the utility of introducing an increas-ing number of categorical identity covariates forimproving density estimation over the baselinecase of unmarked SCR. Further, we sought todemonstrate that the effectiveness of adding cat-egorical identity covariates for improving den-sity estimation depends on the two axes of theIDI, population density, and r. Here, the numberof identity categories is defined to be the total

    number of unique categories implied by the ncatcovariates with nlevelsl each. We start with sam-pling scenarios more challenging for unmarkedSCR than considered by Chandler and Royle(2013) by considering scenarios with more sparsedetection data achieved by using a smaller trap-ping array, a lower k0, and scenarios with higherD given N. Specifically, our trapping arrayconsisted of 81 traps in a 9 9 9 grid with unitspacing, buffered by 3 units to define the statespace of 225 units and we chose densitiesD 2 {0.17, 0.35}, corresponding to N 2 {39, 78}.We considered that populations were sampledfor K = 5 occasions for all scenarios.We conducted simulations across 4 scenarios

    with a 2 9 2 factorial design using low and highvalues of r and D. Scenarios A1 and A2 were thelow r scenarios with r = 0.5, and scenarios A3and A4 doubled r to 1. To account for compensa-tion in the detection function parameters (Effordand Mowat 2014) and maintain similar levels ofdata sparsity with the larger r, we lowered k0from 0.25 to 0.061 to approximately match theexpected number of captures for each individualto that of the scenarios with r = 0.5 (E[caps]~1.65,achieved by trial and error). On this unit spacinggrid, with r = 0.5, the majority of an individual’scaptures fell within a 4-trap area, whereas withr = 1 the majority of an individual’s captures fellwithin a 16-trap area. Scenarios A1 and A3 werethe high abundance scenarios with N = 78, andscenarios A2 and A4 were the low abundance sce-narios with N = 39. The approximate IdentityDiversity Indices (interpolated from Fig. 1) forscenarios A1-A4 were 0.38, 0.23, 0.76, and 0.58.Within each scenario, we explored 9 subscenarioswith differing numbers of identity categories,with 1 identity category corresponding tounmarked SCR. Following the unmarked SCRsubscenario, we sequentially added identitycovariates with 2 category levels each, leading tothe number of unique identity categories increas-ing exponentially with base 2 (2, 4, 8, 16, 32, 64,128, 256). As noted previously, these categoricalidentities are not unique—they may be repre-sented multiple times by different individuals in apopulation, but with decreasing frequency as thenumber of unique identity categories increase.Further, scenario A2 was modified to better

    disentangle the effects of increasing D fromincreasing N. Increasing D by increasing N

    ❖ www.esajournals.org 7 April 2019 ❖ Volume 10(4) ❖ Article e02627

    AUGUSTINE ET AL.

  • simultaneously increases uncertainty in individ-ual identity and reduces data sparsity, whichhave opposite effects on estimator precision.Therefore, to better explore the effect of increas-ing D on the uncertainty in individual identity, Dmust be increased by constraining a fixed N intoa smaller state space. The state space can bereduced in two ways; the number of traps can bereduced, keeping the same state space buffer, orthe number of traps can be fixed while reducingthe state space buffer. In the first scenario, datasparsity is increased since a lower proportion ofindividuals will be located on the interior of thetrapping array and we suspect the absolute num-ber of traps is important for unmarked SCR den-sity estimation. In the second scenario, datasparsity is decreased because a larger proportionof individuals live on the interior of the trappingarray. In order to retain the same number oftraps, we chose to increase the density of sce-nario A2 by reducing the state space buffer from3 to 1 units, thus constraining the N individualsinto a smaller state space area (Scenario A2b).The reduction in the state space area increased Dfrom 0.17 to 0.32, and raising the approximateIDI value from 0.23 to 0.37. The factorial layoutfor simulation scenario A can be found inTable 1.

    Simulation B specificationsWe conducted a second simulation study to

    demonstrate that the categorical SPIM canaccommodate partially observed categoricalidentities (missing identity covariate values) andprovide a proof of concept for using partial geno-types that are the result of failed DNA amplifica-tion, rather than as part of the study design aswould be the case in the first set of simulations ifidentity categories were genotype loci. We used

    the parameter values from scenario A3 above,but introduced imperfect detection to theobserved genotypes. We simulated data sets with7 categorical identity covariates, each with 5equally common category levels, and the cate-gory value for each categorical covariate wasthen observed with probability 0.5, leading to theaverage categorical identity being observed at 3.5of the categorical identity covariates. With 5equally probable levels per identity covariate,these identity covariates are more informativethan the 2-level covariates used in simulation sce-nario A and closer in information content to amultilocus genotype. While each genotype lociusually has more than 5 levels, they are usuallynot equally distributed, so limiting the numberof equal probability levels per identity covariateto 5 likely represents a more fair comparison togenotypes in practice. We fit the categorical SPIMto these data sets, assuming all partial categoricalidentities were usable (Scenario B1) or 75% of thepartial categorical identities were usable as mightbe the case when using partial genotypes if asubset was deemed to be unreliable due to thelikelihood of containing genotyping errors (Sce-nario B2). We then fit the null SCR model to theperfectly observed data for comparison (ScenarioB3).

    Simulation MCMC specificationsFor simulation scenario A, we simulated and

    fit our model to 144 data sets within each subsce-nario, and for simulation scenario B, we simu-lated and fit our model to 128 data sets (differingdue to cluster computing availability). Withineach subscenario of simulation scenario A, weran 3 chains for 100,000 iterations for theunmarked SCR estimator. We calculated the Gel-man-Rubin statistic, Rc, (Gelman and Rubin1992) and only computed inferential quantities(listed below) for sets of chains that indicatedconvergence, which we define as those withRc < 1.1 for parameter N. If fewer than 95% ofthe sets of chains indicated convergence, we ran3 chains for 100,000 iterations for the next sce-nario with a categorical identity covariate (oradditional covariate), until >95% indicated con-vergence. After that, we reverted to a singlechain to save computation time for the remainingsubscenarios. For simulation scenario B, therewere no convergence problems, so we ran a

    Table 1. Factorial design for the simulation scenariosA for two levels of abundance (N) and detectionfunction spatial scale parameter (r).

    r

    0.5 1.0

    N 78 A1 � D = 0.35 A3 � D = 0.3539 A2 � D = 0.17 A4 � D = 0.17

    A2b � D = 0.32Note: Subscenarios A2 and A2b vary individual density

    (D) for a fixed abundance by varying the state space extent.

    ❖ www.esajournals.org 8 April 2019 ❖ Volume 10(4) ❖ Article e02627

    AUGUSTINE ET AL.

  • single chain for 60,000 iterations. We calculatedpoint estimates using the posterior mode andinterval estimates using the highest posteriordensity (HPD) interval. We were interested in thefrequentist bias and coverage of the categoricalSPIM estimator, the accuracy of the estimatordepicted visually by the variance and right skewof the sampling distribution and quantitativelyby the mean squared error and coefficient of vari-ation (CV: 1009posterior sd/posterior mode),and the precision, quantified by the mean 95%HPD interval width. Also of interest was the useof the precision of ncap, also quantified by themean 95% HPD interval width, as a metric ofuncertainty in the individual identity of observedsamples that can predict the uncertainty in N.

    Simulation A resultsNo unmarked SCR scenarios led to reliable

    convergence, with the percentage of simulateddata sets for which the Gelman-Rubin statistic,Rc, indicated convergence ranging from 0.52 to0.90 (Appendix S3: Table S1). The percentage ofsimulated data sets for which Rc indicated con-vergence increased as more categorical identitycovariates were added, with >95% convergenceachieved in all scenarios with 2 categorical iden-tity covariates. Inspection of the MCMC chains(not shown) indicated that the main cause of lackof convergence was data realizations for which rwas not identifiable. The unmarked SCR abun-dance estimator was right-skewed with highvariance (Figs. 2 and 3), except in Scenario A2

    Fig. 2. Histograms of the posterior modes and mean 95% credible interval widths of abundance (N) estimatesplotted against the number of identity categories in scenarios A1, A2, and A2b with baseline detection ratek0 = 0.25 and detection function spatial scale parameter r = 0.5. Population density, D varies across scenarios,and A2b is the scenario that increases D while N remains fixed. Note the number of identity categories increaseexponentially as 2-level identity covariates are added sequentially. Also, note that the histograms depict the med-ian posterior mode, while the mean posterior mode is used to calculate the expected values and bias inAppendix S3: Table S1. The red line depicts the simulated value of N.

    ❖ www.esajournals.org 9 April 2019 ❖ Volume 10(4) ❖ Article e02627

    AUGUSTINE ET AL.

  • where both r was small and abundance was low.The unmarked SCR estimator had a large mean95% credible interval [CI] width relative to abun-dance in all scenarios. The unmarked SCR abun-dance estimates were generally positively biased(Appendix S3: Table S1), although the positivebias in Scenario A4 was only 5%. The only sce-nario where the unmarked SCR estimator wasnegatively biased was A3; however, because theA3 scenario with 1 categorical identity covariatewas positively biased, we attribute the observednegative bias in the A3 unmarked SCR scenarioto the 44% of simulations that were discardeddue to lack of convergence. Data sets that did notlead to convergence likely disproportionatelycorrespond to those producing N estimates in theupper half of the sampling distribution because

    they disproportionately had more capture events(data not shown).Adding and increasing the number of identity

    categories (from 2 to 256 using 8 2-level cate-gories) reduced bias and increased precision inall scenarios, but with diminishing returns asmore identity categories were added. The reduc-tion in mean 95% CI width for ncap by the intro-duction of identity categories was closely relatedto the reduction in mean 95% CI width for abun-dance; however, the relationship was not linearand varied by scenario (Fig. 4). More identitycategories were required to reach maximum pre-cision when abundance was higher and r larger.The largest improvement in precision and abun-dance with the addition of identity categorieswas seen in the low abundance, high density,

    Fig. 3. Histograms of the posterior modes and mean 95% credible interval width of abundance (N) estimatesplotted against the number of identity categories in scenarios A3 and A4 with baseline detection rate k0 = 0.061and detection function spatial scale parameter r = 1. Population density, D, varies across scenarios. Note thenumber of identity categories increases exponentially as 2-level identity covariates are added sequentially. Also,note that the histograms depict the median posterior mode, while the mean posterior mode is used to calculatethe expected values and bias in Appendix S3: Table S1. The red line depicts the simulated value of N.

    ❖ www.esajournals.org 10 April 2019 ❖ Volume 10(4) ❖ Article e02627

    AUGUSTINE ET AL.

  • low r scenario (A2b), where the majority ofuncertainty in abundance was removed with theaddition of one 2-level categorical covariate.Note that the precision of estimates in scenarioA2b converged to a lower value than scenario A2because the same N = 39 individuals were con-strained to be located within 1 unit of the trap-ping array, rather than 3 units, decreasing datasparsity. Positive bias in N̂ was

  • complete genotypes. In Scenario B1, where allpartial categorical identities were used, the inter-val estimate was 95% as precise as the completedata analysis, and in Scenario B2, where 25% ofthe partial categorical identities were unusable,the interval estimate was 80% as precise than thecomplete data analysis. In both scenarios, anaverage of 3.5 categorical covariates providedenough information that the uncertainty in ncap

    was small (mean credible interval widths of 1.8and 2.4 relative to mean ncap of 18.5 and 17.0 inB1 and B2, respectively).

    APPLICATION—CENTRAL APPALACHIAN BLACKBEARS

    We applied the categorical SPIM to a hair trap-ping data set that used 7 microsatellite loci forindividual identification of American black bears(Ursus americanus) in the Central Appalachians,USA. This data set comes from a study con-ducted along the Kentucky-Virginia, USA, bor-der across 2 study areas during 2012 and 2013 toestimate the population density and abundanceof a recently reintroduced population that was inthe process of recolonizing vacant range (Mur-phy et al. 2016). We chose to use the data setfrom the larger study area in 2013 because ourmodel should perform better on the larger trap-ping array and more samples were collected in2013 than in 2012 at this site. The specifics of thedata collection methods are described by Mur-phy et al. (2016); of particular relevance is thateighty-one hair traps were deployed across the215-km2 study area with an average trap spacingof 1.6 km, and all traps were checked weekly for8 consecutive weeks, with a week constituting acapture occasion. Similar to most bear hair trap-ping studies, hair samples were subsampled forgenotyping because of the prohibitive costs ofgenotyping thousands of samples, such that atmost 1 hair sample per trap per occasion pro-duced an individual identity. The capture andsubsampling processes resulted in 95 samplesfrom 45 females and 87 samples from 37 males,determined using the P(sib) criterion. The spatialdistribution of traps and individually identifiedhair sample observations are depicted in Fig. 5.The microsatellites used were G10H, G10L,G10M, MU23, G10J, G10B, and G10P, which hadgenotype frequencies of 19, 22, 19, 17, 12, 15, and

    10 for females and 21, 18, 18, 22, 14, 13, and 10for males. Despite the large number of genotypesat each locus, the majority of individuals sharedjust 2–4 genotypes at each locus, making themless informative than if the loci-specific geno-types were equally distributed as they were inour simulation studies.The goal of this analysis was to fit the categori-

    cal SPIM using from 1 to 7 loci, added in theorder listed above, and to compare the estimatesto the null SCR estimate that does not allow forany uncertainty in individual identity. Further,we also consider a scenario adding partial geno-type samples (2 for females, 4 for males) into theanalysis that were originally discarded. For allgenotype scenarios, the trapping array was buf-fered in the X and Y dimension by 3 km forfemales and 6 km for males, leading to statespace sizes of 1042.5 km2 and 1473.4 km2 forfemales and males, respectively. For each sex-specific, 1-7 loci data set, we ran 32 Markovchains for 250,000 iterations each, thinned by 50,and discarded the first 25,000 iterations as burnin, leaving 1.4 million samples from the

    Fig. 5. Mean male and female capture locations withspatial recaptures for the Kentucky black bear data set.Male 28 is highlighted in yellow because it had a largespatial recapture and was not detected at several trapsin between the two traps where it was detected, requir-ing more loci for the categorical spatial partial identitymodels to link these two samples together.

    ❖ www.esajournals.org 12 April 2019 ❖ Volume 10(4) ❖ Article e02627

    AUGUSTINE ET AL.

  • posterior. This large number of posterior sampleswas likely far more than necessary; however, itallowed us to explore the behavior of the MCMCchains as the last uncertainty in ncap wasremoved by adding genotype information (seeAppendix S4). Because the hair sample subsam-pling process allowed for at most 1 sample perindividual/trap/occasion, we used a Bernoulliobservation model. The metrics of comparisonwere the point estimates (posterior modes), pos-terior standard deviations, and coefficients ofvariation (1009 posterior sd/posterior mode) forabundance as well as the posterior distributionsof ncap. Note, however, that the analyses addingthe partial genotypes should not be expected toreproduce the null SCR point estimates, standarddeviations, and number of individuals capturedbecause they included additional data not usedby the null SCR estimate.

    Application—resultsThe categorical SPIM estimates (Table 3) for

    both sexes generally demonstrated the same pat-terns seen in the simulations. Abundance esti-mates with few identity categories werepositively biased (relative to the SCR estimate)because the estimates of r were negativelybiased and/or the estimates of ncap were posi-tively biased. The magnitude of bias was largerfor males, either partially or fully the result of alarger r, estimated to be 2 times larger thanfemales. As more loci were added, the categoricalSPIM abundance estimates and their posteriorstandard deviations converged toward those ofthe SCR model, the posterior modes of ncap con-verged toward the true number captured, andthe posterior variance of ncap converged toward0 (Fig. 6). The coefficient of variation for all cate-gorical SPIM estimates was lower than 0.20,except for the 1 locus male estimate.The 1 locus female estimate was only 20% per-

    cent less precise than the full SCR estimate, asjudged by the CV, with a positive bias (relativeto the complete data estimate) of 8.6%. The 3 locifemale estimate was substantially better—7.6%less precise than the full SCR estimate and posi-tively biased by only 2.9%. The 4 loci femaleestimate was effectively equivalent to the fullSCR estimate, with 3% less precision and mini-mal bias. The 5–7 loci female estimates werenegligibly improved. Adding 2 partial genotypesamples reduced the posterior standard devia-tion by 2.7% and coefficient of variation by1.9%. One partial genotype sample was consis-tent with two different individuals in the fullgenotype data set, matching one with posteriorprobability of 0.452 and the other with posteriorprobability 0.544, leaving just a 0.004 posteriorprobability that this sample was from a newindividual. The higher posterior match probabil-ity of 0.544 corresponded to the complete geno-type individual that was captured at the sametrap as the partial genotype sample. The otherpartial genotype matched an individual with 2captures in the full genotype data set with pos-terior probability 0.997, leaving a probability of0.003 that this was a new individual, and add-ing a high probability spatial recapture for thisindividual. These genotypes can be found inTable 4.

    Table 3. Sex-specific estimates of the detection func-tion baseline encounter rate, k0, the detection func-tion spatial scale parameter, r, and abundance fromregular spatial capture–recapture (SCR, Full), cate-gorical spatial partial identity models (SPIM) using 1–7 loci (1L–7L), and categorical SPIM with 7 loci,adding partial identity samples not included in theSCR estimate (7L+).

    Genotype data k̂0 r̂ N̂ n̂cap N̂ SE N̂ CV

    Full 0.156 0.92 182.9 45 28.7 15.71L 0.204 0.79 198.7 44.66 37.4 18.82L 0.175 0.82 210.0 46.53 36.9 17.63L 0.161 0.90 188.4 45.06 31.8 16.94L 0.157 0.92 183.4 45.00 29.6 16.15L 0.156 0.92 183.0 45.00 28.8 15.86L 0.156 0.93 182.5 45.00 28.9 15.87L 0.157 0.93 182.2 45.00 28.9 15.97L+ 0.163 0.92 180.2 45.00 28.1 15.6Full 0.066 2.04 117.5 37.00 19.0 16.21L 0.113 0.74 436.9 52.55 165.0 37.82L 0.081 1.58 159.6 40.38 28.8 18.03L 0.079 1.73 137.7 38.13 23.2 16.94L 0.073 1.85 127.2 37.89 22.3 17.55L 0.067 2.02 118.8 37.00 19.9 16.76L 0.066 2.04 118.1 37.00 19.3 16.37L 0.066 2.04 117.8 37.00 19.1 16.27L+ 0.064 2.03 130.0 40.00 20.7 15.9

    Notes: Point estimates are the posterior mode, uncertaintyis quantified by the posterior standard deviation (N̂ SE) andthe coefficient of variation (N̂ CV). For the categorical SPIMs,n̂cap is the estimated number of captured individuals. Femaleestimates are listed first, followed by males.

    ❖ www.esajournals.org 13 April 2019 ❖ Volume 10(4) ❖ Article e02627

    AUGUSTINE ET AL.

  • The 1 locus male estimate was too biased (rela-tive to the full SCR analysis) and imprecise to beof use, whereas the 2 and 3 loci male estimateshad reasonable precision but perhaps too muchpositive bias to be useful. The 4 loci male esti-mate was 7.4% less precise than the full SCR esti-mate, with a positive bias of 7.6%. The 5 lociestimate was only negligibly less precise than thefull SCR estimate, and adding the 6th and 7th lociimproved precision negligibly. Adding 4 partialgenotype samples modestly increased the abun-dance estimate, increased the posterior standarddeviation (due to the larger point estimate), anddecreased the coefficient of variation by 1.9%. Theposterior probability that three of the partial geno-type samples each came from separate individu-als not represented in the full genotype data setwas 1, while the fourth partial genotype samplematched with 8 other samples from 1 individual,each with posterior probability 1.

    DISCUSSION

    We developed a spatial capture–recapturemodel for categorically marked populations thatuses any number of partially identifying categor-ical covariates to reduce the uncertainty in theindividual identity of latent identity samples viathree mechanisms. First, any samples that areinconsistent at any observed covariates are deter-ministically excluded from matching. Second, asthe number of identity categories created bycovariates increases and as the category levelprobabilities for each covariate become moreequal, it is increasingly unlikely that more thanone individual locally, and in the population, willhave the same full categorical identity. Third, thespatial location of the latent identity samples andthe estimated detection function scale parameter,r, spatially restrict which samples matching atall observed covariates could have been

    Fig. 6. Posterior distributions for the number of unique individuals captured (ncap) for the Kentucky black bearfemale and male data set using 1, 2, and 3 loci. True values from full data sets are marked in red (45F, 37M).

    ❖ www.esajournals.org 14 April 2019 ❖ Volume 10(4) ❖ Article e02627

    AUGUSTINE ET AL.

  • produced by the same individual. Thus, the cate-gorical SPIM reduces uncertainty in the individ-ual identity of latent identity samples byproviding deterministic identity exclusions andreducing the uncertainty in probabilistic identityassociations using both spatial and categoricalcovariate information. The categorical SPIM sim-ulation and MCMC functions are maintained inthe SPIM R package (Augustine 2018) and canalso be found in Supplement S1.

    A specific case of categorically marked popula-tions that is of practical importance is that inwhich individuals are marked by multilocusgenotypes. In this case, each locus of a genotypeis a single categorical covariate and the categori-cal SPIM provides an incremental model forgenotype uniqueness as the information aboutindividual identity in the genotypes increases.Thus, the categorical SPIM is an alternative tousing the P(ID) and P(sib) criteria currently usedthat allows for uncertainty in individual identityas might be the case when fewer loci are ampli-fied than necessary to meet probability of iden-tity criteria, which might occur in populations

    with very low genetic diversity (McCarthy et al.2009). The categorical SPIM also introduces thepossibility of using fewer loci than necessary tomeet probability of identity criteria by design,trading some certainty in individual identity forlower genotyping costs. Genotyping costs do notincrease linearly with the number of loci andPuckett (2017) found that variability in the num-ber of loci used in microsatellite studies explainsvery little variability in total project costs. How-ever, there may be some cost savings if usingfewer loci allows for the use of fewer multiplexpanels, which explain a moderate amount ofvariability in total project costs (Puckett 2017).Simulation scenarios A1–4 (Figs. 2 and 3;

    Appendix S3: Table S1) show the importance ofpopulation abundance, density given abundance,and r for the accuracy and precision of theunmarked SCR and categorical SPIM estimators.Unmarked SCR estimates from populations withlower density given abundance and smaller rsshowed less bias (Appendix S3: Table S1), andunmarked SCR estimates were more precise forpopulations with higher abundances, lower den-sity given abundance, and smaller rs. More cate-gorical identity groups were necessary tomaximize precision when abundance was higherand r larger; however, for the scenario thatraised D without raising N, the majority of preci-sion gains came from the addition of the first 4identity categories. This scenario raising D with-out raising N demonstrates the importance ofdisentangling the relationship between N and Dwhen assessing the performance of unmarkedSCR and SCR models with latent or partial indi-vidual identities, more generally. Further, it sug-gests that categorical identity covariates are moreeffective in populations where uncertainty inindividual identity is due to a large D relative toN, rather than a large r. We propose that theuncertainty in ncap, the number of individualsactually captured, is a good measure of the over-all magnitude of uncertainty in individual iden-tity for the unmarked SCR and categorical SPIMestimators. Figure 4 demonstrates that the preci-sion of N increases as the precision of ncap

    increases through the introduction of categoricalidentity covariates, though with diminishingreturns (with precision quantified by the mean95% credible interval width). The rate at whichprecision in N increases with increasing precision

    Table 4. Two partial genotype black bear samples andthe full 7-loci genotype samples they had a positiveposterior probability of matching with.

    Sample no. Genotype X Y

    1 6 7 6 7 4 7 7 330461 40906362 6 7 6 7 4 7 7 329667 40889663 9 12 1 7 7 2 7 331260 40903334 9 12 1 7 7 2 7 330620 40893625 . . . 7 . . . 320620 40893626 7 8 1 1 2 8 2 333045 40910157 7 8 1 1 2 8 2 333045 40910158 . . 1 1 2 . . 331455 4091851

    Notes: The first five samples correspond to one partialgenotypes (sample 5) and its matches (samples 1–4) and thefinal three samples correspond to a second partial genotype(sample 8) and its matches (samples 6 and 7). Each loci-levelgenotype is uniquely numbered, for example, the value 6 atloci 1 corresponds to a genotype with 158 repeats for allele 1and 160 repeats for allele 2. X and Yare the coordinates wherethe samples were collected and missing loci-level genotypesare indicated with dots. The posterior probabilities that fullgenotype sample pairs 1–2, 3–4, and 6–7 each came from thesame individual were 1. The posterior probabilities that par-tial genotype sample 5 came from the same individual assamples 1 and 2, the same individual as samples 3 and 4, or anew individual were 0.452, 0.544, and 0.004, respectively. Theposterior probability that partial genotype sample 8 camefrom the same individual as samples 6 and 7 was 0.997, witha 0.003 probability this sample came from a new individual.

    ❖ www.esajournals.org 15 April 2019 ❖ Volume 10(4) ❖ Article e02627

    AUGUSTINE ET AL.

  • in ncap depends on the parameter values of eachscenario (k0, r, D, and N given D), with greaterprecision gains in N to be had using categoricalidentity covariates in scenarios where there isgreater uncertainty in individual identity in theunmarked SCR estimator.

    The IDI correlated negatively with the preci-sion and accuracy of ncap estimates. For the sce-narios that increased the IDI holding N fixed (A1vs. A3, A2 vs. A4, A2 vs, A2b), the IDI also corre-lated negatively with the accuracy and precisionof N estimates. In scenarios that increased D byincreasing N (A1 vs A2, A3 vs. A4), the increasein uncertainty in ncap was outweighed by thedecrease in data sparsity and N estimates weremore precise and accurate. Our exploration ofIDI values here is very limited and a larger simu-lation study is necessary to determine how wellthis index correlates with estimator performanceand to what degree do scenarios with differingpopulation density and r values producing thesame index value share the same estimator per-formance. Our results suggest that scenarios withdiffering D and r values that produce the sameIDI values will not necessarily produce the sameprecision in ncap. We speculate that this is relatedto how the latent identity samples interact withdifferent densities of activity centers in the SCRprocess model. Specifically, when D is large,there are necessarily more nearby individualsthat a latent identity sample can be allocated to,which increases the uncertainty in individualidentity. Finally, note that some of the largest val-ues of the IDI may represent ecologically implau-sible or even impossible scenarios since homerange size generally varies inversely with density(Efford et al. 2016).

    We demonstrated that the unmarked SCRabundance estimator can be biased (Appendix S3:Table S1) when data are sparse and that this biasis magnified when density is higher for a fixedabundance and/or r is larger. Further, none of theunmarked SCR scenarios we considered led toconvergence rates >90%, with only 52% of theMCMC chains converging for one scenario.Inspection of the MCMC chains indicated that rwas not identifiable for many realizations of thedata for which convergence was not indicated.These patterns should also be seen in other SCRmodels with latent individual identities. Forexample, these patterns should be seen in SMR

    when using few marked individuals and/or whendetection data for the marked individuals aresparse as can occur when using natural marksand/or surveying low density populations. In fact,even though our simulation specifications weremore challenging than those of Chandler andRoyle (2013), many studies in practice use fewerthan 81 traps that we considered and an expected1.65 captures per individual (including the indi-viduals captured 0 times) is likely optimistic formany sampling scenarios. The addition of markedindividuals should allow for more reliable estima-tion for the unmarked population componentbecause the marked individuals provide moreinformation about the detection function parame-ters and reduce the number of samples with latentor partial individual identifications. Further, theuse of individual-linked telemetry data and/or amarking process capture history in conjunctionwith generalized SMR (Whittington et al. 2018)improves the estimation of model parameters andthus, the reliability of density estimates. Still, untilsome practical guidelines can be established, relat-ing estimator performance to abundance, densitygiven abundance, and metrics of data sparsity(e.g., k, K, number/spacing/extent of traps), werecommend researchers conduct simulations withunmarked SCR or SMR parameter values appro-priate to their study design to determine if theirstudy designs are sufficient to produce reliableestimates using these models. If not, the categori-cal SPIM offers a second route to remove bias andincrease precision via reducing the uncertainty inindividual identity, with the first route being thereduction of uncertainty in r using telemetry data(Sollmann et al. 2013) and/or informative priors(Chandler and Royle 2013, Ramsey et al. 2015).Simulation scenarios B1–3 (Table 2) demon-

    strate a proof of concept for using data setswhere some or most of the full categorical identi-ties are partial (missing values for the identitycovariates) such as partial genotypes. With a 0.5probability of successful amplification at eachlocus, no data sets produced enough full geno-type samples that would have been usable in amodel that required certain individual identifica-tion, but the categorical SPIM produced an esti-mate that was 95% as precise as the estimatewhere all loci were amplified with probability 1.Even assuming only 75% of the samples wereusable, the estimate was 80% as precise.

    ❖ www.esajournals.org 16 April 2019 ❖ Volume 10(4) ❖ Article e02627

    AUGUSTINE ET AL.

  • Therefore, the categorical SPIM provides a wayto use the partial identity information, such aspartial genotype samples that are currently beingdiscarded. We caution, however, that if partialgenotype samples are more likely to have geno-typing errors, the categorical SPIM needs to beextended to accommodate those errors, or per-haps a subset of the partial genotype samplescould be deemed reliable through consultationwith a wildlife geneticist. For example, allelicdropout could be ruled out by discarding any locithat were homozygous or using appropriate tests(e.g., available in MicroChecker, Van Oosterhoutet al. 2004), or partial genotypes could perhaps bedeemed reliable if they repeatedly produced thesame partial genotype after a sufficient number ofamplification attempts in a multi-tubes approach.Alternatively, adapting the allelic dropout modelof Wright et al. (2009) to the categorical SPIMwould be straightforward and Wang (2017) pre-sent a model for false alleles that could beadopted if sufficiently general. Both of these errorprocesses would require the modeling of replicateamplification attempts rather than the consensusgenotypes used here. We do note, however, thatthe density estimate precision that is possible forany set of partial genotypes will depend on thenumber of loci available, the underlying geneticdiversity, and the percentage of genotype infor-mation remaining in the data set (in addition tothe typical SCR model parameters). We expectqualitatively similar results to the simulations inScenario B in typical data sets, but the expectedresults for a given population should be deter-mined via simulation using the known or best-guess numbers and frequencies for the loci-levelgenotypes. Better results will be obtained for thesame level of missing genotype information inpopulations with higher genetic diversity.

    The bear genotype analysis demonstrates thatthe categorical SPIM estimator performed simi-larly on a real-world data set as it did for simu-lated data; however, the positive bias in the maleestimates with few loci was larger than seen inthe simulated data sets. We suspect individualheterogeneity in detection function parameters,particularly r, may have been present in themale bear data set. If so, this could have led topoorer performance with few loci/identity cate-gories, and the requirement of more loci/identitycategories to remove bias and increase precision

    than if there were no individual heterogeneity.The distribution of observed spatial recaptures inFig. 5 does seem to suggest individual hetero-geneity in r for males, with one particular indi-vidual having a very long-distance spatialrecapture and many individuals having no spa-tial recaptures. The samples for the individualwith a long-distance spatial recapture wererarely combined into one individual until 3 lociwere used and as ncap = 37 (the correct numberof captured males) became increasingly probablewith the addition of more loci at which point, theestimate of r converged upwards to the full SCRestimate. This behavior is consistent with thesimulations where r is large, but is more pro-nounced in this data set, which could beexplained by individual heterogeneity in thedetection function parameters. A second factorthat tends to split the samples from this poten-tially large r individual apart is that there wereseveral traps between the two traps where thisindividual was captured and the categoricalSPIM found it unlikely that this individualwould not have been captured at these traps clo-ser to its estimated activity center until enoughcategorical covariate information was availableto make it even more unlikely that two individu-als in the population had the same multilocusgenotype. The second longest spatial recapturein the male data set spans a gap with no trapsand required fewer loci to reliably link its sam-ples together.The posterior distributions of ncap in Fig. 6

    demonstrate what we believe is a source of thepositive bias in the categorical SPIM estimator.With the addition of just 2 loci for females and 1locus for males, all incorrect identity associationswere ruled out by the genotype information andthe spatial distribution of the samples. However,a 1 or 2 loci genotype is not sufficient to guaran-tee the local uniqueness of a genotype, leading toa situation in which samples cannot be erro-neously combined into fewer individuals thanproduced them, but they can be erroneously splitapart into more individuals than produced them.Thus, in these scenarios, ncap can never take avalue lower than the true value, but rather mustalways be equal to or larger than the true value.We believe the identity exclusions are removingthe lower tail of the posterior distribution of ncap

    that would be present in the unmarked SCR

    ❖ www.esajournals.org 17 April 2019 ❖ Volume 10(4) ❖ Article e02627

    AUGUSTINE ET AL.

  • estimator and introducing some positive bias,which can be removed by adding more categori-cal identity information and reducing the uppertail of the posterior of ncap. In the simulations,this occurs with the addition of just a few iden-tity categories; however, individual heterogene-ity in detection function parameters as arguedabove may require more categorical identityinformation to remove the positive bias in ncap

    and thus N.This black bear genotype analysis also demon-

    strated the use of genotypes that are partial as aconsequence of DNA amplification failure, withtwo caveats. First, there were very few usablepartial genotypes because of the DNA amplifica-tion protocol used, in which samples at the sametrap/occasion were subsequently genotyped untila full genotype was obtained. This process led tothe partial genotypes matching the completegenotype individual at a particular trap/occasionwith high probability because bears usually leavemultiple hair samples in a hair snare, violatingthe Bernoulli observation process. Second, weassumed the partial genotype samples did notcontain any genotyping errors. Three of the 6partial genotype samples used matched otherindividuals in the population with high probabil-ity, but 3 partial genotypes had posterior proba-bilities of 1 that they were new individuals.These may have indeed been new individuals, orperhaps they did not match any other individu-als because the partial genotypes were corrupted.Including partial genotypes in this manner needsto be done with caution and in consultation witha wildlife geneticist, or the categorical SPIMcould be extended to accommodate genotypingerrors (Wright et al. 2009). If partial genotypes,or even a subsample of the partial genotypes, canbe deemed reliable, including them in the analy-sis can increase the precision of abundance anddensity estimates, especially if high probabilityspatial recaptures can be added, as was the casein the female bear data set.

    We now will discuss four assumptions of par-ticular importance that we make for the categori-cal SPIM. First, and perhaps most consequential,is that there is no individual heterogeneity indetection function parameters, or similarly, thereis no transience in the activity centers during thetime of the survey (Royle et al. 2016). Likeunmarked SCR and spatial mark–resight (SMR),

    the categorical SPIM uses the detection functionlikelihood to determine how likely it is that eachlatent identity sample came from each individ-ual. As demonstrated in Appendix S1, of thedetection function parameters, r largely determi-nes the degree to which samples from differentindividuals overlap in space. If there is individ-ual heterogeneity in detection function parame-ters, especially r, and the SCR model does notinclude this individual heterogeneity, an averagek0 and/or r will be estimated, except this averagewill be a biased estimate since the detected sam-ples will disproportionately belong to the moredetectable individuals. Still, the latent identitysamples from the most detectable individualswill tend to be incorrectly split across two ormore latent individuals because their true spatialdistribution of samples will be deemed unlikelyby the model based on the averaged k0 and/or rvalues. Therefore, ncap will tend to be biased high,introducing positive bias into N. The black bearexample suggests that individual heterogeneity indetection function parameters (if it was present)can be overcome with increasing identity cate-gory information to correctly reproduce the truencap, but adding individual heterogeneity to thecategorical SPIM detection model (as well asunmarked SCR and SMR) would be required toobtain appropriate abundance estimates. Thisshould be investigated in the future, although weexpect the introduction of individual heterogene-ity in detection function parameters to drasticallyincrease the uncertainty in individual identityand, thus, the utility of the categorical SPIM,unmarked SCR, and SMR for density estimation.A better strategy would be the use of covariate-specific detection function parameters for specificcovariates such as the sex, if the subsets of thepopulation that have different detection functionparameters can be at least partially identified(Royle 2009, Sollmann et al. 2011).The second assumption of note is that all possi-

    ble category levels are known, which may not bethe case for genotypes. There are two ways inwhich genotypes can be identified and enumer-ated for the categorical SPIM: identifying allobserved genotypes at each locus, or identifyingall implied genotypes at each locus based on theobserved alleles at each locus. The latter methodis certainly more thorough, but given sufficientlylarge data sets, any unobserved genotypes will

    ❖ www.esajournals.org 18 April 2019 ❖ Volume 10(4) ❖ Article e02627

    AUGUSTINE ET AL.

  • occur in the population very rarely. Regardless,there will always be the possibility that very lowprobability genotypes exist in the population thatwere not observed in the data set, and notaccounting for these genotypes will introducenegative bias in abundance estimates (Wrightet al. 2009), though the magnitude of bias will besmall if the majority of genotypes with non-negli-gible frequencies are identified. Wright et al.(2009) raise the possibility of using independent,reference genotypes from the same population toimprove abundance estimates. This informationcould aid in identifying all possible genotypesfor each locus and could also be used to aid theestimation of genotype frequencies, which wouldbe especially helpful for sparse data sets. The lat-ter use would require that the reference geno-types were representative of the populationsubject to capture but did not contain any of thesame individuals, which would violate indepen-dence (Wright et al. 2009).

    The third assumption of note is that the fullcategorical identities are independent amongindividuals. When using genotypes, this assump-tion will be violated if genetic structure exists atthe home range level as a result of relatedness,natural or anthropogenic impediments to move-ment, or other factors. Due to the importance ofthe spatial proximity of samples in the categori-cal SPIM, the combination of relatedness andphilopatry (e.g., female black bears) can lead to aspatial correlation in genotypes across the land-scape (Moyer et al. 2006). This could introducenegative bias if sufficiently strong because latentidentity samples of nearby related individualswill be erroneously combined into one individ-ual too often when using few loci data sets. Wesuspect spatial correlation in genotypes at thehome range level will be weak in most spatiallystructured, non-isolated populations, and the cat-egorical SPIM to be robust to this effect, althoughthis might not be the case for species such ascanids (Canis sp.) that travel in packs of highlyrelated individuals. As the spatial uniformity ofactivity centers can be regarded as a weak priorfor the distributions of individuals across thelandscape (Royle et al. 2013), the spatial unifor-mity of category levels across the landscape islikely a similarly weak prior with the posteriorable to deviate substantially from spatial unifor-mity, given sufficient data. Regardless, the

    positive bias from individual heterogeneity indetection function parameters will likelyoutweigh any negative bias from the non-inde-pendence of genotypes, but this effect should befurther investigated. The robustness of the cate-gorical SPIM to assumption violations in typicalgenetic data sets can be further established byinvestigating how well it can reproduce estimatesfrom full data sets, as we did in the black bearapplication, across many species and study areas.The fourth assumption we will discuss is that

    the categorical covariates are independent of oneanother. When using genotypes, this is equivalentto linkage equilibrium, which can be tested for(Rousset and Raymond 1995), and results from alinkage disequilibrium test for the black bear dataset used in our application did not detect nonran-dom association of alleles (Murphy et al. 2016).When present, linkage disequilibrium causespseudoreplication in genetic data sets (Selkoe andToonen 2006), which we expect to introduce nega-tive bias into the categorical SPIM estimatorbecause there will be less variability than expectedin the full categorical identities. Non-indepen-dence of the categorical covariates may occur forother types of data; for example, when using nat-ural marks, body size and sex may not be inde-pendent. In this case, a composite covariate thatcombined body size and sex could be constructed,or the body size distribution could be estimatedindependently for each sex. In this latter case,body size would be informative about any miss-ing sex covariates.We added categorical covariates to the

    unmarked SCR model, but categorical identitycovariates can also be added to SMR, 2-flankSPIM, and likely other types of specialized SPIMsthat may not have been developed yet. Combin-ing the categorical SPIM with SMR is especiallyappealing because it would allow all of the fea-tures currently available to improve density esti-mates over unmarked SCR to be combined into asingle model that can accommodate a subset ofmarked individuals, individual-linked telemetrydata, a marking process capture history, and cate-gorical identity covariates. Further, this develop-ment will be required to address researcher-deployed categorical marks, such as colored eartags and collars, because only the “marked” sub-set of individuals are categorically marked andthe number of deployed marks of each type may

    ❖ www.esajournals.org 19 April 2019 ❖ Volume 10(4) ❖ Article e02627

    AUGUSTINE ET AL.

  • be known or perhaps the maximum number maybe known if there is tag loss, requiring a moreconstrained MCMC algorithm for updating thelatent individual identities. Another possibleextension would be to allow continuous identitycovariates, such as body size, perhaps measuredwith error, in a more general “covariate SPIM.”One continuous covariate of particular ecologicalrelevance that is informative of identity is the timea sample was recorded, with samples collectedcloser together in time at the same location beingmore likely to have been produced by the sameindividual. However, making use of the time ofsample deposition would require a model for ani-mal movement and is probably less informativethan spatial location, unless r is large relative tothe trapping array and movement speed rela-tively slow.

    While our application and much of the discus-sion focuses on the use of microsatellite loci ascategorical identity covariates, other observationsystems provide categorical identity covariates thatcan be used by the categorical SPIM. Remote cam-eras, for example, sometimes provide individualsex, age class, color morph, natural markings, and/or other morphological features (Villafa~ne-Trujilloet al. 2018). Some species such as mustelids (Royleet al. 2011, Sir�en et al. 2016, Villafa~ne-Trujilloet al. 2018) and Andean bears (Tremarctos ornatus;Molina et al. 2017) have markings on the chest,head, and neck, that can be used to classify uniqueindividuals. We suspect categorical and/or contin-uous covariates can be extracted from these typesof observations that would allow this identifyinginformation to be used without having to assign aunique identity with certainty, which could beerroneous if two individuals shared the samemarkings, or to include the photographs that werenot distinct enough to provide a unique identity. Asecond observation method with potentially abun-dant partially identifying information is bioacous-tic monitoring where covariates such as noteduration, bandwidth, and frequency can beextracted from the spectrograms of individual calls(Reby et al. 1999, McIntosh et al. 2015, Clink et al.2017), which could improve density estimation fora large number of species that are difficult to detectby methods other than bioacoustics. Finally, inaddition to the possible ways we envision the cate-gorical SPIM may be used outlined in this paper, itmay allow for better density estimation for

    currently used observation systems we are notaware of, or may spur the adoption of new waysof categorically marking species for which uniquemarks are not currently available once researchersare aware that density can be estimated usingthe information contained in categorical marks.Unique marks will always be the most informa-tive, but the categorical SPIM offers a middleground between uniquely marked and unmarkedpopulations.

    ACKNOWLEDGMENTS

    We thank the editor, associate editor, two anonymousreviewers, Lissette Waits, and Dana Morin for sugges-tions that led to the improvement of this paper. B.A.would also like to thank Chris Satter, Chris Rowe, andRobert Alonso for discussions that helped solidify theideas developed in this paper. Any use of trade, product,or firm names is for descriptive purposes only and doesnot imply endorsement by the U.S. Government.

    LITERATURE CITED

    Augustine, B. C. 2018. SPIM R package. https://github.com/benaug/SPIM

    Augustine, B. C., J. A. Royle, M. J. Kelly, C. B. Satter, R.S. Alonso, E. E. Boydston, and K. R. Crooks. 2018.Spatial capture-recapture with partial identity: anapplication to camera traps. Annals of AppliedStatistics 11:67–95.

    Bonner, S. J., and J. Holmberg. 2013. Mark-recapturewith multiple, non-invasive marks. Biometrics69:766–775.

    Borchers, D. L., and M. Efford. 2008. Spatially explicitmaximum likelihood methods for capture–recap-ture studies. Biometrics 64:377–385.

    Chandler, R. B., and J. D. Clark. 2014. Spatially explicitintegrated population models. Methods in Ecologyand Evolution 5:1351–1360.

    Chandler, R. B., and J. A. Royle. 2013. Spatially explicitmodels for inference about density in unmarked orpartially marked populations. Annals of AppliedStatistics 7:936–954.

    Clink, D. J., H. Bernard, M. C. Crofoot, and A. J. Mar-shall. 2017. Investigating individual vocal signa-tures and small-scale patterns of geographicvariation in female bornean gibbon (Hylobates muel-leri) great calls. International Journal of Primatol-ogy 38:656–671.

    Cowen, L., and C. J. Schwarz. 2006. The Jolly-Sebermodel with tag loss. Biometrics 62:699–705.

    Efford, M. 2004. Density estimation in live-trappingstudies. Oikos 106:598–610.

    ❖ www.esajournals.org 20 April 2019 ❖ Volume 10(4) ❖ Article e02627

    AUGUSTINE ET AL.

    https://github.com/benaug/SPIMhttps://github.com/benaug/SPIM

  • Efford, M., D. K. Dawson, Y. Jhala, and Q. Qureshi.2016. Density-dependent home-range size revealedby spatially explicit capture–recapture. Ecography39:676–688.

    Efford, M., and G. Mowat. 2014. Compensatoryheterogeneity in spatially explicit capture—recap-ture data. Ecology 95:1341–1348.

    Fieberg, J., and C. O. Kochanny. 2005. Quantifyinghome-range overlap: the importance of the utiliza-tion distribution. Journal of Wildlife Management69:1346–1359.

    Gelman, A., and D. B. Rubin. 1992. Inference from iter-ative simulation using multiple sequences. Statisti-cal science 7:457–472.

    Knapp, S. M., B. A. Craig, and L. P. Waits. 2009.Incorporating genotyping error into non-invasiveDNA-based mark–recapture population estimates.Journal of Wildlife Management 73:598–604.

    Laake, J., K. Burnham, and D. Anderson. 1993. Dis-tance sampling: estimating abundance of biologicalpopulations. Chapman & Hall, London, UK.

    Lewis, J. S., K. A. Logan, M. W. Alldredge, L. L. Bailey,S. VandeWoude, and K. R. Crooks. 2015. Theeffects of urbanization on population density, occu-pancy, and detection probability of wild felids. Eco-logical Applications 25:1880–1895.

    McCarthy, T. M., L. P. Waits, and B. Mijiddorj. 2009.Status of the Gobi bear in Mongolia as determinedby noninvasive genetic methods. Ursus 20:30–38.

    McClintock, B. T., P. B. Conn, R. S. Alonso, and K. R.Crooks. 2013. Integrated modeling of bilateralphoto-identification data in mark-recapture analy-ses. Ecology 94:1464–1471.

    McIntosh, B., K. M. Dudzinski, and E. Mercado III.2015. Do dolphins’ whistles reveal their age andsex. Animal Behavior and Cognition 2:313–333.

    Mills, L. S., J. J. Citta, K. P. Lair, M. K. Schwartz,and D. A. Tallmon. 2000. Estimating animalabundance using noninvasive DNA sampling:promise and pitfalls. Ecological Applications10:283–294.

    Molina, S., A. K. Fuller, D. J. Morin, and J. A. Royle.2017. Use of spatial capture–recapture to estimatedensity of Andean bears in northern Ecuador.Ursus 28:117–126.

    Moyer, M. A., J. W. McCown, T. H. Eason, and M. K.Oli. 2006. Does genetic relatedness influence spaceuse pattern? A test on Florida black bears. Journalof Mammalogy 87:255–261.

    Murphy, S. M., J. J. Cox, B. C. Augustine, J. T. Hast, J.M. Guthrie, J. Wright, J. McDermott, S. C. Maehr,and J. H. Plaxico. 2016. Characterizing recoloniza-tion by a reintroduced bear population using

    genetic spatial capture–recapture. Journal of Wild-life Management 80:1390–1407.

    Otis, D. L., K. P. Burnham, G. C. White, and D. R.Anderson. 1978. Statistical inference from capturedata on closed animal populations. Wildlife Mono-graphs 62:3–135.

    Puckett, E. E. 2017. Variability in total project and persample genotyping costs under varying studydesigns including with microsatellites or snps toanswer conservation genetic questions. Conserva-tion Genetics Resources 9:289–304.

    Ramsey, D. S., P. A. Caley, and A. Robley. 2015. Esti-mating population density from presence–absencedata using a spatially explicit model. Journal ofWildlife Management 79:491–499.

    Reby, D., B. Cargnelutti, J. Joachim, and S. Aulagnier.1999. Spectral acoustic structure of barking in roedeer (capreolus capreolus). sex-, age-and individual-related variations. Comptes Rendus de l’Acad�emiedes Sciences-Series III-Sciences de la Vie 322:271–279.

    Reich, B. J., and B. Gardner. 2014. A spatial capture-recapture model for territorial species. Environ-metrics 25:630–637.

    Rousset, F., and M. Raymond. 1995. Testing heterozy-gote excess and deficiency. Genetics 140:1413–1419.

    Royle, J. A. 2009. Analysis of capture–recapture mod-els with individual covariates using data augmen-tation. Biometrics 65:267–274.

    Royle, J. A., R. B. Chandler, R. Sollmann, and B. Gard-ner. 2013. Spatial capture-recapture. AcademicPress, Cambridge, Massachusetts, USA.

    Royle, J. A., R. M. Dorazio, and W. A. Link. 2007.Analysis of multinomial models with unknownindex using data augmentation. Journal ofComputational and Graphical Statistics 16:67–85.

    Royle, J. A., A. K. Fuller, and C. Sutherland. 2016.Spatial capture–recapture models allowing Marko-vian transience or dispersal. Population Ecology58:53–62.

    Royle, J. A., A. J. Magoun, B. Gardner, P. Valkenburg,and R. E. Lowell. 2011. Density estimation in awolverine population using spatial capture–recap-ture models. Journal of wildlife management75:604–611.

    Selkoe, K. A., and R. J. Toonen. 2006. Microsatellitesfor ecologists: a practical guide to using and evalu-ating microsatellite markers. Ecology Letters9:615–629.

    Sir�en, A. P., P. J. Pekins, P. L. Abdu, and M. J. Ducey.2016. Identification and density estimation ofAmerican martens (Martes americana) using a novelcamera-trap method. Diversity 8:3.

    ❖ www.esajournals.org 21 April 2019 ❖ Volume 10(4) ❖ Article e02627

    AUGUSTINE ET AL.

  • Sollmann, R., M. M. Furtado, B. Gardner, H. Hofer, A.T. J�acomo, N. M. Tôrres, and L. Silveira. 2011.Improving density estimates for elusive carnivores:accounting for sex-specific detection and move-ments using spatial capture–recapture models forjaguars in central brazil. Biological Conservation144:1017–1024.

    Sollmann, R., B. Gardner, A. W. Parsons, J. J. Stocking,B. T. McClintock, T. R. Simons, K. H. Pollock, andA. F. O’Connell. 2013. A spatial mark–resightmodel augmented with telemetry data. Ecology94:553–559.

    Van Oosterhout, C., W. F. Hutchinson, D. P. Wills, and P.Shipley. 2004. Micro-checker: software for identify-ing and correcting genotyping errors in microsatel-lite data. Molecular Ecology Resources 4:535–538.

    Villafa~ne-Trujillo, �A. J., C. A. L�opez-Gonz�alez, and J.M. Kolowski. 2018. Throat patch variation in tayra

    (Eira barbara) and the potential for individual iden-tification in the field. Diversity 10:7.

    Waits, L. P., G. Luikart, and P. Taberlet. 2001. Estimat-ing the probability of identity among genotypes innatural populations: cautions and guidelines.Molecular Ecology 10:249–256.

    Wang, J. 2017. Estimating genotyping errors fromgenotype and reconstructed pedigree data. Meth-ods in Ecology and Evolution 9:109–120.

    Whittington, J., M. Hebblewhite, and R. B. Chandler.2018. Generalized spatial mark-resight modelswith an application to grizzly bears. Journal ofApplied Ecology 55:157–168.

    Wright, J. A., R. J. Barker, M. R. Schofield, A. C. Frantz,A. E. Byrom, and D. M. Gleeson. 2009. Incorporat-ing genotype uncertainty into mark–recapture-typemodels for estimating abundance using DNA sam-ples. Biometrics 65:833–840.

    SUPPORTING INFORMATION

    Additional Supporting Information may be found online at: http://onlinelibrary.wiley.com/doi/10.1002/ecs2.2627/full

    ❖ www.esajournals.org 22 April 2019 ❖ Volume 10(4) ❖ Article e02627

    AUGUSTINE ET AL.

    http://onlinelibrary.wiley.com/doi/10.1002/ecs2.2627/fullhttp://onlinelibrary.wiley.com/doi/10.1002/ecs2.2627/full