Combinatorial insights into distributions of wealth, size, and abundance

Post on 13-Feb-2016

61 views 1 download

description

Combinatorial insights into distributions of wealth, size, and abundance. Ken Locey. Rank-abundance curve (RAC). Species abundance distribution (SAD). Frequency distribution. frequency. Abundance. Rank in abundance. Abundance class. Ranked curve ( RC ). - PowerPoint PPT Presentation

Transcript of Combinatorial insights into distributions of wealth, size, and abundance

Combinatorial insights into distributions of wealth, size,

and abundance

Ken Locey

Rank-abundance curve (RAC)

Rank in abundance

Abun

danc

e

Frequency distribution

Species abundance distribution (SAD)

Abundance class

freq

uenc

y

Ranked curve (RC)

Rank in abundance,wealth, or size

Abun

danc

e/w

ealth

/size

Frequency distribution

Distributions of wealth, size, abundance

Abundance, wealth, or size class

freq

uenc

y

Wheat Production (tons)

Poverty in Rural America, 2008

Percent in Poverty

54 – 25.1 25 – 20.1 20 – 14.1 14 – 12.1 12 – 10.1 10 – 3.1

Distributions used to predict variation in wealth, size, & abundance1. Pareto (80-20 rule)2. Log-normal3. Log-series4. Geometric series5. Dirichlet6. Negative binomial7. Zipf8. Zipf-Mandelbrot

Rank-abundance curve (RAC)

Rank in abundance

Abun

danc

e

Frequency distribution

Predicting, modeling, & explaining the Species abundance distribution (SAD)

Abundance class

freq

uenc

y

Rank in abundance

Abun

danc

e104

103

102

101

100

ObservedResourcepartitioningDemographic stochasticity

Predicting, modeling, & explaining the Species abundance distribution (SAD)

Rank in abundance

Abun

danc

e104

103

102

101

100

N = 1,700S = 17

Predicting, modeling, & explaining the Species abundance distribution (SAD)

How many forms of the SAD for a given N and S?

Rank in abundance

Abun

danc

e104

103

102

101

100

Integer Partitioning

Integer partition: A positive integer expressed as the sum of unordered positive integers

e.g. 6 = 3+2+1 = 1+2+3 = 2+1+3

Written in non-increasing (lexical) ordere.g. 3+2+1

Rank-abundance curves are integer partitions

Rank-abundance curve

N = total abundanceS = species richness

S unlabeled abundancesthat sum to N

Integer partition

N = positive integerS = number of parts

S unordered +integersthat sum to N=

Combinatorial Explosion

N S Shapes of the SAD

1000 10 > 886 trillion

1000 100 > 302 trillion trillion

Random integer partitions

Goal: Random partitions for N = 5, S = 3:

54+13+23+1+12+2+12+1+1+11+1+1+1+1

Nijenhuis and Wilf (1978) Combinatorial Algorithms for Computer and Calculators. Academic Press, New York.

SAD feasible sets aredominated by hollow curves

Freq

uenc

y

log2(abundance)

The SAD feasible setln

(abu

ndan

ce)

Rank in abundance

N=1000, S=40

Can we explain variation in abundance based on how N and S constrain

observable variation?

Question

Dataset communities

Christmas Bird Count 129

North American Breeding Bird Survey 1586

Gentry’s Forest Transect 182

Forest Inventory & Analysis 7359

Mammal Community Database 42

Indoor Fungal Communities 124

Terrestrial metagenomes 92

Aquatic metagenomes 48

TOTAL 9562

The center of the feasible setln

(abu

ndan

ce)

Rank in abundance

N=1000, S=40

Obs

erve

d ab

unda

nce

100 101 102

Abundance at the center of the feasible set

102

101

100R2 per site

R2 = 1.0

Obs

erve

d ab

unda

nce

R2 = 0.93

Breeding Bird Survey (1,583 sites)

100 101 102

R2 per site

Abundance at the center of the feasible set

102

101

100

Abundance at center of the feasible set

Obs

erve

d ab

unda

nce

Obs

erve

d ab

unda

nce

Abundance at center of the feasible set

Public code and data repository

https://github.com/weecology/feasiblesets

Center of the feasible set

Obs

erve

d ho

me

runs

0.93 0.88

0.91 0.91

0.94 0.93

http://mlb.mlb.com

Combinatorics is one only way to examine feasible sets

Other (more common) ways:Mathematical optimizationLinear programming

Dataset total sites analyzable sites

Christmas Bird Count 1992 129 (6.5%)

North American Breeding Bird Survey 2769 1586 (57%)

Gentry’s Forest Transect 222 182 (82%)

Forest Inventory & Analysis 10356 7359 (71%)

Mammal Community Database 103 42 (41%)

Indoor Fungal Communities 128 124 (97%)

Terrestrial metagenomes 128 92 (72%)

Aquatic metagenomes 252 48 (19%)

TOTAL 15950 9562 (60%)

Efficient algorithms for generating random integer partition with

restricted numbers of parts

Random integer partitions

Goal: Random partitions for N = 5, S = 3:

54+13+23+1+12+2+12+1+1+11+1+1+1+1

Nijenhuis and Wilf (1978) Combinatorial Algorithms for Computer and Calculators. Academic Press, New York.

Combinatorial Explosion

N S SAD shapes

1000 10 > 886 trillion

1000 1,...,1000 > 2.4x1031

Probability of generating a random partition of 1000 having 10 parts: < 10-17

Task: Generate random partitions of N=9 having S=4 parts

4+3+2

Task: Generate random partitions of N=9 having S=4 parts

4+3+2

4+3+2

4+3+2

3+3+2+14+3+2

 

4+3+2

 

3+2=5

4+3+2=9

3+3+2+14+3+2=9

1. Generate a random partition of N - S with S or less as the largest

2. Append S to the front3. Conjugate the partition4. Let cool & serve with garnish

A recipe for random partitions of N with S parts

54+13+23+1+12+2+12+1+1+11+1+1+1+1

Generate a random partition of N-S with S or less as the largest part

Divide & Conquer

Multiplicity

Top down

Bottom up

Un(bias)

Skewness of partitions in a random sample

Den

sity

Speed

Number of parts (S)

Sag

e/al

gorit

hm

N = 50 N = 100

N = 150 N = 200

Old Apples: probability of generating a partition for N = 1000 & S = 10: < 10-17

New Oranges: Seconds to generate a partition for N = 1000 & S = 10: 0.07

Integer partitionsS positive integers that sum to N

without respect to order

What if a distribution has zeros?• subplots with 0 individuals• people with 0 income • publications with 0 citations

Abundance class

freq

uenc

y

0 1 2 3 4 5

Intraspecific spatial abundance distribution (SSAD)N = abundance of a species

S = number of subplots

Intraspecific spatial abundance distribution (SSAD)

Public code repository

https://github.com/klocey/partitions

PeerJ Preprint

https://peerj.com/preprints/78/

Locey KJ, McGlinn DJ. (2013) Efficient algorithms for sampling feasible sets of macroecological patterns. PeerJ PrePrints 1:e78v1

Future Directions in Combinatorial Feasible Sets

Future Directions: metrics of Evenness, diversity, & inequality

freq

uenc

y

Future Directions: metrics of Evenness, diversity, & inequality

freq

uenc

y

Future Directions: metrics of Evenness, diversity, & inequality

Per

cent

ile in

feas

ible

set

Gini’s coefficient of inequality

Future Directions: metrics of Evenness, diversity, & inequality

integer composition: all ordered ways that S positive integers can sum to N

Future Directions: New combinatorial feasible sets

6 = 3+2+1 = 1+2+3 = 3+1+2

Future Directions: New combinatorial feasible sets

Rank

log

abun

danc

e

Future Directions: New combinatorial feasible sets

Rank

log

abun

danc

e

Future Directions: New combinatorial feasible sets

Rank

Pragmatic: explanations & predictions using few inputs

Mathematical: combinatorics can be used to characterize and understand observable variation in nature

System specific: patterns attributed to specific processes are constrained by general variables. What drives the values of the variables?

Policy, management, & philosophy:Would you want to know if the most costly, likely, preferred outcome was 95% similar to 95% of all others? Why?