Curso de Geoestadistica en Ingles

167
MATH RECALL / NOTATION PROGRAMS / FACT SHEETS Content Double Bend Sign - For your information only - Not to be remembered! Math Recall - Sum sign - Integral ⇐⇒ Area - Derivative ⇐⇒ Slope - Minima / Maxima Notation - Statistics - Geostatistics Fact Sheets 1 MATH RECALL - Conventions Computation order - Priority 1: power - Priority 2: multiplication, division - Priority 3: addition, subtraction - Left to right - Parenthesis have priority, inside before outside - Examples: - 1+2 × 3 = 1 + (2 × 3) = 1 + 6 = 7 -1 × 4/2=4/2=2 - (1 + 2) × 3=3 × 3=9 - 1 + (2 × 3) = 1 + 6 = 7 - ((1 + 3) + (2 × 3)) = (4 + 6) = 10 Summation sign: 4 i=1 a i = a 1 + a 2 + a 3 + a 4 Product sign: 4 i=1 a i = a 1 × a 2 × a 3 × a 4 Factorial: n! 4! = 4 × 3 × 2 × 1 2 MATH RECALL - Conventions More about Σ Σ λ i λ j γ ij = λ 1 λ 1 γ 11 + λ 1 λ 2 γ 12 + λ 1 λ 3 γ 13 + λ 2 λ 1 γ 21 + λ 2 λ 2 γ 22 + λ 2 λ 3 γ 23 i Σ λ i = λ 1 + λ 2 i=1 2 1 2 Σ Σ γ ij = γ 11 + γ 12 + γ 13 + γ 21 + γ 22 + γ 23 i=1 j=1 2 3 1,1 1,2 1,3 2,1 2,2 2,3 i Σ λ j = λ 1 + λ 2 + λ 3 j=1 3 1 2 3 j i=1 j=1 2 3 j i j f.150 3 MATH RECALL: Exercise 1 Let 3 Au values: z 1 = 1, z 2 = 2, z 3 = 3 g/t Compute: z 1 + z 2 + z 3 (z 1 + z 2 ) × z 3 z 1 + z 2 × z 3 3 i=1 z i 3 i=1 2 × z i 1 2 3 i=1 (z i - 2) 2 4

description

Curso Geosestadistica

Transcript of Curso de Geoestadistica en Ingles

  • MATH RECALL / NOTATIONPROGRAMS / FACT SHEETS

    Content

    Double Bend Sign- For your information only- Not to be remembered!

    Math Recall- Sum sign

    - Integral Area- Derivative Slope- Minima / Maxima

    Notation- Statistics- Geostatistics

    Fact Sheets

    1

    MATH RECALL - Conventions

    Computation order- Priority 1: power- Priority 2: multiplication, division- Priority 3: addition, subtraction- Left to right- Parenthesis have priority, inside before outside- Examples:

    - 1 + 2 3 = 1 + (2 3) = 1 + 6 = 7

    - 1 4/2 = 4/2 = 2

    - (1 + 2) 3 = 3 3 = 9

    - 1 + (2 3) = 1 + 6 = 7

    - ((1 + 3) + (2 3)) = (4 + 6) = 10

    Summation sign:

    4i=1

    ai = a1 + a2 + a3 + a4

    Product sign: 4i=1

    ai = a1 a2 a3 a4

    Factorial: n!4! = 4 3 2 1

    2

    MATH RECALL - Conventions

    More about

    ijij = 1111 + 1212 + 1313 + 2121 + 2222 + 2323

    i

    i = 1 + 2 i=1

    2 1

    2

    ij = 11 + 12 + 13 + 21 + 22 + 23 i=1 j=1

    2 3

    1,1 1,2 1,3

    2,1 2,2 2,3i

    j = 1 + 2 + 3 j=1

    31 2 3

    j

    i=1 j=1

    2 3

    j

    i j

    f.150

    3

    MATH RECALL: Exercise 1

    Let 3 Au values: z1 = 1, z2 = 2, z3 = 3 g/t

    Compute:

    z1 + z2 + z3

    (z1 + z2) z3

    z1 + z2 z33

    i=1 zi

    3i=1 2 zi

    12

    3i=1(zi 2)

    2

    4

  • CALCULUS - Definitions

    Variable: x- An expression, the value of which is unknown orsubject to change.

    Domain of definition of x- Set of values that can be taken by x.

    - [a, b], i.e. a x b, ]a, b], i.e. a < x b

    - [a, b[, i.e. a x < b, ]a, b[, i.e. a < x < bwhere a & b are two constants.

    Function- An ordered set of pairs (x, y) such that for each x,there is one and only one y. Usual notation is:

    y = f(x)

    - Defined by:

    - domain of definition of x

    - the condition that must be satisfied by x & y

    5

    CALCULUS - Functions

    Parabola: y = x2, 2 x 2

    3 2 1 0 1 2 3 x

    y = f(x)4

    2

    f. 66

    3

    1

    Normal Distribution: f(x) = 12pi2

    exp[ (x)

    2

    22

    ],

    x +

    f(x)

    : Mean

    2: Variance

    x

    f. 1a

    6

    CALCULUS - Integral

    Graphical representation

    y = f(x)

    a b

    y

    x

    dx 0

    f. 68aa

    y = f(x)

    a b

    y

    x

    dx >>> 0

    dxx1

    f(x1+dx) + f(x1) dx 2

    y = f(x)

    a b

    y

    x

    dx >> 0

    dxx x1 2

    2 f(xi+dx) + f(xi) dxi=1 2

    y = f(x)

    a b

    y

    x

    dx > 0

    dx

    x x1 2

    3 f(xi+dx) + f(xi) dxi=1 2

    b

    f(x)dxa

    x3

    7

    CALCULUS - Integral

    DefinitionThe integral of the function f(x) on the interval [a,b] isthe limit of the sum

    ni=1

    f(xi + dx) + f(xi)

    2dx

    when limdx 0. Note that n + in [a, b].

    Notation

    ba

    f(x)dx = limdx0,n

    ni=1

    f(xi + dx) + f(xi)

    2dx

    Graphical representation

    y = f(x)

    a b

    y

    x

    f. 68

    b

    f(x)dxa

    8

  • CALCULUS - Derivative

    DefinitionThe derivative y = f (x) is, when it exists, the limitof the quotient of the increment of the variables x & y,when the increment of x tends towards 0.

    f (x) = limdx0

    dy

    dx= lim

    dx0

    f(x + dx) f(x)

    dx

    The function f(x) is derivable at x = x0 if f(x0) exists.

    Graphical representation

    y = f(x)

    x0

    f(x0)

    y = f(x)

    x

    f(x0+dx)

    x0+dx

    dx > 0

    y = f(x)

    x0

    f(x0)

    y = f(x)

    x

    f(x0+dx)

    x0+dx

    dx >> 0

    f. 67

    f (x) = tg(), i.e. slope of tangent at x0.

    9

    MINIMA / MAXIMA: 1 variable

    y = f(x)

    x

    f(x)

    x2x1

    f(x1)=0

    f(x2)=0

    3 2 1 0 1 2 3 x

    y = f(x)

    4

    2

    f.123

    3

    1

    f(0)=0

    f(x+dx) f(x) dxf(x) = lim

    dx > 0

    (x+dx)2 x2

    dx= lim

    dx > 0

    x2 + 2xdx + (dx)2 x2

    dx= lim

    dx > 0

    = lim (2x + dx)dx > 0

    = 2x + 0 = 0

    f(x) = 2x = 0 iff x = 0

    Example: y = x2

    Slope of tangent = 0 = Minima or Maxima (local orglobal).

    10

    MINIMA / MAXIMA: Several variables

    f(x,b)f(a,y)

    dzdy

    fy(a,y) = 0

    y = b

    dzdx

    fx(x,b) = 0

    x = a

    x

    y

    z=f(x,y)

    ab

    0

    (a,b,0)

    y

    x

    z

    f(x,y)=x2 + y2

    x=0

    y=0

    f(x+dx, y) f(x, y)

    dxfx(x, y) = lim

    dx>0

    (x+dx)2+y2 (x2+y2)

    dx= lim

    dx>0

    x2 + 2xdx + dx2 x2

    dx = lim

    dx>0

    = lim (2x + dx)dx>0

    fx(x,y) = 2x = 0 iff x = 0

    fy(x,y) = 2y = 0 iff y = 0

    Example:

    f.124

    11

    MINIMA / MAXIMA: Under Constraint

    y

    x

    z

    f(x,y)=x2 + y2

    Free Minimum

    Constrained Minimum

    Min f(x, y) = x2 + y2

    Min f(x, y) = x2 + y2

    s.t.: g(x, y) = x 1 = 0

    Min h(x, y, ) = f(x, y) g(x, y)

    = x2 + y2 + x

    dh(x, y, )

    dx = 2x = 0

    dh(x, y, )

    dy = 2y = 0

    dh(x, y, )

    d = x 1 = 0

    (x, y, ) = (1, 1, 2)

    {

    f(x,y) = x2 + y2 such that:

    g(x, y) = x 1 = 0

    df(x, y)

    dx = 2x = 0

    df(x, y)

    dy = 2y = 0

    (x, y) = (0, 0) {{

    f.125

    12

  • CALCULUS: Summary

    Variable: x

    Function: y = f(x)

    Integral:

    ba

    f(x)dx = limdx0,n

    ni=1

    f(xi + dx) + f(xi)

    2dx

    Derivative:

    f (x) = limdx0

    dy

    dx= lim

    dx0

    f(x + dx) f(x)

    dx

    f(x0) = tg

    y = f(x)

    a bx0

    y = f(x)

    x

    a

    bf(x)dx

    x1

    f(x1) = 0

    f.68b

    Minimum/Maximum if derivative is zero.13

    FACT SHEET

    Integral = area Derivative = slope Integrals and derivatives are linear operators. Minimum/Maximum when derivative is zero.

    14

    15 16

  • NOTATION (1)

    Summation sign:

    4i=1

    ai = a1 + a2 + a3 + a4

    Random Variable (RV)- X, Y : random variables- Z(x): random variable Z at location x- x, y, z(x): realizations of X, Y , Z(x)- P , Prob: probability

    Random Function- Z(x): random function- Z(x), x D: set of RV Z(x), x within thedeposit D (or geological domain)

    Descriptive Statistics:- Roman letters- m, mX , x: mean, average- s2, s2X : variance- cvX : coefficient of variation- xi: ith quantile; x50: median- Cov(X, Y ): covariance- Cor(X, Y ): correlation coefficient

    17

    NOTATION (2)

    Model Parameters:- Greek letters- , X , E(X): mean, average, expectation- 2, 2X , V ar(X): variance- X,Y : covariance- X,Y : correlation coefficient

    Distributions- pdf : probability density function (probabilitydistribution)

    - cdf : cumulative density function- fX , fXY : univariate, bivariate pdfs- FX , FXY : univariate, bivariate cdfs- X N(X ,

    2X): normal distribution, mean X ,

    variance 2X

    Variogram- (h): variogram for distance h- (V, V ): average variogram value within block V- (V, i) = (i, V ): average variogram valuebetween sample i and block V

    Block variance:- V ar(ZV ), D

    2(V |), (V = Volume)- V ar(ZB), D

    2(B|), (B = Block)

    18

    NOTATION (3)

    Miscellaneous:- Greek letters

    - , , , , , , - AFC : affine correction- ILC : indirect lognormal correction- IK : indicator kriging- OK : ordinary kriging- QQ : quantile/quantile plot- RV : random variable- RF : random function- SMU : selective mining unit

    - AVE, Ave: average- VAR, Var : variance- RLVAR, RlVar : relative variance- STDV, StDv : standard deviation- RLSTDV, RlStDv : relative standard deviation- CV : coefficient of variation

    - RLP : relative pairwise (variogram)

    19 20

  • FACT SHEET (1)

    Recall Calculus

    Integral = area Derivative = slope

    Minimum/Maximum when derivative is zero.

    Univariate Statistics (Description)

    mX = (1/N)

    xi

    s2X = (1/N)(xi mX)

    2 = (1/N)

    x2i m2X

    m(aX+bY+c) = a mX + b mY + c

    s2aX+b = a2s2X

    Univariate Statistics (Model)

    E(X) = X

    V ar(X) = 2X = E(X X)2 = E(X2) 2X

    21

    FACT SHEET (2)

    Univariate Statistics (Model)

    E[aX + bY + c] = aE[X] + bE[Y ] + c

    V ar[aX + b] = a2V ar[X]

    Bivariate Statistics (Description)

    sX,Y =1N

    [xi mX ][yi mY ] =

    1N

    [xiyi]mXmY

    V ar(aX + bY + c) = a2V ar(X) + b2V ar(Y )+2abCov(X, Y )

    Bivariate Statistics (Model)

    Cov[X, Y ] = X,Y = E[(X X)(Y Y )

    ]= E(XY ) XY

    22

    FACT SHEET (3)

    Variogram, Covariance Function

    (h) = (1/2Nh)[z(xi) z(xi + h)]

    2

    (h) = (1/2)E[(Z(x) Z(x + h)

    )2]

    C(h) = E[(Z(x)Z(x+ h)] Z(x)Z(x+h)

    (h) = C(0) C(h)

    Block Variance

    V ar[ZV (x)

    ]= V ar

    [Z(x)

    ] (V, V )

    Estimation Variance

    V ar(Er) = 2N

    i=1 i(V, i)

    N

    i=1

    Nj=1 ijij (V, V )

    23

    FACT SHEET (4)

    Ordinary Kriging

    2OK =N

    i=1 i(V, i) + (V, V )

    Indicator Kriging

    No new formulas

    Cross-Validation/Reconciliation

    No new formulas

    24

  • UNIVARIATE STATISTICS

    Content

    Concepts, notation:- Random variable, random function- Probability- Stationarity

    Univariate description:- Maps: values, contours,...- Graphs: histogram, boxplot,...- Statistics: mean, variance,...

    Univariate model:- Probability distribution- Parameters: mean (expectation), variance,...- Distributions: normal, lognormal,...

    Applications:- EDA envelope- Geology model- Declustering- Compositing

    1

    BASIC STATISTICS: Random Variable

    Cast of a die:

    P (X = i) = 1/6, i = 1, . . . , 6,

    - X is the random variable (RV) value shown onthe face after casting the die.

    Random variable (RV):- Value cannot be predicted, but- Possible results are outcomes or realizations- Outcomes occur according to probability law

    Many geological attributes, CU grade, lithotype, etc.,have random characteristics:

    - Are not known, unless sampled- Possible values are generally known- Some values are more frequent than others

    Attributes can be interpreted as realizations RVs:- Discrete: lithotype, ...- Continuous: Au, Cu, density ...

    Will try to infer relevant characteristics of these RVs.

    2

    BASIC STATISTICS: Probability

    Cast of a die:- X: RV result from casting a die- The observed relative frequency of getting 2 is:

    p(2) =Number of 2s obtained

    Number of casts

    - p(2) tends towards the probability of having 2when the No. of casts is large (infinite):

    p(2) = Prob(X = 2), if No. of casts is infinite

    Probability of an event: its relative frequency when theNo. of trials is large.

    Properties:- 0 probability 1- Sum of probabilities of all disjoint outcomes is 1

    Cast of a die, X:

    P (X = i) = 1/6, i = 1, . . . , 6

    P (X = 1) + + P (X = 6) = 1

    Z: CU value at a given location (%):

    P (Z = 4.4673829) = 0,

    0 P(Z [0.1, 1.5]

    ) 1

    P(Z 100) = 1

    3 4

  • BASIC STATISTICS: Notation

    z(x) is grade at location x and is interpreted as arealization of a random variable (RV) Z(x).

    Attention!- x is a location here- Sometimes use X for a random variable- X(x) is RV X located in x ...

    UPPER CASE: Random Variables (RV) Z(x)lower case: Realizations z(x)

    Random Variables- Point grades at locations:

    x : Z(x) x0: Z(x0), or Z0 xi: Z(xi), or Zi

    - Block/Volume grades at location x:

    ZB(x), or ZV (x)

    Realizations- z(x), z(x0), z0, z(xi), zi, zB(x), zV (x)

    Important:- Do not mix different supports:

    - e.g. sample & block grades- They have different statistical characteristics.

    - Do not mix apples and oranges.

    5

    BASIC STATISTICS: Stationarity

    Roughly means that the characteristics of the RV Z(x)do not depend on the location x.

    Concept intuitively used by geologists when:- Defining geological domains- Computing statistics

    Important:- Split the deposit within geological domains thathave different mineralization characteristics.

    - Do not mix apples and oranges.

    Geology first!

    Mineralization within a geology domain is usually con-sidered +/- stationary, i.e. with similar characteristics.

    6

    7 8

  • UNIVARIATE DESCRIPTION: Maps

    List of values and coordinates

    East 134.2 628.2 330.5 333.8 ...

    North 253.2 431.7 320.3 382.8 ...

    Au (g/t) 8.5 3.2 6.2 8.1 ...

    Maps

    0.8

    1.5

    1.2

    8.5

    7.5

    13

    6.2

    8.0

    3.23.9

    8.1

    0.8

    1.5

    1.2

    8.5

    7.5

    13

    6.2

    8.0

    3.23.9

    8.18

    6

    24

    0.8

    1.5

    1.2

    8.5

    7.513

    6.2

    8.0

    3.23.9

    8.1

    Locations & Values Symbols

    Contours Colour Scale

    f. 175

    9

    UNIVARIATE DESCRIPTION: Histogram

    A graph showing the (relative) frequencies of occurrenceof values within classes of equal amplitudes

    0

    10

    20

    30

    Freq

    (%)

    Au (g/t)0 4 8 12 16

    Histogram of Au grades

    f. 3a

    P(2. < Au < 4.)

    Remarks:- Class interval: 2.0 g/t- Class interval includes the upper bound, but notthe lower bound

    - 20% of Au values are in the ]2.0, 4.0]% class- Total area defined by classes is 100% or 1.- Different scales are sometimes needed to observelow and high values.

    10

    UNIVARIATE DESCRIPTION: Examples

    HISTOGRAM - Logarithmic Scale

    Fre

    qu

    en

    cy

    BH Au (g/t)

    .1 1. 10. 100.

    0.000

    0.050

    0.100

    0.150

    0.200 Number of Data 99688Number trimmed 2295

    mean 2.170std. dev 4.949

    coef. of var 2.281

    maximum 330.000minimum 0.100

    f. 62

    HISTOGRAM - Arithmetic Scale

    Fre

    qu

    en

    cy

    BH Au (g/t)

    0.0 4.0 8.0 12.0 16.0 20.0

    0.000

    0.100

    0.200

    0.300 Number of Data 99688Number trimmed 2295

    mean 2.170std. dev 4.949

    coef. of var 2.281

    maximum 330.000minimum 0.100

    11

    UNIV. DESCRIPTION: Cumulative Histogram

    A graph showing the (relative) cumulative frequenciesof occurrence of values below a cutoff

    0

    25

    50

    75

    100

    Cum. freq.

    (%)Cumulative histogram

    of AU grades

    f. 4a

    22%

    Au (g/t)0 4 8 12 16

    Histogram

    Remarks- 22% of AU values are 4 g/t, or- Prop(AU 4) = 0.22,- Cumulative frequency of last class is 100% or 1

    12

  • UNIV. DESC.: Location vs. Dispersion Statistics

    0

    10

    20

    30

    Freq(%)

    Location

    Au (g/t)

    f. 3f

    Dispersion

    0 4 8 12 16

    Location Statistics

    - Value around which grades are distributed

    - Ex.: mean grade

    Dispersion (Spread) Statistics

    - Statistics that indicates the spread of values

    - Ex.: variance

    13

    UNIV. DESCRIPTION: Location Statistics

    Let N grade values:-[z(xi), i = 1, . . . , N

    ], xis stand for the

    geographical coordinates.

    Location statistics:- Mean:

    mZ =z(x1) + z(x2) + ...+ z(xN )

    N

    mZ =1

    N

    Ni=1

    z(xi)

    - Mode: most likely value

    - Quantile or percentile:

    zq , such that Prop[z(xi) zq

    ]= q%

    - Median: z50, or 50th percentile

    14

    UNIV. DESCRIPTION: Dispersion Statistics

    Dispersion statistics:- Variance:

    s2Z =1

    N

    Ni=1

    [z(xi)mZ ]

    2 0

    - Standard deviation:

    sZ = +s2Z .

    - Coefficient of variation:

    coef. var. = sZ/mZ .

    - Interquartile range:

    z75 z25

    An easy way to compute the variance (Exercise 2):

    s2Z =( 1N

    Ni=1

    z2i

    )m2Z

    s2Z = Mean of Squares - Square of Mean

    15

    UNIV. DESCRIPTION: Statistics

    0

    10

    20

    30

    freq

    (%)

    Mean

    Au (g/t)

    f. 3e

    Variance

    0 4 8 12 16

    Mode

    25% 25%

    Au25 Au75

    Interquartile RangeAu75 Au25

    16

  • UNIV. DESCRIPTIONCoefficient of Variation

    What is it?- Another measure of spread

    Coef.Var. =Standard Deviation

    Mean

    Why?- Unit-less

    Example (Unit = oz/t)- Mean grade = 1 oz/t 30 g/t- Variance = 1.00 (oz/t)2

    - Coef.Var. = 1.00

    Unit Grade Variance Coef.Var.oz/t 1.00 1.00 1.00dwt 20.0 400. 1.00mopt 1000. 106 1.00g/t 30.0. 900. 1.00

    Rule of thumb- Coef.Var. < 1.5 = No problems- Coef.Var. > 3.0 = Problems

    17

    UNIV. DESC.: Weighted Mean and Variance

    Let N (Grade, Weight) values:

    -[zi, wi, i = 1, . . . , N

    ],N

    i=1 wi = 1

    Weighted mean

    mZ =

    Ni=1

    wi zi

    Weighted variance

    s2Z =

    Ni=1

    [wi (zi mZ)

    2]=( Ni=1

    wi z2i

    )m2Z

    = Example: Declustered mean and variance (to beseen later)

    18

    UNIV. DESC.: Declustered Histogram

    AU Naive Histogram

    Au (g/t)

    Fre

    quency

    0.0 2.0 4.0 6.0 8.0 10.0

    0.000

    0.040

    0.080

    0.120

    Nb. of data 4296

    mean 2.059std. dev. 2.179coef. var 1.058

    maximum 16.000minimum 0.000

    AU Declustered Histogram

    Au (g/t)

    Fre

    quency

    0.0 2.0 4.0 6.0 8.0 10.0

    0.000

    0.040

    0.080

    0.120

    Nb. of data 4296

    mean 1.763std. dev. 1.984coef. var 1.125

    maximum 16.000minimum 0.000

    Note 14% drop in grade due to declustering.19

    UNIV. DESCRIPTION: Various Means

    Some software list several means when computingstatistics.

    Arithmetic mean:

    maZ =1

    N

    Ni=1

    z(xi)

    Geometric mean:

    mgZ =N

    Ni=1

    z(xi)

    Harmonic mean:

    mhZ =

    (1

    N

    Ni=1

    1

    z(xi)

    )1

    Relation:mhZ mgZ maZ

    20

  • UNIVARIATE DESCRIPTION: Exercise 3

    Let 11 Au values

    .1 .2 .7 .8 .9

    1.2 2.0 2.4 3.5 5.7 18.0

    Draw the histogram, choosing a class interval so thatthe shape of the distribution is reproduced.

    Compute the:- mean, median;- variance, standard deviation;- coefficient of variation.

    Comment you results.

    HINTS for variance:

    - Variance = Mean of Squares - Square of Mean, e.g:

    s2Z =(

    1N

    Ni=1 z

    2i

    )m2Z

    - Compute the means of zi and z2i

    - Then compute the variance

    21

    UNIV. DESCRIPTION: Properties of m

    Demonstration given in Exercise 4

    Let two RVs X and Y , sampled at N locations.We know:

    - The means: mX and mY

    - Three constants: a, b, c

    Properties of the mean:

    - m(X+Y ) = mX +mY

    - m(aX) = a mX

    - m(X+a) = mX + a

    More generally

    - m(aX+bY +c) = a mX + b mY + c

    = The mean (average) is a linear operator.

    22

    UNIV. DESCRIPTION: Properties of s2

    Demonstration given in Exercise 4

    Let two RVs X and Y , sampled at N locations.We know:

    - The means: mX and mY

    - The variances: s2X and s2Y

    - Two constants: a, b

    Properties of the variance:

    - s2(X+Y ) =? (see bivariate statistics)

    - s2(aX) = a2s2X

    - s2(X+a) = s2X

    More generally

    - s2(aX+b) = a2s2X

    The variance is not a linear operator.

    23 24

  • UNIV. DESC.: Boxplot

    A graph summarizing a distribution essential statistics

    0

    Boxplot of AU grades

    f. 61

    Au (g/t)

    4

    8

    12

    16

    Minimum

    Lower Quartile

    Median

    Mean

    Upper Quartile

    Maximum

    Outliers}

    Multiple boxplots can be displayed on the same page

    Great display!

    25

    UNIV. DESC.: Boxplots

    f. 63

    AU BOXPLOTS

    DOM-03 DOM-04 DOM-05 DOM-06 DOM-07

    0.1 0.1

    1.0 1.0

    10.0 10.0

    100.0 100.0

    1000.0 1000.0

    4859Number of data Number of data4.7667Mean Mean21.9575Std. Dev. Std. Dev.4.6065Coef. of Var. Coef. of Var.680.0Maximum Maximum

    2.9Upper quartile Upper quartile1.0Median Median0.3Lower quartile Lower quartile0.01Minimum Minimum

    28050Number of data Number of data1.2245Mean Mean6.3134Std. Dev. Std. Dev.5.1559Coef. of Var. Coef. of Var.370.0Maximum Maximum

    0.9Upper quartile Upper quartile0.23Median Median0.1Lower quartile Lower quartile

    0.01Minimum Minimum

    20902Number of data Number of data0.7509Mean Mean3.7238Std. Dev. Std. Dev.4.9593Coef. of Var. Coef. of Var.398.0Maximum Maximum0.5Upper quartile Upper quartile0.11Median Median0.1Lower quartile Lower quartile0.01Minimum Minimum

    13793Number of data Number of data6.1199Mean Mean

    36.3533Std. Dev. Std. Dev.5.9402Coef. of Var. Coef. of Var.972.0Maximum Maximum4.5Upper quartile Upper quartile1.2Median Median0.2Lower quartile Lower quartile0.01Minimum Minimum

    19117Number of data Number of data10.6783Mean Mean45.2865Std. Dev. Std. Dev.

    4.241Coef. of Var. Coef. of Var.1000.0Maximum Maximum

    6.9Upper quartile Upper quartile2.25Median Median0.5Lower quartile Lower quartile0.01Minimum Minimum

    26

    UNIV. DESC.: Piecharts

    f. 64

    AU PIECHARTS

    6%DOM-03

    32%

    DOM-04

    24%DOM-05

    16%

    DOM-06

    22%

    DOM-07

    By number of samples(Total = 86721 samples)

    2%DOM-03

    29%

    DOM-04

    56%DOM-05

    6%

    DOM-06

    7%DOM-07

    By sample weight

    6%DOM-03

    17%

    DOM-04

    21%

    DOM-05

    18%DOM-06

    38%

    DOM-07

    By sample weight x grade

    Useful to show geological domain relative importances.

    27

    UNIV. DESC.: Orientation Plot

    Orientations of Consecutive Pairs in Same Hole

    < 10 pairs

    10 to 100 pairs

    100 to 1,000 pairs

    1,000 to 10,000 pairs

    > 10,000 pairs

    fig192

    E

    80

    70

    60

    50

    40

    30

    20

    10N350340

    330

    320

    310

    300

    290

    280

    W

    260

    250

    240

    230

    220

    210

    200

    190S

    170

    160

    150

    140

    130

    120

    110

    100

    -10 -20 -30 -40 -50 -60 -70 -80

    28

  • UNIV. DESC.: Standardized Variable

    Standardized random variables are often used in statis-tics. For example:

    - Affine correction (e.g. sample to block grade dis-tribution).

    Standardizing consists in subtracting the mean from thevariable, and then dividing by the standard deviation.

    Standardizing X mX

    sX

    f.176

    m

    0.0

    0.0

    1.0

    X mX ORIGINAL DISTRIBUTION

    X

    X mX X

    1

    2

    STANDARDIZED DISTRIBUTION

    Mean and variance of a standardized random variable(Exercise 5) :

    - Mean: 0- Variance: 1

    29

    UNIV. DESC.: Standardized Variable

    Example

    Let 2 AU values (X): 1 g/t, 9 g/t

    The mean, variance, and std. deviation are:

    mX = 5

    s2X =( 2i=1

    x2i

    )m2X = (1 + 81)/2 25 = 16

    sX = 4

    The standardized values (Y) are:

    (1 5)/4 = 1, (9 5)/4 = 1

    The standardized variable mean, variance, andstd. dev. are:

    mY = 0

    s2Y =( 2i=1

    y2i

    )m2Y = (1 + 1)/2 0 = 1

    sY = 1

    30

    UNIVARIATE STATISTICS: Summary 1/3

    Random Variable (RV) Z; Realization z.

    z(x), grade at location x, is interpreted as a realizationof the RV Z(x).

    Stationarity roughly means that characteristics of RVZ(x) do not depend on location x.

    Univariate Description:- histogram, boxplot, piechart;- mean, variance, standard deviation;- coefficient of variation;- quantiles, median.

    Properties

    - Mean

    - m(aX+bY +c) = amX + bmY + c

    - Variance:

    - Mean of Squares Square of Mean

    - s2(aX+b) = a2s2X

    31

    FACT SHEET

    Univariate Statistics (Description)

    mX = (1/N)

    xi

    s2X = (1/N)(xi mX)

    2 = (1/N)

    x2i m2X

    m(aX+bY +c) = a mX + b mY + c

    s2aX+b = a2s2X

    32

  • UNIVARIATE MODEL: Introduction

    Models are needed to go beyond description. Forexample:

    - To build estimators (prediction)- To simplify problems (e.g. using normal/lognormaldistribution)

    - To solve problems (e.g. conditions needed tobuild unbiased estimator)

    Examples of models:- Assumption that grade at location x is a realiza-tion random variable at that location

    - z(x) is realization of RV Z(x)- Assumption of stationarity

    33

    UNIV. MODEL: Probability Density Function (PDF)

    Description: histogram

    0

    10

    20

    30

    Freq

    (%)

    Au (g/t)0 4 8 12 16

    Histogram of Au grades

    f. 3b

    P(2. < Au < 4.)

    P (2 < AU 4) = 20%

    Model: probability density function (pdf)- Recall: integral

    0 8 160

    fZ(z)

    .1

    .2

    .3

    z

    f. 5a

    2 4

    P(2

  • UNIV. MODEL: Parameters

    Let Z(x) a RV. Histogram has Nc classes.f(z) is the pdf of Z(x).

    - Recall: weighted mean and variance

    frq Histogram

    f. 165

    f(z) pdf

    z zz1 , z2 , z3 ,...

    f1 , f2 ,...

    Description Model

    fi=1

    Mean: mZ Z

    mZ =

    Nci=1

    fizi Z =

    +

    zf(z)dz

    Quantile: zq zq

    P[zi zq

    ]= q

    i,zizq

    fi

    zq

    f(z)dz = q

    Variance: s2Z 2Z

    s2Z =

    Nci=1

    fi(zi mZ)2 2Z =

    +

    (z Z)2f(z)dz

    37

    UNIV. MODEL: Parameters

    frq Histogram

    f. 165

    f(z) pdf

    z zz1 , z2 , z3 ,...

    f1 , f2 ,...

    Description Model

    fi=1

    Std. Dev.: sZ Z

    sZ =s2Z Z =

    2Z

    Coef. Var.: cvZ cvZ

    sZ/mZ Z/Z

    Remarks- Greek instead of Roman letters are used for modelparameters: e.g. and instead of m and s.

    38

    UNIV. MODEL: Properties of and 2

    Properties of and 2

    - Are similar to properties of m and s2.

    - Demonstration is given in Exercise 6.

    Let 2 RVs, X and Y , and 3 constants, a, b, and c.

    Properties of the mean:

    - (aX+bY+c) = a X + b Y + c

    = The mean is a linear operator

    Properties of the variance:

    - 2(aX+b) = a22X

    - 2(X+Y ) =?

    39

    UNIV. MODEL: Notation

    Mean: () E()

    - E() stands for Expectation

    - Ex.: (X+Y ) = E(X + Y )

    Variance: 2() V AR()

    - Ex.: 2(aX) = V AR(aX)

    Standard Deviation: () STDV ()

    40

  • UNIV. MODEL: Expectations - Moments

    Suppose:- RV Z with density f ;- A function g(Z).

    Then:

    E[g(Z)] =

    +

    g(z)f(z)dz

    where E[] stands for Expectation.

    Examples:

    g(Z) E[g(Z)]

    1 1Z Z (Moment of order 1)(Z Z)

    2 2Z (Moment of order 2)(Z Z)

    3 Moment of order 3(Z Z)

    4 Moment of order 4(Z Z)

    3/3 Skewness(Z Z)

    4/4 Kurtosis (peakedness)

    Some software list the skewness and kurtosis whencomputing statistics.

    41

    UNIV. MODEL: Skewness - Kurtosis

    Skewness: E(Z- mZ)3

    s3Z

    Symmetry

    = 0

    < 0

    Modez50mZ

    > 0

    Modez50

    mZ

    Kurtosis: E(Z- mZ)4

    s4Z = 3 (Normal)> 3

    < 3

    s2Z

    Same for all

    42

    43 44

  • UNIV. MODEL: Normal Distribution

    Let a RV X with mean X and variance 2X

    X is normally distributed if its pdf gX(x) is:

    gX(x) =12pi2X

    exp

    [(x X)

    2

    22X

    ].

    X

    X

    gX(x)

    X: MeanX: Variance

    x

    f. 1

    22

    The notation is:

    X N(X , 2X),

    A normal distribution is fully defined by its mean, X ,and variance, 2X

    45

    UNIV. MODEL: Normal Distribution

    Important probabilities:

    P (X X X X + X) = 68%

    P (X 2X X X + 2X) = 95%

    X

    gX(x)

    68%95%

    f. 2

    13.5% 13.5%

    X-2X X-X X+2X X+XX

    34% 34%

    = Used for estimation confidence intervals.Cf. Section Estimation Variance.

    46

    U. MODEL: Normal Cumulative Distribution

    X N(X , 2X)

    The cdf of X, GX(x) is such that:

    GX(b) = P (X b)

    0

    .5

    1.GX(x)

    b x

    P(X

  • UNIV. MODEL: Lognormal Distribution

    If X is lognormally distributed, then ln(X) is normallydistributed

    = ln(x)50

    fX(x)

    x50

    X ~ Lognormal ln(x) ~ Normal

    f 42a

    x ln(x)

    X ~ LN(,2) ln(X) ~ N(,2)

    gln(X)

    (ln(x))

    X lognormal distribution parameters: = e+

    2

    2

    2 = 2(e2

    1)

    x50 = e

    Y = ln(X) normal distribution parameters:

    = ln() 2

    2

    2 = ln(1 +2

    2)

    y50 = = ln(x50) 6= ln()

    49

    UNIV. MODEL: Probability Plots

    Probability of exceeding grade

    Gra

    de

    PROBABILITY PLOT - DISTRIBUTION IS NOT NORMAL

    0

    10

    20

    30

    40

    50

    60

    70

    80

    90

    100

    f. 65

    Gra

    de

    LOG PROBABILITY PLOT - DISTRIBUTION IS APPROX. LOGNORMAL

    0.1

    0.2

    0.3 0.4 0.5

    1

    2

    3 4 5

    10

    20

    30 40 50

    100

    200

    300 400 500

    1000

    99.99 99.9 99.8 99 98 95 90 80 70 60 50 40 30 20 10 5 2 1 0.5 0.2 0.1 0.01

    Probability of exceeding grade

    99.99 99.9 99.8 99 98 95 90 80 70 60 50 40 30 20 10 5 2 1 0.5 0.2 0.1 0.01

    50

    UNIV. MODEL: Poisson Distribution

    Occurs for discrete variable randomly distributed withinvolume. Ex: No of AU grains within a 50g pulp sample.

    = 6 = 7

    0 1 2 3 4

    = 1

    = 2 = 3

    = 0.5 = 0.75

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    = 4 = 5

    f. 142

    Poisson

    Distributions

    Prob (X=k)

    k(Salamd, Ingamells, 1974)

    Parameters ( =mean = variance)):

    Prob(x = k) =ek

    k!

    = Sample size matters!51 52

  • UNIV. STATISTICS: Summary 2/3

    Random Variable (X)

    Probability: P (X < a)

    Probability distribution function (pdf), cdf- Expectation or mean: E(X) = X- Variance: 2X- Quantiles: x25, x50, x75- Coef. variation: X/X- Skewness, kurtosis

    Normal distribution- Bell shape- Fully defined by its mean and variance- Notation: X N(X ,

    2X)

    - P (X X < X < X + X) = 68%- Straight line on probability paper

    Lognormal distribution- Straight line on log-probability paper

    Poisson distribution

    53

    FACT SHEET

    Univariate Statistics (Model)

    E(X) = X

    V ar(X) = 2X = E(X X)2 = E(X2) 2X

    E[aX + bY + c] = aE[X] + bE[Y ] + c

    V ar[aX + b] = a2V ar[X]

    54

    55 56

  • Geology Model

    A good geology model is most important.

    The geologymodel should tell something about the min-eralization, the structures, etc.

    A geology model consists of several geology domains.Each domain has its own statistical characteristics thatdo not depend on location (homogeneity station-arity).

    Boundaries between geological domains can be hard,soft, or in-between.

    Statistics to be computed per geology domain andwithin the EDA envelope.

    Useful statistics to check differences between domains:- Multiple boxplots, histograms- Multiple cumulative distribution- Contact plots- Variograms (different directions)

    57

    Overb

    urd

    en

    Ox. S

    apro

    lite

    Sul. S

    apro

    lite

    Sapro

    ck

    Crb

    Lch B

    rk

    Crb

    Stb

    l B

    rk

    Dyke

    Wst

    0.0

    2.0

    Au

    0.5

    4 g

    /t

    DD

    H

    21400

    21400

    21600

    21600

    21800

    21800

    22000

    22000

    22200

    22200

    22400

    22400

    -400

    -400

    -300

    -300

    -200

    -200

    -100

    -100

    0

    0

    100

    100

    200

    200

    PA

    RA

    DIS

    E IN

    C.

    SE

    CT

    ION

    : 9200 N

    Dep

    osit

    : P

    ara

    dis

    eG

    EO

    LO

    GY

    + D

    DH

    SA

    MP

    LE

    S

    las

    cri

    s_

    01

    2

    58

    EDAPurpose

    Data familiarization

    Detecting possible errors

    Identifying/confirming different mineralizations

    Answering questions such as:

    - Ordinary or indicator kriging ?

    - What trimming values ?

    - Mean and variance ?

    Providing information for:

    - Model validation

    - Reconciliation

    59 60

  • EDA Envelope (1/4)

    An EDA envelope is a 3D envelope within which statis-tics are computed.

    Which statistics?- Declustered mean, variance.- Choice of trimming values.- Choice of indicator cut-off grades.- Variogram.- Resource block model validation.

    Why an EDA envelope?

    - To restrict statistics to where it matters.

    - Fairly tight around sample locations

    - No extensive waste areas

    - Well covered with samples

    - To reduce the impact of fringes whendeclustering and computing statistics.

    - To make sure that comparisons made duringvalidation (e.g. sample vs. kriged averagegrades) correspond to the same material (i.e.material within the EDA envelope for both samplesand kriged estimates).

    61

    EDA Envelope (2/4)

    Project Area

    EDA Envelope

    Very low grades"Other" grades

    f.185

    How to define and EDA envelope?- Dont be too precise.- Fairly tight around reasonably well sampled zone.- Around material that matters. Significant wastezone well below cut-off can be ignored.

    - Digitize on a series of benches, then wireframe andcreate a 0/1 indicator grid.

    - Generally, geology can be ignored when definingthe EDA envelope.

    62

    EDA Envelope (3/4)

    431600

    431600

    431800

    431800

    432000

    432000

    432200

    432200

    432400

    432400

    432600

    432600

    432800

    432800

    80

    78

    00

    80

    78

    00

    80

    80

    00

    80

    80

    00

    80

    82

    00

    80

    82

    00

    80

    84

    00

    80

    84

    00

    80

    86

    00

    80

    86

    00

    80

    88

    00

    80

    88

    00

    80

    90

    00

    80

    90

    00

    RL = 302.5 +/- 2.5 -- 5 ft Bench Toe = 300

    Paradise - 10m Composite Au values

    1.0

    10m cmp AU (g/t)

    0.2

    2.0

    EDA Envelope

    pdi-10

    63

    EDA Envelope (4/4)

    Some remarks:- The EDA envelope is used to compute statistics.- All data within and outside the envelope can beused at the estimation step.

    - There can be estimates outside the EDA envelope.- A geology model has to be considered in additionto the envelope.

    64

  • DECLUSTERING: Introduction

    Clusters of samples are common in the mining industry.

    f.116

    CLUSTERS

    N

    Potential problem:- Clusters are often located within high grade zones.Their impact can be a serious overestimation ofthe average grade and variability if not accountedfor.

    Solution:- Declustering.- Objective is to reduce the weight of each clus-tered data +/ proportionally to the cluster sam-pling density.

    65

    DECLUSTERING: Methods

    Cell Declustering

    - Superimpose a grid of cells on the data;- Cell size roughly the average sample spacing, ignor-ing clusters. There is on average 1 data per cell,where clustered. The cell can be rectangular.

    - The declustered weight of a given sample is 1/Ncwhere Nc is the number of samples located in thecorresponding cell.

    1

    2

    86

    2

    1

    f.184b

    2 per cell ==> w=0.50 per sample

    1 per cell ==> w=1.0 per sample

    Average

    Naive: 1 + 2 + 8 + 6 + 1 + 26

    = 3.33 g/t Au

    Declustered:

    8 + 62

    1 + 2 + + 1 + 2

    5= 2.60 g/t Au

    66

    DECLUSTERING: Methods

    Polygonal declustering

    - In 2D, the declustered weights are proportional tothe polygons of influence of the corresponding data.

    - In 3D, the same principle is applied on a benchbasis.

    1

    2

    86 2

    1

    f.184C

    Average

    Naive: 1 + 2 + 8 + 6 + 1 + 26

    = 3.33 g/t Au

    Declustered:

    = 2.54 g/t Au

    37*1 + 28*2 + 17*8 + 21*6 + 47*2 + 44*137 + 28 + 17 + 21 + 47 + 44

    Area : 37 17 44

    Area : 28 21 47

    67

    DECLUSTERING: Methods

    Kriging

    - Kriging is a good declustering tool (see later).- A regular grid of cells is superimposed on the data.The size of the cell does not matter too much.

    - The cells are kriged using the samples. The sam-ple kriging weights are kept in memory.

    - The declustered weight of a given sample is thesum of the corresponding kriging weights kept inmemory.

    - Note:

    - Declustering depends very much on the vari-ogram model (and nugget value).

    68

  • DECLUSTERING: Methods

    Nearest Neighbour Model

    - Data is used to estimate a regular cell/block model.Closest data is used to estimate each block.

    - Resulting distribution is the distribution ofestimated blocks.

    - The shape of the resulting distribution is verysimilar to the shape of the polygonally declustereddistribution.

    - Advantage:

    - Can be done with commercial software.- Disadvantage:

    - Does not attach declustered weights to sam-ples.

    - Notes:

    - Usual AMEC procedure.

    - NN model block size should be small, other-wise many samples may not be considered.

    69

    DECLUSTERING: Methods

    Automatic cell declustering

    - Based on assumption that clusters are always inhigh grade zone. The naive average is thereforeoverestimated.

    - Several cell size are automatically used for declus-tering.

    - The selected cell size is that one that gives the low-est average.

    DeclusteredMean

    OptimumCell Size

    f.177

    0

    Tiny HugeCell Size

    Note: tiny or huge cell size No declustering

    - Nice in theory, but often inconclusive in practice. Not recommended.

    70

    DECLUSTERING: Methods

    Note 1:- Some declustered weights can be very large due to:

    - huge polygonal area (on the fringe)- special sample location (start/end of hole)

    - The solution consists in:- setting a maximum value when declustering- declustering within an EDA envelope

    Note 2:- Polygonal declustering incomplete due to declus-tering radius smaller than small polygons.

    - Solution consists in:- Checking maps of declustered areas- Increasing declustering radius and using EDAenvelope to control the fringes

    - Check declustering weights and eventuallytrim them.

    Useful displays- Histogram of weights- Maps of weight values- Maps of declustered areas

    71

    DECLUSTERING: Statistics

    Let N AU values[z(xi), i = 1, . . . , N

    ]and[

    wi, i = 1, . . . , N]rescaled declustered weights such

    that:Ni=1

    wi = 1

    Declustered statistics are:- Mean:

    mZ =Ni=1

    wiz(xi)

    - Variance:

    s2Z =

    Ni=1

    wi[z(xi)mZ

    ]2

    =

    Ni=1

    wiz2(xi) m

    2Z

    - Standard deviation: sZ =s2Z .

    - Median: z50 such that the sum of the declusteredweights of the values less than z50 is 0.5.

    Note: if pairs of values are available, the covariance (seebivariate statistics) can also be declustered.

    72

  • DECLUSTERING: Example (1/3)

    AU Naive Histogram

    Au (g/t)

    Fre

    quency

    0.0 2.0 4.0 6.0 8.0 10.0

    0.000

    0.040

    0.080

    0.120

    Nb. of data 4296

    mean 2.059std. dev. 2.179coef. var 1.058

    maximum 16.000minimum 0.000

    AU Declustered Histogram

    Au (g/t)

    Fre

    quency

    0.0 2.0 4.0 6.0 8.0 10.0

    0.000

    0.040

    0.080

    0.120

    Nb. of data 4296

    mean 1.763std. dev. 1.984coef. var 1.125

    maximum 16.000minimum 0.000

    -14% change in grade73

    DECLUSTERING: Example (2/3)

    Histogram of Declustered Weights

    Decl. Weight

    Fre

    qu

    en

    cy

    .01 .1 1. 10. 100.

    0.000

    0.040

    0.080

    0.120

    Nb. of data 4296

    mean 3.109std. dev. 4.164coef. var 1.339

    maximum 100.000minimum 0.037

    Very few excessive weights. Keep as is, or trim to 40.

    74

    DECLUSTERING: Example (3/3)

    Polygons

    1.

    5200.

    Geology

    75600

    75600

    75800

    75800

    76000

    76000

    76200

    76200

    76400

    76400

    94

    40

    0

    94

    40

    0

    94

    60

    0

    94

    60

    0

    94

    80

    0

    94

    80

    0

    95

    00

    0

    95

    00

    0

    95

    20

    0

    95

    20

    0

    95

    40

    0

    95

    40

    0

    Declustering polygons - Elev: 202.5

    Incomplete declustering in the Northern portion of themap and in elongated domain in the SW.

    75

    DECLUSTERING: Exercise 7

    Let the following sampling situation:

    f.118

    N

    100m

    What would be a reasonable cell declustering size?

    76

  • COMPOSITING 1/3

    Support size (point, 2 m sample, block, etc.) is im-portant.

    - Different support sizes in different variabilities.- Blocks are less variable than samples.

    In theory, samples must be representative of the pop-ulation. 5m samples are not representative of a 1 msample population.

    (Most) estimation algorithms do not account for samplesize, e.g. do not make the difference between a 10 anda 1 m sample.

    Solution: composite samples so that resulting com-posite lengths are identical.

    "Hard" Geological Boundary

    f. 178

    OriginalSamples

    RegularComposites

    Rock A

    Rock B

    77

    COMPOSITING 2/3

    Compositing may be required if:- Sample lengths are much different: average lengthof 1.5 m, many 50 cm long samples centered onhigh grade veins.

    Before compositing:- Histogram of sample lengths.- Histograms of sample grades per interval of lengths.- Eventually trim or cut very high grade (outliers)to avoid smearing them over much longer lengths(More on outliers in Section Bivariate Statistics).

    Composite length should be such that:- Enough variability is retained when estimating.- No geological boundary crossing.- Do not exceed block size:

    - 5m benches: 2/3 m composites OK; 5m ismaximum.

    If possible, composite only what is needed, i.e. leaveuntouched composites if in specified Min/Max limits.

    78

    COMPOSITING 3/3

    Impact of compositing:- Loose original samples;- Grade variability reduced;- Number of samples reduced;- Geological contacts can be smeared out.

    If an original sample length is very long, compositingwill split it in many regular smaller lengths.

    - OK if the original grade is very low.- Problem if original grade is very high, because thelocation of the high grade is unknown.

    After compositing:- Check for smallest composites

    - Eventually discard some of them.- Check pre/post length weighted histograms.

    - Means should be the same.- Variance should decreases after compositing.

    Visual check: display drill holes with the compositedgrade histogram on one side and the original grade his-togram on the other side.

    79 80

  • OTHER EDA TOOLS: Checking Trends

    2.5ft Composites, Declustered - All Domains

    Grade Profiles by Coordinate Axes

    Number of Data by Coordinate Axes

    Composites

    Composites

    0.0

    0.0 0.443 0.886 1.329 1.772 2.215 2.658 3.101 3.544 3.987

    39300.0

    39607.0

    39914.0

    40221.0

    40528.0

    40835.0

    41142.0

    41449.0

    41756.0

    42063.0

    42370.0

    bh_met1_tr (g/t)

    Easting

    (m)

    0.0 0.443 0.886 1.329 1.772 2.215 2.658 3.101 3.544 3.987

    77820.0

    78050.0

    78280.0

    78510.0

    78740.0

    78970.0

    79200.0

    79430.0

    79660.0

    79890.0

    80120.0

    bh_met1_tr (g/t)

    Nort

    hin

    g(m

    )

    0.0 0.443 0.886 1.329 1.772 2.215 2.658 3.101 3.544 3.987

    -320.0

    -245.5

    -171.0

    -96.5

    -22.0

    52.5

    127.0

    201.5

    276.0

    350.5

    425.0

    bh_met1_tr (g/t)

    Ele

    vation m

    (m)

    0.0 266.2 532.4 798.6 1064.8 1331.0 1597.2 1863.4 2129.6 2395.8 2662.0

    39300.0

    39607.0

    39914.0

    40221.0

    40528.0

    40835.0

    41142.0

    41449.0

    41756.0

    42063.0

    42370.0

    Number of Data

    Easting

    (m)

    0.0 266.2 532.4 798.6 1064.8 1331.0 1597.2 1863.4 2129.6 2395.8 2662.0

    77820.0

    78050.0

    78280.0

    78510.0

    78740.0

    78970.0

    79200.0

    79430.0

    79660.0

    79890.0

    80120.0

    Number of Data

    Nort

    hin

    g(m

    )

    0.0 266.2 532.4 798.6 1064.8 1331.0 1597.2 1863.4 2129.6 2395.8 2662.0

    -320.0

    -245.5

    -171.0

    -96.5

    -22.0

    52.5

    127.0

    201.5

    276.0

    350.5

    425.0

    Number of Data

    Ele

    vation

    (m)

    f197a

    f197a

    81

    OTHER EDA TOOLS: Grade profiles

    Comparison of Au,Ag,Cu,Zn,C and S for DDH161

    Depth Depth

    0.01

    0.01

    0.1

    0.1

    1.0

    1.0

    10.0

    10.0

    100

    100.

    Gold + Silver (in g/t)

    0.01

    0.01

    0.1

    0.1

    1.0

    1.0

    10.0

    10.0

    100

    100

    Copper + Zinc (in %)

    0.01

    0.01

    0.1

    0.1

    1.0

    1.0

    10.0

    10.0

    100

    100

    Carbon + Sulphur (in %)

    0

    20

    40

    60

    80

    100

    120

    140

    160

    180

    200

    220

    0

    20

    40

    60

    80

    100

    120

    140

    160

    180

    200

    220

    82

    OTHER EDA TOOLS Boxplots

    f. 63

    AU BOXPLOTS

    DOM-03 DOM-04 DOM-05 DOM-06 DOM-07

    0.1 0.1

    1.0 1.0

    10.0 10.0

    100.0 100.0

    1000.0 1000.0

    4859Number of data Number of data4.7667Mean Mean21.9575Std. Dev. Std. Dev.4.6065Coef. of Var. Coef. of Var.680.0Maximum Maximum

    2.9Upper quartile Upper quartile1.0Median Median0.3Lower quartile Lower quartile0.01Minimum Minimum

    28050Number of data Number of data1.2245Mean Mean6.3134Std. Dev. Std. Dev.5.1559Coef. of Var. Coef. of Var.370.0Maximum Maximum

    0.9Upper quartile Upper quartile0.23Median Median0.1Lower quartile Lower quartile

    0.01Minimum Minimum

    20902Number of data Number of data0.7509Mean Mean3.7238Std. Dev. Std. Dev.4.9593Coef. of Var. Coef. of Var.398.0Maximum Maximum0.5Upper quartile Upper quartile0.11Median Median0.1Lower quartile Lower quartile0.01Minimum Minimum

    13793Number of data Number of data6.1199Mean Mean

    36.3533Std. Dev. Std. Dev.5.9402Coef. of Var. Coef. of Var.972.0Maximum Maximum4.5Upper quartile Upper quartile1.2Median Median0.2Lower quartile Lower quartile0.01Minimum Minimum

    19117Number of data Number of data10.6783Mean Mean45.2865Std. Dev. Std. Dev.

    4.241Coef. of Var. Coef. of Var.1000.0Maximum Maximum

    6.9Upper quartile Upper quartile2.25Median Median0.5Lower quartile Lower quartile0.01Minimum Minimum

    83

    OTHER EDA TOOLS Multiple Probability Plots

    Different Mineralizations

    PROBALITY OFEXCEEDING GRADE

    GR

    AD

    E

    PROBALITY Plot - Domain 04 - 07 - Au Dwt

    0.0100

    0.0200 0.0300 0.0500 0.1000

    0.2 0.3 0.4 0.5 1.0

    2.0 3.0 5.0 10.0

    20.0 30.0 50.0 100.0

    200.0 300.0 500.0

    99.99 99.9 99 98 95 90 80 70 60 50 40 30 20 10 5 2 1 0.5 0.1 0.01

    PROBALITY OFEXCEEDING GRADE

    GR

    AD

    E

    0.0100

    0.0200 0.0300 0.0500 0.1000

    0.2 0.3 0.4 0.5 1.0

    2.0 3.0 5.0 10.0

    20.0 30.0 50.0 100.0

    200.0 300.0 500.0 900.0

    99.99 99.9 99 98 95 90 80 70 60 50 40 30 20 10 5 2 1 0.5 0.1 0.01

    PROBALITY OFEXCEEDING GRADE

    GR

    AD

    E

    0.0100

    0.0200 0.0300 0.0500 0.1000

    0.2 0.3 0.4 0.5 1.0

    2.0 3.0 5.0 10.0

    20.0 30.0 50.0 100.0

    200.0 300.0 500.0 900.0

    99.99 99.9 99 98 95 90 80 70 60 50 40 30 20 10 5 2 1 0.5 0.1 0.01

    PROBALITY OFEXCEEDING GRADE

    GR

    AD

    E

    0.0100

    0.0200 0.0300 0.0500 0.1000

    0.2 0.3 0.4 0.5 1.0

    2.0 3.0 5.0 10.0

    20.0 30.0 50.0 100.0

    200.0 300.0 500.0 900.0

    99.99 99.9 99 98 95 90 80 70 60 50 40 30 20 10 5 2 1 0.5 0.1 0.01

    PROBALITY OFEXCEEDING GRADE

    GR

    AD

    E

    0.0100

    0.0200 0.0300 0.0500 0.1000

    0.2 0.3 0.4 0.5 1.0

    2.0 3.0 5.0 10.0

    20.0 30.0 50.0 100.0

    200.0 300.0 500.0 900.0

    99.99 99.9 99.8 99 98 95 90 80 70 60 50 40 30 20 10 5 2 1 0.5 0.1 0.01

    pdi_12

    84

  • UNIV. STATISTICS: Summary 3/3

    Geology model- Geology must be relevant to mineralization.

    EDA envelope

    Clusters:- Overestimation possible- Various declustering techniques:

    - Cell, polygonal, kriging, NN- Declustered mean, variance, and covariance.

    Compositing

    Other EDA tools:- Assay above cut-off statistics- Checking spatial anomalies- Checking trends- Checking grade profiles

    85

    FACT SHEET

    No New Formulas

    86

  • BIVARIATE STATISTICS

    Content

    Bivariate description:- Graphs: scattergrams,...- Statistics: covariance, coefficient of correlation,...

    Bivariate model:- Intuitive introduction

    Marginal and conditional statistics

    Applications:- Regression- Trimming, cutting outliers- Checking pairs of values- Checking geological boundaries

    1

    BIVARIATE DESCRIPTION

    Scattergram Bivariate Histogram

    List of pairs of values:

    X (Zn) 4.61 6.07 4.60 7.89 ...

    Y (Pb) 3.2 4.9 3.9 5.3 ...

    Scattergram:

    x

    y

    Scattergram

    f. 89

    Bivariate histogram:

    1

    2

    2

    1

    2

    1 3

    2

    1

    1

    1 1

    1 1

    3

    2

    1

    1

    x

    yBiv. histogram

    f.90

    2

    BIV. DESCRIPTION: Examples

    cu_new

    (%

    )

    cu_pre (%)

    Arithmetic ScalePast VS Current Cu Sample Values (

  • BIV. DESCRIPTION: Joint Statistics

    X and Y considered jointly:

    Covariance:

    Cov(X, Y ) =1

    N

    Ni=1

    ([xi mX ][yi mY ]

    ),

    =

    (1

    N

    Ni=1

    [xiyi]

    )mXmY .

    Cov.= Mean of Products Product of Means

    Coefficient of correlation:

    Cor(X, Y ) =Cov(X, Y )

    sXsY

    1 Cor(X, Y ) 1

    Notes:

    - Cov(X,X) = V ar(X)

    - Cor(X,X) = 1

    5

    BIV. DESCRIPTION: Examples of Correlation

    Various coefficients of correlation (Davis, 1973):

    Cor = 0.98 Cor = 0.54

    Cor = 0.16Cor = 0.80

    Cor = undefined

    Cor = 0

    y

    x

    f. 33

    y

    x

    y

    x

    y

    x

    y

    x

    y

    x

    The covariance and coefficient of correlation are mea-sures of linear correlation.

    6

    BIV. DESCRIPTION: Induced Correlation

    Pebble example (Davis, 1973):

    A

    A A

    A B

    B

    B

    B

    C C

    C C

    AXIS "A" > AXIS "B" > AXIS "C"

    AXES "A" , "B" , "C", PICKED RANDOMLY

    (Davis, 1976)

    f.95

    Two elements A and B such that A+ B = 100%.

    A

    B

    f.96

    100

    100

    Similar but less obvious results with more than two el-ements that add to a constant.

    7

    BIV. DESCRIPTION: Induced Correlation

    X versus X2 :

    VAR 1: V1 ~ N(0., 1.)

    Fre

    qu

    en

    cy

    V1

    -3.0 -2.0 -1.0 0.0 1.0 2.0

    0.00

    0.02

    0.04

    0.06

    0.08

    0.10 No. of Data 1000

    Mean -0.04Std. Dev 1.00

    Max. 4.00Min. -3.34

    VAR 2: V2 = V1 ** 2

    Fre

    qu

    en

    cy

    V2

    0.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 9.00

    0.00

    0.10

    0.20

    0.30

    v2

    v1

    v1/v2 scattergram

    -3.36 -2.36 -1.36 -0.36 0.64 1.64 2.64 3.64

    0.0

    4.0

    8.0

    12.0

    16.0

    Correlation -0.06Rank correlation -0.05

    No. of Data 1000

    Mean 0.99Std. Dev 1.44

    Max. 16.07Min. 0.00

    No. of Data 1000

    Max. 16.07Min. 0.00

    cor_induced.eps

    8

  • BIV. DESCRIPTION: Exercise 8

    Let 11 pairs of AuS values

    X (Au) .1 .2 .7 .8 .9 1.2

    Y (S) 1.01 2.54 0.60 1.72 2.63 1.82

    X (Au) 2.0 2.4 3.5 5.7 18.0

    Y (S) 3.25 2.34 3.68 4.89 2.24

    Draw the scattergram

    Compute the:- covariance;- coefficient of correlation.

    Comment you results.

    HINTS

    - The stats ofX (mx, s2x, sx)have already been computed.

    - For the covariance

    - Cov = Mean of Products - Product of Means

    Cov(X, Y ) =(

    1N

    Ni=1 xiyi

    )mXmY

    - Compute the mean of xiyi, then mx and my .

    - For the correlation coefficient, use the fact that:

    - Variance of X = s2x = 24.3 (computed earlier)

    - Variance of Y = s2y = 1.32 (graciously provided).9

    10

    11 12

  • BIV. DESC.: Rank Correlation Coefficient

    Previous Exercise 8 Au and S values are:

    X (Au) .1 .2 .7 .8 .9 1.2

    Y (S) 1.01 2.54 0.60 1.72 2.63 1.82

    X (Au) 2.0 2.4 3.5 5.7 18.0

    Y (S) 3.25 2.34 3.68 4.89 2.24

    The ranks RAu and RS are:

    Au 1 2 3 4 5 6 7 8 9 10 11

    S 2 7 1 3 8 4 9 6 10 11 5

    The rank correlation coefficient is computed from theranks of the data values, not the values themselves.

    The rank correlation coefficient is:- Robust with respect to outliers- Often a better measure of correlation for preciousmetal deposits.

    Example with previous Exercise 8 data:- Correlation coefficient = 0.22 (cf. solution)- Rank correlation coefficient = 0.85 (cf. below)

    13

    BIV. DESC.: Rank Correlation Coefficient

    Compute intermediate quantities:

    - N = 11: number of data

    -

    RAu,i =

    RS,i = 66

    -

    R2Au,i =

    R2S,i = 506

    -

    RAu,iRS,i = 472

    Compute covariance of the ranks:

    - mRau =1N

    RAu,i = 55/11 = 5

    - mRs = mRau = 5

    - Cov(RAu, RS) =(1N

    AuiSi

    )mRaumRs

    = 472/11 5 5 = 17.91

    Compute rank correlation coefficient:

    - s2Rau =(1N

    R2Au,i

    )m2Rau

    = 506/11 52 = 21

    - s2Rs = s2Rau = 21

    - Cor(RAu, RS) = Cov(RAu, RS)/sRausRs = 0.85

    The rank correlation coefficient is not much affected bythe 18 g/t Au outlier.

    14

    BIV. DESCRIPTION: Properties of m and s2

    Demonstration given in Exercises 4 and 9.

    Let two RVs X and Y . We have:- The means: mX and my- The variances: s2X and s

    2y

    - The covariance: Cov(X,Y)- Three constants: a, b, c

    Properties of 1 RV (X; recall):

    maX = a mX mX+b = mX + b

    s2aX = a2s2X s

    2X+b = s

    2X

    Properties of 2 RVs (X & Y; new):

    mX+Y = mX +mY

    s2X+Y = s2X + s

    2Y + 2Cov(X, Y ) (NEW!)

    More generally

    maX+bY +c = a mX + b mY + c

    s2aX+bY +c = a2s2X + b

    2s2Y + 2abCov(X, Y ) (NEW!)

    The average is a linear operator15

    BIV. DESCRIPTION: Conditional Statistics

    Conditional statistics:

    1

    2

    2

    1

    2

    1 3

    2

    1

    1

    1 1

    1 1

    3

    2

    1

    1

    x

    yy|xBiv. histogram

    f.92

    x|y

    Conditional

    histograms

    - Univariate statistics computed on X, given that Ytakes some value(s), and vice versa.

    - Conditional Means: mX|Y=y and mY |X=x- Conditional Variances: s2X|Y=y and s

    2Y |X=x

    16

  • BIVARIATE MODEL: Introduction

    As in the univariate case, models are needed to go be-yond description to:

    - Build estimators (regression, prediction)- Solve problems (minimizing spread or errors whenestimating)

    - Demonstrate properties

    In fact, multivariate models are sometimes needed:- Simulation (sequential gaussian)

    Following section is just a quick glance at the bivariatemodel. Purpose is to see parallel between descriptionand modeling when two variables are involved.

    17

    BIV. MODEL: Bivariate pdf fXY (x, y)

    Description: bivariate histogram:

    1

    2

    2

    1

    2

    1 3

    2

    1

    1

    1 1

    1 1

    3

    2

    1

    1

    x

    yBiv. histogram

    f.90

    P (4 < X 5 & 4 < Y 5) = 2/26 = 7.7%.

    Model: biv. probability distribution function (pdf):

    x

    y

    a b

    cd

    f.97

    fXY(x,y)

    - P (a < X b & c < Y d) = Double Integral.

    18

    BIV. MODEL: Marginal pdf s

    Description: marginal histograms:

    1

    2

    2

    1

    2

    1 3

    2

    1

    1

    1 1

    1 1

    3

    2

    1

    1

    x

    yBiv. histogram

    Y

    histog.

    X

    histog. f.91

    Model: marginal probability density functions (pdfs):

    x

    y

    fXY

    (x,y)

    f.98

    fY

    (y)

    fX

    (x)

    Marginal PDFS fX

    (x) & fY

    (y)

    19

    BIV. MODEL: Joint Statistics

    Description (recall)- Covariance:

    Cov(X, Y ) =1

    N

    Ni=1

    ([xi mX ][yi mY ]

    )

    =

    (1

    N

    Ni=1

    [xiyi]

    )mXmY .

    - Correlation coefficient:

    1 Cor(X, Y ) =Cov(X, Y )

    sXsY 1

    Model (new)- Covariance:

    X,Y = E[(X X)(Y Y )]

    = E(XY ) XY

    - Correlation coefficient:

    1 XY =XYXY

    1

    20

  • BIV. MODEL: Parameter Properties

    Properties of and 2 similar to properties ofm amd s2.

    We have:- Two RVs (e.g. Au & Cu): X, Y- Three constants: a, b, c

    Properties of 1 RV (X; recall):

    aX = aX X+b = X + b

    2aX = a22X

    2X+b =

    2X

    Properties of 2 RVs (X & Y; new):

    X+Y = X + Y

    2X+Y = 2X +

    2Y + 2X,Y

    More generally:

    aX+bY+c = aX + bY + c

    2aX+bY+c = a22X + b

    22Y + 2abX,Y

    21

    BIV. MODEL: Conditional pdf s

    Description: conditional histogram of X given Y = y0:

    y

    1

    2

    2

    1

    2

    1 3

    2

    1

    1

    1 1

    1 1

    3

    2

    1

    1

    x

    f.92a

    x/yConditionalhistogramof X given Y

    Biv. histogram

    Conditionon Y

    Model: conditional pdf of X given y = y0:

    x

    y

    fXY(x,y)

    yo

    fX|Y=yo(x)

    Conditional pdf

    f.100a

    - Parameters: X|Y=yo, X|Y=yo, . . .

    Same for Y given x.

    22

    23 24

  • REGRESSION: Introduction

    Function summarizing some conditional statistics:- mean of a variable Y given another variable X orvice versa;

    - median of Y given X;- etc.

    Used for prediction, or to study trend between 2 vari-ables.

    Obtained by minimizing deviations between theexperimental values and the regression line.

    Example from Davis (1973):

    X

    Y

    A

    B

    C

    f.102

    Mo

    istu

    re

    Depth

    (Davis, 1976)

    - A: minimize deviations in Y (Y |X: Y given X);- C: minimize deviations in X (X|Y : X given Y );- B: minimize joint deviations;

    = 3 possible regressions (of the mean).25

    LINEAR REGRESSION

    X

    Y

    f.103a

    Y|X

    X|Y

    Binormal Cloud

    of points

    Two regression lines: Y |X and X|Y

    Optimal if samples are (bi)normally distributed and notclustered.

    Regression of Y |X = x (binormal case):Y |X=x = Y +

    YX

    XY (x X)

    2Y |X=x = 2Y (1

    2XY )

    Main problem is lack of normality. Solution:- Log transform prior to linear regression;- Polynomial regression;- Smoothed regression.

    26

    LINEAR REGRESSION ON LOGS

    Recall on lognormal distribution.

    = ln(x)50

    gln(X)

    (ln(X))

    fX(x)

    x50

    X ~ Lognormal ln(x) ~ Normal

    X ~ LN (, 2) ln(X) ~ N(, 2)

    f 42a

    x ln(x)

    x50 = e

    = e + 2

    2

    If (X,Y) are bi-lognormally distributed,then (ln(X), ln(Y)) are bi-normally distributed.

    - (X, Y ) LN(X , Y , 2X ,

    2Y , XY );

    - (lnX, lnY ) N(X , Y , 2X ,

    2Y , XY );

    Linear regression on the logs is optimal:

    Y |X=x = Y +YX

    XY (ln(x) X)

    27

    LINEAR REGRESSION ON LOGS

    Linear regression on the logs is optimal but:

    - The regression is optimal for the logs.

    Taking the anti-log of Y |X=x provides the conditionalmedian of Y

    y50Y |X = exp(Y |X)

    The conditional mean of Y is obtained by:

    Y |X = exp(Y |X +2Y2)

    X

    Y

    Y|X

    Bilognormal Cloud

    of points

    y50Y|X

    ln(X)

    ln(Y)

    f.104

    Y|X

    Binormal Cloud

    of points

    28

  • POLYNOMIAL REGRESSION

    Polynomial of order 2 (Davis, 1973):

    X

    Y

    f.105

    Mois

    ture

    Depth

    (Davis, 1976)

    Dont overdo it!

    X

    Y

    f.106

    29

    PIECEWISE LINEAR REGRESSION

    2 linear segments:

    X

    Y

    f.105a

    Mois

    ture

    Depth

    30

    SMOOTHED REGRESSION

    Average behaviour of Y is computed withinmoving win-dow along X.

    Does not make any assumption about the (X, Y ) dis-tribution.

    More difficult to use as predictor, but good enough inmost cases, when we want to look at the general trendbetween 2 variables.

    Example:

    RE

    JE

    CT

    AU

    ORIGINAL AU

    ORIGINAL AU VERSUS REJECTLC - TRENCHES

    1. 10. 100.

    1.

    10.

    100.NB. OF DATA 477

    X VAR: MEAN 5.814STD. DEV. 6.110

    Y VAR: MEAN 6.021STD. DEV. 7.264

    CORRELATION 0.935

    f107

    31 32

  • EDA: Checking Twin Holes

    Comparison of Drill Holes D-1 and D-2

    Depth

    D-2 D-1

    0.01 0.1 1.0 10.0 100.0

    Au (in g/t)

    0

    10

    20

    30

    40

    50

    60

    70

    80

    90

    100

    Number of pairs: 66

    D-1 mean: 1.779

    D-2 mean: 1.854

    D-1 std. dev.: 1.487

    D-2 std. dev: 0.927Linear correlation: 0.553

    Rank correlation: 0.591

    0.01 0.1 1.0 10.0 100.00.01

    0.1

    1.0

    10.0

    100.0

    RS

    2 A

    u (

    in g

    /t)

    DDH161 Au (in g/t)

    0.01 0.1 1.0 10.0 100.0

    33

    HARD / SOFT GEOLOGICAL BOUNDARIES

    Waste

    Ore{{ {Waste / Waste Waste Ore

    Ore / Ore

    f. 144

    COMPARISON OF CONSECUTIVE DOWN HOLE AU ASSAYS

    WASTE

    0.01

    0.1

    1.0

    10.0

    100.0

    0.01 0.1 1.0 10.0 100.0

    Au in A

    dj. W

    aste

    Au in Waste

    Number of pairs: 89

    X Mean: 1.517

    Y Mean: 1.558

    X Std.Dev.: 2.238

    Y Std.Dev.: 2.239

    Correlation (on logs): 0.214

    WASTE / ORE

    0.01

    0.1

    1.0

    10.0

    100.0

    0.01 0.1 1.0 10.0 100.0

    Au in O

    re

    Au in Waste

    Number of pairs: 129

    X Mean: 1.852

    Y Mean: 7.363

    X Std.Dev.: 2.521

    Y Std.Dev.: 6.328

    Correlation (on logs): 0.102

    ORE

    0.01

    0.1

    1.0

    10.0

    100.0

    0.01 0.1 1.0 10.0 100.0

    Au in A

    dj. O

    re

    Au in Ore

    Number of pairs: 102

    X Mean: 8.882

    Y Mean: 7.142

    X Std.Dev.: 7.264

    Y Std.Dev.: 6.296

    Correlation (on logs): 0.573

    f.110

    Waste / Waste Waste Ore Ore / Ore

    Musselwhite Comparison of Consecutive Down Hole Assays

    Note: sometimes, mineralization occurs at contact.

    34

    HARD / SOFT GEOLOGICAL BOUNDARIES

    Distance From Contact, m

    -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 0.0

    0.5

    1.0

    1.5

    2.0

    2.5

    3.0

    3.5

    4.0

    4.5

    5.0

    DOMAIN: In_Stope

    N = 2146 Mean = 2.72

    DOMAIN: Out_Stope

    N = 16721 Mean = 0.79

    486

    351

    196

    164104

    90

    74

    52

    640426

    219

    188

    125

    120

    94

    56

    Contact Plot

    f144a

    35

    BIV. DESC.: QQ Plot

    Useful to compare 2 populations, say A and B.

    The quantiles of A and B,(a1, b1), (a2, b2), , (a100, b100),

    are plotted on a X/Y graph.

    B B B

    - Same shape- "A" more variable than "B"

    - Same shape- "A" less variable than "B"

    - Different shape

    B

    - Same distribution

    0 100

    10

    0 100

    10

    0 100

    10

    0 100

    10

    f.108

    A A AA

    36

  • BIV. DESC.: Relative Difference Plot

    Useful to investigate conditional bias between twopopulations, say A and B

    - X axis: mean of pair of values (A+B)2

    - Y axis: relative difference between values(AB)(A+B)/2 100%

    f.109

    A + B2

    A -

    B

    (A +

    B

    ) / 2

    100%

    0 1 2 3 4 5 6 7 8 9

    0

    -10

    -2 0

    -3 0

    +10

    +20

    +30

    RELATIVE DIFFERENCEPLOT

    Notes on graph:- Low values of A < B- High values of A > B- A few outliers (analytical errors?).

    37

    BIV. DESC.: Checking Pairs of Values

    ASSAY A: NORMAL ORIGINAL ASSAY B: NORMAL REJECT

    LINEAR CORRELATION: 0.699RANK CORRELATION: 0.799

    0.01 0.1 1.00.01

    0.1

    1.0

    AS

    SA

    Y B

    (O

    Z/T

    )

    ASSAY A (OZ/T)

    SCATTERPLOT (log scaling)

    0 10

    1

    1

    AS

    SA

    Y B

    (O

    Z/T

    )

    ASSAY A (OZ/T)

    SCATTERPLOT

    0 10

    1

    AS

    SA

    Y B

    QU

    AN

    TIL

    ES

    (O

    Z/T

    )

    ASSAY A QUANTILES (OZ/T)

    Q-Q PLOT

    0.01 0.1 1.00.01

    0.1

    1.0

    AS

    SA

    Y B

    QU

    AN

    TIL

    ES

    (O

    Z/T

    )

    ASSAY A QUANTILES (OZ/T)

    Q-Q PLOT (log scaling)

    0.0 1.0-100

    -75

    -50

    -25

    0

    25

    50

    75

    100

    RE

    LD

    IFF

    [A

    -B]/

    AV

    G (

    %)

    AVERAGE [A+B]/2 (OZ/T)

    RELDIFF PLOT

    0.01 0.1 1.0-100

    -75

    -50

    -25

    0

    25

    50

    75

    100

    RE

    LD

    IFF

    [A

    -B]/

    AV

    G (

    %)

    AVERAGE [A+B]/2 (OZ/T)

    RELDIFF PLOT (log scali ng)

    SIDE-BY-SIDE BOXPLOT

    ASSAY AASSAY B

    0.01

    0.1

    1.0

    10.0

    189NUMBER0.143MEAN0.136STDEV1.207MAXIMUM0.17775TH %-ILE0.094MEDIAN0.06425TH %-ILE0.001MINIMUM

    189NUMBER0.146MEAN0.162STDEV1.285MAXIMUM0.17575TH %-ILE0.091MEDIAN0.05525TH %-ILE0.001MINIMUM

    AU

    (O

    Z/T

    )

    pdi_0014.eps

    38

    39 40

  • OUTLIER TRIMMING / TOPCUT (1)

    When computing statistics within a geological domain,we make the assumption that there is only one popula-tion and that all samples belong to that population.

    Outliers or extreme values are often observed. Theirimpact can be a serious overestimation of the blockmodel grade and variability.

    There is also considerable uncertainty as to thegrade and tonnage represented by these very high val-ues.

    Various solutions:- Outliers are erroneous: delete or correct them;- Outliers are from different population:

    - define new geology domain;

    - trim them down prior to computing statistics;

    - restrict their influence during estimation (1/d2

    or kriging);

    - use indicator kriging.- High values are from same population:

    - trim them down to reduce the risk.

    In this section, trimming = topcut.

    41

    OUTLIER TRIMMING / TOPCUT (2)

    The main questions are:- Is trimming/cutting warranted?- If yes, which value(s) to choose?

    The answers are subjective.

    Useful graph to assess outliers:- Actual versus smoothed grade profiles- Histogram, cumulative probability plots- Decile analysis- Indicator correlation plot- Coefficient of variation plot- Quantity of metal plot

    Method to assess the risk- Metal at risk

    Other useful information:- Number of trimmed/cut data- Quantity of metal reduction after trimming

    42

    OUTLIER TRIMMING / TOPCUT (3)

    Grade profiles along holes

    0 50 100

    1

    10

    100

    25 12575

    DHxxxx

    f.171

    Geological

    ContactsOutliers Sample

    Grades

    Smoothed

    Sample Grades

    - Outliers stand out with respect to smoothed gradeprofile.

    Similar techniques can be applied in 2 and 3D:- 2D: sample values & contours; map of residuals.- 3D: sample values & 3D estimates; list of residuals.

    Advantage: detect local outliers.

    43

    OUTLIER TRIMMING / TOPCUT (4)

    Histogram

    f114_a

    HISTOGRAM

    Fre

    quency

    Au

    .1 1. 10. 100. 1000.

    0.000

    0.020

    0.040

    0.060

    0.080

    0.100 NUMBER OF DATA 455NB CUT-OUT 93

    CUT VALUE (MIN) 0.110

    MEAN 10.528STD. DEV 18.818

    COEF. OF VAR 1.787

    MAXIMUM 201.000MINIMUM 0.110

    80 g/t

    - A possible trimming/cutting value is where thehistogram classes start to be isolated on thehorizontal axis.

    - Possible trimming value from graph: 80 g/t.

    44

  • OUTLIER TRIMMING / TOPCUT (5)

    Cumulative (log)probability plots.

    f114_b

    CU

    MU

    LA

    TIV

    E P

    RO

    BA

    BIL

    ITY

    VARIABLE

    CUM. DISTRIBUTION

    0.01

    0.10.2

    12

    5

    10

    20304050607080

    90

    95

    9899

    99.899.9

    99.99

    0.100 1.00 10.0 100.

    70 g/t

    1000.

    - A single population would be represented on the plotby a gradually increasing line.

    - A kink or a break in the curve might indicate twopopulations or the presence outliers.

    - A possible trimming/cutting value is around thekink/break where the second population (outliers) getspredominant.

    - Possible trimming value from graph: 70 g/t.

    45

    OUTLIER TRIMMING / TOPCUT (6)

    Decile Analysis% of Contained Metal Decile # of Samples Average (g/t) Minimum (g/t) Maximum (g/t) Contained Metal (g)

    Percentile (of last decile)

    Suggestion: cutting may be warranted

    0-1010-2020-3030-4040-5050-6060-7070-8080-90

    90-100Total

    90-9191-9292-9393-9494-9595-9696-9797-9898-99

    99-100

    *

    38.018.0

    12.39.3

    6.95.34.23.01.91.0

    9.87.4

    2.92.72.63.62.22.12.91.7

    1.35 0.8 1.9 36

    32.35 32.0 32.7 64

    2.72 2.0 3.4 73

    36.67 35.0 39.0 110

    4.22 3.4 5.25 114

    40.35 39.7 41.0 80

    5.91 5.28 6.65 159

    41.5 41.0 42.0 83

    7.44 6.69 8.4 200

    45.5 42.0 47.5 136

    9.66 8.5 11.0 260

    49.25 49.0 49.5 98

    13.04 11.1 15.0 352

    51.5 50.0 53.0 103

    17.24 15.0 21.0 465

    55.5 54.0 57.0 111

    25.26 21.0 31.5 681

    93.33 61.0 110.0 280

    62.54 32.0 201.0 1438

    185.5 170.0 201.0 371

    27

    2

    27

    3

    27

    2

    27

    2

    27

    3

    27

    2

    27

    2

    27

    2

    27

    3

    23

    2

    266 14.22 0.8 201.0 3783

    Decile Analysis

    - Introduced by I.S. Parrish, Min. Eng., Apr. 97.

    - See next page for 40/10 rule of thumb

    - 40/10 rule to be reduced if last decile / percentiledo no contain a full complement of samples.

    46

    OUTLIER TRIMMING / TOPCUT (7)

    Decile Analysis (Contd.)

    - If Top decile contains:- More than 40% of metal, or- More than twice the metal of previous decile

    Split it in 10 percentiles

    - If top percentile contains:- More than 10% of metal

    Trimming is warranted

    - Suggested trimming value is then:- Highest value of previous percentile

    Possible trimming value from graph- Note that last decile / percentile not full.- Trimming may be warranted.- Previous percentiles 3 values: 61, 109, 110 g/t- Possible trimming value: 100 g/t

    47

    OUTLIER TRIMMING / TOPCUT (8)

    Indicator correlation plot.

    f114_c

    0.1 1.0 10.0 100.0 1000.0

    Indicator Threshold (g/t Au)

    0.0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1.0

    Indic

    ato

    r C

    orr

    ela

    tion for

    Lag 1

    60 g/t

    - This plot shows the correlation coefficient of two adja-cent down-hole sample indicators for increasing cut-offs.

    - Indicator: ic(x) =

    {1, if the grade z(x) zc0, otherwise

    where zc is the cut-off (indicator threshold).

    - As the cut-off zc increases, the correlation decreases.

    - A possible trimming/cutting value is when the correla-tion is or is getting close to 0.

    - Possible trimming value from graph: 60 g/t.48

  • OUTLIER TRIMMING / TOPCUT (9)

    Coefficient of variation plot.

    f114_d

    0.1 1.0 10.0 100.0 1000.0

    Cutting Limit (g/t Au)

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    1.2

    1.4

    1.6

    1.8

    2.0

    2.2

    2.4

    2.6

    2.8

    3.0C

    oeffic

    ient of V

    ariation

    - This plot shows the coefficient of variation of the cutgrades for increasing cutting limits.

    - As the cutting limit increases, coefficient of variationincreases.

    - Indicates the impact of cutting on the CV.

    49

    OUTLIER TRIMMING / TOPCUT (10)

    Quantity of metal plot.

    f114_e

    0.1 1.0 10.0 100.0 1000.0

    Trimming Value (g/t Au)

    0.0

    10.0

    20.0

    30.0

    40.0

    50.0

    60.0

    70.0

    80.0

    90.0

    100.0

    % o

    f C

    onta

    ined M

    eta

    l in

    Sam

    ple

    s

    - This plot shows the relative quantity of metal containedwithin the trimmed-down samples for various trimmingvalues.

    - Useful to know the quantity of metal discarded bytrimming.

    - 93% of metal corresponding to 70g/t trimming value.

    = 7% of metal loss if larger than 80 g/t Au samplesare trimmed down to 80 g/t.

    50

    OUTLIER TRIMMING / TOPCUT (11)

    D6:

    HW

    Sh

    ear

    --

    AU

    Decl, T

    rim

    1500 g

    /t, E

    nv=

    2, In

    sid

    e T

    rust.

    Cu

    ttin

    g S

    tati

    sti

    cs

    0.1

    1.0

    10.0

    100.0

    1000.0

    Ind

    ica

    tor

    Th

    resh

    old

    (g

    /t A

    u)

    0.0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1.0

    Indicator Correlation for Lag 1

    0.1

    1.0

    10.0

    100.0

    1000.0

    Trim

    min

    g L

    imit (

    g/t

    Au

    )

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    1.2

    1.4

    1.6

    1.8

    2.0

    2.2

    2.4

    2.6

    2.8

    3.0

    Coefficient of Variation

    0.1

    1.0

    10.0

    100.0

    1000.0

    Cu

    ttin

    g V

    alu

    e (

    g/t

    Au

    )

    0.0

    10.0

    20.0

    30.0

    40.0

    50.0

    60.0

    70.0

    80.0

    90.0

    100.0

    % of Contained Metal in Samples

    Variable

    : au (

    weig

    hte

    d b

    y w

    tpoly

    ) fr

    om

    0.1

    to 9

    999.0

    fig114_f

    51

    OUTLIER TRIMMING / TOPCUT (12)

    Metal at Risk

    Objective:

    - Assess the risk associated to high grade material

    - Trim down high grade values to reduce the risk

    What is the metal at risk?

    - Consider:- Ore only, not waste- Yearly production increments

    - Distribution of the yearly metal production:- Average = estimated yearly production- 20th percentile is such that:

    - 4 times out of 5, production will be higher- 1 time out of 5, production will be lower

    Yearly Metal Production

    f.200Metal

    ProductionAverage20th

    Percentile

    Metal

    at risk

    52

  • OUTLIER TRIMMING / TOPCUT (13)

    What is the metal at risk? (Contd)

    - Metal at risk is:- Average - 20th percentile

    What is the trimming value?

    - Such that the trimmed value average is close to the20th percentile.

    Procedure: Monte Carlo simulation

    1. Get distribution of ore samples

    2. Assess numberN of ore samples mined out per year

    3. Draw N samples out of the distribution

    - Calculate metal content

    4. Repeat No 3 many times

    - Get distribution of metal production

    5. Pick 20th percentile from distribution.Compute corresponding topcut

    53

    OUTLIER TRIMMING / TOPCUT (14)

    Metal at Risk vs. Other Methods

    Metal at risk:

    - Objective

    - Trimming value depends on production- The higher the production, the lesser the risk,the higher the trimming value

    - Topcut model on the conservative side- Actual production assumed to exceed predic-tion 4 times out of 5.

    - Metal loss exists but we do not know where itis.

    Other methods

    - Sometimes subjective

    - Trimming does not depend on production

    - Topcut model assumed to be middle of the roadmodel

    - i.e. metal loss does not exist

    54

    OUTLIER TRIMMING / TOPCUT (15)

    Trimming/Cutting Summary Table

    Histogram 80 g/tProbability Plot 70 g/t

    Decile Analysis 100 g/tIndicator Correlation 60 g/t

    Metal at risk (1)Final Choice 80 g/t

    Coefficient of Variation 1.4Number of Data Trimmed 4 of 455

    Metal Loss 7%

    (1) Notes:- Metal at risk topcut yet to be included inthis example.

    - Amec policy yet to be formalized.

    55 56

  • BIVARIATE STATISTICS: Summary

    Description- Scattergram, bivariate histogram- Marginal distributions and statistics- Conditional distributions and statistics- Covariance, correlation of coefficient,rank correlation coefficient

    Model- Bivariate PDF of (X, Y )- Marginal and conditional PDFs- Covariance, correlation coefficient

    Regression- Linear, non linear, smoothed

    Special Problems- Checking twinned holes- Checking hard/soft geological boundaries- Checking pairs of values- Choose cutting/trimming values

    - Several graphs- Metal at risk

    57

    FACT SHEET

    Bivariate Statistics

    sX,Y =1N

    [xi mX ][yi mY ] =

    1N

    [xiyi]mXmY

    Cov[X, Y ] = X,Y = E[(X X)(Y Y )

    ]= E(XY ) XY

    V ar(aX+bY +c) = a2V ar(X)+b2V ar(Y )+2abCov(X, Y )

    58

  • VARIOGRAM

    Content

    Theory- Definition- Nugget, sill, range- Anisotropy- Models: nugget, spherical, exponential- Variogram versus covariance

    Practice- Proportional effect- Alternative variograms:

    - Relative pairwise variogram- Correlogram

    - Computing variograms- Variogram cross- Variogram maps

    - Modeling variograms

    1

    VARIOGRAM: Exercise 10

    Suppose a mineralized zone as following:

    A B C

    D

    6m

    3m

    3m

    Ni Saprolite

    Zone

    f. 71a

    How would you compare the similarity of B, C, and Dsample grades with respect to the grade of sample A?

    2

    3 4

  • VARIOGRAM Introduction

    A B C

    D

    6m

    3m

    3m

    Ni Saprolite

    Zone

    f. 71a

    The correlation between sample grades:- increases with decreasing distance betweensamples;

    - can vary with direction;- varies with sample size;- depends on the continuity.

    The variogram is a function that quantifies the notionof geological continuity or spatial correlation.

    5

    VARIOGRAM

    2 % Cu

    3 % Cu

    Example with 3 samples

    Distance

    (300 Azim)

    Diff.

    0 10 20 30

    3

    2

    1

    0

    1 % Cu

    Direction: 300 AzimDistance: 10m 20mDifference 1: 2-1 = 1 3-1=2Difference 2: 3-2 = 1Avg Diff: 1 2

    Distance Between

    Sample Locations

    along one direction

    Range

    Nugget

    f. 12b

    Average

    Difference

    Between

    Sample

    Values

    Variogram

    N

    10m

    Sill

    00

    Model

    In fact, variogram is 1/2 average of squared differences.6

    7 8

  • VARIOGRAM: Definition

    The variogram quantifies spatial correlation by lookingat the average square difference between two values adistance (and angle) h apart.

    Experimental variogram:

    (h) =1

    2Nh

    Nhi=1

    [z(xi) z(xi + h)]2

    where Nh is the number of couples (z(xi), z(xi + h))separated by h (distance + direction).

    Experimental variogram and model

    Distance (h)

    RangeNugget

    SillTotal

    Sill

    (h)

    f. 12

    Theoretical expression(h) =

    1

    2E[(Z(x) Z(x+ h)

    )2]9

    VARIOGRAM: Zone of influence

    Horizontal Range

    Range

    Range

    Range

    Vertical

    Up to 65m verticallyin iron ore formation

    No more than 2m in a bauxite deposit

    Can be different indifferent directions

    f. 72

    VA

    RIO

    GR

    AM

    DISTANCE

    IRON ORE FORMATION (DAVID, 1977)

    0. 20. 40. 60. 80. 100.

    0.00

    1.00

    2.00

    3.00

    4.00

    5.00

    VA

    RIO

    GR

    AM

    DISTANCE

    BAUXITE (DAVID, 1977)

    0.00 1.00 2.00 3.00 4.00

    0.00

    1.00

    2.00

    3.00

    4.00

    5.00

    6.00

    7.00

    8.00

    VA

    RIO

    GR

    AM

    DISTANCE

    ANISOTROPY (DAVID, 1977)

    0.0 10.0 20.0 30.0

    0.00

    1.00

    2.00

    3.00

    4.00

    5.00

    6.00

    10

    VARIOGRAM: Examples

    There is one variogram for every deposit and forevery spatially correlated variable.

    VA

    RIO

    GR

    AM

    DISTANCE

    (DAVID, 1979)U3O8 - WYOMING ROLL FRONT

    0. 40. 80. 120.

    0.00

    1.00

    2.00

    3.00

    VA

    RIO

    GR

    AM

    DISTANCE

    (DAVID, 1979)U3O8 - NEW MEXICO

    0. 50. 100. 150. 200. 250. 300.

    0.00

    1.00

    2.00

    3.00

    4.00

    5.00

    6.00

    7.00

    8.00

    VA

    RIO

    GR

    AM

    DISTANCE

    (DAVID, 1977)OIL IN TAR SAND

    0. 200. 400. 600. 800. 1000.

    0.00

    1.00

    2.00

    3.00

    4.00

    VA

    RIO

    GR

    AM

    DISTANCE

    (DAVID, 1977)WESTERN US COAL(SO2/BTU)

    0. 1000. 2000. 3000. 4000. 5000. 6000.

    0.00

    1.00

    2.00

    3.00

    4.00

    5.00

    6.00

    VA

    RIO

    GR

    AM

    DISTANCE

    (DAVID, 1977)AG - MEXICO

    0.0 10.0 20.0 30.0 40.0 50.0 60.0

    0.00

    1.00

    2.00

    3.00

    4.00

    5.00

    6.00

    VA

    RIO

    GR

    AM

    DISTANCE

    (DAVID, 1977)CU - EXOTICA

    0.0 10.0 20.0 30.0 40.0 50.0 60.0

    0.00

    1.00

    2.00

    3.00

    4.00

    5.00

    6.00

    7.00

    8.00

    VA

    RIO

    GR

    AM

    DISTANCE

    (J & H, 1978)OIL GRADES

    0.0 10.0 20.0 30.0 40.0 50.0 60.0 70.0

    0.0

    5.0

    10.0

    15.0

    20.0

    25.0

    30.0

    VA

    RIO

    GR

    AM

    DISTANCE

    (J & H, 1978)TOPOGRAPHIC HEIGHTS

    0. 50. 100. 150. 200.

    0.0

    4.0

    8.0

    12.0

    VA

    RIO

    GR

    AM

    DISTANCE

    (J & H, 1978)CU - LOS BRONCES

    0.0 10.0 20.0 30.0 40.0 50.0 60.0

    0.00

    0.20

    0.40

    0.60

    0.80

    f. 73

    11 12

  • VARIOGRAM: Models

    Variances are computed using the variogram model.The model therefore must ensure that all computedvariances are positive.

    Example of models are:

    (h)

    h

    Gaussian

    Spherical

    Exponential

    Range

    Nugget

    Sill

    Linear

    Power

    f. 16

    Models with a sill- Nugget: zero range- Spherical- Exponential: practical range- Gaussian: practical range

    Models with no sill- Linear- Power

    Any linear combination of the above Models with several structures

    13

    VARIOGRAM: Examples

    h

    h

    h

    h

    h

    Dist.

    Dist.

    Dist.

    Dist.

    Dist.

    Nugget

    Spherical

    Power

    Gaussian

    Hole effect

    f. 34

    Likely variograms Grade transects

    Grade

    Grade

    Grade

    Grade

    Grade

    (h)

    (h)

    (h)

    (h)

    (h)

    (+ small nugget)

    (+ small nugget)

    (+ small nugget)

    14

    15 16

  • VARIOGRAM: Geometrical Anisotropy

    Occurs when isotropy can be obtained by stretching/squashing the deposit along 1 or 2 main directions.

    hR90R45

    R0

    0 045 090

    0

    f.17

    N

    Ellipse of Ranges

    Long Range Azimuth

    R90R45

    R0

    Variogram

    Notes:- Variogram model covers all directions.

    17

    VARIOGRAM: Anisotropy

    Azim

    = 90 0

    Azim=45 0

    Azim=0 0

    f.17a

    Ellipse of RangesDist

    Dist

    Dis

    t

    Azim

    = 90 0

    Azim=45 0Azim=0 0

    f.17b

    Variogram Map

    Dist

    Dist

    Dis

    t

    0 1/3 x Sill

    1/3 2/3 x Sill

    2/3 1 x Sill

    18

    VARIOGRAM: Variogram Map

    -218.

    -118.

    -18.

    83.

    183.

    -218.

    -118.

    -18.

    83.

    183.

    FA

    1 T

    rad

    itio

    na

    l V

    ari

    og

    ram

    Ma

    pG

    eo

    log

    y 1

    : 1

    -NE W

    est

    -> E

    ast

    South -> North

    0.3

    5

    0.4

    4

    0.5

    2

    0.6

    1

    0.7

    0

    0.7

    8

    0.8

    7

    0.9

    5

    1.0

    1.1

    1.2

    -218.

    -118.

    -18.

    83.

    183.

    -218.

    -118.

    -18.

    83.

    183.

    FA

    1 X

    Y m

    ap

    West ->

    East

    South -> North

    0.3

    5

    0.4

    4

    0.5

    2

    0.6

    1

    0.7

    0

    0.7

    8

    0.8

    7

    0.9

    5

    1.0

    1.1

    1.2

    -218.

    -118.

    -18.

    83.