A copula model to analyze minimum admission scores

Post on 11-Jul-2015

129 views 2 download

Transcript of A copula model to analyze minimum admission scores

A copula model to analyze minimum admission scores

Mariela Fernandez1 and Veronica A. Gonzalez-Lopez2

Institute of Mathematics, Statistics and Computing ScienceUniversity of Campinas

11th ICNAAM , 21-27 September 2013, Rhodes, Greece

1FAPESP Post-doctoral Grant 2011/18285-6.2(a) USP project “Mathematics, computation, language and the brain”; (b)

FAPESP’s project“Portuguese in time and space: linguistic contact, grammars incompetition and parametric change, 2012/06078-9”’; (c) FAPESP’s project “Research,Innovation and Dissemination Center for Neuromathematics - NeuroMat,2013/07699-0’.’

Motivation Copula theory Application Conclusions References

Topics:

1 Why: Motivation.

2 How: Cumulative conditional expectation in a copula framework.

3 Results: Admission score decisions.

4 Next: Final remarks.

mariela, veronica@ime.unicamp.br Copula model; Minimum admission scores 11th ICNAAM 1 / 15

Motivation Copula theory Application Conclusions References

Motivation

mariela, veronica@ime.unicamp.br Copula model; Minimum admission scores 11th ICNAAM 2 / 15

Motivation Copula theory Application Conclusions References

Problem

How to set minimum admission scores in an efficiently way.

Solution

To use the statistical measure

E[Language|Mathematics ≥ m0]

andE[Mathematics|Language ≥ l0]

where m0 is a Mathematics minimum score and l0 is a Language minimumscore.

mariela, veronica@ime.unicamp.br Copula model; Minimum admission scores 11th ICNAAM 3 / 15

Motivation Copula theory Application Conclusions References

E[Mathematics|Language ≥ l0]

We need to know:

Marginals distribution F (x) and G(y) and joint distribution H(X,Y )where X :=Language score and Y :=Mathematics score. Recallingthat

Y |X ≥ x0 ∼ GX≥x0(y) = P(Y ≤ y|X ≥ x0) =G(y)−H(x0, y)

1− F (x0).

Actually, it is more useful to work with the marginals quantiles thanthe marginals scores: “Manager’s control variable”e.g. F (x0) = 0.25 means that we will admit 75% of the candidates.

By taking U = scaling ranks of X and V = scaling ranks of Y , wesearch for the joint density of (U, V ).

mariela, veronica@ime.unicamp.br Copula model; Minimum admission scores 11th ICNAAM 4 / 15

Motivation Copula theory Application Conclusions References

E[Mathematics|Language ≥ l0]

We need to know:

Marginals distribution F (x) and G(y) and joint distribution H(X,Y )where X :=Language score and Y :=Mathematics score. Recallingthat

Y |X ≥ x0 ∼ GX≥x0(y) = P(Y ≤ y|X ≥ x0) =G(y)−H(x0, y)

1− F (x0).

Actually, it is more useful to work with the marginals quantiles thanthe marginals scores: “Manager’s control variable”e.g. F (x0) = 0.25 means that we will admit 75% of the candidates.

By taking U = scaling ranks of X and V = scaling ranks of Y , wesearch for the joint density of (U, V ).

mariela, veronica@ime.unicamp.br Copula model; Minimum admission scores 11th ICNAAM 4 / 15

Motivation Copula theory Application Conclusions References

E[Mathematics|Language ≥ l0]

We need to know:

Marginals distribution F (x) and G(y) and joint distribution H(X,Y )where X :=Language score and Y :=Mathematics score. Recallingthat

Y |X ≥ x0 ∼ GX≥x0(y) = P(Y ≤ y|X ≥ x0) =G(y)−H(x0, y)

1− F (x0).

Actually, it is more useful to work with the marginals quantiles thanthe marginals scores: “Manager’s control variable”e.g. F (x0) = 0.25 means that we will admit 75% of the candidates.

By taking U = scaling ranks of X and V = scaling ranks of Y , wesearch for the joint density of (U, V ).

mariela, veronica@ime.unicamp.br Copula model; Minimum admission scores 11th ICNAAM 4 / 15

Motivation Copula theory Application Conclusions References

Cumulative conditional expectation in a copula framework

DefinitionA bivariate Copula is a bivariate joint distribution with uniform marginals,denoted by C(u, v) for (u, v) ∈ [0, 1]× [0, 1].

Sklar’s TheoremLet H be a joint distribution function with margins F and G. Then thereexists a copula C such that

H(x, y) = C(F (x), G(y)). (1)

If F and G are continuous, then C is unique; otherwise, C is uniquelydetermined on RanF ×RanG. Conversely, if C is a copula and F and Gare distribution functions, then the function H defined by (1) is a jointdistribution with margins F and G.

mariela, veronica@ime.unicamp.br Copula model; Minimum admission scores 11th ICNAAM 5 / 15

Motivation Copula theory Application Conclusions References

Some common bivariate Copulas

Product C(u, v) = uv.

Farlie-Gumbel-Morgenstern C(u, v) = uv + αuv(1− u)(1− v), forα ∈ [−1, 1].Clayton C(u, v) = max{0, (u−α + v−α − 1)−1/α}, for α ∈ (0,∞).

Gumbel C(u, v) = exp(−((− lnu)α + (− ln v)α

)1/α), for

α ∈ [1,∞].

Some applications

Actuarial science, e.g. Frees et al. (1996) and Frees et al. (2005).

Finance and risk management, e.g. Cherubini et al. (2004) andEmbrechts et al. (2003).

Hydrology, e.g. Genest and Frave (2007).

Deforestation (spatio-temporal dependence), e.g. Graler et al. (2010).

Linguistic, e.g. Garcıa et al. (2012).

mariela, veronica@ime.unicamp.br Copula model; Minimum admission scores 11th ICNAAM 6 / 15

Motivation Copula theory Application Conclusions References

Copula model selection according to the problem’s characteristic: simplecross sections (i.e. simple expression for the intersection of the copula withthe plane u = u0, C(u0, v)) since we need

P[V ≤ v|U ≥ u0] =v − C(u0, v)

1− u0

to compute E[V |U ≥ u0].

mariela, veronica@ime.unicamp.br Copula model; Minimum admission scores 11th ICNAAM 7 / 15

Motivation Copula theory Application Conclusions References

Copula model selection according to the problem’s characteristic: simplecross sections (i.e. simple expression for the intersection of the copula withthe plane u = u0, C(u0, v)) since we need

P[V ≤ v|U ≥ u0] =v − C(u0, v)

1− u0

to compute E[V |U ≥ u0].

%Farlie-Gumbel-Morgenstern C(u, v) = uv + αuv(1− u)(1− v), forα ∈ [−1, 1]. Quadratic cross sections in both variables, weak dependenceand exchangeable copula, i.e. C(u, v) = C(v, u).

mariela, veronica@ime.unicamp.br Copula model; Minimum admission scores 11th ICNAAM 7 / 15

Motivation Copula theory Application Conclusions References

Copula model selection according to the problem’s characteristic: simplecross sections (i.e. simple expression for the intersection of the copula withthe plane u = u0, C(u0, v)) since we need

P[V ≤ v|U ≥ u0] =v − C(u0, v)

1− u0

to compute E[V |U ≥ u0].

!Asymmetric Cubic Section Copula (ACS) introduced by Nelsen et al.(1997)

C(u, v) = uv + uv(1− u)(1− v)[(a− b)v(1− u) + b]

where |b| ≤ 1, b−3−√9+6b−3b22 ≤ a ≤ 1 and a 6= b. Cubic cross sections in

both variables, weak dependence and non-exchangeable copula.

mariela, veronica@ime.unicamp.br Copula model; Minimum admission scores 11th ICNAAM 7 / 15

Motivation Copula theory Application Conclusions References

Copula parameters estimation

Bayesian approach through an uniform conjugate prior

a = K−1∫ 1

−1

∫ 1

R(b)aπ(a, b|u)dadb,

b = K−1∫ 1

−1

∫ 1

R(b)bπ(a, b|u)dadb

where π(a, b|u) is the posterior distribution in (a, b), u is the sample data,

K =∫ 1−1∫ 1R(b) π(a, b|u)dadb and R(b) = b−3−

√9+6b−3b22 .

mariela, veronica@ime.unicamp.br Copula model; Minimum admission scores 11th ICNAAM 8 / 15

Motivation Copula theory Application Conclusions References

Cumulative conditional expectation for the ACS copula family

E[V |U ≥ u] =∫ 10 vdP(v|U ≥ u) =

1

12

(6 + (a+ b)u+ (b− a)u2

),

E[U |V ≥ v] =∫ 10 udP(u|V ≥ v) =

1

2+b

6v +

a− b12

v2.

Property

i) The vertex of the function E[V |U ≥ u] is u0 =−a−b2(b−a) . It is a minimum

if b > a and it is maximum if b < a.

ii) The vertex of the function E[U |V ≥ v] is v0 =−ba−b . It is a minimum if

a > b and it is maximum if a < b.

mariela, veronica@ime.unicamp.br Copula model; Minimum admission scores 11th ICNAAM 9 / 15

Motivation Copula theory Application Conclusions References

Admission score decisions

Data

Mathematics and Portuguese scores of each student who succeededat the admission test for the undergraduate course of ElectricalEngineering at University of Campinas in Brazil, from 2010 to 2011.

X = Portuguese score and Y = Mathematics score. An annualstandardization was used to avoid the effect of different tests appliedeach year.

We compute the pseudo-observations

ui = F (xi)N

N + 1=rankxiN + 1

and vi = G(yi)N

N + 1=rankyiN + 1

where N is the size of the sample and F and G are the empiricaldistribution of X and Y respectively.

mariela, veronica@ime.unicamp.br Copula model; Minimum admission scores 11th ICNAAM 10 / 15

Motivation Copula theory Application Conclusions References

E[V|U ≥ u] E[U|V ≥ v]

Year Students τ a b Vertex Type Vertex Type2010 68 -0.0507 -2.2658 0.3253 0.374 min 0.125 max2011 67 -0.2684 -0.5808 -0.7153 – decreas – decreas

mariela, veronica@ime.unicamp.br Copula model; Minimum admission scores 11th ICNAAM 11 / 15

Motivation Copula theory Application Conclusions References

Final remarks

We have explored

Copula theory applied to educational data.

Cumulative conditional expectation as a measure for decision making.

Work in progress

Mathematical and statistical properties of the cumulative conditionalexpectation.

Relation between the cumulative conditional expectation and thedirectional dependency given by E[V |U = u0].

Analytical expressions for others copula families, for example theGeneralized Farlie-Gumbel-Morgenstern C(u, v) = uv + f(u)g(v).

Application to data from other courses.

mariela, veronica@ime.unicamp.br Copula model; Minimum admission scores 11th ICNAAM 12 / 15

References

Cherubini, U., Luciano, E. e Vecchiato, W. (2004). Copula Methods inFinance. John Wiley & Sons.

Embrechts, P., Lindskog, F. e McNeil, A. (2003). ModellingDependence with Copulas and Applications to Risk Management.Handbook of Heavy Tailed Distribution in Finance. Elsevier.

Frees, E., Carriere, J. e Valdez, E. (1996). Annuity valuation withdependent mortality. Journal of Risk and Insurance 63, 229-261.

Frees, E. e Wang, P. (2005). Credibility using copulas. NorthAmerican Actuarial Journal 9 (2), 31-48.

Garcıa, J. E., Gonzalez-Lopez, V. A.; Viola, M. L. L.(2012) Robustmodel selection and the statistical classification of languages. AIPConference Proceedings: 11th Brazilian Bayesian Statistics Meeting v.1490. p. 160-170.

References

Genest, C. e Frave, A. C. (2007). Everything you always wanted toknow about copula modeling but were afraid to ask. Journal ofHydrologic Engineering 12, 347-368.

I Graler, B., Kazianka, H. e M. de Espindola, G. (2010). Copulas, anovel approach to model spatial and spatio-temporal dependence.GIScience for Environmental Change Symposium Proceedings 40,49-54.

Nelsen, R. B., Quesada Molina, J. J., Rodrıguez Lallena, J. A. (1997).Bivariate copulas with cubic sections. J Nonparametr Statist 7,205-220.

Motivation Copula theory Application Conclusions References

Thanks!

mariela, veronica@ime.unicamp.br Copula model; Minimum admission scores 11th ICNAAM 15 / 15