Correcting the population of Brazilian municipalities ...

38
Correcting the population of Brazilian municipalities using the Jackknife model * Pedro Santos and Enlinson Mattos ** Abstract This paper proposes a method for identifying and correcting the observed distor- tion in the population distribution of Brazilian municipalities present in the De- mographic Census data (Litschig, 2012 and Monasterio, 2014). This distortion is characterized by a high concentration of municipalities with populations close to the bands used as criteria for distributing funds to municipalities in Brazil (Municipal Participation Fund, MPF). The proposed method uses two steps. First, we follow Gabaix (1999) and use a model that follows Zipf’s Law to estimate the population of cities (ZIPF, 1949), as in Monasterio (2014). Next, using the Jackknife method, which minimizes population deviations from the standard model, we identify the candidate municipalities for the adjustment and replace their population with the expected counterpart, so as not to present significant discontinuities in the popu- lation distribution (McCrary, 2008). We found no significant change in the MPF distribution after this correction, which suggests that the observed phenomenon of discontinuous population distribution is small and, more importantly, models using fuzzy RDD (Brollo et al 2011) cannot be completely invalidated. Keywords : Municipal Participation Fund; Demographic Census; Vertical fiscal transfers; Zipf’s Law; manipulation of the running variable. JEL codes : H70; H80; C40. * Submitted on 16/09/2015. Revised on 17/07/2017. We are grateful for the comments of an anonymous reviewer, Paulo Arvate and Leonardo Monasterio. Any errors are the authors’ responsibility. ** Sao Paulo School of Economics. Getulio Vargas Foundation. Email: [email protected] and [email protected]. Brazilian Review of Econometrics v. 38, n. 1, pp. 1–38 May 2018

Transcript of Correcting the population of Brazilian municipalities ...

Page 1: Correcting the population of Brazilian municipalities ...

Correcting the population of Brazilianmunicipalities using the Jackknifemodel*

Pedro Santos and Enlinson Mattos**

Abstract

This paper proposes a method for identifying and correcting the observed distor-

tion in the population distribution of Brazilian municipalities present in the De-

mographic Census data (Litschig, 2012 and Monasterio, 2014). This distortion is

characterized by a high concentration of municipalities with populations close to the

bands used as criteria for distributing funds to municipalities in Brazil (Municipal

Participation Fund, MPF). The proposed method uses two steps. First, we follow

Gabaix (1999) and use a model that follows Zipf’s Law to estimate the population

of cities (ZIPF, 1949), as in Monasterio (2014). Next, using the Jackknife method,

which minimizes population deviations from the standard model, we identify the

candidate municipalities for the adjustment and replace their population with the

expected counterpart, so as not to present significant discontinuities in the popu-

lation distribution (McCrary, 2008). We found no significant change in the MPF

distribution after this correction, which suggests that the observed phenomenon of

discontinuous population distribution is small and, more importantly, models using

fuzzy RDD (Brollo et al 2011) cannot be completely invalidated.

Keywords: Municipal Participation Fund; Demographic Census; Vertical fiscal

transfers; Zipf’s Law; manipulation of the running variable.

JEL codes: H70; H80; C40.

* Submitted on 16/09/2015. Revised on 17/07/2017.

We are grateful for the comments of an anonymous reviewer, Paulo Arvate and Leonardo

Monasterio. Any errors are the authors’ responsibility.

** Sao Paulo School of Economics. Getulio Vargas Foundation. Email: [email protected]

and [email protected].

Brazilian Review of Econometrics

v. 38, n. 1, pp. 1–38 May 2018

Page 2: Correcting the population of Brazilian municipalities ...

Pedro Santos and Enlinson Mattos

1. Introduction

The main instrument for distributing federal government resources to municipalities

in Brazil is the Municipal Participation Fund (MPF), which considers population

the only criterion for distribution to those municipalities with fewer than 142,633

inhabitants. Moreover, the distribution itself is carried out discretely, in which

population bands (up to 10,188, between 10,188 and 13,584, etc.) are used to deter-

mine the quantity of resources granted, with there being a discontinuous increase in

transfers for each new band reached. This becomes an incentive for municipalities

to seek strategies to increase their population, in light of the gains they would ob-

tain once the minimum population of the next band was reached. This distortion

was initially studied in Litschig (2012) and Monasterio (2014), in which evidence

was found from 1991 onwards that there was significance in the discontinuity of the

density of municipalities specifically at the MPF band changes.

This paper uses a strategy to correct the distortion in the population distribu-

tion of Brazilian municipalities. This distortion is characterized by a concentration

of municipalities with population values close to the band changes of the Municipal

Participation Fund (MPF), present in the data from the Demographic Census. The

proposed method uses two steps. First, we follow Gabaix (1999) and use a model

that follows Zipf’s Law to estimate the population of cities (ZIPF, 1949), which is

similar to the strategy found in Monasterio (2014). Next, using the Jackknife model,

which minimizes population deviations from the model, we identify the candidate

municipalities for adjustment and replace their populations with the estimated coun-

terpart so as not to present significant discontinuities in the population distribution

(McCrary test, 2008).

This question is highly relevant in the area of public finance. As the rule for dis-

tributing funds to small Brazilian municipalities is based discontinuously on popula-

tion, various studies use this discontinuity as an identification strategy to estimate

the impact of an increase in municipality resources over some economic variable

such as corruption (Brollo et al, 2013), education and poverty (Litshig and Morri-

son, 2013), public spending (Arvate et al, 2011, 2015, Castro and Regattieri, 2014)

etc.

2 Brazilian Review of Econometrics 38(1) May 2018

Page 3: Correcting the population of Brazilian municipalities ...

Correcting the population of Brazilian municipalities using the Jackknife model

The main contribution from this study consists of using an outlier detection

method (Jackknife) to iteratively identify and treat units in a sample that do not

follow a particular distribution or rule of growth. It is worth highlighting the simi-

larity of our strategy with the use of instrumental variables. Zipf’s Law (Aipf, 1949)

works here as our instrument for positioning the cities in the distribution. That is,

it is an exogenous rule used as an instrument for the municipalities identified as

outliers. The success of using this method is then measured by comparing the dis-

continuities before and after its application. After successfully reconstructing the

population distribution, it is then estimated what the MPF distribution would be

with this corrected population. The object of the study are municipalities with up

to 40,000 inhabitants, which corresponds to 6 different population bands. As in

Monasterio (2014), these were chosen because they have the greatest incentives to

overestimate their populations. The data used came from the 2000 Demographic

Census, from the 2007 Federal Court of Auditors (TCU), and the 2010 Demographic

Census.1

Our results derived from implementing the McCrary (2007) test suggest that

the Jackknife and correction methods were effective in identifying and eliminating

discontinuity in the population distribution. On the other hand, our results suggest

that this municipal population correction does not appear to significantly affect the

MPF distribution. This evidence indicates the use of the fuzzy RDD regression,

where the running variable in this model is the municipality’s population and the

MPF instrument would be what the municipality should receive in accordance with

the law.

The paper is divided into four sections besides this introduction. In the next

section, we present the institutional background regarding the MPF and Zipf’s Law.

In the following section, we discuss the methodology and the data. In section 4, we

present the results, and then we present the conclusion.

1 Also see Eggers et al (2015) and Gerard et al (2015) for possible solutions for running variables

with possible manipulation problems.

Brazilian Review of Econometrics 38(1) May 2018 3

Page 4: Correcting the population of Brazilian municipalities ...

Pedro Santos and Enlinson Mattos

2. Background regarding the MPF distribution and Zipf’s Law (1949)

According to Mendes (2008), the MPF is “a redistributive transfer. That is, re-

sources are sent to each municipality in accordance with a previously determined

distribution formula, which has no relationship with the amount of taxes raised in

the municipality itself. [...] The distribution is mainly carried out in accordance

with municipal population”. Its first definition occurred on December 1st 1965 in

Constitutional Amendment n. 18, in article 21, applied to the 1946 Constitution.

It consisted of 10% of receipts from IPI (tax on industrialized products) and from

IR (income tax). Its distribution began in 1967, after regulation via the National

Tax Code (October 25th 1966). At the time, the only criterion for calculating the

portion sent to each municipality was its respective population.

On February 28th 1967, via Complementary Act of the President of the Republic

n. 35, the municipalities were categorized into Capitals (these would receive 10% of

the MPF) and Interior (these would receive the remaining 90%). And on August

27th 1981, in Decree-Law n. 1,881, a new category of municipality was created,

called Reserve, which would consist of those with populations greater than 156,216

inhabitants. Thus, the division of the MPF amount was changed (Capitals: 10%,

Reserve: 3.6%, and Interior: 86.4%). In the 1988 Constitution an increase was

declared in the amount (at the time) of IPI and IR raised, from 17% to 22.5%, valid

from 1993 onwards. Later, on September 20th 2007, via Constitutional Amendment

n. 55, this value rose to 23.5%.

Specifically regarding the distribution of resources of Interior municipalities, the

methodology takes the form described in Decree-Law n. 1,881 (August 27th 1966)

and ratified on December 28th 1989 by Complementary Law n. 62. Two main

criteria are taken into consideration: the Share of the States (re-released annually)

and the Coefficients of the Municipalities.

Thus, the distribution takes the following form: of the total MPF amount

(100%), 86.4% is sent to the Interior municipalities. From this value the percentage

corresponding to the Share (Table 1) of the State in question is extracted. Finally,

to determine the value sent to the Municipalities, the coefficient of each of them is

divided by the sum of the coefficients of all the Municipalities in the State (Table

2). In general, the payment made to each Municipality i and State j is given by:

4 Brazilian Review of Econometrics 38(1) May 2018

Page 5: Correcting the population of Brazilian municipalities ...

Correcting the population of Brazilian municipalities using the Jackknife model

Table 1Share of the States in the MPF

State Share % State Share %

Acre 0.2630 Paraıba 3.1942

Alagoas 2.0883 Parana 7.2857

Amapa 0.1392 Pernambuco 4.7952

Amazonas 1.2452 Piauı 2.4015

Bahia 9.2695 Rio de Janeiro 2.7379

Ceara 4.5864 Rio Grande do Norte 2.4324

Espırito Santo 1.7595 Rio Grande do Sul 7.3011

Goias 3.7318 Rondonia 0.7464

Maranhao 3.9715 Roraima 0.0851

Mato Grosso 1.8949 Santa Catarina 4.1997

Mato Grosso do Sul 1.5004 Sao Paulo 14.262

Minas Gerais 14.1846 Sergipe 1.3342

Para 3.2948 Tocantins 1.2955

Source: TCU Resolution no 242/90, January 2nd 1990 apud STN (2012)

Table 2Coefficient of Interior Municipalities in the MPF

Population Coef Population Coef

Up to 10,188 0.6 From 61.129 to 71.316 2.4

From 10,189 to 13,584 0.8 From 71.317 to 81.504 2.6

From 13,585 to 16,980 1.0 From 81.505 to 91.692 2.8

From 16,981 to 23,772 1.2 From 91.693 to 10.1880 3.0

From 23,773 to 30,564 1.4 From 101.881 to 115.464 3.2

From 30,565 to 37,356 1.6 From 115.465 to 129.048 3.4

From 37,357 to 44,148 1.8 From 129.049 to 142.632 3.6

From 44,149 to 50,940 2.0 From 142.633 to 156.216 3.8

From 50,941 to 61,128 2.2 Over 156.216 4.0

Source: Decree Law no 1.881/1981 apud STN (2012)

Brazilian Review of Econometrics 38(1) May 2018 5

Page 6: Correcting the population of Brazilian municipalities ...

Pedro Santos and Enlinson Mattos

Payment i,j = FPM ∗ 86.4∗ Paymentj ∗ Coefficient i∑

i∈j Coefficient i(1)

Note that as we are considering municipalities with up to 40,000 inhabitants,

from Table 2 this is equivalent to considering 6 different population levels, and

therefore 6 coefficients (0.6, 0.8, 1.0, 1.2, 1.4, and 1.6).

2.1 Problems resulting from the MPF payment calculation

Applying this methodology to distribute resources implies pros and cons (in the

fiscal gap, in fiscal responsibility, in efficient management, among others) discussed

in Mendes (2008). As the focus of this paper, one of these points becomes relevant:

how municipalities with small populations are encouraged to behave strategically in

relation to the MPF distribution rules.

Two examples are (i) small municipalities subdividing in order to increase the

income per capital derived from this source (Mendes, 2008, Mattos and Ponczeck,

2013, and Arvate et al, 2015) and (ii) encouraging population increases, whether

via migratory policies, mobilization of the population for recensusing, or deliberate

frauds in the population censuses (Monasterio, 2014) with the aim of reaching the

next population band and thus increasing the income per capital obtained. This

second example can be observed in the population distribution of the 2010 Demo-

graphic Census, in which there is a high concentration of municipalities immediately

after MPF band changes, represented in Graph 1 by the vertical lines.

2.2 Exogenous instrument: Zipf’s Law

As it is difficult to find the (exogenous) determinants of the population of a munic-

ipality in order to identify what the population of that unit really should be, our

hypothesis is to follow Zipf’s Law (Zipf, 1949), which consists of the relationship

between the size of an observation unit and its position in relation to the other units

in the same sample (ranking), and this relationship becomes linear when taking into

consideration the logarithm of the two variables, represented by the model below:

log(Ri) = a+ b ∗ log(Pi) + εi (2)

6 Brazilian Review of Econometrics 38(1) May 2018

Page 7: Correcting the population of Brazilian municipalities ...

Correcting the population of Brazilian municipalities using the Jackknife model

Figure 1Histogram of the 2010 population census

Source: IBGE (2011), Monasterio (2014), and the authors’ calculations

where Ri is the ranking of observation i in relation to the other units in the sample,

Pi is the value of the variable of this observation i, and [25B?]i is the error. Also in

Zipf (1949), one of the demonstrated applications of this law is in the population

of cities.2 In Gabaix (1999) this evaluation goes one step further, by explaining

why this phenomenon repeats in different countries with different structures and

antecedents. Graph 1 thus reveals the subsequent concentration of band changes and

suggests the influence of non-natural factors in the distribution of the population of

Brazilian municipalities. We follow the authors above for calculating the estimated

population, as shown below.

3. Methodology

Our empirical strategy consists of four steps. The first concerns the use of a linear

model that seeks to estimate the positioning of the cities within the distribution

when it follows Zipf’s Law (Zipf, 1949). Next, we use the Jackknife method to

identify the municipalities with the greatest population distortions in relation to

that estimated by the model, in order to adjust their populations and thus eliminate

the discontinuity in the population distribution (McCrary, 2008).

2 Monasterio (2014) also adopts a similar procedure to the one adopted by us in this stage.

Similarly, we use the fourth degree polynomial to obtain an adequate adjustment.

Brazilian Review of Econometrics 38(1) May 2018 7

Page 8: Correcting the population of Brazilian municipalities ...

Pedro Santos and Enlinson Mattos

3.1 Linear model

We calculated a linear model with the following format (similar to Monasterio, 2014):

ln (populationi) = β0 + β1 ∗X1i + β2 ∗X2

i + β3 ∗X3i + β4 ∗X4

i +Dai +Dp

i + ε�i (3)

where populationi is the population of municipality i in the observed sample, Xi

is the position (“rank”) of municipality i with relation to its population within the

observed sample, βj are the linear coefficients, Dai and Dp

i are the dummy variables

attributed to the municipalities, and εi are the errors. Unlike Monasterio (2014)3,

the decision was taken to attribute two dummies to each discontinuity of the band,

one before (Dai) for municipalities with a population within 97.5% to 100.0% of the

MPF cut-off value and another after (Dpi) for those within 100.0% to 102.5%.4

The results of the regression of the model (2) are presented in Table 3 below,

using the year 20105. All of the variables were statistically significant, suggesting

that there is in fact an impact on the concentration of municipalities around the

MPF band changes and that this effect may be different depending on whether the

municipality is to the right or left of each threshold6

Once the coefficient is obtained for all of the variables, the estimated population

is calculated for each municipality, considering only the Intercept, Rank, Rankˆ2,

Rankˆ3, and Rankˆ4 variables. By removing the dummies from the calculation,

an approximation is expected to the population of each municipality that would be

observed in the absence of distortions.

3 The author considers a single dummy variable for each discontinuity. Two dummies (before

and after) were considered, since we allowed for different intercepts in relation to the average

( β0. With this we are able to observe not only whether there is a difference in average at

the distribution band transition threshold, but whether this is different before and after this

frontier. The estimation with a single dummy does not alter our results and this is available

from the authors on request.

4 Results are robust to the choice of other bands. 1.5% and 3.5% were tested and the results

are statistically similar. They are available from the authors on request.

5 The results for the other years can be made available by the authors on request. They are all

submitted as supplementary material.

6 Monasterio (2014) had found similar results, but with an absence of significance in the fifth

and sixth dummies. His evaluation covered municipalities with up to 50,000 inhabitants.

8 Brazilian Review of Econometrics 38(1) May 2018

Page 9: Correcting the population of Brazilian municipalities ...

Correcting the population of Brazilian municipalities using the Jackknife model

Table 3Linear model of the dependent variable ln(population) of 2010

Coefficients Estimation Standard deviation

Intercept 10.891*** 0.002

Rank -0.00125*** 0.000005

Rank2 0.000000477*** 0.000000004

Rank3 -0.000000000131*** 0.000000000002

Rank4 0.0000000000000129*** 0.0000000000000002

dummy previous 1 0.044*** 0.002

dummy previous 2 0.015*** 0.001

dummy previous 3 0.022*** 0.001

dummy previous 4 0.005** 0.002

dummy previous 5 0.014** 0.002

dummy previous 6 0.013** 0.002

dummy posterior 1 0.033*** 0.001

dummy posterior 2 0.012*** 0.001

dummy posterior 3 0.014*** 0.001

dummy posterior 4 0.004*** 0.001

dummy posterior 5 0.01*** 0.001

dummy posterior 6 0.013*** 0.001

N = 3503 5,000 < pop <40,000

R2 adj = 0,94

Note: *** 1% significant; ** 5% significant; and * 10% significant. Standard deviation

in the second column.

Brazilian Review of Econometrics 38(1) May 2018 9

Page 10: Correcting the population of Brazilian municipalities ...

Pedro Santos and Enlinson Mattos

The obtainment of the estimated population (all of the estimated betas above)

of municipality i takes the format below:

estimatedpopulationi = e(β0+β1∗X1i +β2∗X2

i +β3∗X3i +β4∗X4

i ) (4)

Graph 2 below shows the difference between the observed and estimated popula-

tion. A gradual increase in the deviations (or residuals) is clearly noted, consistently

reaching their highest local values where the population lies on the margins of each

MPF transfer discontinuity (vertical lines).

Figure 2Deviations by municipality, 2010

Source: IBGE (2011) and the authors’ calculations

Thus, the quantity of municipalities in each of the bands can be identified and

this can be compared with which band they should rightly be in, in accordance with

the estimate without distortions. This comparison can be seen in Table 4.

It is important to highlight that the example given above is the one that presents

the greatest number of municipalities indicated outside of their estimated band,

totaling 267. This corresponds to slightly more than 7% of all the municipalities

with fewer than 40,000 inhabitants. A similar strategy was used in Monasterio

(2014). This article aims to identify which municipalities contribute more to the

10 Brazilian Review of Econometrics 38(1) May 2018

Page 11: Correcting the population of Brazilian municipalities ...

Correcting the population of Brazilian municipalities using the Jackknife model

Table 4Quantity of municipalities by MPF band

Quantity of estimated municipalities by band

1 2 3 4 5 6 7

Quantity of

observed

municipalities

by band

1 1224 – – – – – –

2 106 523 – – – – –

3 – 44 384 – – – –

4 – – 49 545 – – –

5 – – – 22 309 – –

6 – – – – 27 203 –

7 – – – – – 19 48

Source: IBGE (2011) and the author’s calculations

deviation (from the estimated position) than others, and will have their population

replaced with the one estimated in the model, as shown below.

3.2 Jackknife Method

The Jackknife method aims to estimate estimation bias by dividing a sample into

different subsamples and it is generally used to construct variance estimators and

because of its versatility it has been applied for various purposes. Its use follows a

simple process. The statistic of interest is calculated using the whole sample. Next,

one element is eliminated from the sample (obtaining a new dataset, n-1) and the

same statistics are recalculated. This is done for all the elements in the sample

individually. The difference between the initial statistic of the sample and that

obtained by removing each element explains the individual influence on the value of

the statistic of the entire sample.7

With this magnitude of influence, it can be determined which elements have

the most impact on the original statistic and then they can be treated. Thus, this

method has also come to be used in identifying outliers, enabling their adequate

treatment. For example, in an efficient frontier context, Sousa (2004) and Sousa

and Stosic (2005) identify municipalities in which there could be errors in the col-

lection and storage of the data or even scale effects of those municipalities. Detecting

and possibly excluding outliers enabled the researchers to obtain a more consistent

7 This procedure has already been used to correct outliers in other economic studies. See, for

example, Souza and Stocic (2005).

Brazilian Review of Econometrics 38(1) May 2018 11

Page 12: Correcting the population of Brazilian municipalities ...

Pedro Santos and Enlinson Mattos

estimate of the efficiency in the provision of services by some of the Brazilian mu-

nicipalities. Our next step is to obtain the impact that each of the municipalities is

causing on the population distribution. For this, the Jackknife method will be used,

which identifies the magnitude of the impact of a municipality in the adequacy of

the linear model proposed in the previous step and shows which ones should be pri-

oritarily adjusted in order to obtain a distribution with fewer deviations (McCrary,

2008).

The steps for applying it are (1) define a J function that will be applied to the

whole sample and (2) for each element in the sample, calculate the value that the

Ji function takes when the element in question is exclusively eliminated.

For this use, the standard deviation function was chosen. Thus, each

municipality i would have a related value Ji in the format:

Ji =

√∑Nj=1;i 6=j(residualj − residual�)2

N − 1(5)

Ji is the value resulting from the Jackknife of the residual of the i-th munici-

pality in the sample, N is the sample size, and the numerator corresponds to the

deviation (residual = observed – estimated) in relation to the average of deviations

( ´residual�) when observation i is excluded. Note that the deviation attributed to

element i is found in the previous step, and notice that j covers all of the elements

from 1 to N (except i). Thus, the value obtained for each Ji is the standard deviation

of the sample without element i.

The value obtained for each municipality can be observed in Graph 2. There

is similarity between the municipalities with greater deviations in Graphs 2 and 3.

However, the second enables a comparison of how much each element is distorting

the statistics of the sample as a whole, and we are thus able to attribute different

weights to each municipality depending on how far they are from their estimated

population.

3.3 Elimination of the impact

After obtaining the impacts of each municipality, regarding their contribution to

the deviations in relation to Zipf’s Law (Aipf, 1949) we have to eliminate which

treatment should be given to each of these points. One of the options would be to

12 Brazilian Review of Econometrics 38(1) May 2018

Page 13: Correcting the population of Brazilian municipalities ...

Correcting the population of Brazilian municipalities using the Jackknife model

Figure 3Standard deviation (Jackknife)

Source: IBGE (2011) and the authors’ calculations

remove these municipalities from the sample (donut hole strategy, see Barreca, et

al, 2011). However, this method can affect the continuity of the data exactly at the

most important points: the MPF band changes (see Arvate, Mattos, and Rocha,

2015). Thus, the decision was taken to replace their observed populations with the

populations estimated using the linear model without the dummies. The process

for choosing the municipalities was the following:

First, (1) determine which municipalities had a population within the intervals

[MPFi - d, MPFi + d], where MPFi is the value of each band shift and d is

the value of the interval chosen to contain some of the municipalities that could be

affected by the distortions. We choose 2.5% as the benchmark.8 Next, (2) determine

the accumulated density distribution of the municipalities ordered by the standard

deviations (Jackknife) of the residuals (Graph 4). With this, we can (3) calculate

the standard deviation of the residuals of the original sample. This will be called

the initial reference sample. Within the iterative process below, this will be the first

value that the variable will take.

8 Results robust to other choices, 1.5% and 3.5%.

Brazilian Review of Econometrics 38(1) May 2018 13

Page 14: Correcting the population of Brazilian municipalities ...

Pedro Santos and Enlinson Mattos

Figure 4Accumulated density of the standard deviations (Jackknife) by municipality - 2010

Source: IBGE (2011) and the authors’ calculations

Then, (4) a value λ ∈{0, 1} is established in the accumulated density distri-

bution. All of the municipalities whose accumulated density is less than λ will have

their population adjusted; that is, the municipalities which after their exclusion gen-

erated low accumulated densities of deviations will have their population adjusted.

After this, (5) the municipality data corrected to the level of λ are saved. This is

(6) iterated by adding the value of α (0.05) to λ, and we return to step 4 above.

For the calibration, the following variable values were chosen:

• d = 1500

• λ (initial) = 0.00

• λ (final) = 0.50

• α = 0.05

After the iterations, the result obtained is a table of data for each year (2000,

2007, 2010) with the population distributions corrected to each value attributed to

λ (from 0 to 0.50, α = 0.05).

14 Brazilian Review of Econometrics 38(1) May 2018

Page 15: Correcting the population of Brazilian municipalities ...

Correcting the population of Brazilian municipalities using the Jackknife model

3.4 McCrary Method

McCrary (2008) develops a method for identifying the manipulation of a variable by

evaluating its density function. It is based on a discontinuity estimator measured

at a particular point of interest, in order to ultimately verify whether there is a

distortion there.

The process for defining this estimator follows two steps. The first consists of

constructing a histogram of the sample to be verified so that each one of its classes

(bins) contains points from only one of the sides of the discontinuity. That is, one of

the histogram cut-offs should occur exactly at the evaluated points of interest (called

c). The second step concerns the linear smoothing of the histogram (LOWESS -

locally weighted scatterplot smoothing), where the average points of each class are

the regressors (Xj) and the normalized count of the number of observations of each

class is the variable of interest (Yj).

The normalization is calculated in the following way:

Yj =1

nb

n∑i=1

1(g (Ri)) = Xj (6)

where Yj is the value of the normalized variable of interest, b is the length of each

class of the histogram, n is the quantity of classes, the sum provides the count of

the number of observations within the j-th class, and Ri is the running variable,

which in this case is population.

In order to obtain clarity in the visualization of the data, two estimated smooth-

ings (f) are calculated, one for each side of the point (c) where the discontinuity is

being investigated. Thus, in order to obtain the estimated value of the discontinuity,

the difference is calculated between the value that each smoothed function (to the

right and to the left) would take at point c.9

Finally, there is the definition of the length of each class (b) and of the width

of the band (h). McCrary (2008) suggests different possibilities for obtaining these

values numerically. Note that this topic enables it to be evaluated whether there has

been a specific effort to concentrate elements of a sample to one side of a particular

9 See McCrary (2009).

Brazilian Review of Econometrics 38(1) May 2018 15

Page 16: Correcting the population of Brazilian municipalities ...

Pedro Santos and Enlinson Mattos

limit. For this case, the verification will concern the various MPF band changes,

checking what the distortion is before and after the adjustment of the populations.

This distortion is calculated via the difference in the logarithm of the f function

calculated at the point of discontinuity c at the two sides (to the right and to the

left).

We chose to follow Monasterio (2014) and b was chosen as 283 because “it is

prime and a divider of all the MPF shift points” (Monasterio, p. 185) and for h

the automatic procedure proposed by the author (McCrary, 2008) was used, which

follows Fan and Gijbels (1996).10

The next step was to apply the method in two different samples, whose only

difference would be the adjustment carried out in the previous steps. Thus, in the

example below the 2010 census, the first change in MPF band (10,188 inhabitants),

and a λ of 0.50 were chosen. Graphs 5 and 6 visually show how the adjustment

influenced the distortion.

For the sample prior to the adjustment, the discontinuity estimate (difference

between the functions to the right and to the left of point c) was 1.43 with a standard

error of 0.13. After the adjustment, it decreased to 0.10 with a standard error of

0.09. With this, the statistic ceased to be statistically significant and the evidence

of discontinuity disappeared. This phenomenon can also be visually verified by

comparing the graphs.

10 The automatic procedure used by McCrary (2008) can be summarized in the following steps:

A). Calculate an initial histogram, which will be used as a basis for the calculations, using

a bin size estimated by the algorithm or provided by the programmer. In the case of this

study, the value provided was 283. B). Using the initial version of the histogram, the fourth

order polynomial is estimated separately for both sides of the discontinuity. Based on this,

for each side the value of the expression θ[σ2 (b− a) /

∑f ′′ (Xj)2

]1/5is calculated and the

h value is determined as its average. θ is equal to 3.348, σ2 is the average square error of the

regression, b−a is equal to Xj − c for the right side of the regression and c−X1 for the left

side and f ′′ (Xj) is the estimate of the second order derivative of the polynomial found for the

overall distribution. C). The second step of this algorithm is based on a bandwidth selection

rule developed by Fan and Gijbels. With the bandwidth determined, the initial histogram

(calculated with the pre-determined bin and the f (r) curve based on the value of h provides

a detailed overview of the distribution of the variable around the point of discontinuity. We

tested the Cross-Validation option proposed by Stone (1974) and the results are similar.

16 Brazilian Review of Econometrics 38(1) May 2018

Page 17: Correcting the population of Brazilian municipalities ...

Correcting the population of Brazilian municipalities using the Jackknife model

Figure 5Initial database (2010 census, first change in band)

Source: IBGE (2011)

Figure 6After adjustment, λ to 0.50 (2010 census, first change in band)

Source: IBGE (2011) and the authors’ calculations

Brazilian Review of Econometrics 38(1) May 2018 17

Page 18: Correcting the population of Brazilian municipalities ...

Pedro Santos and Enlinson Mattos

4. Results

After applying our methods to the population samples from the 2000 and 2010

Demographic Censuses and the 2007 TCU data, we obtained the alterations in the

discontinuity estimates for each municipality whose exclusion considerably reduced

the standard deviation of the deviations from the estimated population. Their

respective values can be observed in Tables 5, 6, and 7, as well as the standard error

found and its significance.

It can be observed that for each band, the λ needed to remove the significance

of the discontinuity is different, and whose value is highlighted in each column.

Moreover, depending on the increase in the λ the significance of the difference in

the statistic of the functions to the right and to the left reduces more and more, in

a monotonic way. This only does not repeat in the sixth MPF income band and

for the year 2000 (λ = 0). However, for all of the λ values after this same band,

monotonicity is shown.

More importantly, except for the fourth 2010 MPF band, we can highlight that

with a λ = 0.30 we would eliminate all of the population discontinuities observed in

the different MPF bands, via the t test. Moreover, we detect which municipalities

can be considered as outliers via this λ. By attributing the population estimated by

Zipf’s Law (Zipf, 1949) for these municipalities in accordance with that predicted

by equation (3) we can note, also graphically, that the population distributions

represent continuity again in those MPF bands.11

Tables 5, 6, and 7 also indicate that with λ = 0.30, we could correct all of the

population distortions observed in the small municipalities, except close to band

4 of the 2010 Census, when our procedure does not appear to be efficient. Note

that in these tables for each band we highlight the λ that guarantees the absence of

statistical difference in the municipality density histogram between the two sides of

the population bands.

One possible interpretation of our method is that we use a completely exogenous

variable from the point of view of the municipalities (Zipf, 1949) to estimate the esti-

11 Graphs contained in the supplementary file.

18 Brazilian Review of Econometrics 38(1) May 2018

Page 19: Correcting the population of Brazilian municipalities ...

Correcting the population of Brazilian municipalities using the Jackknife model

mated (instrumentalized) population which, in turn, is replaced in the municipalities

whose population can be considered endogenous by the Jackknife method.

Finally, Table 8 indicates how many municipalities should have their population

corrected for each λ for each population band in 2000, 2007, and 2010. As can be ob-

served, we would correct the sample of 302, 319, and 337 municipalities respectively

for 2000, 2007, and 2010, for a λ of 0.30, which would correct all of the distortions

found (except in band 4 of the 2010 census). This strategy would indicate the upper

limit of alterations, since we could allow different λ’s for each band and year12

5. MPF Distribution per capita

With both the original and corrected distributions, we investigated what happens

with the MPF distribution per capital granted to the municipalities in the two cases.

The value obtained corresponds to the share that each municipality would have in

the total MPF distributed in the year.13

Once the MPF variables were calculated for the two populations, we considered

the population variable as independent and MPF per capita itself as a dependent

variable in a fourth degree linear regression. For each MPF band a specific polyno-

mial was calculated in order to obtain the best smoothing possible of the values of

each group of municipalities. A confidence interval of one standard deviation was

added to the graphs for comparison.

As a database for the calculation the two population samples from each year

(2000, 2007, and 2010) were used, the original one with λ = 0.00 and the corrected

one with λ = 0.30, which as verified is a sufficient value to remove the discontinuities.

The resulting graphs, with the MPF per capita per municipality and their respective

values expected by the exponential smoothing, can be observed below.

It can be visually observed that their does not appear to be any difference be-

tween the two strategies for calculating the MPF, especially at the margins of the

population bands, where our procedure acted; that is, the effect of correcting the dis-

tortion in the discontinuity of the population distribution does not appear to cause

12 By allowing for different λ’s, the municipalities that would have their populations readjusted

would be 154, 184, and 260, respectively.

13 The value was multiplied by 10ˆ6 for a better representation of the data.

Brazilian Review of Econometrics 38(1) May 2018 19

Page 20: Correcting the population of Brazilian municipalities ...

Pedro Santos and Enlinson Mattos

Table 5Value and significance of the MPF discontinuities (2000)

Lambda StatisticsYear: 2000

10188 13584 16980 23772 30564 37356

0%

Mean 0,500 0,284 0,491 0,570 0,440 0,097

Standard error 0,105 0,117 0,150 0,193 0,236 0,293

Test 4,748*** 2,427** 3274*** 2,947*** 1,862* 0,329

5%

Mean 0,304 0,281 0,371 0,492 0,373 -0,907

Standard error 0,099 0,116 0,144 0,188 0,240 0,324

Test 3,071*** 2,437** 2,568** 2,612*** 1,553 2,8***

10%

Mean 0,209 0,285 0,186 0,204 0,231 -0,575

Standard error 0,096 0,116 0,134 0,171 0,244 0,335

Test 2,18** 2,461** 1,389 1,193 0,95 1,717*

15%

Mean 0,150 0,289 0,103 0,139 0,156 -0,567

Standard error 0,094 0,116 0,129 0,162 0,237 0,336

Test 1,597 2,487** 0,801 0,856 0,657 1,69*

20%

Mean 0,110 0,246 0,069 0,097 0,124 -0,543

Standard error 0,093 0,115 0,126 0,155 0,232 0,338

Test 1,185 2,136** 0,549 0,624 0,535 1,608

25%

Mean 0,091 0,192 0,016 0,048 0,114 -0,511

Standard error 0,093 0,114 0,122 0,128 0,230 0,340

Test 0,987 1,687* 0,134 0,376 0,495 1,503

30%

Mean 0,088 0,144 -0,039 0,053 0,102 -0,498

Standard error 0,093 0,112 0,116 0,122 0,229 0,341

Test 0,949 0,1292 0,34 0,433 0,446 1,458

35%

Mean 0,089 0,128 -0,048 0,059 0,098 -0,498

Standard error 0,093 0,111 0,115 0,121 0,227 0,341

Test 0,953 1,151 0,419 0,485 0,431 1,457

40%

Mean 0,081 0,096 -0,038 0,021 0,077 -0,493

Standard error 0,094 0,112 0,118 0,133 0,225 0,342

Test 0,858 0,861 0,318 0,161 0,341 1,439

45%

Mean 0,078 0,080 -0,029 0,022 0,119 -0,493

Standard error 0,095 0,112 0,120 0,134 0,224 0,342

Test 0,826 0,714 0,245 0,167 0,532 1,439

50%

Mean 0,077 0,037 -0,021 0,032 0,279 -0,493

Standard error 0,095 0,110 0,122 0,131 0,237 0,343

Test 0,809 0,334 0,175 0,245 1,18 1,438

Significance values based on the Student t test

20 Brazilian Review of Econometrics 38(1) May 2018

Page 21: Correcting the population of Brazilian municipalities ...

Correcting the population of Brazilian municipalities using the Jackknife model

Table 6Value and significance of the MPF discontinuities (2007)

Lambda StatisticsYear: 2007

10188 13584 16980 23772 30564 37356

0%

Mean 0,813 0,581 0,496 0,578 0,180 0,753

Standard error 0,111 0,122 0,144 0,191 0,213 0,279

Test 7,293*** 4,744*** 3,442*** 3,02*** 0,844 2,698***

5%

Mean 0,119 0,267 0,105 0,586 0,180 0,713

Standard error 0,093 0,114 0,125 0,196 0,211 0,301

Test 1,284 2,347** 0,843 2,987*** 0,855 2,365**

10%

Mean 0,103 0,107 -0,102 0,542 0,159 -0,013

Standard error 0,092 0,106 0,125 0,195 0,212 0,289

Test 1,118 1,003 0,81 2,785*** 0,752 0,044

15%

Mean 0,105 0,010 -0,132 0,379 0,128 -0,052

Standard error 0,092 0,104 0,128 0,184 0,211 0,295

Test 1,136 0,095 1,034 2,059** 0,607 0,175

20%

Mean 0,112 -0,012 -0,118 0,312 0,138 -0,027

Standard error 0,093 0,104 0,128 0,179 0,212 0,298

Test 1,204 0,111 0,924 1,743* 0,651 0,09

25%

Mean 0,112 -0,025 -0,102 0,190 0,168 -0,016

Standard error 0,093 0,105 0,127 0,171 0,216 0,301

Test 1,2 0,241 0,805 1,114 0,778 0,054

30%

Mean 0,116 -0,030 -0,087 0,104 0,205 0,000

Standard error 0,093 0,105 0,125 0,164 0,219 0,303

Test 1,247 0,289 0,692 0,632 0,933 0

35%

Mean 0,115 -0,032 -0,078 0,106 0,226 0,006

Standard error 0,093 0,105 0,124 0,164 0,221 0,303

Test 1,243 0,304 0,631 0,642 1,024 0,02

40%

Mean 0,110 -0,030 -0,067 0,102 0,247 0,014

Standard error 0,094 0,105 0,123 0,164 0,223 0,303

Test 1,176 0,281 0,543 0,623 1,11 0,045

45%

Mean 0,104 -0,035 -0,060 0,095 0,254 0,042

Standard error 0,094 0,107 0,123 0,164 0,224 0,300

Test 1,103 0,325 0,486 0,58 1,135 0,139

50%

Mean 0,096 -0,037 -0,049 0,093 0,301 0,041

Standard error 0,095 0,108 0,123 0,164 0,227 0,300

Test 1,004 0,343 0,398 0,566 1,325 0,137

Significance values based on the Student t test

Brazilian Review of Econometrics 38(1) May 2018 21

Page 22: Correcting the population of Brazilian municipalities ...

Pedro Santos and Enlinson Mattos

Table 7Value and significance of the MPF discontinuities (2010)

Lambda StatisticsYear: 2010

10188 13584 16980 23772 30564 37356

0%

Mean 1,427 0,653 0,792 0,877 1,460 0,964

Standard error 0,131 0,121 0,145 0,199 0,280 0,327

Test 10.,904*** 5,419*** 5,452*** 4,401*** 5,206*** 2,948***

5%

Mean 0,242 0,728 0,573 0,894 1,417 1,000

Standard error 0,093 0,126 0,139 0,204 0,281 0,338

Test 2,614*** 5,784*** 4,128*** 4,39*** 5,049*** 2,958***

10%

Mean 0,133 0,375 0,258 0,893 0,808 0,862

Standard error 0,088 0,115 0,123 0,203 0,238 0,321

Test 1,506 3,271*** 2,09** 4,39*** 3,395*** 2,686***

15%

Mean 0,133 0,282 0,165 0,882 0,547 0,331

Standard error 0,088 0,111 0,115 0,201 0,225 0,287

Test 1,518 2,544** 1,438 4,395*** 2,43** 1,154

20%

Mean 0,123 0,209 0,107 0,801 0,450 0,242

Standard error 0,088 0,108 0,112 0,197 0,225 0,284

Test 1,41 1,927* 0,95 4,07*** 1,994** 0,853

25%

Mean 0,118 0,17 0,095 0,736 0,215 -0,327

Standard error 0,088 0,107 0,114 0,195 0,228 0,281

Test 1,339 1,59 0,835 3,781*** 0,943 1,167

30%

Mean 0,116 0,073 -0,003 0,650 0,124 -0,399

Standard error 0,09 0,104 0,119 0,191 0,229 0,285

Test 1,29 0,705 0,027 3,398*** 0,541 1,4

35%

Mean 0,114 0,038 -0,057 0,606 0,066 -0,362

Standard error 0,091 0,102 0,123 0,190 0,230 0,293

Test 1,254 0,375 0,468 3,196*** 0,287 1,238

40%

Mean 0,113 0,014 -0,117 0,475 0,003 -0,312

Standard error 0,091 0,101 0,127 0,183 0,230 0,299

Test 1,242 0,14 0,923 2,596*** 0,014 1,045

45%

Mean 0,112 0,002 -0,117 0,380 0,011 -0,252

Standard error 0,092 0,101 0,127 0,177 0,228 0,301

Test 1,214 0,023 0,922 2,147** 0,049 0,839

50%

Mean 0,11 0 -0,119 0,297 0,014 -0,211

Standard error 0,093 0,101 0,128 0,173 0,225 0,301

Test 1,186 0,001 0,925 1,716* 0,061 0,7

Significance values based on the Student t test

22 Brazilian Review of Econometrics 38(1) May 2018

Page 23: Correcting the population of Brazilian municipalities ...

Correcting the population of Brazilian municipalities using the Jackknife model

Table 8Number of municipalities with altered population, by MPF band and year

2010 Census - Number of municipalities with altered population by band

Lambda Alterations 5 10 15 20 25 30 35 40 45 50

10188 43 62 66 72 78 88 91 93 95 97

13584 0 0 0 0 4 7 23 36 45 53

16980 35 43 47 53 60 68 76 82 89 95

23772 2 30 37 39 45 50 51 55 61 65

30564 6 16 29 38 41 45 47 52 57 61

37356 16 38 44 44 44 44 44 44 44 44

Total 102 189 223 246 272 302 332 362 391 415

2007 Census - Number of municipalities with altered population by band

Lambda Alterations 5 10 15 20 25 30 35 40 45 50

10188 81 88 91 94 97 99 102 102 104 110

13584 18 30 38 41 44 46 48 58 68 74

16980 18 40 44 55 67 78 82 87 88 88

23772 0 5 8 16 25 29 33 40 42 43

30564 2 8 10 12 14 18 21 28 32 42

37356 17 39 45 46 47 49 49 51 52 52

Total 136 210 236 264 294 319 335 366 386 409

2010 Census - Number of municipalities with altered population by band

Lambda Alterations 5 10 15 20 25 30 35 40 45 50

10188 113 130 137 146 146 146 146 146 146 146

13584 1 21 34 41 43 48 60 67 73 74

16980 30 57 61 67 68 74 77 80 85 92

23772 0 0 0 5 6 9 11 13 15 18

30564 1 13 20 25 35 40 43 44 45 47

37356 0 1 7 15 16 20 26 28 32 39

Total 145 222 259 299 314 337 363 378 396 416

Brazilian Review of Econometrics 38(1) May 2018 23

Page 24: Correcting the population of Brazilian municipalities ...

Pedro Santos and Enlinson Mattos

Figure 7MPF per capita, original population, 2000

Source: IBGE and the authors’ calculations

an impact on the distribution of the MPF per capita for the municipalities. This

is due to the fact that only some municipalities (around 300) have their population

corrected.

Thus, our proposed method appears to be robust in that it effectively manages

to correct an anomaly observed in the sample, without causing other undesirable

effects due to the methodology for distributing the fund among the other munici-

palities. Another possible interpretation concerns the fact that the use of estimated

populations to correct this anomaly observed in the population of Brazilian munic-

ipalities appears to be a promising strategy. More importantly, the strategies that

use estimated MPF (Fuzzy RDD) should not necessarily be discarded.

6. Conclusion

The existence of a distortion in the municipality populations close to the MPF band

changes was verified in Litschig (2012) and Monasterio (2014). The authors also

identify that there appeared to be an increase in the population of the municipalities,

which could cause their displacement into the next bands, with the aim of increasing

24 Brazilian Review of Econometrics 38(1) May 2018

Page 25: Correcting the population of Brazilian municipalities ...

Correcting the population of Brazilian municipalities using the Jackknife model

Figure 8MPF per capita, corrected population, 2000

Source: IBGE and the authors’ calculations

Figure 9MPF per capita, original population, 2007

Source: IBGE and the authors’ calculations

Brazilian Review of Econometrics 38(1) May 2018 25

Page 26: Correcting the population of Brazilian municipalities ...

Pedro Santos and Enlinson Mattos

Figure 10MPF per capita, corrected population, 2007

Source: IBGE and the authors’ calculations

Figure 11MPF per capita, original population, 2010

Source: IBGE and the authors’ calculations

26 Brazilian Review of Econometrics 38(1) May 2018

Page 27: Correcting the population of Brazilian municipalities ...

Correcting the population of Brazilian municipalities using the Jackknife model

Figure 12MPF per capita, corrected population 2010

Source: IBGE and the authors’ calculations

the transfer per capita obtained. This raises the discussion of how adequate the

current criteria would be for distributing the Municipal Participation Fund (MPF)

and whether population could be correctly used as a running variable in a Regression

Discontinuity strategy.

This study contributes in two ways. It uses an exogenous rule (Zipf’s Law,

1949) to quantify the expected population of each municipality. Based on this

expected population, we use the Jackknife method to identify the municipalities that

contribute most to the deviations from what is expected. For these municipalities,

we apply the expected population.

By applying our method in the samples from the 2000 Demographic Census,

from the 2007 TCU, and from the 2010 Demographic Census, a significant improve-

ment was obtained in the population distortion of municipalities with up to 40,000

inhabitants around the MPF band changes. Originally, in 16 of the 18 band changes

a statistically significant distortion was observed, showing some form of distortion

of the variable. After applying the method for identifying and adjusting the munic-

ipalities, we determined in an exploratory way what the optimal adjustment point

Brazilian Review of Econometrics 38(1) May 2018 27

Page 28: Correcting the population of Brazilian municipalities ...

Pedro Santos and Enlinson Mattos

would be using the McCrary (2008) test and thus managed to (almost) eliminate

the totality of the distortion in the population of the municipalities for the three

observed Censuses. With this, the distortion was smoothed, enabling the obtained

sample to be used for purposes that take into account the population order and

relationship between Brazilian municipalities.

As can be seen in Tables 5, 6, and 7, over the years the distortions only in-

creased (visible via the increasing value of the discontinuities for λ = 0) - and this

can concentrate the MPF distribution more and more in those municipalities iden-

tified in our strategy (Mendes, 2008). In particular, we estimate that around 300

municipalities are contributing to this practice, which suggests the need to review

the way that the MPF is distributed to municipalities from the Interior group.

On the other hand, when evaluating the MPF per capita of the municipalities

before and after the proposed correction, no significant alteration is verified in the

values obtained close to the band changes. Thus, the impact on the MPF per capita

caused by our strategy does not appear to significantly impact the municipalities

that should not be the object of correction. Our results suggest that the use of a

sharp RD (regression discontinuity) in models that explain MPF by population may

not be appropriate. This is due to the proven discontinuity of the population of the

municipalities.

However, our strategy leans towards the use of fuzzy RDD type regressions

in which the MPF is instrumentalized by its theoretical counterpart in which the

corrected population could be considered as a determinant of the amount to be

received by each municipality. That strategy could capture the exogenous variation

in the transfers for each economic variable of interest.

References

Arvate, P., Mattos, E. and Rocha, F. (2015). Conditional versus unconditional

grants and local public spending in Brazilian municipalities. 35th Meeting of

the Brazilian Econometric Society, Foz do Iguacu, Brazil.

Arvate, P., Mattos, E. and Rocha, F. (2011). Flypaper effect revisited: Evidence for

tax collection efficiency in Brazilian municipalities. Estudos Economicos, v. 41,

p. 7-28.

28 Brazilian Review of Econometrics 38(1) May 2018

Page 29: Correcting the population of Brazilian municipalities ...

Correcting the population of Brazilian municipalities using the Jackknife model

Barreca, A.; Guldi, M.; Lindo, J.; Waddell, G. (2011) “Saving Babies? Revisiting

the Effect of the Very Low Birth Weight Classification”. Quarterly Journal of

Economics, 126.

Brollo, F.; Nannicini, T.; Perotti, R.; Tabellini, G. (2013). The Political Resource

Curse, American Economic Review, 103(5), p. 1759-96.

Castro, M. and Regatieri, R. (2014). Impacto do Fundo de Participacao dos Mu-

nicıpios sobre os gastos publicos por funcao e subjuncao: analise atraves de

uma regressao em descontinuidade. 42 Encontro Nacional de Economia, Natal,

Rio Grande do Norte, Brasil.

Eggers, A. C., Freier, R., Grembi, V., and Nannicini, T. (2015). Regression Dis-

continuity Designs Based on Population Thresholds: Pitfalls and Solutions.

Working Paper IZA No. 9553, December.

Funaro et al (2009). Diretrizes para apresentacao de dissertacoes e teses da USP:

documento eletronico e impresso. Universidade de Sao Paulo. Available from:

http://www.usp.br/prolam/ABNT 2011.pdf. Accessed on July 10th 2015.

Gabaix, X. (1999). Zipf’s Law for Cities: An Explanation. The Quarterly Journal

of Economics, v.114, n.3, p.739-767.

Gerard, F., Rokkanen, M., Rothe, C. (2015). Identification and Inference in Re-

gression Discontinuity Designs with a Manipulated Running Variable (working

paper).

IBGE - Instituto Brasileiro de Geografia e Estatıstica. Censo 2010. Brasılia,

IBGE: 2011. Available from: http://www.ibge.gov.br/home/estatistica/

populacao/censo2010/default.shtm.

IBGE - Instituto Brasileiro de Geografia e Estatıstica. Contagem

da populacao 2007. Brasılia, IBGE: 2007. Available from:

http://www.ibge.gov.br/home/estatistica/populacao/contagem2007.

Brazilian Review of Econometrics 38(1) May 2018 29

Page 30: Correcting the population of Brazilian municipalities ...

Pedro Santos and Enlinson Mattos

Litschig, S. (2012). Are rules-based government programs shielded from special-

interest politics? Evidence from revenue-sharing transfers in Brazil. Journal of

public Economics 96.11: 1047-1060.

Litschig, S.; Morrison, K. (2013). The impact of intergovernmental transfers on edu-

cation outcomes and poverty reduction. American Economic Journal: Applied

Economics.

Mattos, E. and V. Ponczek (2013). Efeitos da divisao municipal na oferta de bens

publicos e indicadores sociais. Revista Brasileira de Economia 67 (3), 315-336.

Mccrary, J. (2008). Manipulation of the running variable in the regression

discontinuity design: A density test. Journal of Econometrics 142(2): 698–714.

Mccrary, J. (2009). Codes for Manipulation of the Running Variable. Contem

o codigo para uso no software STATA (DCdensity.ado), chamadas e saıdas

de um exemplo (DCdensity example.do, DCdensity example.log, DCden-

sity example.eps) e explicacao do codigo implementado (DCdensity.pdf). Avail-

able from: http://eml.berkeley.edu/˜jmccrary/DCdensity. Accessed on June

7th 2015.

Mendes, M.; Miranda, R. B.; Cossio, F. B. (2008). O Fundo de Participacao dos

Municıpios precisa mudar. Constituicao de 1988: O Brasil 20 anos depois -

Estado e economia em vinte anos de mudancas, v.4, 2008.

Monasterio, L. M. (2014). A estranha distribuicao da populacao dos pequenos

municıpios brasileiros. Rev. Econ. NE, Fortaleza, v.45, n.4, p.111-119, 2014.

R Core Team. (2014). R: A language and environment for statistical computing.

R Foundation for Statistical Computing, Vienna, Austria. Available from:

http://www.R-project.org/.

S Original, from StatLib and by Rob Tibshirani. R PORT by Friedrich Leisch.

(2015). bootstrap: Functions for the Book ’An Introduction to the Bootstrap.

Available from: http://CRAN.R-project.org/package=bootstrap.

30 Brazilian Review of Econometrics 38(1) May 2018

Page 31: Correcting the population of Brazilian municipalities ...

Correcting the population of Brazilian municipalities using the Jackknife model

Sousa, M. C.; Neto, F. C.; Stosic, B. (2004). Explaining DEA Technical Efficiency

Scores in an Outlier Corrected Environment: The Case of Public Services in

Brazilian Municipalities. Brazilian Review of Econometrics, v.25, n.2, p.287-

313, 2005.

Sousa, M. C. S., Stosic, B. D. Technical Efficiency of the Brazilian Municipali-

ties: Correcting Nonparametric Frontier Measurements for Outliers. Journal

of Productivity Analysis, Springer-Netherlands, v. 24, p. 157-181, 2005.

Stone, M. “Cross-Validation and Multinomial Prediction,” Biometrika, December

1974, 61 (3), 509– 515.

STN - Secretaria do Tesouro Nacional. (2012). O que voce precisa

saber sobre as transferencias constitucionais e legais. Avaialble from:

http://www3.tesouro.fazenda.gov.br/estados municipios/download/CartilhaMPF.pdf.

Accessed on June 14th 2015.

Zipf, K (1949). Human Behavior and the Principle of Least Effort. Addison-Wesley.

Brazilian Review of Econometrics 38(1) May 2018 31

Page 32: Correcting the population of Brazilian municipalities ...

Pedro Santos and Enlinson Mattos

A. Appendix

A.1 Codes, databases, and products

All of the code, all of the databases, and the graphs obtained can be found at the

link below: https://goo.gl/zY6Dnu

The code (script and lib folder) covers steps 1 to 3 of the Methodology chapter.

The algorithm was constructed in R. The McCrary code in Stata can be found on

the page referenced in the Bibliography.

The databases (database folder) are the Demographic Censuses and also the

variations in MPF per capita.

The products (exports folder) are the graphs and tables generated.

Using the code in R, the graphs are in the r plots subfolder.

Using the McCrary algorithm, the tables that were used as a basis are in the

stata database subfolder, the file with the command lines are in the stata code

subfolder, and the graphs obtained are in the stata plots folder.

A.2 Tables of observed and estimated bands

A.2.1 Year 2000

Table A.12000, before adjustment (λ to 0.00)

Estimated band

1 2 3 4 5 6 7

Observed

band

1 1344 – – – – – –

2 37 566 – – – – –

3 – 18 409 – – – –

4 – – 29 530 – – –

5 – – – 21 315 4 –

6 – – – – – 190 –

7 – – – – – 13 40

Source: IBGE (2011) and the authors’ calculations

32 Brazilian Review of Econometrics 38(1) May 2018

Page 33: Correcting the population of Brazilian municipalities ...

Correcting the population of Brazilian municipalities using the Jackknife model

Table A.22000, after adjustment (λ to 0.30)

Estimated band

1 2 3 4 5 6 7

Observed

band

1 1381 – – – – – –

2 – 573 – – – – –

3 – 11 437 – – – –

4 – – 1 551 – – –

5 – – – – 315 4 –

6 – – – – – 203 –

7 – – – – – – 40

Source: IBGE (2011) and the authors’ calculations

A.2.2 Year 2007

Table A.32007, before adjustment (λ to 0,00)

Estimated band

1 2 3 4 5 6 7

Observed

band

1 1284 – – – – – –

2 63 540 – – – – –

3 – 36 411 – – – –

4 – – 31 553 – – –

5 – – – 14 319 – –

6 – – – – 6 197 –

7 – – – – – 16 53

Source: IBGE (2011) and the authors’ calculations

Table A.42007, after adjustment (λ to 0.45)

Estimated band

1 2 3 4 5 6 7

Observed

band

1 1347 – – – – – –

2 1 575 – – – – –

3 – – 442 – – – –

4 – – 1 566 – – –

5 – – – – 319 – –

6 – – – – 7 212 –

7 – – – – – 1 52

Source: IBGE (2011) and the authors’ calculations

Brazilian Review of Econometrics 38(1) May 2018 33

Page 34: Correcting the population of Brazilian municipalities ...

Pedro Santos and Enlinson Mattos

A.2.3 Year 2010

Table A.52010, before adjustment (λ to 0.00)

Estimated band

1 2 3 4 5 6 7

Observed

band

1 1224 – – – – – –

2 106 523 – – – – –

3 – 44 384 – – – –

4 – – 49 545 – – –

5 – – – 22 309 – –

6 – – – – 27 203 –

7 – – – – – 19 48

Source: IBGE (2011) and the authors’ calculations

Table A.62010, after adjustment (λ to 0.50)

Estimated band

1 2 3 4 5 6 7

Observed

band

1 1330 – – – – – –

2 1 564 2 – – – –

3 – – 431 2 – – –

4 – – – 559 – – –

5 – – – 8 336 – –

6 – – – – – 222 –

7 – – – – – 1 47

Source: IBGE (2011) and the authors’ calculations

A.3 Results with a similar database to MONASTERIO (2014)

This study used municipalities with populations between 5,000 and 40,000 inhab-

itants in the calculations. The interval is slightly different from MONASTERIO

(2014), who used municipalities with populations between 5,000 and 50,000 inhab-

itants. Thus, the result of the method applied to each one of the discontinuities

concerning this new database (λ = 0.50) are shown below.

The loss and/or reduction of significance in all of the 2010 discontinuity bands

(observed in Table 7) also occurs with this new application.

34 Brazilian Review of Econometrics 38(1) May 2018

Page 35: Correcting the population of Brazilian municipalities ...

Correcting the population of Brazilian municipalities using the Jackknife model

Figure A.12010, first discontinuity

Source: IBGE and the authors’ calculations

Figure A.22010, second discontinuity

Source: IBGE and the authors’ calculations

Brazilian Review of Econometrics 38(1) May 2018 35

Page 36: Correcting the population of Brazilian municipalities ...

Pedro Santos and Enlinson Mattos

Figure A.32010, third discontinuity

Source: IBGE and the authors’ calculations

Figure A.42010, fourth discontinuity

Source: IBGE and the authors’ calculations

36 Brazilian Review of Econometrics 38(1) May 2018

Page 37: Correcting the population of Brazilian municipalities ...

Correcting the population of Brazilian municipalities using the Jackknife model

Figure A.52010, fifth discontinuity

Source: IBGE and the authors’ calculations

Figure A.62010, sixth discontinuity

Source: IBGE and the authors’ calculations

Brazilian Review of Econometrics 38(1) May 2018 37

Page 38: Correcting the population of Brazilian municipalities ...

Pedro Santos and Enlinson Mattos

Figure A.72010, seventh discontinuity

Source: IBGE and the authors’ calculations

Table A.7Value and significance of the MPF discontinuities (λ to 0.50)

Discontinuity Mean (Standard error)

1 0.135 (0.092)

2 0.025 (0.103)

3 -0.086 (0.106)

4 0.155 (0.153)

5 0.393 (0.201) *

6 0.544 (0.268) **

7 -0.618 (0.356) *

** 5% signficance, * 10% significance

Source: IBGE and the authors’ calculations

38 Brazilian Review of Econometrics 38(1) May 2018