Correcting the population of Brazilian municipalities ...
Transcript of Correcting the population of Brazilian municipalities ...
Correcting the population of Brazilianmunicipalities using the Jackknifemodel*
Pedro Santos and Enlinson Mattos**
Abstract
This paper proposes a method for identifying and correcting the observed distor-
tion in the population distribution of Brazilian municipalities present in the De-
mographic Census data (Litschig, 2012 and Monasterio, 2014). This distortion is
characterized by a high concentration of municipalities with populations close to the
bands used as criteria for distributing funds to municipalities in Brazil (Municipal
Participation Fund, MPF). The proposed method uses two steps. First, we follow
Gabaix (1999) and use a model that follows Zipf’s Law to estimate the population
of cities (ZIPF, 1949), as in Monasterio (2014). Next, using the Jackknife method,
which minimizes population deviations from the standard model, we identify the
candidate municipalities for the adjustment and replace their population with the
expected counterpart, so as not to present significant discontinuities in the popu-
lation distribution (McCrary, 2008). We found no significant change in the MPF
distribution after this correction, which suggests that the observed phenomenon of
discontinuous population distribution is small and, more importantly, models using
fuzzy RDD (Brollo et al 2011) cannot be completely invalidated.
Keywords: Municipal Participation Fund; Demographic Census; Vertical fiscal
transfers; Zipf’s Law; manipulation of the running variable.
JEL codes: H70; H80; C40.
* Submitted on 16/09/2015. Revised on 17/07/2017.
We are grateful for the comments of an anonymous reviewer, Paulo Arvate and Leonardo
Monasterio. Any errors are the authors’ responsibility.
** Sao Paulo School of Economics. Getulio Vargas Foundation. Email: [email protected]
and [email protected].
Brazilian Review of Econometrics
v. 38, n. 1, pp. 1–38 May 2018
Pedro Santos and Enlinson Mattos
1. Introduction
The main instrument for distributing federal government resources to municipalities
in Brazil is the Municipal Participation Fund (MPF), which considers population
the only criterion for distribution to those municipalities with fewer than 142,633
inhabitants. Moreover, the distribution itself is carried out discretely, in which
population bands (up to 10,188, between 10,188 and 13,584, etc.) are used to deter-
mine the quantity of resources granted, with there being a discontinuous increase in
transfers for each new band reached. This becomes an incentive for municipalities
to seek strategies to increase their population, in light of the gains they would ob-
tain once the minimum population of the next band was reached. This distortion
was initially studied in Litschig (2012) and Monasterio (2014), in which evidence
was found from 1991 onwards that there was significance in the discontinuity of the
density of municipalities specifically at the MPF band changes.
This paper uses a strategy to correct the distortion in the population distribu-
tion of Brazilian municipalities. This distortion is characterized by a concentration
of municipalities with population values close to the band changes of the Municipal
Participation Fund (MPF), present in the data from the Demographic Census. The
proposed method uses two steps. First, we follow Gabaix (1999) and use a model
that follows Zipf’s Law to estimate the population of cities (ZIPF, 1949), which is
similar to the strategy found in Monasterio (2014). Next, using the Jackknife model,
which minimizes population deviations from the model, we identify the candidate
municipalities for adjustment and replace their populations with the estimated coun-
terpart so as not to present significant discontinuities in the population distribution
(McCrary test, 2008).
This question is highly relevant in the area of public finance. As the rule for dis-
tributing funds to small Brazilian municipalities is based discontinuously on popula-
tion, various studies use this discontinuity as an identification strategy to estimate
the impact of an increase in municipality resources over some economic variable
such as corruption (Brollo et al, 2013), education and poverty (Litshig and Morri-
son, 2013), public spending (Arvate et al, 2011, 2015, Castro and Regattieri, 2014)
etc.
2 Brazilian Review of Econometrics 38(1) May 2018
Correcting the population of Brazilian municipalities using the Jackknife model
The main contribution from this study consists of using an outlier detection
method (Jackknife) to iteratively identify and treat units in a sample that do not
follow a particular distribution or rule of growth. It is worth highlighting the simi-
larity of our strategy with the use of instrumental variables. Zipf’s Law (Aipf, 1949)
works here as our instrument for positioning the cities in the distribution. That is,
it is an exogenous rule used as an instrument for the municipalities identified as
outliers. The success of using this method is then measured by comparing the dis-
continuities before and after its application. After successfully reconstructing the
population distribution, it is then estimated what the MPF distribution would be
with this corrected population. The object of the study are municipalities with up
to 40,000 inhabitants, which corresponds to 6 different population bands. As in
Monasterio (2014), these were chosen because they have the greatest incentives to
overestimate their populations. The data used came from the 2000 Demographic
Census, from the 2007 Federal Court of Auditors (TCU), and the 2010 Demographic
Census.1
Our results derived from implementing the McCrary (2007) test suggest that
the Jackknife and correction methods were effective in identifying and eliminating
discontinuity in the population distribution. On the other hand, our results suggest
that this municipal population correction does not appear to significantly affect the
MPF distribution. This evidence indicates the use of the fuzzy RDD regression,
where the running variable in this model is the municipality’s population and the
MPF instrument would be what the municipality should receive in accordance with
the law.
The paper is divided into four sections besides this introduction. In the next
section, we present the institutional background regarding the MPF and Zipf’s Law.
In the following section, we discuss the methodology and the data. In section 4, we
present the results, and then we present the conclusion.
1 Also see Eggers et al (2015) and Gerard et al (2015) for possible solutions for running variables
with possible manipulation problems.
Brazilian Review of Econometrics 38(1) May 2018 3
Pedro Santos and Enlinson Mattos
2. Background regarding the MPF distribution and Zipf’s Law (1949)
According to Mendes (2008), the MPF is “a redistributive transfer. That is, re-
sources are sent to each municipality in accordance with a previously determined
distribution formula, which has no relationship with the amount of taxes raised in
the municipality itself. [...] The distribution is mainly carried out in accordance
with municipal population”. Its first definition occurred on December 1st 1965 in
Constitutional Amendment n. 18, in article 21, applied to the 1946 Constitution.
It consisted of 10% of receipts from IPI (tax on industrialized products) and from
IR (income tax). Its distribution began in 1967, after regulation via the National
Tax Code (October 25th 1966). At the time, the only criterion for calculating the
portion sent to each municipality was its respective population.
On February 28th 1967, via Complementary Act of the President of the Republic
n. 35, the municipalities were categorized into Capitals (these would receive 10% of
the MPF) and Interior (these would receive the remaining 90%). And on August
27th 1981, in Decree-Law n. 1,881, a new category of municipality was created,
called Reserve, which would consist of those with populations greater than 156,216
inhabitants. Thus, the division of the MPF amount was changed (Capitals: 10%,
Reserve: 3.6%, and Interior: 86.4%). In the 1988 Constitution an increase was
declared in the amount (at the time) of IPI and IR raised, from 17% to 22.5%, valid
from 1993 onwards. Later, on September 20th 2007, via Constitutional Amendment
n. 55, this value rose to 23.5%.
Specifically regarding the distribution of resources of Interior municipalities, the
methodology takes the form described in Decree-Law n. 1,881 (August 27th 1966)
and ratified on December 28th 1989 by Complementary Law n. 62. Two main
criteria are taken into consideration: the Share of the States (re-released annually)
and the Coefficients of the Municipalities.
Thus, the distribution takes the following form: of the total MPF amount
(100%), 86.4% is sent to the Interior municipalities. From this value the percentage
corresponding to the Share (Table 1) of the State in question is extracted. Finally,
to determine the value sent to the Municipalities, the coefficient of each of them is
divided by the sum of the coefficients of all the Municipalities in the State (Table
2). In general, the payment made to each Municipality i and State j is given by:
4 Brazilian Review of Econometrics 38(1) May 2018
Correcting the population of Brazilian municipalities using the Jackknife model
Table 1Share of the States in the MPF
State Share % State Share %
Acre 0.2630 Paraıba 3.1942
Alagoas 2.0883 Parana 7.2857
Amapa 0.1392 Pernambuco 4.7952
Amazonas 1.2452 Piauı 2.4015
Bahia 9.2695 Rio de Janeiro 2.7379
Ceara 4.5864 Rio Grande do Norte 2.4324
Espırito Santo 1.7595 Rio Grande do Sul 7.3011
Goias 3.7318 Rondonia 0.7464
Maranhao 3.9715 Roraima 0.0851
Mato Grosso 1.8949 Santa Catarina 4.1997
Mato Grosso do Sul 1.5004 Sao Paulo 14.262
Minas Gerais 14.1846 Sergipe 1.3342
Para 3.2948 Tocantins 1.2955
Source: TCU Resolution no 242/90, January 2nd 1990 apud STN (2012)
Table 2Coefficient of Interior Municipalities in the MPF
Population Coef Population Coef
Up to 10,188 0.6 From 61.129 to 71.316 2.4
From 10,189 to 13,584 0.8 From 71.317 to 81.504 2.6
From 13,585 to 16,980 1.0 From 81.505 to 91.692 2.8
From 16,981 to 23,772 1.2 From 91.693 to 10.1880 3.0
From 23,773 to 30,564 1.4 From 101.881 to 115.464 3.2
From 30,565 to 37,356 1.6 From 115.465 to 129.048 3.4
From 37,357 to 44,148 1.8 From 129.049 to 142.632 3.6
From 44,149 to 50,940 2.0 From 142.633 to 156.216 3.8
From 50,941 to 61,128 2.2 Over 156.216 4.0
Source: Decree Law no 1.881/1981 apud STN (2012)
Brazilian Review of Econometrics 38(1) May 2018 5
Pedro Santos and Enlinson Mattos
Payment i,j = FPM ∗ 86.4∗ Paymentj ∗ Coefficient i∑
i∈j Coefficient i(1)
Note that as we are considering municipalities with up to 40,000 inhabitants,
from Table 2 this is equivalent to considering 6 different population levels, and
therefore 6 coefficients (0.6, 0.8, 1.0, 1.2, 1.4, and 1.6).
2.1 Problems resulting from the MPF payment calculation
Applying this methodology to distribute resources implies pros and cons (in the
fiscal gap, in fiscal responsibility, in efficient management, among others) discussed
in Mendes (2008). As the focus of this paper, one of these points becomes relevant:
how municipalities with small populations are encouraged to behave strategically in
relation to the MPF distribution rules.
Two examples are (i) small municipalities subdividing in order to increase the
income per capital derived from this source (Mendes, 2008, Mattos and Ponczeck,
2013, and Arvate et al, 2015) and (ii) encouraging population increases, whether
via migratory policies, mobilization of the population for recensusing, or deliberate
frauds in the population censuses (Monasterio, 2014) with the aim of reaching the
next population band and thus increasing the income per capital obtained. This
second example can be observed in the population distribution of the 2010 Demo-
graphic Census, in which there is a high concentration of municipalities immediately
after MPF band changes, represented in Graph 1 by the vertical lines.
2.2 Exogenous instrument: Zipf’s Law
As it is difficult to find the (exogenous) determinants of the population of a munic-
ipality in order to identify what the population of that unit really should be, our
hypothesis is to follow Zipf’s Law (Zipf, 1949), which consists of the relationship
between the size of an observation unit and its position in relation to the other units
in the same sample (ranking), and this relationship becomes linear when taking into
consideration the logarithm of the two variables, represented by the model below:
log(Ri) = a+ b ∗ log(Pi) + εi (2)
6 Brazilian Review of Econometrics 38(1) May 2018
Correcting the population of Brazilian municipalities using the Jackknife model
Figure 1Histogram of the 2010 population census
Source: IBGE (2011), Monasterio (2014), and the authors’ calculations
where Ri is the ranking of observation i in relation to the other units in the sample,
Pi is the value of the variable of this observation i, and [25B?]i is the error. Also in
Zipf (1949), one of the demonstrated applications of this law is in the population
of cities.2 In Gabaix (1999) this evaluation goes one step further, by explaining
why this phenomenon repeats in different countries with different structures and
antecedents. Graph 1 thus reveals the subsequent concentration of band changes and
suggests the influence of non-natural factors in the distribution of the population of
Brazilian municipalities. We follow the authors above for calculating the estimated
population, as shown below.
3. Methodology
Our empirical strategy consists of four steps. The first concerns the use of a linear
model that seeks to estimate the positioning of the cities within the distribution
when it follows Zipf’s Law (Zipf, 1949). Next, we use the Jackknife method to
identify the municipalities with the greatest population distortions in relation to
that estimated by the model, in order to adjust their populations and thus eliminate
the discontinuity in the population distribution (McCrary, 2008).
2 Monasterio (2014) also adopts a similar procedure to the one adopted by us in this stage.
Similarly, we use the fourth degree polynomial to obtain an adequate adjustment.
Brazilian Review of Econometrics 38(1) May 2018 7
Pedro Santos and Enlinson Mattos
3.1 Linear model
We calculated a linear model with the following format (similar to Monasterio, 2014):
ln (populationi) = β0 + β1 ∗X1i + β2 ∗X2
i + β3 ∗X3i + β4 ∗X4
i +Dai +Dp
i + ε�i (3)
where populationi is the population of municipality i in the observed sample, Xi
is the position (“rank”) of municipality i with relation to its population within the
observed sample, βj are the linear coefficients, Dai and Dp
i are the dummy variables
attributed to the municipalities, and εi are the errors. Unlike Monasterio (2014)3,
the decision was taken to attribute two dummies to each discontinuity of the band,
one before (Dai) for municipalities with a population within 97.5% to 100.0% of the
MPF cut-off value and another after (Dpi) for those within 100.0% to 102.5%.4
The results of the regression of the model (2) are presented in Table 3 below,
using the year 20105. All of the variables were statistically significant, suggesting
that there is in fact an impact on the concentration of municipalities around the
MPF band changes and that this effect may be different depending on whether the
municipality is to the right or left of each threshold6
Once the coefficient is obtained for all of the variables, the estimated population
is calculated for each municipality, considering only the Intercept, Rank, Rankˆ2,
Rankˆ3, and Rankˆ4 variables. By removing the dummies from the calculation,
an approximation is expected to the population of each municipality that would be
observed in the absence of distortions.
3 The author considers a single dummy variable for each discontinuity. Two dummies (before
and after) were considered, since we allowed for different intercepts in relation to the average
( β0. With this we are able to observe not only whether there is a difference in average at
the distribution band transition threshold, but whether this is different before and after this
frontier. The estimation with a single dummy does not alter our results and this is available
from the authors on request.
4 Results are robust to the choice of other bands. 1.5% and 3.5% were tested and the results
are statistically similar. They are available from the authors on request.
5 The results for the other years can be made available by the authors on request. They are all
submitted as supplementary material.
6 Monasterio (2014) had found similar results, but with an absence of significance in the fifth
and sixth dummies. His evaluation covered municipalities with up to 50,000 inhabitants.
8 Brazilian Review of Econometrics 38(1) May 2018
Correcting the population of Brazilian municipalities using the Jackknife model
Table 3Linear model of the dependent variable ln(population) of 2010
Coefficients Estimation Standard deviation
Intercept 10.891*** 0.002
Rank -0.00125*** 0.000005
Rank2 0.000000477*** 0.000000004
Rank3 -0.000000000131*** 0.000000000002
Rank4 0.0000000000000129*** 0.0000000000000002
dummy previous 1 0.044*** 0.002
dummy previous 2 0.015*** 0.001
dummy previous 3 0.022*** 0.001
dummy previous 4 0.005** 0.002
dummy previous 5 0.014** 0.002
dummy previous 6 0.013** 0.002
dummy posterior 1 0.033*** 0.001
dummy posterior 2 0.012*** 0.001
dummy posterior 3 0.014*** 0.001
dummy posterior 4 0.004*** 0.001
dummy posterior 5 0.01*** 0.001
dummy posterior 6 0.013*** 0.001
N = 3503 5,000 < pop <40,000
R2 adj = 0,94
Note: *** 1% significant; ** 5% significant; and * 10% significant. Standard deviation
in the second column.
Brazilian Review of Econometrics 38(1) May 2018 9
Pedro Santos and Enlinson Mattos
The obtainment of the estimated population (all of the estimated betas above)
of municipality i takes the format below:
estimatedpopulationi = e(β0+β1∗X1i +β2∗X2
i +β3∗X3i +β4∗X4
i ) (4)
Graph 2 below shows the difference between the observed and estimated popula-
tion. A gradual increase in the deviations (or residuals) is clearly noted, consistently
reaching their highest local values where the population lies on the margins of each
MPF transfer discontinuity (vertical lines).
Figure 2Deviations by municipality, 2010
Source: IBGE (2011) and the authors’ calculations
Thus, the quantity of municipalities in each of the bands can be identified and
this can be compared with which band they should rightly be in, in accordance with
the estimate without distortions. This comparison can be seen in Table 4.
It is important to highlight that the example given above is the one that presents
the greatest number of municipalities indicated outside of their estimated band,
totaling 267. This corresponds to slightly more than 7% of all the municipalities
with fewer than 40,000 inhabitants. A similar strategy was used in Monasterio
(2014). This article aims to identify which municipalities contribute more to the
10 Brazilian Review of Econometrics 38(1) May 2018
Correcting the population of Brazilian municipalities using the Jackknife model
Table 4Quantity of municipalities by MPF band
Quantity of estimated municipalities by band
1 2 3 4 5 6 7
Quantity of
observed
municipalities
by band
1 1224 – – – – – –
2 106 523 – – – – –
3 – 44 384 – – – –
4 – – 49 545 – – –
5 – – – 22 309 – –
6 – – – – 27 203 –
7 – – – – – 19 48
Source: IBGE (2011) and the author’s calculations
deviation (from the estimated position) than others, and will have their population
replaced with the one estimated in the model, as shown below.
3.2 Jackknife Method
The Jackknife method aims to estimate estimation bias by dividing a sample into
different subsamples and it is generally used to construct variance estimators and
because of its versatility it has been applied for various purposes. Its use follows a
simple process. The statistic of interest is calculated using the whole sample. Next,
one element is eliminated from the sample (obtaining a new dataset, n-1) and the
same statistics are recalculated. This is done for all the elements in the sample
individually. The difference between the initial statistic of the sample and that
obtained by removing each element explains the individual influence on the value of
the statistic of the entire sample.7
With this magnitude of influence, it can be determined which elements have
the most impact on the original statistic and then they can be treated. Thus, this
method has also come to be used in identifying outliers, enabling their adequate
treatment. For example, in an efficient frontier context, Sousa (2004) and Sousa
and Stosic (2005) identify municipalities in which there could be errors in the col-
lection and storage of the data or even scale effects of those municipalities. Detecting
and possibly excluding outliers enabled the researchers to obtain a more consistent
7 This procedure has already been used to correct outliers in other economic studies. See, for
example, Souza and Stocic (2005).
Brazilian Review of Econometrics 38(1) May 2018 11
Pedro Santos and Enlinson Mattos
estimate of the efficiency in the provision of services by some of the Brazilian mu-
nicipalities. Our next step is to obtain the impact that each of the municipalities is
causing on the population distribution. For this, the Jackknife method will be used,
which identifies the magnitude of the impact of a municipality in the adequacy of
the linear model proposed in the previous step and shows which ones should be pri-
oritarily adjusted in order to obtain a distribution with fewer deviations (McCrary,
2008).
The steps for applying it are (1) define a J function that will be applied to the
whole sample and (2) for each element in the sample, calculate the value that the
Ji function takes when the element in question is exclusively eliminated.
For this use, the standard deviation function was chosen. Thus, each
municipality i would have a related value Ji in the format:
Ji =
√∑Nj=1;i 6=j(residualj − residual�)2
N − 1(5)
Ji is the value resulting from the Jackknife of the residual of the i-th munici-
pality in the sample, N is the sample size, and the numerator corresponds to the
deviation (residual = observed – estimated) in relation to the average of deviations
( ´residual�) when observation i is excluded. Note that the deviation attributed to
element i is found in the previous step, and notice that j covers all of the elements
from 1 to N (except i). Thus, the value obtained for each Ji is the standard deviation
of the sample without element i.
The value obtained for each municipality can be observed in Graph 2. There
is similarity between the municipalities with greater deviations in Graphs 2 and 3.
However, the second enables a comparison of how much each element is distorting
the statistics of the sample as a whole, and we are thus able to attribute different
weights to each municipality depending on how far they are from their estimated
population.
3.3 Elimination of the impact
After obtaining the impacts of each municipality, regarding their contribution to
the deviations in relation to Zipf’s Law (Aipf, 1949) we have to eliminate which
treatment should be given to each of these points. One of the options would be to
12 Brazilian Review of Econometrics 38(1) May 2018
Correcting the population of Brazilian municipalities using the Jackknife model
Figure 3Standard deviation (Jackknife)
Source: IBGE (2011) and the authors’ calculations
remove these municipalities from the sample (donut hole strategy, see Barreca, et
al, 2011). However, this method can affect the continuity of the data exactly at the
most important points: the MPF band changes (see Arvate, Mattos, and Rocha,
2015). Thus, the decision was taken to replace their observed populations with the
populations estimated using the linear model without the dummies. The process
for choosing the municipalities was the following:
First, (1) determine which municipalities had a population within the intervals
[MPFi - d, MPFi + d], where MPFi is the value of each band shift and d is
the value of the interval chosen to contain some of the municipalities that could be
affected by the distortions. We choose 2.5% as the benchmark.8 Next, (2) determine
the accumulated density distribution of the municipalities ordered by the standard
deviations (Jackknife) of the residuals (Graph 4). With this, we can (3) calculate
the standard deviation of the residuals of the original sample. This will be called
the initial reference sample. Within the iterative process below, this will be the first
value that the variable will take.
8 Results robust to other choices, 1.5% and 3.5%.
Brazilian Review of Econometrics 38(1) May 2018 13
Pedro Santos and Enlinson Mattos
Figure 4Accumulated density of the standard deviations (Jackknife) by municipality - 2010
Source: IBGE (2011) and the authors’ calculations
Then, (4) a value λ ∈{0, 1} is established in the accumulated density distri-
bution. All of the municipalities whose accumulated density is less than λ will have
their population adjusted; that is, the municipalities which after their exclusion gen-
erated low accumulated densities of deviations will have their population adjusted.
After this, (5) the municipality data corrected to the level of λ are saved. This is
(6) iterated by adding the value of α (0.05) to λ, and we return to step 4 above.
For the calibration, the following variable values were chosen:
• d = 1500
• λ (initial) = 0.00
• λ (final) = 0.50
• α = 0.05
After the iterations, the result obtained is a table of data for each year (2000,
2007, 2010) with the population distributions corrected to each value attributed to
λ (from 0 to 0.50, α = 0.05).
14 Brazilian Review of Econometrics 38(1) May 2018
Correcting the population of Brazilian municipalities using the Jackknife model
3.4 McCrary Method
McCrary (2008) develops a method for identifying the manipulation of a variable by
evaluating its density function. It is based on a discontinuity estimator measured
at a particular point of interest, in order to ultimately verify whether there is a
distortion there.
The process for defining this estimator follows two steps. The first consists of
constructing a histogram of the sample to be verified so that each one of its classes
(bins) contains points from only one of the sides of the discontinuity. That is, one of
the histogram cut-offs should occur exactly at the evaluated points of interest (called
c). The second step concerns the linear smoothing of the histogram (LOWESS -
locally weighted scatterplot smoothing), where the average points of each class are
the regressors (Xj) and the normalized count of the number of observations of each
class is the variable of interest (Yj).
The normalization is calculated in the following way:
Yj =1
nb
n∑i=1
1(g (Ri)) = Xj (6)
where Yj is the value of the normalized variable of interest, b is the length of each
class of the histogram, n is the quantity of classes, the sum provides the count of
the number of observations within the j-th class, and Ri is the running variable,
which in this case is population.
In order to obtain clarity in the visualization of the data, two estimated smooth-
ings (f) are calculated, one for each side of the point (c) where the discontinuity is
being investigated. Thus, in order to obtain the estimated value of the discontinuity,
the difference is calculated between the value that each smoothed function (to the
right and to the left) would take at point c.9
Finally, there is the definition of the length of each class (b) and of the width
of the band (h). McCrary (2008) suggests different possibilities for obtaining these
values numerically. Note that this topic enables it to be evaluated whether there has
been a specific effort to concentrate elements of a sample to one side of a particular
9 See McCrary (2009).
Brazilian Review of Econometrics 38(1) May 2018 15
Pedro Santos and Enlinson Mattos
limit. For this case, the verification will concern the various MPF band changes,
checking what the distortion is before and after the adjustment of the populations.
This distortion is calculated via the difference in the logarithm of the f function
calculated at the point of discontinuity c at the two sides (to the right and to the
left).
We chose to follow Monasterio (2014) and b was chosen as 283 because “it is
prime and a divider of all the MPF shift points” (Monasterio, p. 185) and for h
the automatic procedure proposed by the author (McCrary, 2008) was used, which
follows Fan and Gijbels (1996).10
The next step was to apply the method in two different samples, whose only
difference would be the adjustment carried out in the previous steps. Thus, in the
example below the 2010 census, the first change in MPF band (10,188 inhabitants),
and a λ of 0.50 were chosen. Graphs 5 and 6 visually show how the adjustment
influenced the distortion.
For the sample prior to the adjustment, the discontinuity estimate (difference
between the functions to the right and to the left of point c) was 1.43 with a standard
error of 0.13. After the adjustment, it decreased to 0.10 with a standard error of
0.09. With this, the statistic ceased to be statistically significant and the evidence
of discontinuity disappeared. This phenomenon can also be visually verified by
comparing the graphs.
10 The automatic procedure used by McCrary (2008) can be summarized in the following steps:
A). Calculate an initial histogram, which will be used as a basis for the calculations, using
a bin size estimated by the algorithm or provided by the programmer. In the case of this
study, the value provided was 283. B). Using the initial version of the histogram, the fourth
order polynomial is estimated separately for both sides of the discontinuity. Based on this,
for each side the value of the expression θ[σ2 (b− a) /
∑f ′′ (Xj)2
]1/5is calculated and the
h value is determined as its average. θ is equal to 3.348, σ2 is the average square error of the
regression, b−a is equal to Xj − c for the right side of the regression and c−X1 for the left
side and f ′′ (Xj) is the estimate of the second order derivative of the polynomial found for the
overall distribution. C). The second step of this algorithm is based on a bandwidth selection
rule developed by Fan and Gijbels. With the bandwidth determined, the initial histogram
(calculated with the pre-determined bin and the f (r) curve based on the value of h provides
a detailed overview of the distribution of the variable around the point of discontinuity. We
tested the Cross-Validation option proposed by Stone (1974) and the results are similar.
16 Brazilian Review of Econometrics 38(1) May 2018
Correcting the population of Brazilian municipalities using the Jackknife model
Figure 5Initial database (2010 census, first change in band)
Source: IBGE (2011)
Figure 6After adjustment, λ to 0.50 (2010 census, first change in band)
Source: IBGE (2011) and the authors’ calculations
Brazilian Review of Econometrics 38(1) May 2018 17
Pedro Santos and Enlinson Mattos
4. Results
After applying our methods to the population samples from the 2000 and 2010
Demographic Censuses and the 2007 TCU data, we obtained the alterations in the
discontinuity estimates for each municipality whose exclusion considerably reduced
the standard deviation of the deviations from the estimated population. Their
respective values can be observed in Tables 5, 6, and 7, as well as the standard error
found and its significance.
It can be observed that for each band, the λ needed to remove the significance
of the discontinuity is different, and whose value is highlighted in each column.
Moreover, depending on the increase in the λ the significance of the difference in
the statistic of the functions to the right and to the left reduces more and more, in
a monotonic way. This only does not repeat in the sixth MPF income band and
for the year 2000 (λ = 0). However, for all of the λ values after this same band,
monotonicity is shown.
More importantly, except for the fourth 2010 MPF band, we can highlight that
with a λ = 0.30 we would eliminate all of the population discontinuities observed in
the different MPF bands, via the t test. Moreover, we detect which municipalities
can be considered as outliers via this λ. By attributing the population estimated by
Zipf’s Law (Zipf, 1949) for these municipalities in accordance with that predicted
by equation (3) we can note, also graphically, that the population distributions
represent continuity again in those MPF bands.11
Tables 5, 6, and 7 also indicate that with λ = 0.30, we could correct all of the
population distortions observed in the small municipalities, except close to band
4 of the 2010 Census, when our procedure does not appear to be efficient. Note
that in these tables for each band we highlight the λ that guarantees the absence of
statistical difference in the municipality density histogram between the two sides of
the population bands.
One possible interpretation of our method is that we use a completely exogenous
variable from the point of view of the municipalities (Zipf, 1949) to estimate the esti-
11 Graphs contained in the supplementary file.
18 Brazilian Review of Econometrics 38(1) May 2018
Correcting the population of Brazilian municipalities using the Jackknife model
mated (instrumentalized) population which, in turn, is replaced in the municipalities
whose population can be considered endogenous by the Jackknife method.
Finally, Table 8 indicates how many municipalities should have their population
corrected for each λ for each population band in 2000, 2007, and 2010. As can be ob-
served, we would correct the sample of 302, 319, and 337 municipalities respectively
for 2000, 2007, and 2010, for a λ of 0.30, which would correct all of the distortions
found (except in band 4 of the 2010 census). This strategy would indicate the upper
limit of alterations, since we could allow different λ’s for each band and year12
5. MPF Distribution per capita
With both the original and corrected distributions, we investigated what happens
with the MPF distribution per capital granted to the municipalities in the two cases.
The value obtained corresponds to the share that each municipality would have in
the total MPF distributed in the year.13
Once the MPF variables were calculated for the two populations, we considered
the population variable as independent and MPF per capita itself as a dependent
variable in a fourth degree linear regression. For each MPF band a specific polyno-
mial was calculated in order to obtain the best smoothing possible of the values of
each group of municipalities. A confidence interval of one standard deviation was
added to the graphs for comparison.
As a database for the calculation the two population samples from each year
(2000, 2007, and 2010) were used, the original one with λ = 0.00 and the corrected
one with λ = 0.30, which as verified is a sufficient value to remove the discontinuities.
The resulting graphs, with the MPF per capita per municipality and their respective
values expected by the exponential smoothing, can be observed below.
It can be visually observed that their does not appear to be any difference be-
tween the two strategies for calculating the MPF, especially at the margins of the
population bands, where our procedure acted; that is, the effect of correcting the dis-
tortion in the discontinuity of the population distribution does not appear to cause
12 By allowing for different λ’s, the municipalities that would have their populations readjusted
would be 154, 184, and 260, respectively.
13 The value was multiplied by 10ˆ6 for a better representation of the data.
Brazilian Review of Econometrics 38(1) May 2018 19
Pedro Santos and Enlinson Mattos
Table 5Value and significance of the MPF discontinuities (2000)
Lambda StatisticsYear: 2000
10188 13584 16980 23772 30564 37356
0%
Mean 0,500 0,284 0,491 0,570 0,440 0,097
Standard error 0,105 0,117 0,150 0,193 0,236 0,293
Test 4,748*** 2,427** 3274*** 2,947*** 1,862* 0,329
5%
Mean 0,304 0,281 0,371 0,492 0,373 -0,907
Standard error 0,099 0,116 0,144 0,188 0,240 0,324
Test 3,071*** 2,437** 2,568** 2,612*** 1,553 2,8***
10%
Mean 0,209 0,285 0,186 0,204 0,231 -0,575
Standard error 0,096 0,116 0,134 0,171 0,244 0,335
Test 2,18** 2,461** 1,389 1,193 0,95 1,717*
15%
Mean 0,150 0,289 0,103 0,139 0,156 -0,567
Standard error 0,094 0,116 0,129 0,162 0,237 0,336
Test 1,597 2,487** 0,801 0,856 0,657 1,69*
20%
Mean 0,110 0,246 0,069 0,097 0,124 -0,543
Standard error 0,093 0,115 0,126 0,155 0,232 0,338
Test 1,185 2,136** 0,549 0,624 0,535 1,608
25%
Mean 0,091 0,192 0,016 0,048 0,114 -0,511
Standard error 0,093 0,114 0,122 0,128 0,230 0,340
Test 0,987 1,687* 0,134 0,376 0,495 1,503
30%
Mean 0,088 0,144 -0,039 0,053 0,102 -0,498
Standard error 0,093 0,112 0,116 0,122 0,229 0,341
Test 0,949 0,1292 0,34 0,433 0,446 1,458
35%
Mean 0,089 0,128 -0,048 0,059 0,098 -0,498
Standard error 0,093 0,111 0,115 0,121 0,227 0,341
Test 0,953 1,151 0,419 0,485 0,431 1,457
40%
Mean 0,081 0,096 -0,038 0,021 0,077 -0,493
Standard error 0,094 0,112 0,118 0,133 0,225 0,342
Test 0,858 0,861 0,318 0,161 0,341 1,439
45%
Mean 0,078 0,080 -0,029 0,022 0,119 -0,493
Standard error 0,095 0,112 0,120 0,134 0,224 0,342
Test 0,826 0,714 0,245 0,167 0,532 1,439
50%
Mean 0,077 0,037 -0,021 0,032 0,279 -0,493
Standard error 0,095 0,110 0,122 0,131 0,237 0,343
Test 0,809 0,334 0,175 0,245 1,18 1,438
Significance values based on the Student t test
20 Brazilian Review of Econometrics 38(1) May 2018
Correcting the population of Brazilian municipalities using the Jackknife model
Table 6Value and significance of the MPF discontinuities (2007)
Lambda StatisticsYear: 2007
10188 13584 16980 23772 30564 37356
0%
Mean 0,813 0,581 0,496 0,578 0,180 0,753
Standard error 0,111 0,122 0,144 0,191 0,213 0,279
Test 7,293*** 4,744*** 3,442*** 3,02*** 0,844 2,698***
5%
Mean 0,119 0,267 0,105 0,586 0,180 0,713
Standard error 0,093 0,114 0,125 0,196 0,211 0,301
Test 1,284 2,347** 0,843 2,987*** 0,855 2,365**
10%
Mean 0,103 0,107 -0,102 0,542 0,159 -0,013
Standard error 0,092 0,106 0,125 0,195 0,212 0,289
Test 1,118 1,003 0,81 2,785*** 0,752 0,044
15%
Mean 0,105 0,010 -0,132 0,379 0,128 -0,052
Standard error 0,092 0,104 0,128 0,184 0,211 0,295
Test 1,136 0,095 1,034 2,059** 0,607 0,175
20%
Mean 0,112 -0,012 -0,118 0,312 0,138 -0,027
Standard error 0,093 0,104 0,128 0,179 0,212 0,298
Test 1,204 0,111 0,924 1,743* 0,651 0,09
25%
Mean 0,112 -0,025 -0,102 0,190 0,168 -0,016
Standard error 0,093 0,105 0,127 0,171 0,216 0,301
Test 1,2 0,241 0,805 1,114 0,778 0,054
30%
Mean 0,116 -0,030 -0,087 0,104 0,205 0,000
Standard error 0,093 0,105 0,125 0,164 0,219 0,303
Test 1,247 0,289 0,692 0,632 0,933 0
35%
Mean 0,115 -0,032 -0,078 0,106 0,226 0,006
Standard error 0,093 0,105 0,124 0,164 0,221 0,303
Test 1,243 0,304 0,631 0,642 1,024 0,02
40%
Mean 0,110 -0,030 -0,067 0,102 0,247 0,014
Standard error 0,094 0,105 0,123 0,164 0,223 0,303
Test 1,176 0,281 0,543 0,623 1,11 0,045
45%
Mean 0,104 -0,035 -0,060 0,095 0,254 0,042
Standard error 0,094 0,107 0,123 0,164 0,224 0,300
Test 1,103 0,325 0,486 0,58 1,135 0,139
50%
Mean 0,096 -0,037 -0,049 0,093 0,301 0,041
Standard error 0,095 0,108 0,123 0,164 0,227 0,300
Test 1,004 0,343 0,398 0,566 1,325 0,137
Significance values based on the Student t test
Brazilian Review of Econometrics 38(1) May 2018 21
Pedro Santos and Enlinson Mattos
Table 7Value and significance of the MPF discontinuities (2010)
Lambda StatisticsYear: 2010
10188 13584 16980 23772 30564 37356
0%
Mean 1,427 0,653 0,792 0,877 1,460 0,964
Standard error 0,131 0,121 0,145 0,199 0,280 0,327
Test 10.,904*** 5,419*** 5,452*** 4,401*** 5,206*** 2,948***
5%
Mean 0,242 0,728 0,573 0,894 1,417 1,000
Standard error 0,093 0,126 0,139 0,204 0,281 0,338
Test 2,614*** 5,784*** 4,128*** 4,39*** 5,049*** 2,958***
10%
Mean 0,133 0,375 0,258 0,893 0,808 0,862
Standard error 0,088 0,115 0,123 0,203 0,238 0,321
Test 1,506 3,271*** 2,09** 4,39*** 3,395*** 2,686***
15%
Mean 0,133 0,282 0,165 0,882 0,547 0,331
Standard error 0,088 0,111 0,115 0,201 0,225 0,287
Test 1,518 2,544** 1,438 4,395*** 2,43** 1,154
20%
Mean 0,123 0,209 0,107 0,801 0,450 0,242
Standard error 0,088 0,108 0,112 0,197 0,225 0,284
Test 1,41 1,927* 0,95 4,07*** 1,994** 0,853
25%
Mean 0,118 0,17 0,095 0,736 0,215 -0,327
Standard error 0,088 0,107 0,114 0,195 0,228 0,281
Test 1,339 1,59 0,835 3,781*** 0,943 1,167
30%
Mean 0,116 0,073 -0,003 0,650 0,124 -0,399
Standard error 0,09 0,104 0,119 0,191 0,229 0,285
Test 1,29 0,705 0,027 3,398*** 0,541 1,4
35%
Mean 0,114 0,038 -0,057 0,606 0,066 -0,362
Standard error 0,091 0,102 0,123 0,190 0,230 0,293
Test 1,254 0,375 0,468 3,196*** 0,287 1,238
40%
Mean 0,113 0,014 -0,117 0,475 0,003 -0,312
Standard error 0,091 0,101 0,127 0,183 0,230 0,299
Test 1,242 0,14 0,923 2,596*** 0,014 1,045
45%
Mean 0,112 0,002 -0,117 0,380 0,011 -0,252
Standard error 0,092 0,101 0,127 0,177 0,228 0,301
Test 1,214 0,023 0,922 2,147** 0,049 0,839
50%
Mean 0,11 0 -0,119 0,297 0,014 -0,211
Standard error 0,093 0,101 0,128 0,173 0,225 0,301
Test 1,186 0,001 0,925 1,716* 0,061 0,7
Significance values based on the Student t test
22 Brazilian Review of Econometrics 38(1) May 2018
Correcting the population of Brazilian municipalities using the Jackknife model
Table 8Number of municipalities with altered population, by MPF band and year
2010 Census - Number of municipalities with altered population by band
Lambda Alterations 5 10 15 20 25 30 35 40 45 50
10188 43 62 66 72 78 88 91 93 95 97
13584 0 0 0 0 4 7 23 36 45 53
16980 35 43 47 53 60 68 76 82 89 95
23772 2 30 37 39 45 50 51 55 61 65
30564 6 16 29 38 41 45 47 52 57 61
37356 16 38 44 44 44 44 44 44 44 44
Total 102 189 223 246 272 302 332 362 391 415
2007 Census - Number of municipalities with altered population by band
Lambda Alterations 5 10 15 20 25 30 35 40 45 50
10188 81 88 91 94 97 99 102 102 104 110
13584 18 30 38 41 44 46 48 58 68 74
16980 18 40 44 55 67 78 82 87 88 88
23772 0 5 8 16 25 29 33 40 42 43
30564 2 8 10 12 14 18 21 28 32 42
37356 17 39 45 46 47 49 49 51 52 52
Total 136 210 236 264 294 319 335 366 386 409
2010 Census - Number of municipalities with altered population by band
Lambda Alterations 5 10 15 20 25 30 35 40 45 50
10188 113 130 137 146 146 146 146 146 146 146
13584 1 21 34 41 43 48 60 67 73 74
16980 30 57 61 67 68 74 77 80 85 92
23772 0 0 0 5 6 9 11 13 15 18
30564 1 13 20 25 35 40 43 44 45 47
37356 0 1 7 15 16 20 26 28 32 39
Total 145 222 259 299 314 337 363 378 396 416
Brazilian Review of Econometrics 38(1) May 2018 23
Pedro Santos and Enlinson Mattos
Figure 7MPF per capita, original population, 2000
Source: IBGE and the authors’ calculations
an impact on the distribution of the MPF per capita for the municipalities. This
is due to the fact that only some municipalities (around 300) have their population
corrected.
Thus, our proposed method appears to be robust in that it effectively manages
to correct an anomaly observed in the sample, without causing other undesirable
effects due to the methodology for distributing the fund among the other munici-
palities. Another possible interpretation concerns the fact that the use of estimated
populations to correct this anomaly observed in the population of Brazilian munic-
ipalities appears to be a promising strategy. More importantly, the strategies that
use estimated MPF (Fuzzy RDD) should not necessarily be discarded.
6. Conclusion
The existence of a distortion in the municipality populations close to the MPF band
changes was verified in Litschig (2012) and Monasterio (2014). The authors also
identify that there appeared to be an increase in the population of the municipalities,
which could cause their displacement into the next bands, with the aim of increasing
24 Brazilian Review of Econometrics 38(1) May 2018
Correcting the population of Brazilian municipalities using the Jackknife model
Figure 8MPF per capita, corrected population, 2000
Source: IBGE and the authors’ calculations
Figure 9MPF per capita, original population, 2007
Source: IBGE and the authors’ calculations
Brazilian Review of Econometrics 38(1) May 2018 25
Pedro Santos and Enlinson Mattos
Figure 10MPF per capita, corrected population, 2007
Source: IBGE and the authors’ calculations
Figure 11MPF per capita, original population, 2010
Source: IBGE and the authors’ calculations
26 Brazilian Review of Econometrics 38(1) May 2018
Correcting the population of Brazilian municipalities using the Jackknife model
Figure 12MPF per capita, corrected population 2010
Source: IBGE and the authors’ calculations
the transfer per capita obtained. This raises the discussion of how adequate the
current criteria would be for distributing the Municipal Participation Fund (MPF)
and whether population could be correctly used as a running variable in a Regression
Discontinuity strategy.
This study contributes in two ways. It uses an exogenous rule (Zipf’s Law,
1949) to quantify the expected population of each municipality. Based on this
expected population, we use the Jackknife method to identify the municipalities that
contribute most to the deviations from what is expected. For these municipalities,
we apply the expected population.
By applying our method in the samples from the 2000 Demographic Census,
from the 2007 TCU, and from the 2010 Demographic Census, a significant improve-
ment was obtained in the population distortion of municipalities with up to 40,000
inhabitants around the MPF band changes. Originally, in 16 of the 18 band changes
a statistically significant distortion was observed, showing some form of distortion
of the variable. After applying the method for identifying and adjusting the munic-
ipalities, we determined in an exploratory way what the optimal adjustment point
Brazilian Review of Econometrics 38(1) May 2018 27
Pedro Santos and Enlinson Mattos
would be using the McCrary (2008) test and thus managed to (almost) eliminate
the totality of the distortion in the population of the municipalities for the three
observed Censuses. With this, the distortion was smoothed, enabling the obtained
sample to be used for purposes that take into account the population order and
relationship between Brazilian municipalities.
As can be seen in Tables 5, 6, and 7, over the years the distortions only in-
creased (visible via the increasing value of the discontinuities for λ = 0) - and this
can concentrate the MPF distribution more and more in those municipalities iden-
tified in our strategy (Mendes, 2008). In particular, we estimate that around 300
municipalities are contributing to this practice, which suggests the need to review
the way that the MPF is distributed to municipalities from the Interior group.
On the other hand, when evaluating the MPF per capita of the municipalities
before and after the proposed correction, no significant alteration is verified in the
values obtained close to the band changes. Thus, the impact on the MPF per capita
caused by our strategy does not appear to significantly impact the municipalities
that should not be the object of correction. Our results suggest that the use of a
sharp RD (regression discontinuity) in models that explain MPF by population may
not be appropriate. This is due to the proven discontinuity of the population of the
municipalities.
However, our strategy leans towards the use of fuzzy RDD type regressions
in which the MPF is instrumentalized by its theoretical counterpart in which the
corrected population could be considered as a determinant of the amount to be
received by each municipality. That strategy could capture the exogenous variation
in the transfers for each economic variable of interest.
References
Arvate, P., Mattos, E. and Rocha, F. (2015). Conditional versus unconditional
grants and local public spending in Brazilian municipalities. 35th Meeting of
the Brazilian Econometric Society, Foz do Iguacu, Brazil.
Arvate, P., Mattos, E. and Rocha, F. (2011). Flypaper effect revisited: Evidence for
tax collection efficiency in Brazilian municipalities. Estudos Economicos, v. 41,
p. 7-28.
28 Brazilian Review of Econometrics 38(1) May 2018
Correcting the population of Brazilian municipalities using the Jackknife model
Barreca, A.; Guldi, M.; Lindo, J.; Waddell, G. (2011) “Saving Babies? Revisiting
the Effect of the Very Low Birth Weight Classification”. Quarterly Journal of
Economics, 126.
Brollo, F.; Nannicini, T.; Perotti, R.; Tabellini, G. (2013). The Political Resource
Curse, American Economic Review, 103(5), p. 1759-96.
Castro, M. and Regatieri, R. (2014). Impacto do Fundo de Participacao dos Mu-
nicıpios sobre os gastos publicos por funcao e subjuncao: analise atraves de
uma regressao em descontinuidade. 42 Encontro Nacional de Economia, Natal,
Rio Grande do Norte, Brasil.
Eggers, A. C., Freier, R., Grembi, V., and Nannicini, T. (2015). Regression Dis-
continuity Designs Based on Population Thresholds: Pitfalls and Solutions.
Working Paper IZA No. 9553, December.
Funaro et al (2009). Diretrizes para apresentacao de dissertacoes e teses da USP:
documento eletronico e impresso. Universidade de Sao Paulo. Available from:
http://www.usp.br/prolam/ABNT 2011.pdf. Accessed on July 10th 2015.
Gabaix, X. (1999). Zipf’s Law for Cities: An Explanation. The Quarterly Journal
of Economics, v.114, n.3, p.739-767.
Gerard, F., Rokkanen, M., Rothe, C. (2015). Identification and Inference in Re-
gression Discontinuity Designs with a Manipulated Running Variable (working
paper).
IBGE - Instituto Brasileiro de Geografia e Estatıstica. Censo 2010. Brasılia,
IBGE: 2011. Available from: http://www.ibge.gov.br/home/estatistica/
populacao/censo2010/default.shtm.
IBGE - Instituto Brasileiro de Geografia e Estatıstica. Contagem
da populacao 2007. Brasılia, IBGE: 2007. Available from:
http://www.ibge.gov.br/home/estatistica/populacao/contagem2007.
Brazilian Review of Econometrics 38(1) May 2018 29
Pedro Santos and Enlinson Mattos
Litschig, S. (2012). Are rules-based government programs shielded from special-
interest politics? Evidence from revenue-sharing transfers in Brazil. Journal of
public Economics 96.11: 1047-1060.
Litschig, S.; Morrison, K. (2013). The impact of intergovernmental transfers on edu-
cation outcomes and poverty reduction. American Economic Journal: Applied
Economics.
Mattos, E. and V. Ponczek (2013). Efeitos da divisao municipal na oferta de bens
publicos e indicadores sociais. Revista Brasileira de Economia 67 (3), 315-336.
Mccrary, J. (2008). Manipulation of the running variable in the regression
discontinuity design: A density test. Journal of Econometrics 142(2): 698–714.
Mccrary, J. (2009). Codes for Manipulation of the Running Variable. Contem
o codigo para uso no software STATA (DCdensity.ado), chamadas e saıdas
de um exemplo (DCdensity example.do, DCdensity example.log, DCden-
sity example.eps) e explicacao do codigo implementado (DCdensity.pdf). Avail-
able from: http://eml.berkeley.edu/˜jmccrary/DCdensity. Accessed on June
7th 2015.
Mendes, M.; Miranda, R. B.; Cossio, F. B. (2008). O Fundo de Participacao dos
Municıpios precisa mudar. Constituicao de 1988: O Brasil 20 anos depois -
Estado e economia em vinte anos de mudancas, v.4, 2008.
Monasterio, L. M. (2014). A estranha distribuicao da populacao dos pequenos
municıpios brasileiros. Rev. Econ. NE, Fortaleza, v.45, n.4, p.111-119, 2014.
R Core Team. (2014). R: A language and environment for statistical computing.
R Foundation for Statistical Computing, Vienna, Austria. Available from:
http://www.R-project.org/.
S Original, from StatLib and by Rob Tibshirani. R PORT by Friedrich Leisch.
(2015). bootstrap: Functions for the Book ’An Introduction to the Bootstrap.
Available from: http://CRAN.R-project.org/package=bootstrap.
30 Brazilian Review of Econometrics 38(1) May 2018
Correcting the population of Brazilian municipalities using the Jackknife model
Sousa, M. C.; Neto, F. C.; Stosic, B. (2004). Explaining DEA Technical Efficiency
Scores in an Outlier Corrected Environment: The Case of Public Services in
Brazilian Municipalities. Brazilian Review of Econometrics, v.25, n.2, p.287-
313, 2005.
Sousa, M. C. S., Stosic, B. D. Technical Efficiency of the Brazilian Municipali-
ties: Correcting Nonparametric Frontier Measurements for Outliers. Journal
of Productivity Analysis, Springer-Netherlands, v. 24, p. 157-181, 2005.
Stone, M. “Cross-Validation and Multinomial Prediction,” Biometrika, December
1974, 61 (3), 509– 515.
STN - Secretaria do Tesouro Nacional. (2012). O que voce precisa
saber sobre as transferencias constitucionais e legais. Avaialble from:
http://www3.tesouro.fazenda.gov.br/estados municipios/download/CartilhaMPF.pdf.
Accessed on June 14th 2015.
Zipf, K (1949). Human Behavior and the Principle of Least Effort. Addison-Wesley.
Brazilian Review of Econometrics 38(1) May 2018 31
Pedro Santos and Enlinson Mattos
A. Appendix
A.1 Codes, databases, and products
All of the code, all of the databases, and the graphs obtained can be found at the
link below: https://goo.gl/zY6Dnu
The code (script and lib folder) covers steps 1 to 3 of the Methodology chapter.
The algorithm was constructed in R. The McCrary code in Stata can be found on
the page referenced in the Bibliography.
The databases (database folder) are the Demographic Censuses and also the
variations in MPF per capita.
The products (exports folder) are the graphs and tables generated.
Using the code in R, the graphs are in the r plots subfolder.
Using the McCrary algorithm, the tables that were used as a basis are in the
stata database subfolder, the file with the command lines are in the stata code
subfolder, and the graphs obtained are in the stata plots folder.
A.2 Tables of observed and estimated bands
A.2.1 Year 2000
Table A.12000, before adjustment (λ to 0.00)
Estimated band
1 2 3 4 5 6 7
Observed
band
1 1344 – – – – – –
2 37 566 – – – – –
3 – 18 409 – – – –
4 – – 29 530 – – –
5 – – – 21 315 4 –
6 – – – – – 190 –
7 – – – – – 13 40
Source: IBGE (2011) and the authors’ calculations
32 Brazilian Review of Econometrics 38(1) May 2018
Correcting the population of Brazilian municipalities using the Jackknife model
Table A.22000, after adjustment (λ to 0.30)
Estimated band
1 2 3 4 5 6 7
Observed
band
1 1381 – – – – – –
2 – 573 – – – – –
3 – 11 437 – – – –
4 – – 1 551 – – –
5 – – – – 315 4 –
6 – – – – – 203 –
7 – – – – – – 40
Source: IBGE (2011) and the authors’ calculations
A.2.2 Year 2007
Table A.32007, before adjustment (λ to 0,00)
Estimated band
1 2 3 4 5 6 7
Observed
band
1 1284 – – – – – –
2 63 540 – – – – –
3 – 36 411 – – – –
4 – – 31 553 – – –
5 – – – 14 319 – –
6 – – – – 6 197 –
7 – – – – – 16 53
Source: IBGE (2011) and the authors’ calculations
Table A.42007, after adjustment (λ to 0.45)
Estimated band
1 2 3 4 5 6 7
Observed
band
1 1347 – – – – – –
2 1 575 – – – – –
3 – – 442 – – – –
4 – – 1 566 – – –
5 – – – – 319 – –
6 – – – – 7 212 –
7 – – – – – 1 52
Source: IBGE (2011) and the authors’ calculations
Brazilian Review of Econometrics 38(1) May 2018 33
Pedro Santos and Enlinson Mattos
A.2.3 Year 2010
Table A.52010, before adjustment (λ to 0.00)
Estimated band
1 2 3 4 5 6 7
Observed
band
1 1224 – – – – – –
2 106 523 – – – – –
3 – 44 384 – – – –
4 – – 49 545 – – –
5 – – – 22 309 – –
6 – – – – 27 203 –
7 – – – – – 19 48
Source: IBGE (2011) and the authors’ calculations
Table A.62010, after adjustment (λ to 0.50)
Estimated band
1 2 3 4 5 6 7
Observed
band
1 1330 – – – – – –
2 1 564 2 – – – –
3 – – 431 2 – – –
4 – – – 559 – – –
5 – – – 8 336 – –
6 – – – – – 222 –
7 – – – – – 1 47
Source: IBGE (2011) and the authors’ calculations
A.3 Results with a similar database to MONASTERIO (2014)
This study used municipalities with populations between 5,000 and 40,000 inhab-
itants in the calculations. The interval is slightly different from MONASTERIO
(2014), who used municipalities with populations between 5,000 and 50,000 inhab-
itants. Thus, the result of the method applied to each one of the discontinuities
concerning this new database (λ = 0.50) are shown below.
The loss and/or reduction of significance in all of the 2010 discontinuity bands
(observed in Table 7) also occurs with this new application.
34 Brazilian Review of Econometrics 38(1) May 2018
Correcting the population of Brazilian municipalities using the Jackknife model
Figure A.12010, first discontinuity
Source: IBGE and the authors’ calculations
Figure A.22010, second discontinuity
Source: IBGE and the authors’ calculations
Brazilian Review of Econometrics 38(1) May 2018 35
Pedro Santos and Enlinson Mattos
Figure A.32010, third discontinuity
Source: IBGE and the authors’ calculations
Figure A.42010, fourth discontinuity
Source: IBGE and the authors’ calculations
36 Brazilian Review of Econometrics 38(1) May 2018
Correcting the population of Brazilian municipalities using the Jackknife model
Figure A.52010, fifth discontinuity
Source: IBGE and the authors’ calculations
Figure A.62010, sixth discontinuity
Source: IBGE and the authors’ calculations
Brazilian Review of Econometrics 38(1) May 2018 37
Pedro Santos and Enlinson Mattos
Figure A.72010, seventh discontinuity
Source: IBGE and the authors’ calculations
Table A.7Value and significance of the MPF discontinuities (λ to 0.50)
Discontinuity Mean (Standard error)
1 0.135 (0.092)
2 0.025 (0.103)
3 -0.086 (0.106)
4 0.155 (0.153)
5 0.393 (0.201) *
6 0.544 (0.268) **
7 -0.618 (0.356) *
** 5% signficance, * 10% significance
Source: IBGE and the authors’ calculations
38 Brazilian Review of Econometrics 38(1) May 2018