Behind The Veil of Power: State Statistics and Benford’s Law in Chosǒn Korea
-
Upload
journal-of-human-and-social-science-research -
Category
Documents
-
view
218 -
download
0
Transcript of Behind The Veil of Power: State Statistics and Benford’s Law in Chosǒn Korea
8/13/2019 Behind The Veil of Power: State Statistics and Benford’s Law in Chosǒn Korea
http://slidepdf.com/reader/full/behind-the-veil-of-power-state-statistics-and-benfords-law-in-choson-korea 1/12
Online available since 2013/Nov/20 at www.oricpub.com© (2013) Copyright ORIC Publications
Journal of Human and Social Science ResearchVol. 2, No. 1 (2013), 9-20
webpage: http://www.oricpub.com/hssr-journal
H uman and S ocial
S cience R esearch HSSR
Vol. 2(1), 9-20
ORIC Publications www.oricpub.com
All rights reserved. No part of contents of this paper may be reproduced or transmitted in any form or by any means without the written permission of ORIC Publications,
www.oricpub.com.
Keywords:
Chosǒ nKoreaHistoryBenford’s Law
Received: 27 Oct 2013 Accepted: 04 Nov 2013
Milan Hejtmanek
Department of Korean History, Collegeof Humanities, Seoul National
University, Gwanak-gu, Seoul, SouthKorea
Behind The Veil of Power: State Statistics and Benford’s Law in Chos n
Korea
Corresponding Author: Milan Hejtmanek
Department of Korean History, College of Humanities, Seoul National University, Gwanak-gu, Seoul, South Korea
Abstract
Situating the Chosǒn period (1392-1910) in Korea within the larger historiographical
framework has proved no easy challenge. While its economy was basic, relying largely onagriculture, its system of rule comprised a complex bureaucracy superintended by highly
educated officials selected by impartially administered examinations. While generallyconsigned to the catch-all historiographical category of “pre-modern” some scholars such
as Alexander Woodside have pointed to the rationality of the Chosǒn bureaucracy and
deemed it an alternative form of modernity. Central in such attempt to reappraise the
sophistication of rule in Chosǒn are the sophistication and accuracy of its administrative
record keeping. This paper uses the mathematical insights garnered over the past century
concerning “Benford’s Law” or the “First Digits Law” to examine the internal consistencyof a set of Chosǒn-era government statistics. Benford and others discovered that the
distribution of the left-most digits in a wide variety of statistics follow a non-uniform
distribution, one not obvious to those unfamiliar with it. Through a comparison of the
actual numbers with their theoretical distribution under Benford’s Law, it has been possible
to uncover a wide variety of malfeasance and accounting fraud. This paper finds thatgovernment documents from both the 15th and 19 centuries generally comport with the
Benford distribution, adding weight to an understanding of the Chosǒn period as more
competent and sophisticated it is rule than has previously been thought.
1. INTRODUCTION
The Chosǒn period (1392-1910) presents the seeming paradox of
understanding how one of the longest continuous dynastic states in world
history achieved such seeming stability in the midst of not infrequent
invasion, domestic rebellions, fratricidal mayhem and murder in the royal
house, the tensions inherent in the existence of widespread hereditary
slavery, and frequent recourse to brigandage. Scholars have tended to locate
the social anchor in the Chosǒn period in its tenacious commitment to
learning generally, especially as articulated through a turn toward
Neo-Confucian thought and social precepts, as propagated by — and
demonstrated through — a state examination system that yielded hard-earned
degrees for which every adult aristocrat (the yangban class) sought, often
for long decades into their late adulthood.1 Other explanations for the
dynastic longevity must surely involve a careful examination of the
elaborate bureaucratic system that superintended the affairs of state.
1For context on Korean history in the Chosǒn period generally see Lee (1984); for specialized information on Confucianism and
its influence on Chosǒn-era institutions, see Palais (1996) and Deuchler (1995 ).
8/13/2019 Behind The Veil of Power: State Statistics and Benford’s Law in Chosǒn Korea
http://slidepdf.com/reader/full/behind-the-veil-of-power-state-statistics-and-benfords-law-in-choson-korea 2/12
8/13/2019 Behind The Veil of Power: State Statistics and Benford’s Law in Chosǒn Korea
http://slidepdf.com/reader/full/behind-the-veil-of-power-state-statistics-and-benfords-law-in-choson-korea 3/12
M. Hejtmanek | 11
© ORIC Publications /2013
set of reference comprehensive reference works, providing extensive details on institutions and locales, vital
for effective governance by the capital bureaucracy. Often referenced but relatively little studied, either in
Korea or elsewhere, these texts reveal a systematic approach to rational governance, based on the patient
assembling of a wide variety of useful and accurate information necessary for the development of thoughtful
and well-designed policies.
2. MATERIALS AND METHODS
The counterintuitive fact that for many (but not all) collections of numbers, the first significant digits
(i.e. the first non-zero digit from the left of number) are not distributed uniformly but instead are highly
skewed toward low numbers was realized only in the late nineteenth century by Newcomb (1881) and was
not explored with some thoroughness until the twentieth century by Benford (1938), a physicist at General
Electric. He discovered a wide variety of categories of numbers, including the areas of rivers, baseball
statistics, street addresses all display a characteristic weakly monotonic decreasing distribution of the first
digits (from 1 to 9), for which he devised a formula treating the numbers logarithmically:
( ) (1)
By this model, now known as Benford’s Law (also termed the First Significant Digits (FSD) phenomenon),
numbers beginning with 1 will occur with a frequency of about 30.1%, while those beginning with 9 only
around 4.6%, as shown in Fig. 1.
Figure 1. Benford's Distribution for 1st Digits
A related implication of Benford’s Law is that the second digits (from 0 to 9) also follow a non-uniform
distribution, given by the equation:
8/13/2019 Behind The Veil of Power: State Statistics and Benford’s Law in Chosǒn Korea
http://slidepdf.com/reader/full/behind-the-veil-of-power-state-statistics-and-benfords-law-in-choson-korea 4/12
12 | Behind The Veil Of Power: State Statistics And Benford’s Law In Chosǒn Korea
Journal of Human And Social Science Research / Vol. 2, No. 1 (2013), 9-20
∑ 0 −9
= (2)
By this rule, the number 0 should appear as the second digit about 12% of the time, in contrast to 9,
which lags at a frequency of about 8.5%, as shown in Fig. 2.
Figure 2. Benford's Distribution for 2st Digits
Subsequently, over decades scholars such as Pinkham (1961), Raimi (1969), Hill (1995), and Berger et
al. (2011) have probed the mathematical foundations of Benford’s Law, while others have sought to apply it
empirically in a wide variety of disciplines, including genome analysis, scientific fraud detections, analysis
of macro-economic data, and forensic accounting. Due to the counter-intuitive nature of the FSD distribution,
tax fraud and other financial malfeasance can at time be ferreted out through its use (Nigrini, 1996, 2012).
Another burgeoning venue for use of Benford’s Law has been the forensic analysis of elections, seeking
irregularities in the vote through unusual patterns in the count (Berber et al., 2012). Active research in the
field continues, even as Grendar et al. (2007) have refined the underlying approach by introducing a family
of related Benford-like distributions. At present Benford’s Law has been employed to seek out possible
irregularities, but while such analyses produce at times highly suggestive and indicative results, they should
not be understood as fully confirmative of potential malfeasance.
This paper uses Benford's Law to examine at two related issues: first, do the counts found in historical
Korean statistical compendia generally follow the distributions predicted by Benford’s Law? And second, if
they do indeed do so, can we find statistically significant departures that might hint at more broadly based
lapses in record keeping, or even deliberate malfeasance?
3. ANALYSIS OF SEJONG ANNALS GEOGRAPHICAL APPENDIX
Using data derived from Park (1987) and Kang (1998), using Stata 12 (2011) software, I analyzed the
household and population statistics given in the Geographical Appendix for 65 locales in 15th century Korea,
8/13/2019 Behind The Veil of Power: State Statistics and Benford’s Law in Chosǒn Korea
http://slidepdf.com/reader/full/behind-the-veil-of-power-state-statistics-and-benfords-law-in-choson-korea 5/12
M. Hejtmanek | 13
© ORIC Publications /2013
comprising 112,874 persons, living in 37,749 households. As can be seen from Figure 3, the first digits of
the household data, well tracks the Benford distribution. Using the standard Pearson’s chi-squared test to
compare the observed data with the expected data from a Benford first-digit distribution, the statistic
obtained (3.3, p=0.914), with 8 degrees of freedom is far too low to find evidence to reject the null
hypothesis of concordance with the 1BL distribution. Hence, households in this data sample do track the
first-digit Benford’s Law (1BL) quite well. Using the second digit test of Benford’s Law (2BL, see Figure 4),
there remains no evidence (chi2=3.72, p=0.9287) to reject the null hypothesis. This data on households from
the fifteenth century well follows a Benford distribution and provides no evidence of possible irregularities.
Figure 3. First Digit Distribution of Households in Senjong Annals
Figure 4. Second Digit Distribution of Households in Senjong Annals
8/13/2019 Behind The Veil of Power: State Statistics and Benford’s Law in Chosǒn Korea
http://slidepdf.com/reader/full/behind-the-veil-of-power-state-statistics-and-benfords-law-in-choson-korea 6/12
14 | Behind The Veil Of Power: State Statistics And Benford’s Law In Chosǒn Korea
Journal of Human And Social Science Research / Vol. 2, No. 1 (2013), 9-20
Figure 5. First Digit Distribution of Population in Senjong Annals
Figure 6. Second Digit Distribution of Population in Senjong Annals
Data on the population in these locales from the Sejong Annals is presented in Figures 5 and 6. Here
too, for the 1st digit, the numbers line up in a distribution statistically indistinguishable from the that
predicted by 1BL, with a chi-square value of 6.84 for 8 degrees of freedom, or p = 0.554, i.e. and hence far
above a p value of 0.05 than could indicated convincing evidence of a difference possibly indicative of
8/13/2019 Behind The Veil of Power: State Statistics and Benford’s Law in Chosǒn Korea
http://slidepdf.com/reader/full/behind-the-veil-of-power-state-statistics-and-benfords-law-in-choson-korea 7/12
M. Hejtmanek | 15
© ORIC Publications /2013
slipshod data collection or fraudulent recording. For the second digit analysis, the chi-square value of 14.24
also failed to provide strong evidence of a departure from the 2BL distribution, but at p = 0.1139, displayed
marked divergences in the second digits of 0, 1, 4, and 5 especially. The data might bear closer examination
to see if the departures might be seen as part of a local pattern of recording. One might speculate that the
number four was underrepresented in both the 1BL and 2BL counts given the general taboo in East Asia
against the number four by those recording population since the word for “four” ( sa 四) is a homophone
with the word for “death” (sa 死). It is conceivable than some fours were recorded as fives. However
any such taboo concerning four is not in evidence in other data presented in this study. In the end, the data
remain explicable as random departures from the 1BL and 2BL distributions.
4. ANALYSIS OF THE ROYAL COMPENDIUM
Using data from JS Bae (1995) I first examined the statistics concerning the national system of
strategic royal forests ( pongsan), a category of data likely less charged with the potential for malfeasance
than the others examined for the Royal Compendium, and hence useful as a potential baseline for observing
how well the data in this collection might adhere to a Benford-like distribution. Nationally there were 651strategic forests, found in 74 jurisdictions. The results of the analysis for 1BL are presented in Figure 7.
Because only 21 of the 74 locales had 10 or more forests, it was deemed unreasonable to attempt a 2BL
analysis, which requires counts of 10 or higher to obtain a second digit. The first digits in the data generate a
chi-square of 8.89 or p=0.352, providing insufficient evidence to reject the null hypothesis. These data are,
then, statistically concordant with the predictions of the 1BL distribution. Given the small number of second
digits in this data set (21 out of 74), not analysis of the 2BL is possible.
Figure 7. First Digit Distribution of Strategic Forests in Royal Compendium
Next, using raw data compiled by KT Kim et al. (2010) as part of an on-going project in Chosŏn-era
economic history at Sunggyungwan University and Seoul National University, I examined a series of
potentially more contentious statistics from the early 19th century, beginning with tax-exempt land.
8/13/2019 Behind The Veil of Power: State Statistics and Benford’s Law in Chosǒn Korea
http://slidepdf.com/reader/full/behind-the-veil-of-power-state-statistics-and-benfords-law-in-choson-korea 8/12
16 | Behind The Veil Of Power: State Statistics And Benford’s Law In Chosǒn Korea
Journal of Human And Social Science Research / Vol. 2, No. 1 (2013), 9-20
Nationwide there was a recorded 338,293 kyŏl (it is difficult to know precisely the precise acreage, since the
kyŏl measures productive capacity of land rather than absolute acreage, but on the order of roughly a million
acres) of land off the tax rolls for various 199 separate purposes and jurisdictions. The 1BL and 2BL
analyses are presented in Figures 8 and 9. Here, as well, it is impossible to reject the null hypothesis of
statistical concordance with the Benford distributions, given a chi-square of 10.31 (p=0.244) for the 1BL and
9.83 (p=0.364) for the 2BL. There is no evidence, at this level of aggregation, of possible irregularities
discernible via Benford distributions, which the data well follow.
Figure 8. First Digit Distribution of Tax Exempt Land in Royal Compendium
Figure 9. Second Digit Distribution of Tax Exempt Land in Royal Compendium
8/13/2019 Behind The Veil of Power: State Statistics and Benford’s Law in Chosǒn Korea
http://slidepdf.com/reader/full/behind-the-veil-of-power-state-statistics-and-benfords-law-in-choson-korea 9/12
M. Hejtmanek | 17
© ORIC Publications /2013
Figure 10. First Digit Distribution of Grain Tax in Royal Compendium
Figure 11. Second Digit Distribution of Grain Tax in Royal Compendium
Figures 10 and 11 show the 1BL and 2BL analyses of the national grain tax in rice and beans, paid to
the central government by farmers throughout the country in the early 19th century. Split among 70
categories of place and crop and totaling 107,895 sŏk (1 sŏk equals approximately 40 gallons), the numbers
8/13/2019 Behind The Veil of Power: State Statistics and Benford’s Law in Chosǒn Korea
http://slidepdf.com/reader/full/behind-the-veil-of-power-state-statistics-and-benfords-law-in-choson-korea 10/12
18 | Behind The Veil Of Power: State Statistics And Benford’s Law In Chosǒn Korea
Journal of Human And Social Science Research / Vol. 2, No. 1 (2013), 9-20
provided for the grain tax again show no statistical difference from the 1BL distribution (chi-square of 11.68,
p=0.166)) and 2BL distribution (chi-square 6.07 p=0.73)). Of course, this level of high aggregation might
well cover a wide variety of departures from Benford distributions that have canceled each other out, but as
they stand, these numbers diplay no discernible irregularities.
The final category examined was the potentially contentious one of military personnel. The
compendium lists 180,729 men in 79 separate categories. The results of the 1BL and 2BL analysis are shown
in Figures 12 and 13. Unlike other data presented in this paper, these provide statistically sufficient evidence
to reject the null hypothesis, both for first and second digits. The chi-square for 1BL is 21.95 (0.005) and for
2BL it is 38.93 (p=.000). Perhaps some of these high chi-square results stem from low cell counts, but
clearly the very high counts for 1 (1BL) and 0 (2BL) bear closer examination. These data appear to break
decisively from a Benford distribution and hence admit the possibility of either special circumstances that
might generate an excess of 1s, 0s, and 5s or, alternatively, potential abuse of the military rolls. One
plausible possibility would seem that size of military units in created an excess of 1s in the first digit
position and of 0s in the second position.
Figure 12. First Digit Distribution of Military Manpower in Royal Compendium
8/13/2019 Behind The Veil of Power: State Statistics and Benford’s Law in Chosǒn Korea
http://slidepdf.com/reader/full/behind-the-veil-of-power-state-statistics-and-benfords-law-in-choson-korea 11/12
M. Hejtmanek | 19
© ORIC Publications /2013
Figure 13. Second Digit Distribution of Military Manpower in Royal Compendium
5. CONCLUSION
To a surprising degree, a wide variety of data introduced in this paper from 15th- and 19th- centuries
om Chosŏn Korea statistically match Benford 1BL and 2BL distributions. This consonance is both indicative
of the general underlying power of Benford-like distributions to model data drawn from complex social
arrangements on a variety of scales and its potential use for further, more intense investigations at finer
levels of data aggregation. The 19th century military roster data show departures from the Benford
distributions, both1BL and 2BL, that warrant further study to understand their sources, which may well have
stemmed from the nature of military unit sizes.
While the general absence of non-Benford distributions of the data cannot be used as proof that
malfeasance was not occurring, it is nonetheless reassuring that in cases of household numbers and their
recorded population (15th century), strategic forest preserves, tax-exempt land plots, and grain tax figures
(19th century), there appeared no evidence of gross manipulation or wholesale manufacture of numbers. On
balance, this preliminary examination of two of Chosŏn Korea’s most important statistical compendia has
strengthened our general confidence in their utility and give credence to historical interpretations of the
Chosǒn period as superintended by a complex and sophisticated bureaucracy capable of producing a variety
of statistical information that passes twenty-first century tests of overall lack of gross unreliability.
REFERENCES [1] Bae, J. S. (1995). Chosŏn hugi pongsan wich’i mit kinǔng e kwan han yŏn’gu. Sallim kyŏngjae yŏng’u 3, 29-44.
[2] Benford, F. (1938). The Law of Anomalous Numbers. Proceedings of the American Philosophical Society, 78, 551-572.
[3] Berger, A. and Hill, T. P. (2011). A Basic Theory of Benford’s Law, Probability Surveys, 8, 1 -126.[4] Berber B., Scacco, A., and Alvarez R. M. (2012). What the Numbers Say: A Digit-Based Test for Election Fraud. Political
Analysis, 20(2) 211-234.
[5] Chosǒn wangjo sillok, Sejong chiriji (1454) [http://sillok.history.go.kr].
[6] Chŭngbo munhǒn pigo. (1908). Vol. 1&2. Seoul: Tongmun Munhwasa, photo-reproduction 1957.
8/13/2019 Behind The Veil of Power: State Statistics and Benford’s Law in Chosǒn Korea
http://slidepdf.com/reader/full/behind-the-veil-of-power-state-statistics-and-benfords-law-in-choson-korea 12/12
20 | Behind The Veil Of Power: State Statistics And Benford’s Law In Chosǒn Korea
[7] Deuchler, M. (1995). The Confucian Transformation of Korea: a Study of Society and Ideology. Cambridge: Harvard
University Press, 1995.
[8] Grendar, M., Judge, M. G., & Schechter, L. (2007). An empirical non-parametric likelihood family of data-based Benford-like
distributions, Physica A 380, 429-438.
[9] Academy of Korean Studies, Han'guk minjok munkwa paekkwa sajǒn electronic edition [http://encykorea.aks.ac.kr].
[10] Hill, T. P. (1995). A Statistical Derivation of the Significant-digit-law. Statistical Science, 10(4), 354-363.
[11] Kang, E. G. (1998). Koryŏ hugi hojangjǔng ǔi pyŏnhwa wa Sejong sillok chiriji ǔi t’osŏng mangsŏng. Dongbanji, 53-130.
[12] Kim, Kuen Tae et al. (2010). Man’gi yoram electronic database. Department of Korean History, Seoul National Universityand the Academy of East Asian Studies. Sunggyunkwan University.
[13] Lee, K. B. (1984). A New History of Korea. Cambridge: Harvard University Press, 1984.
[14] Newcomb, S. (1881). Note on the Frequency of Use of the Different Digits in Natural Numbers. American Journal of
Mathematics, 4(1), 39-40.
[15] Nigrini, M. J. (1996). A taxpayer compliance application of Benford’s law. J. Am. Taxation Assoc. 18 (1) 21 -23.
[16] Nigrini, M. J. (2012). Benford's Law: Applications for Forensic Accounting, Auditing, and Fraud Detection. John Wily &
Sons, New Jersey, 2012
[17] Pak, E. G. (1987). Koryŏ hugi hyangnich’ǔng ǔi pyŏngdong, Chindan hakpo. (64) 95-117.
[18] Palais, J. (1996). Confucian Statecraft and Korean Institutions: Yu Hyǒngwǒn and the Late Chosǒn Dynasty. Seattle :
University of Washington Press, 1996.
[19] Pinkham, R. (1961). On the Distribution of First Significant Digits. Annals of Mathematical Statistics, 32(4) 1223-1230.[20] Raimi, R. (1969). The First Digit Problem. American Mathematical Monthly, 83(7), 521-538.
[21] StataCorp. (2011). Stata Statistical Software: Release 12. College Station, TX: StataCorp LP.
[22] Sǒ Y. B. & Sim, S. K. (1809). Man'gi yoram, Vol. 1 & 2, Seoul: Kyǒngin Munhwasa, photo -reproduction 1972.
[23] Woodside, A. (2006). Lost Modernities. Cambridge: Harvard University Press, 2006.
[24] Yi, H. (1530). Sin tongguk yǒji sŭngnam. Seoul: Tongguk Munhwasa, photo-reproduction 1958.