Benford's Law

30
Benford's law 1 Benford's law The distribution of first digits, according to Benford's law. Each bar represents a digit, and the height of the bar is the percentage of numbers that start with that digit. A logarithmic scale bar. Picking a random x position uniformly on this number line, roughly 30% of the time the first digit of the number will be 1. Frequency of first significant digit of physical constants plotted against Benford's Law. Benford's law, also called the first-digit law, states that in lists of numbers from many (but not all) real-life sources of data, the leading digit is distributed in a specific, non-uniform way. According to this law, the first digit is 1 about 30% of the time, and larger digits occur as the leading digit with lower and lower frequency, to the point where 9 as a first digit occurs less than 5% of the time. This distribution of first digits is the same as the widths of gridlines on the logarithmic scale. This counter-intuitive result has been found to apply to a wide variety of data sets, including electricity bills, street addresses, stock prices, population numbers, death rates, lengths of rivers, physical and mathematical constants, and processes described by power laws (which are very common in nature). It tends to be most accurate when values are distributed across multiple orders of magnitude. The graph to the right shows Benford's law for base 10. There is a generalization of the law to numbers expressed in other bases (for example, base 16), and also a generalization to second digits and later digits. It is named after physicist Frank Benford, who stated it in 1938, [1] although it had been previously stated by Simon Newcomb in 1881. [2] Mathematical statement A set of numbers is said to satisfy Benford's law if the leading digit d (d  {1, , 9}) occurs with probability Numerically, the leading digits have the following distribution in Benford's law, where d is the leading digit and P(d) the probability:

Transcript of Benford's Law

Page 1: Benford's Law

Benford's law 1

Benford's law

The distribution of first digits, according toBenford's law. Each bar represents a digit, and the

height of the bar is the percentage of numbersthat start with that digit.

A logarithmic scale bar. Picking a random x position uniformly on thisnumber line, roughly 30% of the time the first digit of the number will be

1.

Frequency of first significant digit of physicalconstants plotted against Benford's Law.

Benford's law, also called the first-digit law,states that in lists of numbers from many (but notall) real-life sources of data, the leading digit isdistributed in a specific, non-uniform way.According to this law, the first digit is 1 about 30%of the time, and larger digits occur as the leadingdigit with lower and lower frequency, to the pointwhere 9 as a first digit occurs less than 5% of thetime. This distribution of first digits is the same asthe widths of gridlines on the logarithmic scale.

This counter-intuitive result has been found toapply to a wide variety of data sets, includingelectricity bills, street addresses, stock prices,population numbers, death rates, lengths of rivers,physical and mathematical constants, andprocesses described by power laws (which are verycommon in nature). It tends to be most accuratewhen values are distributed across multiple ordersof magnitude.

The graph to the right shows Benford's law forbase 10. There is a generalization of the law tonumbers expressed in other bases (for example,base 16), and also a generalization to second digitsand later digits.

It is named after physicist Frank Benford, whostated it in 1938,[1] although it had been previouslystated by Simon Newcomb in 1881.[2]

Mathematical statement

A set of numbers is said to satisfy Benford's law ifthe leading digit d (d ∈ {1, …, 9}) occurs withprobability

Numerically, the leading digits have the following distribution in Benford's law, where d is the leading digit and P(d)the probability:

Page 2: Benford's Law

Benford's law 2

d P(d) Relative size of P(d)

1 30.1%

2 17.6%

3 12.5%

4 9.7%

5 7.9%

6 6.7%

7 5.8%

8 5.1%

9 4.6%

The quantity P(d) is proportional to the space between d and d + 1 on a logarithmic scale. Therefore, this is thedistribution expected if the logarithms of the numbers (but not the numbers themselves) are uniformly and randomlydistributed. For example, a one-digit number x starts with the digit 1 if 1 ≤ x < 2, and starts with the digit 9 if9 ≤ x < 10. Therefore, x starts with the digit 1 if log 1 ≤ log x < log 2, or starts with 9 if log 9 ≤ log x < log 10. Theinterval [log 1, log 2] is much wider than the interval [log 9, log 10] (0.30 and 0.05 respectively); therefore if log x isuniformly and randomly distributed, it is much more likely to fall into the wider interval than the narrower interval,i.e. more likely to start with 1 than with 9. The probabilities are proportional to the interval widths, and this gives theequation above. (The above discussion assumed x is a one-digit number, but the result is the same no matter howmany digits x has.)An extension of Benford's law predicts the distribution of first digits in other bases besides decimal; in fact, any baseb ≥ 2. The general form is:

For b = 2 (the binary number system), Benford's law is true but trivial: All binary numbers (except for 0) start withthe digit 1. (On the other hand, the generalization of Benford's law to second and later digits is not trivial, even forbinary numbers.) Also, Benford's law does not apply to unary systems such as tally marks.Benford's "law" is different from a typical mathematical theorem: It is an empirical statement about real-worlddatasets. It applies to some datasets but not all, and even when it applies it is at best only approximate, never exact.

Example

Distribution of first digits (in %, red bars) in thepopulation of the 237 countries of the world.

Black dots indicate the distribution predicted byBenford's law.

Examining a list of the heights of the 60 tallest structures in the worldby category shows that 1 is by far the most common leading digit,irrespective of the unit of measurement:

Page 3: Benford's Law

Benford's law 3

Leading digit meters feet In Benford's law

Count % Count %

1 26 43.3% 18 30.0% 30.1%

2 7 11.7% 8 13.3% 17.6%

3 9 15.0% 8 13.3% 12.5%

4 6 10.0% 6 10.0% 9.7%

5 4 6.7% 10 16.7% 7.9%

6 1 1.7% 5 8.3% 6.7%

7 2 3.3% 2 3.3% 5.8%

8 5 8.3% 1 1.7% 5.1%

9 0 0.0% 2 3.3% 4.6%

HistoryThe discovery of this fact goes back to 1881, when the American astronomer Simon Newcomb noticed that inlogarithm tables (used at that time to perform calculations), the earlier pages (which contained numbers that startedwith 1) were much more worn than the other pages.[2] Newcomb's published result is the first known instance of thisobservation and includes a distribution on the second digit, as well. Newcomb proposed a law that the probability ofa single number N being the first digit of a number was equal to log(N + 1) − log(N).The phenomenon was rediscovered in 1938 by the physicist Frank Benford,[1] who checked it on a wide variety ofdata sets and was credited for it. In 1995, Ted Hill proved the result about mixed distributions mentioned below.[3]

The discovery was named after Benford making it an example of Stigler's law.

ExplanationsBenford's law has been explained in various ways.

Outcomes of exponential growth processesThe precise form of Benford's law can be explained if one assumes that the logarithms of the numbers are uniformlydistributed; for instance that a number is just as likely to be between 100 and 1000 (logarithm between 2 and 3) as itis between 10,000 and 100,000 (logarithm between 4 and 5). For many sets of numbers, especially sets that growexponentially such as incomes and stock prices, this is a reasonable assumption.For example, if a quantity increases continuously and doubles every year, then it will be twice its original value afterone year, four times its original value after two years, eight times its original value after three years, and so on. Whenthis quantity reaches a value of 100, the value will have a leading digit of 1 for a year, reaching 200 at the end of theyear. Over the course of the next year, the value increases from 200 to 400; it will have a leading digit of 2 for a littleover seven months, and 3 for the remaining five months. In the third year, the leading digit will pass through 4, 5, 6,and 7, spending less and less time with each succeeding digit, reaching 800 at the end of the year. Early in the fourthyear, the leading digit will pass through 8 and 9. The leading digit returns to 1 when the value reaches 1000, and theprocess starts again, taking a year to double from 1000 to 2000. From this example, it can be seen that if the value issampled at uniformly distributed random times throughout those years, it is more likely to be measured when theleading digit is 1, and successively less likely to be measured with higher leading digits.This example makes it plausible that data tables that involve measurements of exponentially growing quantities willagree with Benford's Law. But the law also appears to hold for many cases where an exponential growth pattern is

Page 4: Benford's Law

Benford's law 4

not obvious.

Scale invariance

For each positive integer n, this graph shows theprobability that a random integer between 1 and n

starts with each of the nine possible digits. Forany particular value of n, the probabilities do notprecisely satisfy Benford's law; however, lookingat a variety of different values of n and averaging

the probabilities for each, the resultingprobabilities do exactly satisfy Benford's law.

The law can alternatively be explained by the fact that, if it is indeedtrue that the first digits have a particular distribution, it must beindependent of the measuring units used (otherwise the law would bean effect of the units, not the data). This means that if one convertsfrom feet to yards (multiplication by a constant), for example, thedistribution must be unchanged — it is scale invariant, and the onlycontinuous distribution that fits this is one whose logarithm isuniformly distributed.

For example, the first (non-zero) digit of the lengths or distances ofobjects should have the same distribution whether the unit ofmeasurement is feet, yards, or anything else. But there are three feet ina yard, so the probability that the first digit of a length in yards is 1must be the same as the probability that the first digit of a length in feetis 3, 4, or 5. Applying this to all possible measurement scales gives alogarithmic distribution, and combined with the fact that log10(1) = 0and log10(10) = 1 gives Benford's law. That is, if there is a distribution of first digits, it must apply to a set of dataregardless of what measuring units are used, and the only distribution of first digits that fits that is the Benford Law.

Multiple probability distributionsFor numbers drawn from certain distributions, for example IQ scores, human heights or other variables followingnormal distributions, the law is not valid. However, if one "mixes" numbers from those distributions, for example bytaking numbers from newspaper articles, Benford's law reappears. This can also be proven mathematically: if onerepeatedly "randomly" chooses a probability distribution (from an uncorrelated set) and then randomly chooses anumber according to that distribution, the resulting list of numbers will obey Benford's law.[4][3] Élise Janvresse andThierry de la Rue from CNRS advanced as similar probabilistic explanation for the appearance of Benford's law ineveryday-life numbers, by showing that it arises naturally when one considers mixtures of uniform distributions.[5]

ApplicationsIn 1972, Hal Varian suggested that the law could be used to detect possible fraud in lists of socio-economic datasubmitted in support of public planning decisions. Based on the plausible assumption that people who make upfigures tend to distribute their digits fairly uniformly, a simple comparison of first-digit frequency distribution fromthe data with the expected distribution according to Benford's law ought to show up any anomalous results.[6]

Following this idea, Mark Nigrini showed that Benford's law could be used in forensic accounting and auditing as anindicator of accounting and expenses fraud.[7] In the United States, evidence based on Benford's law is legallyadmissible in criminal cases at the federal, state, and local levels.[8]

Benford's law has been invoked as evidence of fraud in the 2009 Iranian elections.[9] However, other expertsconsider Benford's law essentially useless as a statistical indicator of election fraud in general.[10][11]

Page 5: Benford's Law

Benford's law 5

LimitationsBenford's law can only be applied to data that are distributed across multiple orders of magnitude. For instance, onemight expect that Benford's law would apply to a list of numbers representing the populations of UK villagesbeginning with 'A', or representing the values of small insurance claims. But if a "village" is a settlement withpopulation between 300 and 999, or a "small insurance claim" is a claim between $50 and $100, then Benford's lawwill not apply.[12][13]

Consider the probability distributions shown below, plotted on a log scale.[14] In each case, the total area in red is therelative probability that the first digit is 1, and the total area in blue is the relative probability that the first digit is 8.

A narrow probability distribution on a log scale

For the left distribution, the size of the areasof red and blue are approximatelyproportional to the widths of each red andblue bar. Therefore the numbers drawn fromthis distribution will approximately followBenford's law. On the other hand, for theright distribution, the ratio of the areas ofred and blue is very different from the ratioof the widths of each red and blue bar.

Rather, the relative areas of red and blue are determined more by the height of the bars than the widths. The heights,unlike the widths, do not satisfy the universal relationship of Benford's law; instead, they are determined entirely bythe shape of the distribution in question. Accordingly, the first digits in this distribution do not satisfy Benford's lawat all.[13]

Thus, real-world distributions that span several orders of magnitude rather smoothly like the left distribution (e.g.income distributions, or populations of towns and cities) are likely to satisfy Benford's law to a very goodapproximation. On the other hand, a distribution that covers only one or two orders of magnitude, like the rightdistribution (e.g. heights of human adults, or IQ scores) is unlikely to satisfy Benford's law well.[12][13]

Distributions that exactly satisfy Benford's lawSome well-known infinite integer sequences provably satisfy Benford's law exactly (in the asymptotic limit as moreand more terms of the sequence are included). Among these are the Fibonacci numbers,[15][16] the factorials,[17] thepowers of 2,[18][19] and the powers of almost any other number.[18]

Likewise, some continuous processes satisfy Benford's law exactly (in the asymptotic limit as the process continueslonger and longer). One is an exponential growth or decay process: If a quantity is exponentially increasing ordecreasing in time, then the percentage of time that it has each first digit satisfies Benford's law asymptotically (i.e.,more and more accurately as the process continues for more and more time).

Generalization to digits beyond the firstIt is possible to extend the law to digits beyond the first.[20] In particular, the probability of encountering a numberstarting with the string of digits n is given by:

(For example, the probability that a number starts with the digits 3, 1, 4 is log10(1 + 1/314) ≈ 0.0014.) This result canbe used to find the probability that a particular digit occurs at a given position within a number. For instance, theprobability that a "2" is encountered as the second digit is[20]

Page 6: Benford's Law

Benford's law 6

And the probability that d (d = 0, 1, ..., 9) is encountered as the n-th (n > 1) digit is

The distribution of the n-th digit, as n increases, rapidly approaches a uniform distribution with 10% for each of theten digits.[20]

In practice, applications of Benford's law for fraud detection routinely use more than the first digit.[7]

Notes[1] Frank Benford (March 1938). "The law of anomalous numbers". Proceedings of the American Philosophical Society 78 (4): 551–572.

JSTOR 984802. (subscription required)[2] Simon Newcomb (1881). "Note on the frequency of use of the different digits in natural numbers". American Journal of Mathematics

(American Journal of Mathematics, Vol. 4, No. 1) 4 (1/4): 39–40. doi:10.2307/2369148. JSTOR 2369148. (subscription required)[3] Theodore P. Hill (1995). "A Statistical Derivation of the Significant-Digit Law" (http:/ / www. tphill. net/ publications/ BENFORD PAPERS/

statisticalDerivationSigDigitLaw1995. pdf) (PDF). Statistical Science 10: 354–363. .[4] Theodore P. Hill (July–August 1998). "The first digit phenomenon" (http:/ / www. tphill. net/ publications/ BENFORD PAPERS/

TheFirstDigitPhenomenonAmericanScientist1996. pdf) (PDF). American Scientist 86: 358. .[5] Élise Janvresse and Thierry de la Rue (2004), From Uniform Distributions to Benford's Law, Journal of Applied Probability, 41 1203-1210

(2004) preprint (https:/ / www. univ-rouen. fr/ LMRS/ Persopage/ Delarue/ Publis/ PDF/ uniform_distribution_to_Benford_law. pdf)[6] Varian, Hal. "Benford's law". The American Statistician 26: 65.[7] Mark J. Nigrini (May 1999). "I've Got Your Number" (http:/ / www. journalofaccountancy. com/ Issues/ 1999/ May/ nigrini). Journal of

Accountancy. .[8] "From Benford to Erdös" (http:/ / www. wnyc. org/ shows/ radiolab/ episodes/ 2009/ 10/ 09/ segments/ 137643). Radio Lab. 2009-09-30. No.

2009-10-09.[9] Stephen Battersby Statistics hint at fraud in Iranian election (http:/ / www. newscientist. com/ article/ mg20227144.

000-statistics-hint-at-fraud-in-iranian-election. html) New Scientist 24 June 2009[10] Joseph Deckert, Mikhail Myagkov and Peter C. Ordeshook, (2010) The Irrelevance of Benford’s Law for Detecting Fraud in Elections

(http:/ / www. vote. caltech. edu/ drupal/ node/ 327), Caltech/MIT Voting Technology Project Working Paper No. 9[11] Charles R. Tolle, Joanne L. Budzien, and Randall A. LaViolette (2000) Do dynamical systems follow Benford?s law? (http:/ / dx. doi. org/

10. 1063/ 1. 166498), Chaos 10, 2, pp.331-336 (2000); DOI:10.1063/1.166498[12] See (http:/ / www. dspguide. com/ ch34. htm), in particular (http:/ / www. dspguide. com/ ch34/ 10. htm).[13] Fewster, R. M. (2009). "A simple explanation of Benford's Law". The American Statistician 63 (1): 26–32. doi:10.1198/tast.2009.0005[14] Note that if you have a regular probability distribution (on a linear scale), you have to multiply it by a certain function to get a proper

probability distribution on a log scale: The log scale distorts the horizontal distances, so the height has to be changed also, in order for the areaunder each section of the curve to remain true to the original distribution. See, for example, (http:/ / www. dspguide. com/ ch34/ 4. htm)

[15] L. C. Washington, "Benford's Law for Fibonacci and Lucas Numbers", The Fibonacci Quarterly, 19.2, (1981), 175–177[16] R. L. Duncan, "An Application of Uniform Distribution to the Fibonacci Numbers", The Fibonacci Quarterly, 5, (1967), 137–140[17] P. B. Sarkar, "An Observation on the Significant Digits of Binomial Coefficients and Factorials", Sankhya B, 35, (1973), 363–364[18] In general, the sequence k1, k2, k3, etc., satisfies Benford's law exactly, under the condition that log10 k is an irrational number. This is a

straightforward consequence of the equidistribution theorem.[19] That the first 100 powers of 2 approximately satisfy Benford's law is mentioned by Ralph Raimi. Ralph A. Raimi, "The First Digit

Problem", American Mathematical Monthly, 83, number 7 (August–September 1976), 521–538[20] Theodore P. Hill, "The Significant-Digit Phenomenon", The American Mathematical Monthly, Vol. 102, No. 4, (Apr., 1995), pp. 322–327.

Official web link (subscription required) (http:/ / www. jstor. org/ stable/ 2974952). Alternate, free web link (http:/ / www. math. gatech. edu/~hill/ publications/ cv. dir/ digit. pdf).

Page 7: Benford's Law

Benford's law 7

References• Sehity et al. (2005). "Price developments after a nominal shock: Benford's Law and psychological pricing after

the euro introduction". International Journal of Research in Marketing 22 (4): 471–480.doi:10.1016/j.ijresmar.2005.09.002.

• Bernhard Rauch1, Max Göttsche, Gernot Brähler, Stefan Engel (August 2011). "Fact and Fiction inEU-Governmental Economic Data". German Economic Review 12 (3): 243–255.doi:10.1111/j.1468-0475.2011.00542.x.

• Wendy Cho and Brian Gaines (August 2007). "Breaking the (Benford) Law: statistical fraud detection incampaign finance". The American Statistician 61 (3): 218–223. doi:10.1198/000313007X223496.

• L.V.Furlan (June 1948). "Die Harmoniegesetz der Statistik: Eune Untersuchung uber die metrischeInterdependenz der soziale Erscheinungen". Reviewed in Journal of the American Statistical Association 43(242): 325–328. JSTOR 2280379.

External links

General audience• Benford Online Bibliography (http:/ / www. benfordonline. net), an online bibliographic database on Benford's

Law.• Following Benford's Law, or Looking Out for No. 1 (http:/ / www. rexswain. com/ benford. html), 1998 article

from The New York Times.• A further five numbers: number 1 and Benford's law (http:/ / www. bbc. co. uk/ radio4/ science/ further5. shtml),

BBC radio segment by Simon Singh• From Benford to Erdös (http:/ / www. wnyc. org/ shows/ radiolab/ episodes/ 2009/ 10/ 09/ segments/ 137643),

Radio segment from the Radiolab program• Looking out for number one (http:/ / plus. maths. org/ issue9/ features/ benford/ index-gifd. html) by Jon Walthoe,

Robert Hunt and Mike Pearson, Plus Magazine, September 1999• Video showing Benford's Law applied to Web Data (incl. Minnesota Lakes, US Census Data and Digg Statistics)

(http:/ / www. kirix. com/ blog/ 2008/ 07/ 22/ fun-and-fraud-detection-with-benfords-law/ )• An illustration of Benford's Law (http:/ / www. mpi-inf. mpg. de/ ~fietzke/ benford. html), showing first-digit

distributions of various sequences evolve over time, interactive.• Testing Benford's Law (http:/ / testingbenfordslaw. com) An open source project showing Benford's Law in

action against publicly available datasets.

More mathematical• Weisstein, Eric W., " Benford's Law (http:/ / mathworld. wolfram. com/ BenfordsLaw. html)" from MathWorld.• Benford’s law, Zipf’s law, and the Pareto distribution (http:/ / terrytao. wordpress. com/ 2009/ 07/ 03/

benfords-law-zipfs-law-and-the-pareto-distribution/ ) by Terence Tao• Country Data and Benford's Law (http:/ / demonstrations. wolfram. com/ CountryDataAndBenfordsLaw/ ),

Benford's Law from Ratios of Random Numbers (http:/ / demonstrations. wolfram. com/BenfordsLawFromRatiosOfRandomNumbers/ ) at Wolfram Demonstrations Project.

• Benford's Law Solved with Digital Signal Processing (http:/ / www. dspguide. com/ CH34. PDF)

Page 8: Benford's Law

Article Sources and Contributors 8

Article Sources and ContributorsBenford's law  Source: http://en.wikipedia.org/w/index.php?oldid=463539347  Contributors: 130.94.122.xxx, 4pq1injbok, AV3000, Achoo5000, Ahoerstemeier, Aioth, Akwdb, Albrodax,Alexsmail, Amakuha, Amniarix, AndrewWTaylor, Ash.matadeen, Asrghasrhiojadrhr, Avraham, AxelBoldt, Axlrosen, Ayager, AznBurger, BGrayson, BYZANTIVM, Baccyak4H, Bdb484,Beland, BenFrantzDale, BenRG, Bender235, Betacommand, Bryan Derksen, Bubba73, Burn, Buster79, Cactusthorn, Calypso, Cburnett, Cgwaldman, ChangChienFu, Charles Matthews, CharlesSturm, Circumspice, Cmglee, Crtolle, CurmudgeonlyEditor, Curps, Cybercobra, Cyclotome, Das my, DavidWBrooks, Den fjättrade ankan, Derek farn, Doctormatt, Doradus, Drnathanfurious,Dzordzm, ESkog, Ebster95, Edward, Eequor, Elwikipedista, Eurobas, Eurosong, Euryalus, ExoSynth, FarhanC99, Finn-Zoltan, Fnielsen, Foobaz, Frankie1969, Friendshao, Gandalf61,Geoffrey.landis, Giftlite, Gknor, Gnathan87, Gpeilon, Groyolo, Hannes Eder, Harmil, Henrygb, Hillman, Hu12, IanOfNorwich, Imfranklyn, Irishguy, Jackessler, Jacquerie27, Jakob.scholbach,Jason Carreiro, Jeffq, Jeremyleader, Jj137, Johnbibby, Johnuniq, Justin Mauger, KellyCoinGuy, Kevinpurcell, Kindall, Kinser, Kojones, Kompere, Kweeket, Lanzkron, Leandrod, Lightmouse,Lonewolf1313, Lord.lucan, Lsdan, Lubaf, MBlakley, MSGJ, MacGyverMagic, Malurth, Marcus erronius, Meira, Michael Bednarek, Michael Hardy, Michael.Pohoreski, Mindmatrix, Mitar,Mkch, Mmm, Morte, Mrwright, NameIsRon, Nameless23, Nbarth, Nishkarshs, Noe, Nunquam Dormio, Orangemike, Oreo Priest, Oxinabox, Oxymoron83, PChalmer, PMajer, Paul Murray, PaulNiquette, Pcu123456789, PhilKnight, Philopedia, Pleasantville, Pmanderson, Prari, Protonk, Qwfp, Randywombat, Rich Farmbrough, Rjwilmsi, Robertharder, RuM, Sam Derbyshire,SamIAmNot, Samw, Sbyrnes321, Sheppa28, Sholom, Shyamal, Smyth, Social Norm, Srleffler, SteveChervitzTrutane, Subwaynyc, SuneJ, Tagishsimon, Tam0031, Tarantulae, Tassedethe,Taxman, Tbhotch, Tez, That Guy, From That Show!, The wub, Thorwald, Thumperward, Tijfo098, Tomi, UncleBubba, Urhixidur, Vashtihorvat, Vgent, Waldir, Wmahan, 172 anonymous edits

Image Sources, Licenses and ContributorsFile:Rozklad benforda.svg  Source: http://en.wikipedia.org/w/index.php?title=File:Rozklad_benforda.svg  License: Public Domain  Contributors: GknorFile:Logarithmic scale.png  Source: http://en.wikipedia.org/w/index.php?title=File:Logarithmic_scale.png  License: Public Domain  Contributors: Original uploader was HB at fr.wikipediaFile:Benford-physical.svg  Source: http://en.wikipedia.org/w/index.php?title=File:Benford-physical.svg  License: Public Domain  Contributors: Original uploader was Drnathanfurious aten.wikipediaFile:Benfords law illustrated by world's countries population.png  Source: http://en.wikipedia.org/w/index.php?title=File:Benfords_law_illustrated_by_world's_countries_population.png License: Creative Commons Attribution-Sharealike 3.0  Contributors: Jakob.scholbachFile:BenfordDensities.png  Source: http://en.wikipedia.org/w/index.php?title=File:BenfordDensities.png  License: Creative Commons Attribution-Sharealike 3.0  Contributors: Original uploaderwas Sam Derbyshire at en.wikipediaFile:BenfordNarrow.gif  Source: http://en.wikipedia.org/w/index.php?title=File:BenfordNarrow.gif  License: Public Domain  Contributors: Sbyrnes321

LicenseCreative Commons Attribution-Share Alike 3.0 Unported//creativecommons.org/licenses/by-sa/3.0/

Page 9: Benford's Law

701

CHAPTER

34Explaining Benford’s Law

Digital Signal Processing usually involves signals with either time or space as theindependent parameter, such as audio and images, respectively. However, the power ofDSP can also be applied to signals represented in other domains. This chapter providesan example of this, where the independent parameter is the number line. The particularexample we will use is Benford’s Law, a mathematical puzzle that has caused people toscratch their heads for decades. The techniques of signal processing provide an elegantsolution to this problem, succeeding where other mathematical approaches have failed.

Frank Benford’s DiscoveryFrank Benford was a research physicist at General Electric in the 1930swhen he noticed something unusual about a book of logarithmic tables.The first pages showed more wear than the last pages, indicating thatnumbers beginning with the digit 1 were being looked up more often thannumbers beginning with 2 through 9. Benford seized upon this idea andspent years collecting data to show that this pattern was widespread innature. In 1938, Benford published his results, citing more than 20,000values such as atomic weights, numbers in magazine articles, baseballstatistics, and the areas of rivers.

This pattern of numbers is unexpected and counterintuitive. In fact, manydo not believe it is real until they conduct an experiment for themselves.I didn’t! For instance, go through several pages of today’s newspaperand examine the leading digit of each number. That is, start from the leftof each number and ignore the sign, the decimal point and any zeros. Thefirst digit you come to, between 1 and 9, is the leading digit . Forexample, 3 is the leading digit of 37.3447, and 6 is the leading digit of-0.06345. Since there are nine possible digits, you would expect thatone-ninth (11.11%) of the numbers would have 1 in the leading digitposition. However, this is not what you will find– about 30.1% of thenumbers will start with 1. It gets even stranger from here.

Page 10: Benford's Law

The Scientist and Engineer's Guide to Digital Signal Processing702

FIGURE 34-1Frank Albert Benford, Jr., (1883-1948) was anAmerican electrical engineer and physicist. In1938 he published a paper entitled “The Law ofAnomalous Numbers.” This is now commonlycalled Benford’s Law.

Figure 34-2 shows two examples of Benford’s law. The histogram on theleft is for 14,414 numbers taken from the income tax returns of U.S.corporations. The pattern here is obvious and very repeatable. Theleading digit in these numbers is a 1 about 30.1% of the time, a 2 about17.6% of the time, and so on. Mathematicians immediately recognize thatthese values correspond to the spacing on the logarithmic number line.That is, the distance between 1 and 2 on the log scale is log(2) - log(1)= 0.301. The distance between 2 and 3 is log(3) - log(2) = 0.176, and soon. Benford showed us that this logarithmic pattern of leading digits isextremely common in nature and human activities. In fact, even thephysical constants of the universe follow this pattern– just look at thetables in a physics textbook. On the other hand, not all sets of numbers follow Benford’s law. Forexample, the histogram in Fig. 34-2b was generated by taking a largenumber of samples from a computer random number generator. Theseparticular numbers follow a normal distribution with a mean of five anda standard deviation of three. Changing any of these parameters willdrastically change the shape of this histogram, with little apparent rhymeor reason. Obviously, these numbers do not follow the logarithmicleading-digit distribution. Likewise, most of the common distributionsyou learned about in statistics classes do not follow Benford’s law. Oneof the primary mysteries of Benford’s law has been this seeminglyunpredictable behavior. Why does one set of numbers fol low thelogarithmic pattern, while another set of numbers does not?

As if this wasn’t mysterious enough, Benford’s law has another propertythat is certain to keep you up at night. Figure 34-2a was created fromnumbers that appear in U.S. tax returns, and therefore each of thesenumbers is a dollar value. But what is so special about the U.S. dollar?Suppose that you are a financial expert in India and want to examine thisset of data. To make it easier you convert all of the dollar values toIndian rupees by multiplying each number by the current conversion rate.It is likely that the leading digit of all 14,414 numbers will be changed

Page 11: Benford's Law

Chapter 34- Explaining Benford’s Law 703

FIGURE 34-2Two examples of leading-digit histograms. The left figure shows the leading-digit distribution for14,414 numbers taken from U.S. Federal income tax returns. The figure on the right is for numbersproduced by a computer random number generator (RNG). This shows one of the longstandingmysteries of Benford’s law– Why do some sets of numbers follow the law (such as tax returns), whileothers (such as this RNG) do not? Many have claimed that this is some sort of secret code hidden inthe fabric of Nature.

by this conversion. Nevertheless, about 30.1% of the converted numberswill still have a leading digit of 1. In other words, if a set of numbersfollows Benford’s law, multiplying the numbers by any possible constantwill create another set of numbers that also follows Benford’s law. Asystem that remains unchanged when multiplied by a constant is calledscale invariant. Specifically, groups of numbers that follow Benford’slaw are scale invariant. Likewise, groups of numbers that do not followBenford’s law are not. For instance, this procedure would scramble theshape of the histogram in Fig. 34-2b.

Now suppose that this tax return data is being examined by an alien fromanother planet. Since he has eight fingers, he converts all of his numbersto base 8. Like before, most or all of the leading digits will change inthis procedure. In spite of this, the new group of numbers also followsBenford’s law (taking into account that there are no 8's or 9's in base 8).This property is called base invariance. In general, if a group of numbersfollows Benford’s law in one base, it will also follow Benford’s law ifconverted to another base. However, there are some exceptions to thisthat we will look at later.

Page 12: Benford's Law

The Scientist and Engineer's Guide to Digital Signal Processing704

What does this all mean? Over the last seven decades Benford’s law hasachieved almost a cult following. It has been widely claimed to be evidenceof some mysterious or paranormal property of our universe. For instance,Benford himself tried to connect the mathematics with Nature, claimingthat mere Man counts arithmetically, 1,2,3,4..., while Nature counts e0, ex,e2x, e3x, and so on. In another popular version, suppose that nature containssome underlying and universal distribution of numbers. Since it is universal,it should look the same regardless of how we choose to examine it. Inparticular, it should not make any difference what units we associate with thenumbers. The distribution should appear the same if we express it in dollarsor rupees, feet or meters, Fahrenheit or Celsius, and so on. Likewise, theappearance should not change when we examine the numbers in differentbases. It has been mathematically proven that the logarithmic leading-digitpattern is the only distribution that fulfils these invariance requirements.Therefore, if there is an underlying universal distribution, Benford’s lawmust be it. Based on this logic, it is very common to hear that Benford’s lawonly applies to numbers that have units associated with them. On the otherend of the spectrum, crackpots abound that associate Benford’s law withpsychic and other paranormal claims.

Don’t waste your time trying to understand the above ideas; they arecompletely on the wrong track. There is no “universal distribution” and thisphenomenon is unrelated to “units.” In the end, we will find that Benford’slaw looks more like a well-executed magic trick than a hidden property ofthe universe.

Homomorphic ProcessingEnjoy learning about Benford’s law, but don’t lose sight of the purpose ofthis chapter. Focus on the overall method:

“If the tool you have is a hammer, make the problem look like a nail.”

In DSP this approach is called homomorphic processing, meaning “thesame structure.” In science and engineering it is common to encountersignals that are difficult to understand or analyze. The strategy ofhomomorphic processing is to convert this unmanageable situation intoa conventional linear system, where the analysis techniques are wellunderstood. This is done by applying whatever mathematical transformsor tricks are needed for the particular application.

For instance, the classic use of homomorphic processing is to separatesignals that have been multiplied, such as: a(t) = b(t) × c(t). This can beconverted into a linear system, i.e., signals that are added together, bytaking the logarithm: log[a(t)] = log[b(t)] + log[c(t)]. Notice that this istaking the log of the dependent parameter. In our analysis of Benford’slaw we will take the log of the independent parameter. Two different

Page 13: Benford's Law

Chapter 34- Explaining Benford’s Law 705

techniques to keep in your bag of DSP tricks. In the next section severalother tricks will be presented, such as inventing the Ones Scaling Test,and evoking a sampling function.

It this sounds complicated, you’re right; it certainly can be. There is noguarantee that it is even possible to convert an arbitrary problem into theform of a linear system. Even if it is possible, it may require a series ofnasty steps that take considerable time to develop. However, if you aresuccessful in applying the homomorphic approach the rewards willimmediately flow. You can say goodbye to a difficult problem, and helloto a representation that is simple and straightforward.

The following analysis of Benford’s law is conducted in three steps. Instep one we will define a statistical procedure for determining how well a setof numbers follows Benford’s law, called the Ones Scaling Test. In step twowe will move from statistics to probability, expressing the problem in theform of a convolution. In step three we use the Fourier Transform to solvethe convolution, giving us the explanation we are looking for.

The Ones Scaling TestGiven a set of numbers, the simplest test for Benford’s law is to counthow many of the numbers have 1 as the leading digit. This fraction willbe about 0.301 if Benford’s law is being followed. However, even findingthis value exactly is not sufficient to conclude that the numbers areobeying the law. For instance, the set might have 30.1% of the numberswith a value of 1.00, and 69.9% with a value of 2.00. We can overcomethis problem by including a test for scale invariance. That is, we multiplyeach number in the set by some constant, and then recounting how manynumbers have 1 as their leading digit. If Benford’s law is truly beingfollowed, the percentage of numbers beginning with the digit 1 willremain about 30.1%, regardless of the constant we use.

A computer program can make this procedure more systematic, such asthe example in Table 34-1. This program loops through the evaluation696 times, with each loop multiplying all numbers in the group by 1.01.On the first loop each of the original numbers will be multiplied by 1.01.On the second loop each number will be multiplied by 1.01 again, inaddition to the multiplication that took place on the first loop. By thetime we reach the 80th loop, each number will have been multiplied by1.01 a total of 80 times. Therefore, the numbers on the 80th loop are thesame as multiplying each of the original numbers by 1.0180, or 2.217. Atthe completion of the program the numbers will have been multiplied 696times, equivalent to multiplying the original numbers by a constant of1.01696 . 1,000. In other words, this computer program systematicallyscales the data in small increments over about three orders of magnitude.

The fraction of numbers having 1 as the leading digit is tallied on eachof these 696 steps and stored in an array, which we will call the OnesScaling Test. Figure 34-3 shows the values in this array for the two

Page 14: Benford's Law

The Scientist and Engineer's Guide to Digital Signal Processing706

FIGURE 34-3The Ones Scaling Test for the examples in Fig. 34-2. The Ones Scaling Test determines the fractionof numbers having a leading digit of one, as the set of number is repeatedly multiplied by a constantslightly greater than unity, such as 1.01. If the set of numbers follows Benford’s law, the fraction willremain close to 0.301, as shown in (a). The fraction departing from 0.301 proves that the numbers donot follow Benford’s law, such as in (b).

examples in Fig. 34-2. As expected, the Ones Scaling Test for the incometax numbers is a relatively constant value around 30.1%, proving that itfollows Benford’s law very closely. As also expected, the Ones ScalingTest for the random number generator shows wild fluctuations, as highas 51% and as low as 12%.

An important point to notice in Fig. 34-3 is that the Ones Scaling Test isperiodic, repeating itself when the multiplication constant reaches a factorof ten. In this example the period is 232 entries in the array, since 1.01232

. 10. Say you start with the number 3.12345 and multiply it by 10 to get31.2345. These two numbers, 3.12345 and 31.2345, are exactly the samewhen you are only concerned with the leading digit, and the entire patternrepeats.

Pay particular attention to the operations in lines 400 to 430 of Table 34-1. This is where the program determines the leading digit of the numberbeing evaluated. In line 310, one of the 10,000 numbers being tested ist ransferred to the variable: TESTX . The leading digi t of TESTX ,eventually held in the variable LD, is calculated in four steps. In line 400we eliminate the sign of the number by taking the absolute value. Lines410 and 420 repeatedly multiply or divide the number by a factor a ten,as needed, until the number is between 1 and 9.999999. For instance,line 410 tests the number for being less than 1. If it is, the number ismultiplied by 10, and the line is repeated. When the number finallyexceeds 1, the program moves to the next line. In line 430 we extract theinteger portion of the number, which is the leading digit. Make sure youunderstand these steps; they are key to understanding what is really goingon in Benford’s law.

Page 15: Benford's Law

Chapter 34- Explaining Benford’s Law 707

100 ' INVESTIGATING BENFORD’S LAW: THE ONES SCALING TEST 110 '120 ' 'DIMENSION THE ARRAYS130 DIM OST(696) 'The "Ones Scaling Test" array.140 DIM X(9999) 'The 10,000 numbers being tested.150 '160 FOR I = 0 TO 9999 'GENERATE 10,000 NUMBERS FOR TESTING170 X(I) = RND ' RND returns a random number uniformly180 NEXT I ' distributed between 0 and 1.190 '200 ' 'CALCULATE THE ONE SCALING TEST ARRAY210 FOR K = 0 to 696 'Loop for each entry in the OST array.220 NRONES = 0 'NRONES counts how many leading digits are one. 230 ' 300 FOR I = 0 TO 9999 'Loop through all 10,000 numbers being tested.310 TESTX = X(I) 'Load number being tested into variable, TESTX.320 '330 ' 'Find the leading digit, LD, of TESTX.400 TESTX = ABS(TESTX)410 IF TESTX < 1 THEN TESTX = TESTX * 10: GOTO 410420 IF TESTX >= 10 THEN TESTX = TESTX / 10: GOTO 420430 LD = INT(TESTX)440 '500 ' 'If leading digit is 1, increment counter.510 IF LD = 1 THEN NRONES = NRONES + 1520 NEXT I530 '540 OST(K) = NRONES / 10000 'Store the calculated fraction in the array.550 '600 FOR I = 0 TO 9999 'Multiply test numbers by 1.01, for next loop.610 X(I) = X(I) * 1.01620 NEXT I630 '700 NEXT K710 ' 'The Ones Scaling Test now resides in OST( ).

TABLE 34-1

Writing Benford’s Law as a ConvolutionThe previous section describes the Ones Scaling Test in terms of statistics,i.e., the analysis of actual numbers. Our task now is to rewrite this testin terms of probability, the underlying mathematics that govern how thenumbers are generated.

As discussed in Chapter 2, the mathematical description of a process thatgenerates numbers is called the probability density function, or pdf. Ingeneral, there are two ways that the shape of a particular pdf can beknown. First, we can understand the physical process that generates thenumbers. For instance, the random number generator of a computer fallsin this category. We know what the pdf is, because it was specificallydesigned to have this pdf by the programer that developed the routine.

Page 16: Benford's Law

The Scientist and Engineer's Guide to Digital Signal Processing708

EQUATION 34-1Correction needed when converting apdf from the linear to the base tenlogarithmic number line.

Second, we can estimate the pdf by examining the generated values. Theincome tax return numbers are an example of this. It seems unlikely thatanyone could mathematically understand or predict the pdf of thesenumbers; the processes involved are just too complicated. However, wecan take a large group of these numbers and form a histogram of theirvalues. This histogram gives us an estimate of the underlying pdf, butisn’t exact because of random statistical variations. As the number ofsamples in the histogram becomes larger, and the width of the bins ismade smaller, the accuracy of the estimate becomes better.

The statistical version of the Ones Scaling Test analyzes a group ofnumbers. Moving into the world of probability, we will replace this groupof numbers with its probability density function. The pdf we will use asan example is shown in Fig. 34-4a. The mathematical name we will givethis example curve is pdf(g). However, there is an important catch here;we are representing this probability density function along the base-tenlogarithmic number line, rather than the conventional linear number line.The position along the logarithmic axis will be denoted by the variable,g. For instance, g = -2 corresponds to a value of 0.01 on the linear scale,since log(0.01) = -2. Likewise, g = 0 corresponds to 1, g = 1 correspondsto 10, and so on.

Many science and engineering graphs are presented with a logarithmic x-axis, so this probably isn’t a new concept for you. However, a specialproblem arises when converting a probability density function from thelinear to the logarithmic number line. The usual way of moving betweenthese domains is simple point-to-point mapping. That is, whatever valueis at 0.01 on the linear scale becomes the value at -2 on the log scale;whatever value is at 10 on the linear scale becomes the value at 1 on thelog scale, and so on. However, the pdf has a special property that mustbe taken into account. For instance, suppose we know the shape of a pdfand want to determine how many of the numbers it generates are greaterthan 3 but less than 4. From basic probability, this fraction is equal tothe area under the curve between the values of 3 and 4. Now look atwhat happens in a point-to-point mapping. The locations of 3 and 4 onthe linear scale become log(3) = 0.477 and log(4) = 0.602, respectively,on the log scale. That is, the distance between the two points is 1.00 onthe linear scale, but only 0.125 on the logarithmic number line. Thischanges the area under the curve between the two points, which is simplynot acceptable for a pdf.

Fortunately, this is quite simple to correct. First, transfer the pdf fromthe linear scale to the log scale by using a point-to-point mapping.Second, multiply this mapped curve by the following exponential functionto correct the area problem:

Page 17: Benford's Law

Chapter 34- Explaining Benford’s Law 709

There is also another way to look at this issue. A histogram is created fora group of number by breaking the linear number line into equally spacedbins. But how would this histogram be created on the logarithmic scale?There are two choices. First, you could calculate the histogram on thelinear scale, and then transfer the value of the bins to the log scale.However, the equally spaced bins on the linear scale become unequallyspaced on the log scale, and Eq. 34-1 would be needed as a correction.Second, you could break the logarithmic number line in equally spacedbins, and directly f i l l up these bins with the data. This procedureaccurately estimates the pdf on the log scale without any additionalcorrections.

Now back to Fig. 34-4a. The example shown is a Gaussian (normal)curve with a mean of -0.25 and a standard deviation of 0.25, measured onthe base ten logarithmic number line. Since it is a normal distributionwhen displayed on the logarithmic scale, it is given the special name: log-normal. When this pdf is displayed on the linear scale it looks entirelydifferent, as we will see shortly. About 95% of the numbers generatedfrom a normal distribution lie within +/- 2 standard deviations of themean , o r in th i s example , f rom -0 .75 to 0 .250, on the log sca le .Converting back to the linear scale, this particular random process willgenerate 95% of its samples between 10-0.75 and 100.25, that is, between0.178 and 1.778.

The important point is that this is a single process that generatesnumbers, but we can look at those numbers on either the linear or thelogarithmic scale. For instance, on the linear scale the numbers mightlook like: 1.2034, 0.3456, 0.9643, 1.8567, and so on. On the log scalethese same numbers would be log(1.2034) = 0.0804, -0.4614, -0.0158,0.2687, respectively. When we ask if this distribution follows Benford’slaw, we are referring to the numbers on the linear scale. That is, we arelooking at the leading digits of 1.2034, 0.3456, 0.9643, 1.8567, etc.However, to understand why Benford’s law is being followed or notfollowed, we will find it necessary to work with their logari thmiccounterparts.

The next step is to determine what fraction of samples produced by thispdf have 1 as their leading digit. On the linear number line there are onlycertain regions where a leading digit of 1 is produced, such as: 0.1 to0.199999; 1 to 1.99999; 10 to 19.9999; and so on. The correspondinglocations on the base ten log scale are: -1.000 to -0.699; 0.000 to 0.301;and 1.000 to 1.301, respectively. In Fig. 34-4b these regions have beenmarked with a value of one, while all other sections of the logarithmicnumber line are given a value of zero. This allows the waveform in Fig.(b) to be used as a sampling function, and therefore we will call it, sf(g).

Here is how it works. We multiply pdf(g) by sf(g) and display the resultin Fig. (c). As shown, this isolates those sections of the pdf where 1 isthe leading digi t . We then f ind the total area of these regions byintegrating from negative to positive infinity. Now you can see one

Page 18: Benford's Law

The Scientist and Engineer's Guide to Digital Signal Processing710

EQUATION 34-2Calculating the Ones Scaling Test fromthe probability density function, by useof a scaling function. This equation alsoappears in Fig. 34i.

reason this analysis is carried out on the logarithmic number line: thesampling function is a simple periodic pattern of pulses. In comparison,think about how this sampling function would appear on the linear scale–far too complicated to even consider. The above procedure is expressed by the equation in (d), which calculatesthe fraction of number produced by the distribution with 1 as the leadingdigit. However, as before, even if this number is exactly 0.301, it wouldnot be conclusive proof that the pdf follows Benford’s law. To show thiswe must conduct the Ones Scaling Test. That is, we will adjust pdf(g)such that the numbers it produces are multiplied by a constant that isslightly above unity. We then recalculate the fraction of ones in theleading digit position, and repeat the process many times.

Here we find a second reason to use the logarithmic scale: multiplicationon the linear number line becomes addition in the logarithmic domain. Onthe linear scale we calculate: n x 1.01, while on the logarithmic scale thisbecomes: log(n) + log(1.01). In other words, on the logarithmic numberline we scale the distribution by adding a small constant to each numberthat is produced. This has the effect of shifting the entire pdf(g) curve tothe right a small distance, which we represent by the variable, s. This isshown in Fig. (f). Mathematically, shifting the signal pdf(g) to the righta distance, s, is written pdf(g-s).

The sampling function in Fig. (g) is the same as before; however, it nowisolates a different section of the pdf, shown in (h). The integration alsogoes on as before, with the addition of the shift, s, represented in theequation. In short , we have derived an equation that provides theprobability that a number produced by pdf(g) will have 1 in the leadingdigit position, for any scaling factor, s. As before, we will call this theOnes Scaling Test, and denote it by ost(s). This equation is given in (i),and reprinted below:

The signal ost(s) is nothing more than a continuous version of the graphsshown in Fig. 34-3. If pdf(g) follows Benford’s law, then ost(s) will beapproximately a constant value of 0.301. If ost(s) deviates from this keyvalue, Benford’s law is not being followed. For instance, we can easilysee from Fig. (e) that the example pdf in (a) does not follow the law.

These last steps and Eq. 34-2 should look very familiar: shift, multiply,integrate. That’s convolution! Comparing Eq. 34-2 with the definition

Page 19: Benford's Law

Chapter 34- Explaining Benford’s Law 711

FIGURE 34-4Expressing Benford’s law as a convolution. Figures a-e show how to calculate the probability thata sample produced by pdf(g) will have a leading digit of 1. Figures f-i extend this calculation intothe complete Ones Scaling Test. This shows that the Ones Scaling Test, ost(g), is equal to theconvolution of the probability density function, pdf(g), and the scaling function, sf(g).

EQUATION 34-3Benford’s law written as a convolution.The negative sign in pdf(-g) is an artifactof how the equation is derived and is notimportant

of convolution (Eq. 13-1 in chapter 13), we have succeeded in expressingBenford’s law as a straightforward linear system:

There are two small issues that need to be mentioned in this equation.First, the negative sign in pdf(-g). As you recall, convolution requiresthat one of the two original signals be flipped left-or-right before theshift, multiply, integrate operations. This is needed for convolution toproperly represent linear system theory. On the other hand, this flip notneeded in examining Benford’s law; it’s just a nuisance. Nevertheless,we need to account for it somewhere. In Eq. 34-3 we account for it by

Page 20: Benford's Law

The Scientist and Engineer's Guide to Digital Signal Processing712

EQUATION 34-4The Fourier transform converts thedifficult operation of convolutioninto a simple multiplication.

pre-flipping pdf(g) by making it pdf(-g). This pre-flip cancels the flipinherent in convolution, keeping the math straight. However, the wholeissue of using pdf(-g) instead of pdf(g) is unimportant for Benford’s law;it disappears completely in the next step.

The second small issue is a signal processing notation, the elimination ofthe variable, s. In Fig. 3-4 we write pdf(g) and sf(g), meaning that thesetwo signals have the logarithmic number l ine as their independentvariable, g. However, the Ones Scaling Test is written ost(s), where s isa shift along the logarithmic number line. This distinction between g ands is needed in the derivation to understand how the three signals arerelated. However, when we get to the shorthand notation of Eq. 34-3, weeliminate s by changing ost(s) to ost(g). This places the three signals,pdf(g), sf(g) and ost(g) all on equal footing, each running along thelogarithmic number line.

Solving in the Frequency Domain

Figure 34-5 is what we have been working toward, a systematic way ofunderstanding the operation of Benford’s law. The left three signals, thelogarithmic domain, are pdf(g), sf(g) and ost(g). The particular examplesin this figure are the same ones we used previously (i.e., Fig. 34-4).These three signals are related by convolution (Eq. 34-3), a mathematicaloperation that is not especially easy to deal with. To overcome this wemove the problem into the frequency domain by taking the Fouriertransform of each signal. Using standard DSP notation, we will representthe Fourier transforms of pdf(g), sf(g), and ost(g), as PDF(f), SF(f), andOST(f), respectively. These are shown on the right side of Fig. 34-5.

By moving the problem into the frequency domain we replace thed i f f i cu l t opera t ion o f convo lu t ion wi th the s imple opera t ion o fmultiplication. That is, the six signals in Fig. 34-5 are related as follows:

A small detail: The Fourier transform of pdf(g) is PDF(f), while theFourier transform of pdf(-g) is PDF*(f). The star in PDF*(f) means it isthe complex conjugate of PDF(f), indicating that all of the phase valuesare changed in sign. However, notice that Fig. 34-5 only shows themagnitudes; we are completely ignoring the phases. The reason for thisis simple– the phase does not contain information we are interested in forthis particular problem. This makes it unimportant if we use pdf(g) vs.pdf(-g), or PDF(f) vs. PDF*(f).

Page 21: Benford's Law

Chapter 34- Explaining Benford’s Law 713

Notice how these signals represent the key components of Benford’s law.First, there is a group of numbers or a probability density function thatcan generate a group of numbers. This is represented by pdf(g) andPDF(f). Second, we modify each number in this group or distribution bytaking its leading digit. This action is represented by convolving pdf(g)with sf(g), or by multiplying PDF(f) by SF(f). Third, we observe that theleading digits often have an unusual property. This unusual characteristicis seen in ost(g) and OST(f).

All six of these signals have specific characteristics that are fixed by thedefinition of the problem. For instance, the value at f=0 in the frequencydomain always corresponds to the average value of the signal in thelogarithmic domain. In particular, this means that PDF(0) will always beequal to one, since the area under pdf(g) is unity. In this example we areusing a Gaussian curve for pdf(g). One of the interesting properties of theGaussian is that its Fourier Transform is also a Gaussian, one-sided inthis case, as shown in Fig. (d). These are related by Ff = 1/(2BFg).

Since sf(g) is periodic with a period of one, SF(f) consists of a series ofspikes at f = 0, 1, 2, 3, ..., with all other values being zero. This is astandard transform pair, given by Fig. 13-10 in chapter 13. The zerothspike, SF(0), is the average value of sf(g). This is equal to the fraction ofthe time that the signal is in the high state, or log(2) - log(1) = 0.301.The remaining spikes have amplitudes: SF(1) = 0.516, SF(2) = 0.302,SF(3) = 0.064, and so on, as calculated from the above reference.

Lastly we come to ost(g) and OST(f). If Benford’s law is being followed,ost(g) will be a flat line with a value of 0.301. This corresponds toOST(0) = 0.301, with all other values in OST(f) being zero. However, ifBenford’s law is not being followed, then ost(g) will be periodic with aperiod of one, as show in Fig. (c). Therefore, OST(f) will be a series ofspikes at f = 0, 1, 2, 3, ..., with the space between being zero.

Solving Mystery #1

There are two main mysteries in Benford’s law. The first is this: Wheredoes the logarithmic pattern of leading digits come from? Is it some hiddenproperty of Nature? We know that ost(g) is a constant value of 0.301 ifBenford’s law is being followed. Using Fig. 34-5 we can find where thisnumber originates. By definition, the average value of ost(g) is OST(0);likewise, the average value of sf(g) is SF(0). However, OST(0) is alwaysequal to SF(0), since PDF(0) has a constant value of one. That is, theaverage value of ost(g) is equal to the average value of sf(g), and does notdepend on the characteristics of pdf(g). As shown above, the averagevalue of sf(g) is log(2) - log(1) = 0.301, which dictates that the averagevalue of ost(g) is also 0.301. If we repeated this procedure looking for2 as the leading digit, the average value of sf(g) would be log(3) - log(2)= 0.176. The remaining digits, 3-9, are handled in the same way. Inanswer to our question, the logarithmic pattern of leading digits derives

Page 22: Benford's Law

The Scientist and Engineer's Guide to Digital Signal Processing714

solely from sf(g) and the convolution, and not at all from pdf(g). In short,the logarithmic pattern of leading digits comes from the manipulationof the data, and has nothing to do with patterns in the numbers beinginvestigated.

This result can be understood in a simple way, showing how Benford'slaw resembles a magician’s slight of hand. Say you tabulate a list ofnumbers appearing in a newspaper. You tally the histogram of leadingdigits and find that they follow the logarithmic pattern. You then wonderhow this pattern could be hidden in the numbers. The key to this isrealizing that something has been concealed— a big something.

Recall the program in Table 34-1, where lines 400-430 extract the leadingdigit of each number. This is done by multiplying or dividing eachnumber repeatedly by a factor of ten until it is between 1 and 9.999999.This manipulation of the data is far from trivial or benign. You don'tnotice this procedure when manually tabulating the numbers because yourbrain is so efficient. But look at what this manipulation involves. Forexample, successive numbers might be multiplied by: 0.01, 100, 0.1, 1,10, 1000, 0.001, and so on.

This changes the numbers in a pattern based on powers of ten, i.e., theanti-logarithm. You then examine the processed data and marvel that itlooks logarithmic. Not realizing that your brain has secretly manipulatedthe data, you attribute this logarithmic pattern to some hidden feature ofthe original numbers. Voila! The mystery of Benford's law!

Solving Mystery #2

The second mystery is: Why does one set of numbers follow Benford’s law,while another set of numbers does not? Again we can answer thisquestion by examining Fig. 34-5. Our goal is to find the characteristicsof pdf(g) that result in ost(g) having a constant value of 0.301. As shownabove, the average value of ost(g) will always be 0.301, regardless ifBenford’s law is being followed or not. So our only concern is whetherost(g) has oscillations, or is a flat line.

For ost(g) to be a flat line it must have no sinusoidal components. In thefrequency domain this means that OST(f) must be equal to zero at allfrequencies above f=0. However, OST(f) is equal to SF(f) × PSF(f), andSF(f) is nonzero only at the integer frequencies, f = 0, 1, 2, 3, 4, and soon. Therefore, ost(g) will be flat, if and only if, PSF(f) has a value ofzero at the integer frequencies. The particular example in Fig. 34-5clearly does not meet this condition, and therefore does not followBenford’s law. In Fig. (d), PDF(1) has a value of 0.349. Multiplying thisby the value of SF(1) = 0.516, we find OST(1) = 0.18. Therefore, ost(g)has a sinusoidal component with a period of one, and an amplitude of0.18. This is a key result, describing what criterion a distribution must

Page 23: Benford's Law

Chapter 34- Explaining Benford’s Law 715

FIGURE 34-5Benford's law analyzed in the frequency domain. In the logarithmic domain Benford's law isrepresented as a convolution, ost(g) = sf(g) t pdf(-g). In the frequency domain this becomes themuch simpler operation of multiplication, OST( f ) = SF( f ) × PDF*( f ).

meet to follow Benford’s law. This is important enough that we willexpress it as a theorem.

Benford’s Law Compliance Theorem Let P be a random process generating numbers in base B onthe linear number line, pdf(g) its probability density functionexpressed on the base B logarithmic number line, and PDF(f) theFourier transform of pdf(g). The numbers generated by P willfollow Benford’s law, if and only if, PDF(f) = 0 at all nonzerointeger frequencies.

Our next step is to examine what type of distributions comply with thistheorem. There are two distinct ways that PDF(f) can have a value ofzero at the nonzero integer frequencies. As shown in Fig. 34-6b, PDF(f)can be oscillatory, periodically hitting zero at frequencies that include theintegers. In the logarithmic domain this corresponds to two or morediscontinuities spaced an integer distance apart, such as sharp edges orabrupt changes in the slope. Figure (a) shows an example of this, arectangular pulse with edges at -1 and 1. These discontinuities can easily

Page 24: Benford's Law

The Scientist and Engineer's Guide to Digital Signal Processing716

be created by human manipulation, but seldom occur in natural orunforced processes. This type of distribution does follow Benford’s law,but it is mainly just a footnote, not the bulk of the mystery.

Figure (d) shows a far more important situation, where PDF(f) smoothlydecreases in value with increasing frequency. This behavior is more thancommon, it is the rule. It is what you would find for most any set ofrandom numbers you examine. The key parameter we want to examine ishow fast the curve drops to zero. For instance, the curve in Fig. 34-6ddrops so rapidly that it has a negligible value at f=1 and all higherfrequencies. Therefore, this distribution will follow Benford's law to avery high degree. Now compare this with Fig. 34-5d, an example wherePDF(f) drops much slower. Since it has a significant value at f=1, thisdistribution follows Benford's law very poorly.

Now look at pdf(g) for the above two examples, Figs. 34-6c and 34-5a.Both of these are normal distributions on the logarithmic scale; the onlydifference between them is their width. A key property of the Fouriertransform is the compression/expansion between the domains. If youneed to refresh your memory, look at Figure 10-12 in chapter 10. Inshort, if the signal in one domain is made narrower, the signal in theother domain will become wider, and vice versa. For example, in Fig.34-5a the standard deviation of pdf(g) is Fg = 0.25. This results in PDF(f)having a standard deviation of: Ff = 1/(2BFg) = 0.637. In Fig. 34-6 thelog domain is twice as wide, Fg = 0.50, making the frequency domaintwice as narrow, Ff = 0.318. In these figures the width of the distributionis indicated as 2F, that is, -F to F. This is common, but certainly not theonly way to measure the width.

In short, if pdf(g) is narrow, then PDF(f) will be wide. This results inPDF(f) having a significant amplitude at f=1, and possibly at higherfrequencies. Therefore, the distribution will not follow Benford's law.However, if pdf(g) is wide, then PDF(f) will be narrow. This results inPDF(f) falling near zero before f=1, and Benford's law is followed.

A key issue is how wide or narrow pdf(g) needs to be to toggle betweenthe two behaviors. To follow Benford law, PDF(f) must drop to near zeroby f=1. Further, f=1 in the frequency domain corresponds to a sinusoidwith a period of one on the log scale, making this the critical distance.This gives us the answer to our question. With a few caveats, Benford'slaw is followed by distributions that are wide compared with unitdistance along the logarithmic scale. Likewise, the law is not followedby distributions that are narrow compared with unit distance.

To be clear, one exception occurs when PDF(f) is oscillatory such as inFig. 34-6b. The other exception is when PDF(f) does not smoothlydecreases in value with increasing frequency. Also, the definition of“width” used here is slightly fuzzy. We will improve upon this in the nextsection. However, these are minor issues and details; do not let themdistract from your understanding of the mainstream phenomenon.

Page 25: Benford's Law

Chapter 34- Explaining Benford’s Law 717

FIGURE 34-6 Two ways of obeying Benford's law. The Benford’s Law Compliance Theorem shows that adistribution will obey the law only if PDF(f) has a value of zero at f = 1, 2, 3, ... . This can beachieved in two different ways. In (b) the oscillations hit zero at these frequencies, while in (d)the curve has dropped to zero before f=1.

More on Following Benford’s law

This last result is very surprising; the mystery of Benford’s law turns outto be nothing more than distribution width. Figure 34-7 demonstrates thisusing our previous examples. Figures (a) and (c) are the histograms ofthe income tax return and the RNG numbers, respectively, on thelogarithmic scale. Figure (b) and (d) are their Fourier Transforms. TheBenford’s Law Compliance Theorem te l l s us that (b) wil l fol lowBenford’s law very closely, while (d) will follow it very poorly. That is,PDF(f) falls to near zero before f=1 for the income tax numbers, but doesnot for the RNG numbers. The next step of this is less rigorous, but stillperfectly clear. Figure (b) falls to zero quickly because (a) is broad.Likewise, (d) falls to zero more slowly because (c) is narrow.

This also tells us something about the magic trick. If the distribution iswide compared with unit distance on the log axis, it means that the spreadin the set of numbers being examined is much greater than ten . Forinstance, look back at the income tax numbers shown in Fig. 34-2a. Thelargest numbers in this set are about a million times greater in value thanthe smallest numbers. This extensive spread is a key part of stamping thelogarithmic pattern into the data. That is, 543,923,100 must be divided by100,000,000 to place it between 1 and 9.99999, while 1,221 only needs tobe divided by 1,000. In other words, different numbers are being treateddifferently, all according to an anti-logarithmic pattern.

Page 26: Benford's Law

The Scientist and Engineer's Guide to Digital Signal Processing718

Now look at the RNG numbers in Fig. 34-2, a group that does not obeyBenford’s law. The largest numbers in this set are about four times thesmallest numbers (measured from -F to +F). That is, they are groupedrelatively close together in value. When we extract the leading digitsfrom these numbers, most of them are treated exactly the same. Forinstance, both 7.844026 and 1.230605 are divided by 1 to place thembetween 1 and 9.999999. Likewise, numbers clustered around 5,000would all be divided by 1,000 to extract the leading digits. Since the vastmajority of the numbers are being treated the same, or nearly the same,the distortion of the data is relatively weak. That is, the logarithmicpattern cannot be introduced into the data, and the magic trick fails.

How does Benford’s law behave in other bases? Suppose you repeat theprevious derivation in base 4 instead of base 10. The base 4 logarithmicnumber line is used and the Benford’s Law Compliance Theorem stillholds. The difference comes in when we compare the width of our testdistribution with one unit of distance on the logarithmic scale. One unitof distance in base 4 is only log10(4) = 0.602 the length of one unit inbase 10, making it easier for the distribution to comply with Benford’slaw. In terms of the magic trick, the spread in the numbers beingexamined only needs to be much greater than four, rather than ten. In thecommon case where PDF(f) smoothly decreases, Benford’s law willalways be followed better when converted to a lower base, and worse ifconverted to a higher base. For instance, the income tax numbers will notfollow Benford’s law if converted to base 10,000 or above (making theunit distance on the log scale four times greater). Likewise, the RNDnumber will follow Benford’s law if converted to base 2 (shortening theunit distance to log10(2) = 0.301).

A note for advanced readers: You may have noticed a problem with thislast statement, that is: all numbers in base 2 have a leading digit of 1.However, a more sophisticated definition of Benford’s law can be usedto eliminate issues of this sort. The leading digit of a number can befound by repeatedly multiplying/dividing the number by ten until it isbetween 1 and 9.99999, and then taking the integer portion. The advancedmethod stops after the first step, and directly looks at the pdf of thenumbers running between 1 and 9.99999. We will call these the modifiednumbers. If Benford’s law is being followed, a(n) = k/n, where a(n) isthe probability density function of the modified numbers on the linearscale, and k is a constant providing unity area under the pdf curve. Ifneeded for some purpose, we can find the fraction of numbers that havea leading digit of 1 by integrating a(n) from 1 to 2. Since the integral ofk/n is the logarithm, if Benford’s law is being followed this fraction isgiven by: log(2) - log(1) = 0.301. That is, we can easily move from theadvanced representation to the simpler leading-digit definition.

This “k/n” form of Benford’s law can be also derived from the method ofFig. 34-5. The fraction of the modified numbers that are greater than pbut less than q is found by integrating a(n) between p and q. Further, thisfraction will remain a constant under the scaling test if Benford’s law is

Page 27: Benford's Law

Chapter 34- Explaining Benford’s Law 719

FIGURE 34-7 Two examples for understanding Benford's law. A distribution will follow Benford’s law only if PDF(f)falls to near zero before f=1 (excluding the oscillatory case). In turn, this requires that pdf(g) be broadcompared with one unit of distance on the logarithmic scale. This explains why the income tax numbersfollow the law, while the RNG numbers do not.

EQUATION 34-5Derivation of k/n form ofBenford’s law.

being followed. However, this value is also equal to the average value ofthe appropriate scaling function. The logic here is the same used to showthat the average value of ost(g) is equal to the average value of sf(g) in“Solving Mystery #1.” These two factors become the left and right sidesof the following equation, respectively:

Solving this equation results in Benford’s law, i.e., a(n) = k/n,

Analysis of the Log-Normal Distribution

We have looked at two log-normal distributions, one having a standarddeviation of 0.25 and the other a standard deviation of 0.5. Surprisingly,one follows Benford's law extremely well, while the other does not followit at all. In this section we will examine the analytical transition betweenthese two behaviors for this particular distribution.

As shown in Fig. 34-5d, we can use the value of OST(1) as a measure ofhow well Benford's law is followed. Our goal is to derive an equationrelating the standard deviation of psf(g) with the value of OST(1), that is,relating the width of the distribution with its compliance with Benford'slaw. Notice that this has rigorously defined the problem (removed the

Page 28: Benford's Law

The Scientist and Engineer's Guide to Digital Signal Processing720

EQUATION 34-5Compliance of the log-normaldistribution with Benford’s law.

fuzziness) by specifying three things, the shape of the distribution, howwe are measuring compliance with Benford's law, and how we aredefining the distribution width.

The next step is to write the equation for PSF(f), a one-sided Gaussiancurve, having a value of zero at f=0, and a standard deviation of Ff:

Next we plug in the conversion from the logarithmic-domain standarddeviation, Ff = 1/(2BFg), and evaluate the expression at f=1:

Lastly, we use OST(1) = SF(1) × PSF(1), where SF(1) = 0.516, to reachthe final equation:

As illustrated in Fig. 34-5c, the highest value in ost(g) is OST(1) plus0.301, and the lowest value is 0.301 - OST(1). These highest and lowestvalues are graphed in Fig. 34-8a. As shown, when the 2F width of thedistribution is 0.5 (as in Fig 34-5a), the Ones Scaling Test will havevalues as high as 45% and as low as 16%, a very poor match to Benford’slaw. However, doubling the width to 2F = 1.0 results in a high to lowfluctuation of less than 1%, a good match.

There are a number of interesting details in this example. First, noticehow rapidly the transition occurs between following and not followingBenford’s law. For instance, two cases are indicated by A and B in Fig.34-8, with 2F = 0.60 and 2F = 0.90, respectively. In Fig. (b) these areshown on the linear scale. Now imagine that you are a researcher tryingto understand Benford’s law, before reading this chapter. Even thoughthese two distributions appear very similar, one follows Benford’s lawvery well, and the other doesn’t follow it at all! This gives you an ideaof the frustration Benford’s law has produced.

Second, even though the curves in Fig. (a) move together extremelyrapidly, they never actually meet (excluding infinity which isn’t allowedfor a pdf). For instance, from Eq. 34-5 a log-normal distribution with astandard deviation of three will follow Benford’s law within about 1 partin 100,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000. That’s pretty close! Infact, you could not statistically detect this error even with a billioncomputers, each generating a billion numbers each second, since thebeginning of the universe.

Page 29: Benford's Law

Chapter 34- Explaining Benford’s Law 721

FIGURE 34-8 Analyzing the log-normal distribution for complying with Benford’s law. Even a slight differencein the width of this distribution, shown by A and B, can drastically change its following the law.

Nevertheless, this is a finite error, and has caused frustration of its own.Again imagine that you are a researcher trying to understand Benford’slaw. You proceed by writing down some equation describing whenBenford’s law will be followed, and then you solve it. The answer youfind is— Never! There is no distribution (excluding the oscillatory caseof Fig. 34-6b) that follows Benford’s law exactly. An equation doesn’tgive you what is close, only what is equal. In other words, you find nounderstanding, just more mystery.

Lastly, the log-normal distribution is more than just an example, it is animportant case where Benford’s law arises in Nature. The reason for thisis one of the most powerful driving forces in statistics, the Central LimitTheorem (CLT). As discussed in chapter 2, the CLT describes that addingmany random numbers produces a normal distribution. This accounts forthe normal distribution being so commonly observed in science andengineering. However, if a group of random numbers are multiplied, theresult will be a normal distribution on the logarithmic scale. Accordingly,the log-normal distribution is also commonly found in Nature. This isprobably the single most important reason that some distributions arefound to follow Benford’s law while others do not. Normal distributionsare not wide enough to follow the law. On the other hand, broad log-normal distributions follow it to a very high degree.

Want to generate numbers that follow Benford’s law for your ownexperiments? You can take advantage of the CLT. Most computerlanguages have a random number generator that produces valuesuniformly distributed between 0 and 1. Call this function multiple timesand multiply the numbers. It can be shown that PDF(1) = 0.344 for theuniform distribution, and therefore the product of these numbers followsBenford’s law according to OST(1) = 51.6% × 0.344", where " is howmany random numbers are multiplied. For instance, ten multiplicationsproduce a random number that comes from a log-normal distribution with

Page 30: Benford's Law

The Scientist and Engineer's Guide to Digital Signal Processing722

a standard deviation of approximately 0.75. This corresponds to OST(1)= 0.0012%, a very good fit to Benford’s law. If you do try some of these experiments, remember that the statisticalvariation (noise) on N random events is about SQRT(N). For instance,suppose you generate 1 million numbers in your computer and count howmany have 1 as the leading digit. If Benford’s law is being followed, thisnumber will be about 301,000. However, when you repeat the experimentseveral times you find this changes randomly by about 1,000 numbers,since SQRT(1,000,000) = 1,000. In other words, using 1 million numbersallows you to conclude that the percentage of numbers with one as theleading digit is about 30.1% +/- 0.1%. As another example, the ripple inFig. 34-3a is a result of using 14,414 samples. For a more precisemeasurement you need more numbers, and it grows very quickly. Forinstance, to detect the error of OST(1) = 0.0012% (the above example),you will need in excess of a billion numbers.

The Power of Signal Processing Benford’s law has never been viewed as a major mathematical problem,only a minor mystery. Nevertheless, many bright and creative peoplehave spent time trying to understand it. The primary goal of this chapterhas been to demonstrate the power of DSP in nontraditional applications.In the case of Benford’s law this power is clear; signal processing hassucceeded where other mathematical techniques have failed.

Nowhere is this more apparent than a review article published in 1976 bymathematician Ralph Raimi. He examined the many approaches inexplicit mathematical detail, and his paper has become a landmark in thehistory of this problem. Buried in the detailed math, Raimi makes thebrief comment: “...many writers ... have said vaguely that Benford’s lawholds better when the distribution ... covers several orders of magnitude.”As we now know, this is the root of the phenomenon. In one of the mostcolorful events of this history, a small error in logic prompted Raimi toargue that this could not be correct. [Specifically, scaling a distributiondoes not change how many orders of magnitude it covers.] While thisslight misdirection probably made no difference, it shows just how littlesuccess had been achieved by traditional mathematics. An understandingof the basic operation of Benford’s law was nowhere on the horizon.

Lastly, this discussion would be incomplete without mentioning thepractical applications of Benford’s law. Next time you file your incometax return or other f inancial report , consider what happens to thedistribution of leading digits if you fabricate some of the numbers. I’mnot going to help you cheat, so I won’t give the details away. Simplyput, the numbers you make up will probably not follow Benford’s law,making your fraudulent report distinguishable from an accurate one. I’lllet you imagine who might be interested in this.