<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<?xml-stylesheet type="text/css" href="http://pear.math.pitt.edu/mathzilla/Examples/Fermat/mathmlBase.css"?>
<head>
  <title>Benford's Law</title>
  <meta name="Version" content="$Rev: 18 $ - $URL: svn+ssh://svn.omnifarious.org/home/hopper/src/svn/homepage/trunk/technical/benfordslaw.xml $" />
  <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
</head>

<body>
<h2>This is a mirror of <a
href="http://www.mathpages.com/home/kmath302/kmath302.htm">this document</a>.</h2>

<p>I'm mirroring it because the other document may randomly go away someday,
and I find the information too useful.</p>

<h2>This document uses <a href="http://www.w3.org/Math/">MathML</a>.</h2>

<p>If, as a consequence, the equations don't look right to you, try
looking at <a href="equations.png">this image</a> that shows what the
properly rendered equations would look like.</p>

<h2>Benford's Law</h2>

<p>Benford's Law (which was first stated by Simon Newcomb in 1881) states that
if you randomly select a number from a table of physical constants or
statistical data, the probability that the first digit will be a "1" is about
0.301, rather than 0.1 as we might expect if all digits were equally likely.
In general, the "law" says that the probability of the first digit being a "d"
is</p>

<p align="center">
<math xmlns="http://www.w3.org/1998/Math/MathML">
  <mrow>
    <msub>
      <mi>log</mi>
      <mn>10</mn>
    </msub>
    <mrow>
      <mo>(</mo>
      <mn>1</mn>
      <mo>+</mo>
      <mfrac>
        <mn>1</mn>
        <mi>d</mi>
      </mfrac>
      <mo>)</mo>
    </mrow>
  </mrow>
</math>
</p>

<p>This implies that a number in a table of physical constants is more likely
to begin with a smaller digit than a larger digit. It was published by Newcomb
in a paper entitled "Note on the Frequency of Use of the Different Digits in
Natural Numbers", which appeared in The American Journal of Mathematics (1881)
4, 39-40. It was re-discovered by Benford in 1938, and he published an article
called "The Law of Anomalous Numbers" in Proc. Amer. Phil. Soc 78, pp
551-72.</p>

<p>Just for fun, I tabulated the first-digits of the physical constants listed
in Table 2.3 of Abramowitz and Stegun's "Handbook of Mathematical
Functions":</p>
<pre>      1
      1
      1*
      1
      1
      1  2
      1  2
      1  2*
      1  2
      1  2   *
      1  2     4  5
      1  2     4* 5
      1  2     4  5* 6        9
      1  2     4  5  6   *    9*
      1  2  3  4  5  6  7  8  9</pre>

<p>The "*" symbols indicate the approximate predicted number of occurrances
according to Benford's Law. Aside from the conspicuous deficiency of 3's,
that's not a bad match for just 44 data points.</p>

<p>Although there have been many lengthy "explanations" for Benford's Law, it
seems to me this is a good candidate for a "Proof Without Words":</p>
<pre>   1---------------2---------3-------4-----5----6---7--8--9</pre>

<p>[Okay, not entirely without words:] The underlying premise is simply that
physical constants, expressed in the base 10 and more or less arbitrary units,
will be somewhat evenly distributed on a logarithmic scale. This is confirmed
by the fact that the exponents on these constants are fairly uniformly
distributed, at least over several "decades". As a result, the probability of
the leading digit being "d" clearly approaches</p>

<p align="center">
<math xmlns="http://www.w3.org/1998/Math/MathML" display="inline">
  <mrow>
    <mfrac>
      <mrow>
        <mrow>
          <msub>
            <mi>log</mi>
            <mn>10</mn>
          </msub>
          <mrow>
            <mo>(</mo>
            <mi>d</mi>
            <mo>+</mo>
            <mn>1</mn>
            <mo>)</mo>
          </mrow>
        </mrow>
        <mo>-</mo>
        <mrow>
          <msub>
            <mi>log</mi>
            <mn>10</mn>
          </msub>
          <mi>d</mi>
        </mrow>
      </mrow>
      <mrow>
        <mrow>
          <msub>
            <mi>log</mi>
            <mn>10</mn>
          </msub>
          <mn>10</mn>
        </mrow>
        <mo>-</mo>
        <mrow>
          <msub>
            <mi>log</mi>
            <mn>10</mn>
          </msub>
          <mn>1</mn>
        </mrow>
      </mrow>
    </mfrac>
    <mo>=</mo>
    <mrow>
      <msub>
        <mi>log</mi>
        <mn>10</mn>
      </msub>
      <mrow>
        <mo>(</mo>
        <mn>1</mn>
        <mo>+</mo>
        <mfrac>
          <mn>1</mn>
          <mi>d</mi>
        </mfrac>
        <mo>)</mo>
      </mrow>
    </mrow>
  </mrow>
</math>
</p>

<p>Of course, we COULD have chosen units for our physical constants such that
the leading digits were all 9's (for example), but evidently we have a natural
tendancy to choose units so that our numbers are evenly distributed by order
of magnitude, rather than absolute value. This may be related to our basic
impressions of hearing and sight (not to mention earthquakes), where our
intuitive senses of loudness and brightness are logarithmic.</p>

<p>Naturally we can apply Benford's Law to numbers expressed in any base, not
just the base 10. In general the probability of the leading digit d (in the
range 1 to b-1) for the base b is</p>

<p align="center">
<math xmlns="http://www.w3.org/1998/Math/MathML" display="inline">
  <mrow>
    <mrow>
      <mi>Pr</mi>
      <mo>{</mo>
      <mi>d</mi>
      <mo>}</mo>
    </mrow>
    <mo>=</mo>
    <mrow>
      <mfrac>
        <mrow>
          <mi>ln</mi>
          <mo>(</mo>
          <mn>1</mn>
          <mo>+</mo>
          <mfrac>
            <mn>1</mn>
            <mi>d</mi>
          </mfrac>
          <mo>)</mo>
        </mrow>
        <mrow>
          <mi>ln</mi>
          <mo>(</mo>
          <mi>b</mi>
          <mo>)</mo>
        </mrow>
      </mfrac>
    </mrow>
  </mrow>
</math>
</p>

<p>Notice that for binary numbers, i.e., numbers expressed in the base 2, the
probability of the leading digit being 1 is 1.000, as it must be, since the
leading digit of a binary number is necessarily 1. The distributions of
probabilities of the digits 1 to b-1 for each base b from 2 to 10 are shown
below</p>
<img align="middle" src="302fig1.png" /> 

<p>We can also easily verify that the sum of all the probabilities for digits
1 through b-1 equals 1.0000, as it must, since the leading digit must be one
of these. This implies</p>

<p align="center">
<math xmlns="http://www.w3.org/1998/Math/MathML" display="inline">
  <mrow>
    <mrow>
      <mi>ln</mi>
      <mo>(</mo>
      <mrow>
        <mn>1</mn>
        <mo>+</mo>
        <mfrac>
          <mn>1</mn>
          <mn>1</mn>
        </mfrac>
      </mrow>
      <mo>)</mo>
    </mrow>
    <mo>+</mo>
    <mrow>
      <mi>ln</mi>
      <mo>(</mo>
      <mrow>
        <mn>1</mn>
        <mo>+</mo>
        <mfrac>
          <mn>1</mn>
          <mn>2</mn>
        </mfrac>
      </mrow>
      <mo>)</mo>
    </mrow>
    <mo>+</mo>
    <mrow>
      <mi>. . .</mi>
    </mrow>
    <mo>+</mo>
    <mrow>
      <mi>ln</mi>
      <mo>(</mo>
      <mrow>
        <mn>1</mn>
        <mo>+</mo>
        <mfrac>
          <mn>1</mn>
          <mrow>
            <mi>b</mi>
            <mo>-</mo>
            <mn>1</mn>
          </mrow>
        </mfrac>
      </mrow>
      <mo>)</mo>
    </mrow>
    <mo>=</mo>
    <mrow>
      <mi>ln</mi>
      <mo>(</mo>
      <mi>b</mi>
      <mo>)</mo>
    </mrow>
  </mrow>
</math>
</p>

<p>By the law of logarithms, 
<math xmlns="http://www.w3.org/1998/Math/MathML" display="inline">
  <mrow>
    <mrow>
      <mi>ln</mi>
      <mo>(</mo>
      <mrow>
        <mi>a</mi>
        <!-- <mo>&InvisibleTimes;</mo> -->
        <mi>b</mi>
      </mrow>
      <mo>)</mo>
    </mrow>
    <mo>=</mo>
    <mrow>
      <mrow>
        <mi>ln</mi>
        <mo>(</mo>
        <mi>a</mi>
        <mo>)</mo>
      </mrow>
      <mo>+</mo>
      <mrow>
        <mi>ln</mi>
        <mo>(</mo>
        <mi>b</mi>
        <mo>)</mo>
      </mrow>
    </mrow>
  </mrow>
</math>
, this is equivalent to</p>

<p align="center">
<math xmlns="http://www.w3.org/1998/Math/MathML" display="inline">
  <mrow>
    <mrow>
      <mi>ln</mi>
      <mo>(</mo>
      <mrow>
        <mfrac>
          <mn>2</mn>
          <mn>1</mn>
        </mfrac>
        <mo>*</mo>
        <mfrac>
          <mn>3</mn>
          <mn>2</mn>
        </mfrac>
        <mo>*</mo>
        <mfrac>
          <mn>4</mn>
          <mn>3</mn>
        </mfrac>
        <mo>*</mo>
        <mi>. . .</mi>
        <mo>*</mo>
        <mfrac>
          <mi>b</mi>
          <mrow>
            <mi>b</mi>
            <mo>-</mo>
            <mn>1</mn>
          </mrow>
        </mfrac>
      </mrow>
      <mo>)</mo>
    </mrow>
    <mo>=</mo>
    <mrow>
      <mi>ln</mi>
      <mo>(</mo>
      <mi>b</mi>
      <mo>)</mo>
    </mrow>
  </mrow>
</math>
</p>

<p>which confirms the result.</p>
<hr />
<a href="http://www.mathpages.com/home/index.htm">Return to MathPages Main
Menu</a> 
<hr />
</body>
</html>
