Monday, June 30, 2014

Wednesday, June 25, 2014

More on pi vs. tau

First, a few links:

You can't toot your own horn if you only have an OBOE.

I would not at all be surprised if there were off-by-one errors (OBOE) in some of the positions I reported in pi and tau. When programming, I do zero-based counting, but I tried to express position numbers with more familiar one-based counting ordinals. A further complication is that it seems to be traditional to count the digits of pi starting after the decimal point, ignoring the three. Either these sources of error will cancel out or I'll actually have Off By Two Error, I don't know anymore, my head hurts. I tell you, Off By One Error are the two most annoying things about programming.

You keep using that word, "random". I do not think it means what you think it means.

It is very problematic to use the word random in relation to pi: there's absolutely nothing random about it, it's a constant, every digit is perfectly predictable if you have the time and patience and computing ability. I know what you're going to say, "but it's random in that its digits don't, aren't..." and I'll just stop you right there, I thought that too, but I'm told on good authority in no uncertain terms that that's not what random means. Mathematicians call the behavior of the digits of pi, or tau, or e, or √2, etc. that of normal numbers, e.g. the longer you count the digits, the more equal the count of every single digit. There are two problems with this nomenclature: one, the word "normal" is abnormal in that it means a lot of different things even within one discipline such as mathematics (normal numbers have little to do with normal distributions), so it's prone to be misunderstood, and (2) there is no proof that pi is normal, it just looks really, really, really normal every time someone carries the calculation a little farther.

tl;dr: I'm calling pi randomesque. It's totally meaningless, but instantly grokkable, I think.

It's hip to R Squared

I was unsure whether to use the term R squared, since it's so often used in a context of regression analysis. But I went with the familiar term rather than Coefficient of Determination. Just remember, the R Squared is not comparing the distribution to its best fit regression line, but to the line of a uniform distribution (with a slope of zero). R squared is most often used to analyze a regression model, but it's perfectly valid to use the term to compare a distribution to any model.

Home on the range

I found this somewhat interesting; there are plateaus and long reversals in the range between the most common and least common digits in the cumulative count.

The stats, ma'am, just the stats

Here are the complete stats mentioned in passing in part two of the competition:

NOTE: Some of the Tau ones might be wrong; a courageous reader found a bug in my program. I think the pi ones are correct, though, and any errors in Tau would make Tau better in comparison to Pi, so it would not change the results of my totally arbitrary contest, thank goodness.
pi tau
length position length position
Consecutive even numbers 29 36,454,143 28 39,904,078
Consecutive odd numbers 30 92,438,125 28 2,946,687
Consecutive prime numbers 26 896,631,791 22 84,259,349
Consecutive binary numbers 8 42,408,101 8 65,607,193
Longest stretch lacking one number 196 18,522,937 210 362,783,626
Recapitulation of itself 9 50,366,471 10 19,683,238
Recapitulation of the other 10 19,683,238 11 52,567,169

Number nine, number nine

When talking about R squared, a common question is "how many nines?", i.e. is the statistic above 0.9 (one nine), 0.99 (two nines), 0.999 (three nines), etc. The R Squared of Pi's cumulative digit distribution vs. a uniform distribution hits seven nines at position 72,000,000; I did not check past 100 million. The pattern seems to be approximately adding a nine with every tenfold change in position. Maybe that's some sort of inevitability with near-uniform distributions, it sounds likely, if any mathematicians read this and are like, "well, duh," let me know.

R squared position
0.9 19
0.99 577
0.999 6,410
0.9999 55,049
0.99999 461,828
0.999999 4,840,626
0.9999999 72,819,444

A final thought

It wasn't long ago that it was difficult to calculate pi. Now I could have downloaded a program to generate hundreds and hundreds of millions of digits on my desktop computer in a reasonable amount of time. Maybe the fact that supercomputers have pushed our knowledge into the hundreds of billions of digits makes it less fascinating, but many of the geek-out pi pages on the Internet date from the '90s.

• • •