Sunday, August 30, 2015

211: Saving A Few Million Years

Audio Link

Those of you who follow the Math Mutation facebook feed may have noticed that a book I co-authored was just released: "Formal Verification: An Essential Toolkit for Modern VLSI Design". Now, I need to caution you that this is not a Math Mutation book-- it's a technical book, aimed at electrical and computer engineers invovled in chip design. However, I do think Math Mutation listeners might have some interest in the core concepts on which the book is based. So, today I'll try to give you a brief summary of what Formal Verification is, and why it's worth writing a book about.

You're probably aware that modern computer chips are pretty complicated, by many measures the most complex devices ever created by man. For new ones coming out this year, the number of transistors is measured in the billions. So it makes sense to ask the question: how do we know these things will work? It would be prohibitively expensive to just build them first and then test them, so we need to discover and fix as many of the bugs as possible during the design stage. The process of finding these bugs is known as design validation. For many years, the most popular technology used in design validation has been simulation: testing how a software model of the design will behave for various sets of inputs.

Unfortunately, even a simple chip design has so many possible behaviors that there is no way to simulate them all. For example, suppose you are designing a simple integer adder: it will take 2 numbers as inputs, each expressed with 32 bits, or binary digits, which can each be 1 or 0. It will then output their sum. How many possible input sets are there for this design? Since each input has 32 bits, it has 2^32 possible values, thus resulting in 2^64 possible values for the pair. If we assume we have a fast simulator that can check 2^20 values per second, that means that we will need 2^44 seconds to check all possible values-- over half a million years. Most managers are not very happy when given a time estimate on this order to finish a project. And needless to say, most chips sold today are many orders of magnitude more complex than a simple adder.

So, what can we do to make sure our chip designs really will work? A variety of technologies have been developed over the past few decades to try to find a good set of example values to simulate. And they do seem to be doing a decent job: most electronic devices you buy today seem to more-or-less do what you want them to. But it still seems like there should be a better way to validate them: no matter how good you make it, simulation and related methods can never cover more than a tiny portion of your design's possible behaviors.

That's where formal verification comes in. The idea of formal verification is to take a totally different approach: instead of trying specific values for your design, why not just mathematically prove that it will always be correct? That way you don't have to worry about trying every possible test case. If there is a single set of values that would generate an incorrect result, your proof will fail, and you will know your design has a bug. If you do succeed in mathematically proving your design correct, then you know that there is no bug, and do not need to waste time simulating lots of testcases. In effect, formally verifying a design is equivalent to simulating all possible values. Many would argue that philosophically, this is really the "right" way to validate chip designs. You may have heard the famous Guindon quote, "Writing is nature’s way of letting you know how sloppy your thinking is." Formal Verification pioneer Leslie Lamport expanded on this with "Math is nature's way of letting you know how sloppy your writing is.", and later added "Formal math is nature's way of letting you know how sloppy your math is."

You've probably guessed by now that there has to be a catch. Formal verification is easier defined than done: when billions of transistors are involved, how do we even get our heads around the problem of creating mathematical proofs? It's far beyond what anyone could manually do, so to make this method a possibility, humans need to be aided by intelligent software that helps to automate proofs. To further complicate matters, it's also been shown that any formal verification system needs to internally solve what are known as NP-complete problems. If you remember our discussion way back in episode 13, an NP-complete problem is "provably hard" in some sense, meaning that no piece of computer software can ever solve it efficiently in 100% of cases. However, researchers have worked for many years to try to develop practical software that could utilize clever tricks to enable real proofs on a wide variety of actual industrial product designs.

The good news is that, in the past decade, formal technology has advanced to the point where it really is practical for an average design engineer to use in many cases. While formal verification software can't handle full multi-billion-transistor chip designs, it can often enable an engineer to create solid proofs on major sub-blocks that go into a chip design, massively reducing overall risk of bugs. Using formal verification software remains a bit of an art though. Due to the NP-completeness issue, the software may get stuck or progress very slowly: the user must often give subtle hints and suggest shortcuts to enable the proofs to complete. In addition, formal verification is a problem that is impossible to fully automate: no matter how good your software gets at proving stuff, a human still has to somehow be able to tell it what stuff to prove-- what is the overall intent of the design in the first place? Ultimately, someone has to carefully transfer the design intent from their human brain into a machine-readable form, and understand the possible limitations and pitfalls in this process. Sadly, computer software that directly plugs into your brain is probably still many years away, and even then I have the feeling that many of us think too sloppily to enable this level of verification directly.

Thus, the need for a Formal Verification book. While there have been many books on Formal Verification published over the past few decades, most have focused on internal algorithms that would be needed to develop the software involved. Our book is one of the first real practical manuals designed to help deesign and validation engineers use formal verification software on real-life design targets.
Anyway, that quick summary should give you an idea of what our new book is about. If you're in a field where you do chip design or something related, please visit our book's website at, order a copy, and tell all your friends about it!   If you're not in this field, the book probably won't make much sense to you, but hopefully you've still enjoyed this episode of the podcast.

And this has been your Math Mutation for today.


Sunday, July 26, 2015

210: Two Plus Two Equals Five

Audio Link

Before we start, I'd like to thank listener D. Zemke, who posted another nice review on iTunes.  Thanks D!

Now, on to today's topic.   Recently I heard someone quote a clever metaphor in a casual conversation,  "Life is when nature takes 2 and 2 to make 5."   It's a nice statement of how living creatures are more than the sum of their parts.  If you took all the chemical compounds in my body and dumped them on the ground in the right proportions, all you would get is a mess.   Yet somehow I am here, and at least sentient enough to record math podcasts.    I went online to try to find the source of this quotation, and was surprised to see the number of references to this seemingly silly nonsense equation, 2+2=5.

Most of us are probably familiar with the equation from George Orwell's classic novel 1984.   As you probably recall, in the book, people are told that if the government says that 2+2=5, it is the duty of all citizens to believe it-- not just say it, but actually come to believe that it is true.   Surprisingly, Orwell did not come up with this out of thin air:  a real-life totalitarian government, the Soviet Union, actually did use 2+2=5 as part of its propaganda, in a poster with the title ""2+2=5: Arithmetic of a counter-plan plus the enthusiasm of the workers."    It wasn't quite as blatantly absurd as in 1984, but the Soviet propaganda poster used it as a metaphor:   supposedly a 5-year plan could be completed in 4 years, because the entuhsiasm of the workers provided an invisible additive factor.   Sadly, most of this "enthusiasm" was mainly due to fear of being sent to the Gulag prison camps, which resulted in many managers doctoring the statistics to match the results that the government wanted-- on paper only.   It's also reported that Nazi Herman Goering actually used this metaphor in real life, once saying "“If the F├╝hrer wants it, two and two makes five!”

The phrase 2+2=5 actually existed in the arts dating from the early 19th century.   According to Wikipedia, the phrase was first coined in a letter from Lord Byron, where he wrote ""I know that two and two make four—& should be glad to prove it too if I could—though I must say if by any sort of process I could convert 2 & 2 into five it would give me much greater pleasure."   He may have been making an indirect reference to Rene Descartes' Meditations, where the famous philosopher discussed whether equations such as 2+3=5 exist outside the human mind, and whether they can be doubted:  "And further, as I sometimes think that others are in error respecting matters of which they believe themselves to possess a perfect knowledge, how do I know that I am not also deceived each time I add together two and three, or number the sides of a square, or form some judgment still more simple, if more simple indeed can be imagined?"

Later Victor Hugo used this concept in a critique of tthe mob rule that had led to Napolean, foreshadowing Orwell's later political metaphor:  ""Now, get seven million five hundred thousand votes to declare that two and two make five, that the straight line is the longest road, that the whole is less than its part; get it declared by eight millions, by ten millions, by a hundred millions of votes, you will not have advanced a step."   Russian authors Ivan Turgenev, Leo Tolstoy, and Fyodor Dostoyevsky also made use of this metaphor.   Turgenev used it to symbolize divine intervention:  "Whatever a man prays for, he prays for a miracle. Every prayer reduces itself to this: Great God, grant that twice two be not four."    In the 20th century, there are many instances of authors following Orwell's lead and again using this metaphor for the struggle against totalitarianism, including Albert Camus and Ayn Rand.

An intriguing question is whether there are cases when it is actually valid to say that 2+2=5.   A well-known mathematicians' joke is that "2+2=5, for particularly large values of 2."   This may refer to issues with rounding:  if you start, for example, with the obviously correct equation "2.4 + 2.4 = 4.8", and ask someone to round all the numbers to the nearest integer, you do indeed derive "2+2=5" from this true equation.   It also might be a case of playing with the definitions of symbols:  perhaps you can define the symbol that we normally write as "2" to actually be an algebraic variable representing the value 2.5.   A trickier example is a "proof" that 2+2=5 that is circulating the web, where many lines of complex algebra are used.   These many lines are artificially complex in order to misdirect you from one invalid step, where a term t is replaced with the square root of t squared.   Remember that you can only do such a replacement if t is positive, a fact glossed over in the so-called proof.   I won't bore you by trying to cite all the lines of equations in an audio podcast, but you can find them linked online in the show notes if you're curious.

An amusing spoof article online points out some real-life situations where 2 and 2 might really make 5.   Ancient Incas used knotted ropes to track business transactions, and if you tie together two ropes that each have two knots, the resulting rope will have 5 knots, including the one used to tie them together.   Another example is if you put 2 male and 2 female rabbits in a cage-- pretty soon you will see numbers way larger than 5.   I'm pretty sure that most people who experience these situations in real life can make the distinction between the messiness of reality and the related arithmetic though.

But that last example brings us back around to the quote that started this whole thing.   Ironically, my web searching did not succeed in uncovering the source of the clever comparison between life and making two plus two equal five.   Most likely I didn't remember the phrasing exactly right, or else someone was just coining this on the fly and it didn't really come from a famous quote.   If you have heard it before and know its origin, please send me an email!

And this has been your math mutation for today,


Sunday, June 21, 2015

209: Wheels That Aren't Round

Audio Link

Most of us are generally familiar with the concept that wheels are usually round.   But do they have to be this way?    What are the properties that make a round wheel useful?   Yes, you might think that eight years of math podcasting has finally driven me insane, to question something so obvious.     But math geeks are famous for requiring proofs of the obvious-- and this is a case where common instincts might lead us astray.   Now of course, for a wide variety of reasons, circular shapes do tend to make the best wheels.     In certain cases, though, there is a more general class of figures that can be substituted for the circular shape, with some important real-life applications.   These are known as curves of constant width.

To simplify the discussion and avoid the complications of axles, let's discuss simple rollers. Suppose you want to smoothly roll a large plank across the top of a bunch of logs.   If the logs have a circular cross section, it's pretty obvious that the plank can roll  smoothly along, without wobbling up and down.    But what is the property that enables this?   The reason for the plank's smooth rolling is that the circle is a curve of constant width.    This means that if you put parallel lines above and below a circle and touching it, the distance will be a constant, the diameter of a circle.    However, a surprising fact discovered by Euler in the 18th century is that there are many other curves of constant width that could be used instead and still allow smooth rolling.

The most famous non-circular curve in this class is known as the Reuleaux Triangle, a kind of equilateral triangle with rounded edges.   To create one, start with an ordinary equilateral triangle.  Then, for each vertex, replace the opposite side with the arc of a circle whose center is that vertex, and whose radius matches the side of the triangle.   If you think about it for a minute, you should see that this curve will be of constant diameter:   if a plank is rolling over the top, at any given moment either the plank or the ground will be touching a vertex, and the opposite surface will be touching a curved edge.    Since the circle used to form that curved edge is defined as the set of points equidistant from its center, the opposite vertex in this case, the distance between the plank and the ground will be a constant value equal to the circle's radius.    Thus, logs with a Reuleaux cross section will be rolled over just as smoothly as circular ones.   

As you can probably see from how we constructed it, the Reuleaux Triangle is just one representative of a large class of curves of constant width.   Take any regular polygon with an odd number of sides, and replace side opposite each vertex with an arc of a circle centered at that vertex.   There are also many other curves in this class, with more complicated construction methods; you can read up on these in the show notes if you're curious.

The surprising discovery of this large class of shapes has led to some useful real-life applications.   Reuleaux, the 19th-century engineer for whom the triangle is named (despite Euler's earlier knowledge of it), became famous for investigating a variety of uses based on converting circular into other types of motion.   Later this led to applications in mechanisms as diverse as film projectors and automotive engines.   Since a rotating Reuleaux triangle traces a shape that is nearly square, it has also been used to construct a special drill bit that enables woodworkers to drill square holes.     By basing the drill on other curves of constant width, a similar method can be used to drill pentagon, hexagon, or octagon-shaped holes as well.     This shape has also been used in the design of pencils, with the claim that the constant diameter but non-circular shape provide a comfortable grip while reducing the chance of spontaneously rolling off a table.      And in several countries, non-circular curves of constant width have been used as the shape of coins, with their constant diameter providing advantages in the design of vending machines.   You can see nice pictures of these and other applications at the links in the show notes.

But one of the most useful aspects of the Reuleaux Triangle and related shapes is as a non-circular counterexample, forcing us to question basic assumptions about simple geometric properties.   According to some sources, engineers working on the doomed space shuttle Challenger tried to verify the cylindrical shape of some components by measuring their width at various sampling points, not being aware of the existence of non-circular curves of constant diameter.    Too bad they didn't have math podcasts back then, though techncially the engineers could have read Martin Gardner's classic essay on the topic.    Anyway, if the shapes were not circular, this would mean that various types of stress would affect the parts unevenly.   This may have contributed to the shuttle's eventual destruction.

So, be sure to think about the existence of these non-circular curves of constant width, next time you are assembling a mechanical device, minting your nation's currency, or designing a space shuttle.

And this has been your math mutation for today.


Sunday, May 24, 2015

208: Your Kids Are Smarter Than You

Audio Link

Before we start, I'd like to thank listeners Don-e Merson and Seasoncolor, who have posted some more nice reviews on iTunes.   Thanks guys!

Now, on to today's topic.   Did you know that, measured by constant standards, the average Intelligence Quotient, or IQ, of the world's population has been steadily increasing as long as it has been measured?  In fact, by today's standards, your great-grandparents most likely would be formally diagnosed as mentally retarded.  It's a little confusing, since the IQ tests are continually re-normalized, so the "average IQ" at  any given time is pegged to 100.   But if we look at the raw test scores and compare them across decades, we see that in every modern industrialized country, the IQ has slowly been creeping upwards.   This effect is known as the Flyyn Effect, named after the New Zealand psychiatrist who first noticed it in the 1980s.  This seems pretty surprising-- could our entire population really be steadily increasing its intelligence?  

When I first heard about this effect, I was a bit skeptical.     If you've read Stephen Jay Gould's classic "The Mismeasure of Man", you have learned about all sorts of broken and ridiculous ways in which people have attempted to measure intelligence at various times.   My favorite example was an IQ test from the early 20th century where your intelligence was, in part, dependent on your ability to recall the locations of certain Ivy League colleges.    Even though such egregious examples no longer are likely to appear, you could easily hypothesize that the Flynn Effect was merely measuring the fact that over the past century, kids have been progressively exposed to a lot more miscellaneous trivia first through radio, then TV, growing mass media, and finally on the Internet.   

Even simple things such as the expanding access to books and magazines throughout the 20th century might have contributed; I remember all the hours I spent biking between local used bookstores as a teenager, looking for cool math and science books, and I doubt my father had such an opportunity at his age.   My daughter won't even have to think about such absurdities, having instant access to virtually all major literature published by the human race over the Internet.   But it turns out that the belief that this IQ growth is just measuring access to accumulated factoids is not quite right-- the growth has been very minor in tests dependent on this type of factual knowledge, and is really measuring an increased ability to do abstract reasoning using simple concepts.

In our modern lives, we take the concept of abstraction for granted:  the ability to talk about and compare ideas, rather than just discuss concrete items and actions that are immediately relevant.   And of course all of modern mathematics, including topics we often discuss in this podcast, is dependent on the ability to do this kind of abstraction.   But this is not something to take for granted:  it has been slowly growing in our society from generation to generation.    For example, one of the online articles linked in the show notes talks about a study done on an isolated tribe in Liberia.   They took a bunch of random objects from the village and asked the villagers to sort them into categories.    Instead of sorting into groups of clothing, tools, and food, as we might do, they put items together that were used together, such as a potato with a knife, since the knife is used to cut the potato.     So apparently modern IQ tests are largely measuring our ability to think in abstract categories, and this is the ability that is increasing.   Flynn has argued that we should really label this kind of thinking as "more modern" rather than "more intelligent"-- can we really say objectively that one kind of thinking is better?   However, we probably can say that this modern thinking is a critical component in the explosion of science and technology that we observe in the modern world.

There are numerous theories to try to explain the Flynn Effect.   Most center on social or societal factors.   Perhaps the explosion of media exposure is important not because of miscellaneous factoids, but because of the generally more cognitively complex environment, forcing us to think in abstractions to make sense of the massive bombardment of ideas coming at us from literature, television, and the Internet.   The growth of intellectually demanding work, where more and more of us have jobs that involve at least some thinking rather than pure manual labor, may also contribute.     Another possible factor is the reduced family size in the Western world:  with fewer kids around, each gets more parental attention, and this may foster development of abstract thought.    And of course, in recent years, I'm sure there has been an IQ explosion among the very important subset of the population who listen to Math Mutation.

Aside from social factors, there are more basic physical ones:    basic improvements to health and welfare, such as massively reduced malnutrition and disease, could also be important here.     You may remember that back in podcast 110, "One Intestinal Worm Per Child", we discussed how simple health can have a much bigger effect on educational success than fancy computers.   There is also the theory that we are simply measuring the effects of Darwinian natural selection, where parents with this more modern thinking style are more likely to reproduce, due to coping better in our technological 20th-21st century society.     But most biologists believe that the Flynn effect has set upon us too quickly to be evolution-based.

To further complicate the discussion, some recent studies in Northern Europe seem to show that the Flynn Effect is disappearing or getting reversed.   It's unclear whether this is a real effect, or an artifact of recent population shifts:  over the past two decades, there has been massive immigration from the Third World into these countries, and it could be that we are just measuring the fact that a lot of new immigrants are just in earlier stages of the Flynn Effect treadmill.   But as in every generation, there is no shortage of commentators who can find good reasons why today's young whippersnappers are supposedly getting dumber, such as a focus on repetitive video games and social-network inanity.   We need to contrast this with  their parents' more intellecutal pursuits, such as Looney Tunes and Jerry Springer.

So, what does this all mean?    We certainly do see some effects in society that may very well be partially due to the Flynn Effect, such as the explosion of new technology in recent years.    I think we should do whatever we can to continue making our kids smarter, and enabling more modern and abstract thinking-- though of course, that would be true with or without the Flynn Effect anyway.    Encourage your kids to engage in cognitively complex tasks such as reading lots of books, learning to play a musical instrument, and discussing cool math podcasts.   But when they tell you in a few years that you're going senile, don't take it personally, you really are dumber than they are, due to the Flynn Effect.

And this has been your math mutation for today.


Sunday, April 12, 2015

207: Answering All Possible Questions

Audio Link

Have you ever wished, in your daily life, that you had a simple way to find all the answers about any subject that was vexing you?    Perhaps you are in a personal crisis wondering whether God exists, or maybe have a mundane issue as simple has finding your way home when lost.   Well, according to 13th-century monk Ramon Llull, you're in luck.   Llull devised a unique philosophical system, based on combining a set of primitive concepts, that he believed would provide the path to solving any conceivable dilemma.   His primary goal was to find a way to discuss religious issues and rationally convert heathens to Christianity, without relying on unprovable statements from the Bible or other holy books.    As a philosophy, his system was far from definitive or complete, and gradually faded into obscurity.   But along the way he became a major contributor to mathematics, making advances in areas as diverse as algebra, combinatorics, and computer science as he tried to elaborate upon his strange philosophical methods.

Llull began by listing a set of nine attributes in each of several categories of thought, intended to represent a complete description of that category, which could be agreed upon both by Christians and non-Christians.   For example, his first list was the nine attributes of God:   goodness, greatness, eternity, power, wisdom, will, virtue, truth, and glory.    He wanted to discuss all combinations of these virtues, but repeating them endlessly was kind of tedious in the days before word processing, so he labeled each with a letter:  B, C, D, E, F, G, H, I, K.    He then drew a diagram in which he connected each letter to each of the others, forming kind of a nine-pointed star with fully connected vertices; you can see a picture at one of the links in the show notes.    By examining a particular connection, you could spur a discussion of the relationship of two attributes of God:  for example, by observing the connection between B and C, you could discuss how God's goodness is great, and how his greatness is good.    Whatever you might think of his religious views, this was actually a major advance in algebra:   while the basics of algebra had existed by then, variables were commonly represented by short words rather than letters, and had been thought of as simply representing an unknown to be solved for in a single equation.    For the first time, Llull was using letters to represent something more complex than numbers, and mixing and matching them in arbitrary expressions.    In addition, his diagram of the relations between attributes was what we now call a graph, an important basic data structure in computer science.   He also created another depiction of the possible combinations as a square half-matrix, another data structure that is common today but was unknown in Llull's time.

Llull's system got even more complicated when he introduced additional sets of attributes, and tried to find more combinations.     For example, another set of his concepts consisted of relationships:  difference, concordance, contrariety, beginning, middle, end, majority, equality, minority.   He also had a list of subjects:   God, angel, heaven, man, imaginative, sensitive, vegetative, elementative, instrumentative.   Even deeper philosophical conversations could theoretically result from combining elements from several lists.   This created some challenges, however.   He would again label each element of these lists with letters, but keeping track of all combinations led to an explosion of possibilities:  just the three lists we have so far make 9x9x9, or 729 combinations, and he had a total of 6 major lists.   So to facilitate discussion of arbitrary combinations, he created a set of three nested wheels, each divided into 9 sectors, one for each letter.   One would be drawn on a sheet of paper, and the other two would be progressively smaller and drawn on separate sheets that could be placed over the first one and independently rotated.    Thus, he had developed a kind of primitive machine for elaborating the combinations of multiple sets:  for each 9 turns of one wheel, you would turn the next larger wheel once, and by the time you returned to your starting point, you would have explored all the combinations possible on the three wheels.    Several centuries later, the great mathematician Gottfried Leibniz cited Llull as a major influence when inventing the first mechanical calculating machines.

There were also several other contributions resulting from this work, which you can read about in more detail at the links in the show notes:   Llull can be thought of as the first person to discuss ternary relations, or functions of more than one variable; and he anticipated some of Condorcet's contributions to election theory, which we discussed back in podcast 183.  Llull, of course, was not really concerned with making contributions to mathematics, as he was concentrating on developing a comprehensive philosophical system.   In his own mind, at least, he believed that he had succeeded:   he claimed that "everything that exists is implied, and there is nothing that exists outside it".   To help prove this point, he wrote a long treatise elaborating upon physical, conceptual, geometrical, cosmological, and social applications of his ideas.     Apparently he even spent five pages showing how his system could aid the captain of a ship that was lost at sea.    Personally, I would prefer to have a GPS.   But even if our modern thought processes don't strictly follow Llull's guidelines, we still owe him a debt of gratitude for his contributions to mathematics along the way.

And this has been your math mutation for today.


Sunday, March 22, 2015

206: Deceptive Digits

Audio Link

Imagine that you are a crooked corporate manager, and are trying to convince your large financial firm's customers that they own a set of continually growing stocks, when in fact you blew the whole thing investing in math podcasts over a decade ago. You carefully create artifical monthly statements indicating made-up balances and profits, choosing numbers where each digit 1-9 appears as the leading digit about 1/9th of the time, so everything looks random just like real balances would. You are then shocked when the cops come and arrest you, telling you that the distribution of these leading digits is a key piece of evidence. In fact, due to a bizarre but accurate mathematical rule known as Benford's Law, the first digit should have been 1 about 30% of the time, with probabilities trailing off until 9s only appear about 5% of the time. How could this be? Could the random processes of reality actually favor some digits over others?

This surprising mathematical law was first discovered by American astronomer Simon Newcomb back in 1881, in a pre-automation era when performing advanced computations efficiently required a small book listing tables of logarithms. Newcomb noticed that in his logarithm book, the earlier pages, which covered numbers starting with 1, were much more worn than later ones. In 1938, physicist Frank Benford investigated this in more detail, which is why he got to put his name on the law. He looked at thousands of data sets as diverse as the surface areas of rivers, a large set of molecular weights, 104 physical constants, and all the numbers he could gather from an issue of Reader's Digest. He found the results remarkably consistent: a 1 would be the leading digit about 30% of the time, followed by 2 at about 18%, and gradually trailing down to about 5% each for 8 and 9.

While counterintuitive at first, Benford's Law actually makes a lot of sense if you look at a piece of logarithmic graph paper. You probably saw this kind of paper in high school physics class: it has a large interval between 1 and 2, with shrinking intervals as you get up to 9, and then the interval grows again to represent the beginning of the next order of magnitude. The idea is that this scale can represent values that may be very small and very large on the same graph, by having the same amount of space on a graph represent much larger intervals as the order of magnitude grows. It effectively transforms exponential intervals to linear ones. If you can generate a data set that tends to vary evenly across orders of magnitude, it is likely to generate numbers which appear at random locations on this log scale-- which means that the probabilities of it being in a 1-2 interval are much larger than a 2-3, 3-4, and so on.

Now, you are probably thinking of the next logical quesiton, why would a data set vary smoothly across several orders of magnitude? Actually, there are some very natural ways this could happen. One way is if you are choosing a bunch of totally arbitrary numbers generated from diverse sources, as in the Reader's Digest example, or the set of assorted physical constants. Another simple explanation is exponential growth. Take a look, for example, at the powers of 2: 2, 4, 8, 16, 32, 64, 128, etc. You can see that for each count of digits in the number, you only go through a few values before jumping to having more digits, or the next order of magnitude.  When you add new digits by doubling values, you will jump up to a larger number that begins with a 1.   If you try writing out the first 20 or so powers of 2 and look at the first digits, you will see that we are already not too far off from Benford's Law, with 1s appearing most commonly in the lead.

Sets of arbitrarily occurring human or natural data that can span multiple orders of magnitude also tend to share this Benford distribution. The key is that you need to choose a data set that does have this kind of span, due to encompassing both very small and very large examples. If you look at populations of towns in England, ranging from the tiniest hovel to London, you will see that it obeys Benford's law. However, if you define "small town" as a town with 100-999 residents, creating a category that is restricted to three-digit numbers only, this phenomenon will go away, and the leading digits will likely show a roughly equal distribution.

The most intriguing part of Benford's law is the fact that it leads to several powerful real-life applications. As we alluded to in the intro to this topic, Benford's Law is legally admissible in cases of accounting fraud, and can often be used to ensnare foolish fraudsters who haven't had the foresight to listen to Math Mutation. (Or who are listening too slowly and haven't reached this episode yet.) A link in the show notes goes to an article that demonstrates fraud in several bankrupt U.S. municipalities based on their reported data not conforming to Benford's law. It was claimed that this law proves fraud in Iran's 2009 election data as well, and in the economic data Greece used to enter the Eurozone. It has also been proposed that this could be a good test for detecting scientific fraud in published papers. Naturally, however, once someone knows about Benford's law they can use it to generate their fake data, so complicance with this law doesn't prove the absence of fraud.

So, next time you are looking at a large data set in an accounting table, scientific article, or newspaper story, take a close look at the first digits of all the numbers. If you don't see the digits appearing in the proportions identified by Benford, you may very well be seeing a set of made-up numbers.

And this has been your math mutation for today.



Sunday, March 1, 2015

205: The Converse of a CEO

Audio Link

Ever since I was a small child, I aspired to grow up to become a great Rectangle.    When I was only six years old, my father took me to meet one of the leading Rectangles of New Jersey, and I will always remember his advice:  "Be sure to have four sides and four angles."   All through my teenage years, I worked on developing my four sides and four angles, as I read similar advice in numerous glossy magazines aimed at Rectangle fans.     In high school, my guidance counselor showed me many nice pamphlets with profiles of famous Rectangles who had ridden their four sides and four angles to success.   Finally, soon after I turned 18, I took a shot at realizing my dream, lining up many hours to audition for a spot on the popular TV show "American Rectangle".    But when I made it up onto the stage, I was mortified to be met by a chorus of laughter, and end up as one of the foolish dorks that Simon Cowell makes fun of on the failed auditions episode.    With all my years of effort, I had not become a Rectangle, but a mere Trapezoid.

OK, that anecdote might be slightly absurd, but think for a moment about the premise.   Suppose you want to become successful in  some difficult profession or task.   A natural inclination is to find others who have succeeded at that, and ask them for advice.   If you find something that a large proportion of those successful people claim to have done, then you conclude that following those actions will lead you to success.     Most of us don't actually aspire to become geometric shapes, but you can probably think of many miscellaneous pieces of advice you have heard in this area:   practicing many hours, waking up early every day, choosing an appropriate college major, etc.    I started reflecting on this concept after looking at a nice career planning tool aimed at high school students, which lets them select professions they are interested in, and then read about attributes and advice from those successful in it.

Unfortunately, this kind of advice-seeking from the successful is actually acting out a basic mathematical fallacy.    In simple logic terms, an implication statement "A implies B", is logically different from its converse, "B implies A".   Neither statement logically follows from the other:   "A implies B" does not mean that "B implies A".   When we look at the case of rectangles, this seems fairly easy to understand:   the condition A of having four sides and four angles does NOT imply the consequent B, that the object is a rectangle.   By observing that all rectangles have these characteristics, we are learning the opposite:   Being a rectangle implies that the object has four sides and four angles.   This is important to recognize because there many be infinitely many non-rectangle objects that meet this condition, and actual rectangles might represent only a small portion of the possibilities.     If we wanted to isolate conditions that will imply something is a rectangle, we need to look at both rectangles and non-rectangles, to identify unique rectangle conditions, such as having four right angles.    Once we have a set of properties that will pertain only to rectangles and not to non-rectangles, then we might be able to come up with an intelligent set of preconditions.

Sadly, real life does not always offer us geometric shapes.   When we substitute a real aspiration people might have, too many try to infer the keys to success just from looking at the successful.      Without thinking through this basic logical fallacy about a statement and its converse, "A implies B" does not mean "B implies A",  many people waste lots of time and money following paths where their likelihood of success is minimal.     A common case among today's generation of middle class kids is the hopeful young writer who decides to major in English.   An aspiring writer might see that many successful writers have degrees in English, without taking the time to note that the proportion of English majors who become successful writers is infinitesmally small.    The statement "If you are a successful writer today, you probably have a college degree in English" does not imply "if you earn a degree in English, you will probably become a successful writer."       In contrast,  if looking at computer engineering, they might see a similar profile among the most successful-- but will also find that unlike in English, a huge majority of computer engineering majors do end up with a well-paying job in that field upon graduation.    So in that case, the implication really does work both ways-- but this is a coincidence, since the statement and its converse are independent.

Even famous business consultants are subject to this fallacy.   Have you heard of  the influential 1980s business book "In Search of Excellence", where the authors closely looked at a set of successful companies to find out what characteristics they were built upon?      That became one of the all-time best-selling business books, and many leaders followed their sweeping conclusions, hoping to someday make their companies as successful as NCR, Wang, or Data General.     But some have criticized the basic premise of this research for this same basic flaw:  trying to determine the conditions of success by looking only at the successful will inherently get you the wrong kind of implication.   It may enable you to find a set of preconditions that being successful means you must have had, while these same preconditions are met by endless numbers of failed companies.   You really need to study both success and failure to find conditions that uniquely imply success.

So, when you or your children are thinking about their future, look carefully at all the available information, not just at instances of success.   Always keep in mind that a logical statement "A implies B" is truly distinct from its converse "B implies A", and take this into account in your decision making.
And this has been your math mutation for today.