When iTunes “shuffle” was introduced, Apple received many complaints. It turns out that a number of songs were played many times, and customers felt that the randomness of this random shuffle algorithm was not truly random. Apple changed the algorithm, and it works a bit better now. However, their change actually made the process non-random. The previous iteration of the software was random. Why, then, did the complaints arise?
If you take a carton of toothpicks and throw them across the room in a truly random manner, you will notice that the toothpicks will start to form clusters. This “clumping” occurs due to the nature of a Poisson point process, or a Cox family of point processes. Simply put, the process tends to create clusters around certain locations or values when it is truly random. The same also occurred in World War II. The Germans were randomly bombing Britain. However, the randomness led to the same type of clustering one would see in iTunes. Certain targets were bombed more often than others. This led the British to think that the Germans had some strategy to their bombing when, in fact, the process was purely random. We tend to think that a random process would be evenly distributed, and when the reality defies our logic, we no longer see the randomness in the random process. Apple decided to change their algorithm to a less random but more evenly distributed one, and customers remained happy.
I can discuss different types of randomness fairly extensively, but I would rather touch upon two different types of random number generation. These are pseudo-random number generators and true random number generators. Pseudo-random number generators use mathematical formulae or tables to pull numbers that appear random. This process is efficient, and it is a deterministic, as opposed to a stochastic, one. The problem is that these generators are periodic and will tend to cycle through the same set of pseudo-random numbers. While they may be excellent for pulling random numbers on small scales, they fall prey to significant problems in large-scale simulations. The lack of true randomness creates artifacts in data and confounds proper analysis.
True random number generators, on the other hand, use real data. Typically, data from physical observations, such as weather patterns or radioactive decay, are extracted and used to generate random values. The lavarand generator, for example, used images of lava lamps to generate random numbers. These true random number generators are nondeterministic and do not suffer from the periodicity of pseudo-random number generators.
This distinction is important in the simulation of data. How can one best generate random numbers? If an internal clock is used to generate random numbers, but you are iterating through some code thousands of times, a periodicity dependent upon the computation time may result and generate artifacts. The use of atmospheric noise could overcome this, though pulling the data takes time and could slow down computation.
The world around us is filled with processes both random and nonrandom. It is a challenge to generate artificial random processes, and it is surprising that truly random processes often appear nonrandom to human observers.