Add Me!Close Menu Navigation

Randomness

People tend to trivialize the concept of randomness in computing. While true randomness cannot be replicated by a computer, the occurrences of randomness in nature have provided us with considerable insight into the properties of a random sequence of numbers. We can replicate these properties in defined numerical sequences that behave tolerably similar to random sequences.

In this post, we will implement and examine a simple yet well known pseudorandom number generation algorithm known as the linear congruential method.

Randomness isn’t really something that can be defined, we can only observe certain properties of randomness such as the equal likelihood for any number within the domain to occur. In order to fulfill this definition in our generator, we also have to make sure that the random number is taken from a countable domain set. Beyond that however, randomness is more or less an ambiguous concept rather than anything tangible. It’s not at all insane (meh, maybe a little) to theorize that randomness is in fact a byproduct of a snowman pissing out a rainbow of low energy spectrum.

But weird pictures of snowmen aside, we can actually define what a random sequence of number may look like and fool the user into thinking that a non-random sequence is random because it appears to be random. For example,

0.0012512588885159
0.56358531449324
0.19330423902097
0.80874050111393
0.58500930814539
0.47987304300058
0.3502914517655
0.89596240119633
0.82284005249184
0.7466048158208
0.17410809656056
0.85894344920194
0.71050141911069

may look like a sequence of random numbers, but if you type in

math.randomseed(1)
for i=1,13 do 
    print(math.random()) 
end

you will always get this sequence of numbers.

Now I’m not saying that random sequences are unique, true randomness just cannot be generated from a preset pattern. But if all we’re asking for is something that looks like random sequence of numbers, then these pseudorandom number generators work pretty well.

1. The Linear Congruential PRNG

This simple, efficient, and still somewhat effective (for a PRNG that provides arbitrary numbers and can cycle through every element of a domain) algorithm was developed by D. Lehmer in 1951, which computes successive random numbers from an initial seed. The initial formulation is as follows:

X[i+1] = (b*X[i] + 1) mod m

Which is endowed with the property that the resulting X can never be larger than m-1. Also, for well chosen values of b and m, the PRNG will have a long period. Knuth suggests the following rules when choosing the b and m parameters:

  1. m needs to be large, commonly set to 2^32 (largest unsigned 32bit int) or a large number of base 2 so that modulus can be optimized to a bitwise and.
  2. b shouldn’t be too large or too small, it is usually 1 digit less than m
  3. b should end with …n21 where n is an even number. Admittedly this is a very weird requirement, but it has been shown to prevent the occurrence of a few “bad cases” uncovered by mathematical analysis.

Following these rules, the following is a simple implementation of a LCG:

a = 0 -- Seed
function rand()
   a = (385934821*a + 1) % (2^32)
   return a
end

which returns the following list:

1
385934822
192177344
2638077888
1098901248
17605376
1042075393
1810960640
853095680
3367332096
3509841152
2524184832
1521083648

Meh, looks kinda “random” to me. For all intensive purposes, this could pass off as a stream of random numbers in everyday use. (LCG’s are not usually used for cryptography as they’re not inherently secure)

2. A Slight Variation

Modern lightweight PRNG’s commonly implements a variation of the 1951 algorithm which introduces an increment parameter, c, that defines how the PRNG grows.

X[i+1] = (b*X[i] + c) mod m

In 1961, Greenberger also formalized a set of rules in order to maximize the noncyclic generation of numbers (IE: prevent the generation of the seed itself, which will cause the PRNG to cycle back)

  1. m and c are relatively prime, or that they have no common divisors / factors other than 1
  2. If there exists a prime number q that divides wholly into m, then the said q also divides into (b-1)
  3. If 4 divides into m, then 4 also divides into (b-1)

The rationales behind these and other requirements are discussed in heavier details elsewhere, just google the keywords “linear congruential generator”.

Common parameters

Shamelessly stolen from Wikipedia:

Source m b c
Numerical Recipes 232 1664525 1013904223
Borland C/C++ 232 22695477 1
glibc 232 1103515245 12345
ANSI C 232 1103515245 12345
Borland Delphi 232 134775813 1
Microsoft Visual/Quick C/C++ 232 214013 2531011
MMIX by Donald Knuth 264 6364136223846793005 1442695040888963407
Random class in Java API 248 25214903917 11

Implementations

So to replicate the PRNG used by gcc, we can simply write

a = 0 -- Seed
function rand()
   a = (1103515245*a + 12345) % (2^32) 
   return a -- Note, we'll also need to mask the highest 2 bits (via mod 2^30), but for all intensive purposes...
end

And to create a function that generates these PRNGs, we can write a factory function:

a = 0 -- Seed
function randFactory(m, b, c)
    return function()
        a = (b*a + c) % m
        return a
    end
end

However, a crucial advantage of the LCG is its ability to run multiple RNG streams without any significant overhead. Hence, in order to run concurrent instances of rand, we can cache the state of the RNG in a local variable that is an upvalue with respect to the actual rand function:

function randFactory(m, b, c, a) -- a is seed
    if not a then a = 0 end
    return function()
        a = (b*a + c) % m
        return a
    end
end

so that here

rand = randFactory(2^32, 1103515245, 12345)
for i=1,10 do print(rand()) end

rand2 = randFactory(2^32, 1103515245, 12345)
for i=1,10 do print(rand2()) end

would result in

12345
3554416254
2802067456
1358247936
2138638336
1459132416
1445521408
2518349824
4044081152
3908327424
12345 -- rand2() starts here
3554416254
2802067456
1358247936
2138638336
1459132416
1445521408
2518349824
4044081152
3908327424

Bit Masks

The Glibc implementation actually consumes only 31 bits, this can be easily rectified via

a = 0 -- Seed
function rand()
   a = (1103515245*a + 12345) % (2^32) 
   return a%(2^31) -- The internal state is not pruned, still 32 bit
end

In order to improve the effectiveness of the generator, some implementations produce more bits than are consumed. This is because the lower bits are relatively prone to collision with the seed so consuming only the higher bits will often improve the period of the RNG. In lua, bitwise shifts to the right can be implemented via division of (2^n), so we can replicate Java’s RNG via

a = 0 -- Seed
function rand()
   a = (25214903917*a + 11) % (2^48) 
   return a/(2^32) -- again, internal state is not modified
end

And of course, the cheek in tongue solution:

rand = math.random
Posted By Lee

Woosh

  • http://phailed.me/ Phailure

    btw, if anyone wants the Illustrator file for the snowman pissing out a rainbow, just holler and I’ll upload the .ai

  • I’mBack

    Lee, you’re smart. – I’mBack

  • http://phailed.me/ Phailure

    xD kevin

  • Pingback: best beat making software for pc()

  • Three Balls to the Wind

    I think there’s a problem here in that accurately computing the product of two 32-bit integers requires 64 bits for the result. Lua’s numbers are floating point and do not have 64 bits of precision. So some of the multiplications b*a will produce an incorrect result . Thus the random sequences above will “misbehave”.

  • Three Balls to the Wind

    To see what I mean, try multiplying Delphi’s b of 134775813 (from above) with 4294967295 using lua vs. the venerable bc:

    echo ‘134775813 * 4294967295’ | bc

    lua5.1 -e ‘print(string.format(“%.0f”, 134775813 * 4294967295))’

    The lua output is close but not exact.

  • Three Balls to the Wall

    Years ago a college paper I had written something like “…the horseshoe crab is, for all intensive purposes, a crustacean…” or some such nonsense. When I got back the paper, the teacher had crossed out the expression “for all intensive purposes” and written “for all intents and purposes” above it in red.