Thursday, August 15, 2013

Random but not really: A pseudo-Bell Curve Data Generator

The Problem

Have been contemplating how to create a list of data that is unordered, and in that sense "random", but not truly random from a statistician's viewpoint (You know, what you get from "RAND()" and "RANDBETWEEN()"). I was looking for something more like what you would see on a Bell Curve, weighted to the middle and trailing to the left and right.  For example, I need a pseudo-schedule with dates around  "NOW()" some typed in date.

Finding the general solution

Well, it was not in EXCEL or VBA help from what I could see, so I went to the next great fount of knowledge: the Internet.  Found a few references that reminded me of the most common Bell Curve-like structure I know: rolling board game dice.  There are more combinations for 7 than anything else and the fewest for 2 and 12.  All other numbers arc nicely from up from 2 to 7 and down to 12.

When I saw a reference to an RPGer (no, not someone with a rocket propelled grenade or an AS400 programmer) the light went on.  Role Pay Gaming depends on dice with different numbers of sides for different needs, and the combinations of the dice defined the number of different outcomes.

The Formula Solution

Rolling a pair of standard, six-sided dice, with one to six dots per side, can be simulated adding two executions of the  RANDBETWEEN() function.
  • RANDBETWEEN(1,6) + RANDBETWEEN(1,6) would be the result of rolling 2 standard dice.
  • CHOOSE(RANDBETWEEN(1,3),"Rock","Paper","Scissors")

The VBA Solution

The VBA solution, DiceThrow(), uses RANDBETWEEN().  A single call of following code is not sophisticated.  
  • There can be only one kind of die.  
  • The numbers on the dice are sequential, e.g. if you say the lowest number on a face is 1 and the largest 6, it assumes the numbers are 1, 2, 3, 4, 5, and 6.
  • The dice are "fair" in the sense that each face is as likely to appear as any other.
  • The inputs(parameters) are 
    • Bottom:  The lowest number on a face of the dice.
    • Top: The highest number.
    • Dice: The number of dice

Function DiceThrow(Bottom as Integer, Top as Integer, Dice as Integer) As Integer
    Dim i As Integer
    DiceThrow = 0
    For i = 1 To Dice
        DiceThrow = DiceThrow + Application.WorksheetFunction.RandBetween(Bottom, Top)
    Next i
End Function

Using DiceThrow()

What's the number?

The simplest use of DiceThrow is to plug it into a cell with numbers for Top, Bottom, and Dice.  
  • DiceThrow(1,6,2) would give the roll of two normal dice each time the sheet was recalculated.
For combinations of dice, e.g. a 10-sided die, numbers 7 to 16, and a normal die, the entry would be 
  • DiceThrow(7,16,1)+DiceThrow(1,6,1) 

One suggestion would be to make the Bottom, Top, and Dice entries refer to Names in the workbook.  for example
  • Names: Set1Bottom, Set1Top, and Set1Dice could be 1, 6, and 2, respectively, indicating a pair of "normal" dice.
  • DiceThrow(Set1Bottom,Set1Top, Set1Dice) would give the roll.

Bell Curve pseudo-data

While not a true Bell Curve (Bell Curves have continuous data, this is discrete), the following sub-routine will give an approximation of a Bell Curve and put it on a worksheet.  This could be used for demos, testing, presentations of concepts.

Sub BellCurve()
    Const Tries As Double = 1000000
    Const Dice As Integer = 5
    Const Bottom As Integer = 1
    Const Top As Integer = 10
    Dim arrBellCurve() As Double
    Dim i As Double
    Dim n As Integer
    
    ReDim arrBellCurve(Dice * Top, 2)
    For i = 1 To UBound(arrBellCurve)
        arrBellCurve(i, 1) = i
    Next i
    For i = 1 To Tries
        n = DiceThrow(Bottom, Top, Dice)
        arrBellCurve(n, 2) = arrBellCurve(n, 2) + 1
    Next i
    For i = Dice * Bottom To Dice * Top
        ActiveSheet.Range("I" & i).Value = arrBellCurve(i, 1)
        ActiveSheet.Range("J" & i).Value = arrBellCurve(i, 2)
    Next i

End Sub

Practice with it to see what happens with changing the number of faces on the dice and the number of dice.  they make the curves shallower and higher, respectively.

Data begins at ...

To get the Bell shape, it takes at least 2 "dice", otherwise the "shape" of a series of throws is rectangular, aka random within a range.  However, with two standard dice, the lowest throw is 2. 

Sometimes the test data needs to start at one or zero or just about any other number.  There are several ways to manipulate this 
  • DiceThrow(1,6,2) - 1 ' A pair of standard dice, subtracting 1 from the result each time
  • DiceThrow(0,5,2) + 1 ' Six-sided dice, numbered blank (zero) to five, adding 1 to the result
If the problem requires many different types of dice(we'll touch on that later)
  • DiceThrow(1,6,1) + DiceThrow(0,5,1) ' unequal dice

...and to what ever number is needed

Suppose the range to cover is 1-23.  Just as the lowest number possible from a set of dice is the sum of the lowest numbers on the individual dice, the highest result is the sum of the highest on each.  The following are possibilities of for the 1-23 range
  • DiceThrow(1,3,1) + DiceThrow(0,5,4) 
  • DiceThrow(1,5,1) + DiceThrow(0,6,3) 
  • DiceThrow(1,2,1) + DiceThrow(0,1,3) + DiceThrow(0,6,3) 

back and forth in time...

Another use would be for time data.  Forward from a point in time would mean adding whatever DiceThrow() combination desired to a given point in time.

  • Round(Now(),0) + DiceThrow(x,y,z) would give dates in the future
  • Round(Now(),0) - DiceThrow(x,y,z) would give dates in the past
  • Round(Now(),0) + iif(DiceThrow(0,2,1) = 1,-1,1) * DiceThrow(x,y,z) would give dates in the past and future
  • The DateAdd() funcion could be used for larger & smaller time increments.

Games:

The coin flip

Function HeadsOrTails() as String
   HeadsOrTails = iif(DiceThrow0,1,1) = 1,"Heads","Tails")
Function End

Matching Coin Flips

<tbd>

Rock, Paper, Scissors

<tbd>

Rock, Paper, Scissors, Lizard, Spock

<tbd>