Thursday, January 8, 2015

Random Characters: Generating Passwords

Introduction

Excel Zoom has a nice article on generating random characters for passwords.  In short, it uses RANDBETWEEN() for and CHAR(RANDBETWEEN()) for other characters
  • Digits, i.e.  0-9: = RANDBETWEEN(0,9)
  • UpperCase Letters, i.e. A-Z: =CHAR(RANDBETWEEN(65,90))
  • LowerCase Letters, i.e. a-z: =CHAR(RANDBETWEEN(97,122))
  • SpecialCharacters , i.e. !”#$%&'()*+,-./: =CHAR(RANDBETWEEN(33,47))
As usual, I see something new and it's a jumping off point for me.

What else  to do: 

Randomize the order

In the article, these four formulas are then concatenated, one after another.  This means that a hacker would be able to tell which subset of the characters would go into which position.  To solve this problem, one needs to randomize which subset goes into each position.  The formula for each position becomes
=CHOOSE(RANDBETWEEN(1,4),
   RANDBETWEEN(0,9),
   CHAR(RANDBETWEEN(65,90)),
   CHAR(RANDBETWEEN(97,122))
   CHAR(RANDBETWEEN(33,47)))

The above formula uses the RANDBETWEEN() within the CHOOSE to select which of the subsets might be used.

Shorten the Formula

As you can see, this formula will be quite long if expanded to 8 or more characters.  It can be shortened by putting the formulas for each into a Name
  • RandDigit: = RANDBETWEEN(0,9)
  • RandUpperCase: =CHAR(RANDBETWEEN(65,90))
  • RandLowerCase: =CHAR(RANDBETWEEN(97,122))
  • RandSpecial: =CHAR(RANDBETWEEN(33,47))
Now the formula for a single character is 
=CHOOSE(RANDBETWEEN(1,4),RandDigit,RandUpperCase,RandLowerCase,RandSpecial)
This can be shortened further by replacing RandUpperCase & RandLowerCase with RandAlpha  =CHOOSE(RANDBETWEEN(1,2),CHAR(RANDBETWEEN(65,90)),CHAR(RANDBETWEEN(97,122))) The formula for a single character now becomes 
  =CHOOSE(RANDBETWEEN(1,3),RandDigit,RandAlpha,RandSpecial)                      

Balance the Odds

Both the article and the above have a common issue:  There are some characters more likely to show up than others.  As mentioned, the article's formula results in only one of the subsets being in certain positions.  In this case, the odds of each subset is equal at each position but since the some subsets are larger than the others, the likelihood of a specific member of the larger subset(s) being selected is less likely than that of  a smaller subset.  A specific letter being selected is less likely than that of a specific special character, and a specific digit is more likely of than any letter of special character.

Unfortunately, now we go into a realm that takes more typing and less knowledge of functions.  The solution is to create a Name with all of the valid characters, e.g. ValidPwDChars = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz!#$%&'()*+,-./" & """"
  • NOTE 1:  Pay attention to the treatment of the double-quotation mark.  This is necessary to include it in a string that is in the RefersTo field.  However, if the Name refers to a cell where the valid characters are typed in, then the quote of the quote is unnecessary.
    NOTE 2: An alternative to this would be to make Names for each of the subsets then concatenating them here, e.g. ValidPwdDigits = "0123456789" etc then ValidPwdChars = ValidPasswordDigits  & ValidPwDAlpha & ValidSpecial.
  • The new formula relies on the MID() function, using RANDBETWEEN() within CHOOSE() to pick a character on the new Name
    =MID(ValidPWDChars,CHOOSE(RANDBETWEEN(1,77),1)

Non-contiguous Subsets (Internationalising)

The initial solution established there are four subsets of characters that are valid for password, coming in four contiguous sequences in the ASCII table: All very convenient for US-based people.

I currently work solely in the US.  In the past I have worked in several countries in Europe as well as with teams that included people from every continent except Antarctica.  One of the first things I learned when working internationally is that all keyboards are not the same.  Looking at the ASCII tables I notice there are some symbols that show up on some keyboards, like for the British Pound and Euro, which are nowhere near the Special Characters range (33 to 47).
Therefore, if the solution is to be usable for non-US organizations it needs to take this into consideration.  This means adding keyboard characters, and perhaps deleting some.
If setting up the password generator is being done on a keyboard that does not support characters one wants to include, there are two options.
  • OPTION 1: Insert, Copy, Paste
    • Using the Insert menu of the Ribbon, click the Insert Symbol option
    • Choose the symbol needed, double-clicking or clicking the Insert button
    • Close the dialogue
    • Copy/paste the symbol into the Name RefersTo field
  • OPTION 2: CHAR()
    • Using the Insert menu of the Ribbon, click the Insert Symbol option
    • Choose the symbol needed, noting the ASCII code
    • Close the dialogue
    • Copy/paste the symbol into the Name RefersTo field

Position-related Limits

As a reminder, the formulas discussed here are only for a single character of a password.  To make a complete (i.e. minimum length) password, the formula is repeated/concatenated as many times as necessary.
For some applications, the first character of a password is limited, perhaps to just alphabetic characters.  Therefore, the whole formula may start with =MID(ValidPwDAlpha,CHOOSE(RANDBETWEEN(1,52),1)

Improve =MID(ValidPWDChars,CHOOSE(RANDBETWEEN(1,xx),1)

Likewise, some characters may be invalid for certain applications that are valid for others.  Since characters may be added/removed, the upper-bound of the RANDBETWEEN either needs to be maintained with each change or made to reflect the current state.  
For the latter (ie. the recommended route). use the LEN() function with the Name of the field with the all of the valid characters.
=MID(ValidPWDChars,CHOOSE(RANDBETWEEN(1,LEN(ValidPWDChars)),1)

Flex the Solution

Let us suppose this password generator is going to be used by a support desk that interacts with users in multiple technical environments, two factors we've identified as influencing the set of valid characters.
In these cases, although the values are different the formulas should be clear enough to allow someone to troubleshoot issues.
Therefore, the creation of Names should follow a standard yet support maintainability.
  • Start with the digits, establishing "PWDDigits_Default"
  • If there are cases where 0-9 is not right (maybe some passwords are Octal or Hex), create that set and name it properly, e.g. "PWDDigits_Octal", "PWDDigits_Hex", then prepare for the argument theat "PWDDigits_Default" should be "PWDDigits_Decimal"
  • Another case is that although all characters are valid, some are easier to confuse and therefore harder to communicate in a phone conversation: "O" and "0"' "1", "l" and "I".  characters can be removed from a literal string using =SUBSTITUTE(<string>,<char>,"")
  • Develop a strategy for alphabetic and special characters.  One method would be to establish something like "PWDAlpha_Core" then add on for characters with diacritical marks.
  • For special characters, do the same.  For example, the core special characters might be "!#%&'()*+,-./" & """"  (Same as above, but without the dollar sign.)  Localization would allow for simply creating a new name like PWD_Special_EU = PWD_Special_Core & "€". and PWD_Special_ US = PWD_Special_Core & "$"
  • The end-point formulas should reflect their purpose, the names always linked by standardization.  For example, a character for Oracle users in the UK may be "=MID(ValidPWDChars_Ora_UK,CHOOSE(RANDBETWEEN(1,LEN(ValidPWDChars_Ora_UK)),1)".  This improves maintainability.

Ensure Enough Subsets Used

One of the standards we see for passwords is that they must use at least 3 of the 4 subsets.  There are two ways to ensure this.

The first method takes us back to the original article, i.e. make a certain number of the characters specific to the subset.  For example, if the standard is at least three of the different subsets be used, the first character may be limited to an uppercase letter, the fourth to a special character, and the sixth to a digit.

The second method is more complex, but it ensures the order of the subsets is randomized while ensuring enough of the subsets are used for sufficient password strength.
Strong Password Generation

  1. Begin with a "password seed."  This is a string that is 4 characters less than the number of characters needed for the password.  It uses a concatenation of the randomly selected characters, e.g.
    A3 =MID(ValidPwdCharacters,RANDBETWEEN(1,LEN(ValidPwdCharacters)),1)
    &MID(ValidPwdCharacters,RANDBETWEEN(1,LEN(ValidPwdCharacters)),1)
    &MID(ValidPwdCharacters,RANDBETWEEN(1,LEN(ValidPwdCharacters)),1)
    &MID(ValidPwdCharacters,RANDBETWEEN(1,LEN(ValidPwdCharacters)),1)
  2. Calculate a position where a character will be inserted.
    B3 =RANDBETWEEN(0,LEN(A3))
  3. Using the position, insert a random character from the specified subset
    C3 =LEFT(A3,B2)&RandDigit&RIGHT(A3,LEN(A3)-B2)
  4. Drag-Copy to complete the table.  E3 has the generated password.