I recently picked up a book called “Ghost in the Wires: My Adventures as the World’s Most Wanted Hacker” by Kevin Mitnick. To be totally honest, I’d never heard of Mitnick before. He was a little before my time—when he was arrested in 1995, I was still struggling to understand why the hell I had to learn fractions and everything I knew about computers was based on WarGames.
I haven’t finished it yet, but thus far, I have come to learn that he was accused of disconnecting phone lines, deleting his own arrest records, harassing some actress named Kristy McNichol, hacking into the NSA, and just generally being a nuisance. He maintains that the more outlandish claims are false. He does admit to screwing with a drive-through speaker to make it scream, “What the fuck are you looking at?” at a McDonald’s manager.
Mitnick is a little vague about his approach to writing code. It’s a narrative, not an instruction manual. He focuses more on the social engineering aspect of his activities. The con-artist part. Where you trick people into giving up their user names and passwords and such. Not my cup of tea. You need social skills for that. I was getting ready to put the book back when I noticed that he starts off every chapter with a scramble of letters. Like this:
MAX VHLM HY MAX UNL WKBOXK INGVA B NLXW MH INGVA FR HPG MKTGLYXKL
Except with each chapter, they seem to get more complicated. He introduces numbers, non-alphanumeric characters, then towards the end, it’s only ones and zeros.
The first one appears to be the easiest because it’s just letters and spaces. And the MAX/MAX/MH is a give-away. If you know a cipher is written in proper English (complete sentences) and you suspect that it’s a simple substitution, an easy way to start is to look for a three letter word that appears more than once. Decent chance it’ll be “the.” And if you know M=T, then MH can only be TO. Then you look at the two letter words. HY has to be OF, OR, or ON. And you just keep plugging away.
It’s easy enough to do by hand but if you’ve got a computer handy, why not make it help?
Let’s Talk About Letter Frequency
Some letters come up more frequently than others in English. That’s where a macros that tells you how many times a particular letter occurs can come in handy. Look at the number of times each letter appears in the Gettysburg Address:
Here’s a character count for all the letters in a recent Wired article about the OurMine hacks.
And here’s the first 3rd of the script for Deadpool.
E and T frequently end up in the top two slots. The other vowels and S generally follow. X, Z, and Q tend to be on the low end.
So if you plugged the scramble in not knowing anything about it, you could very well guess that M and X are either T or E. L is likely to be a vowel or S. The more characters you have, the more accurate it tends to be, but it can help you get started even with a small sample like this. The letters they turned out to be are below the percentages.
The scramble in the second chapter is slightly harder because there are no “the”s to guide the way. But you look at the character count of it:
ESTD MZZV ESLE ELFRSE XP SZH EZ NCPLEP YPH TOPYETETPD HSPY T HLD L ACP-EPPY
…You can guess that E and P are going to be T or E. I figured he wasn’t going to assign E to itself, so E = T and P = E. Then all of a sudden, ACP-EPPY becomes a give-away because what word in English matches this pattern: _ _ E-TEE_?
It helps that the chapter is about Mitnick’s Survival Bookstore excursions in his youth.
So keep in mind—I start with E and T as a rule, but on shorter sentences, it doesn’t always work. That’s when I start defaulting to the other vowels, S’s, R’s, and N’s. That’s how I got the third scramble to cooperate:
PBZFSOBP DKFOBTPKX LQ PBKFI PPBKFPRY AOXTOLC IIXZ LQ ABPR BOBT PBZFSBA CL BMVQ OBAIL BPBEQ
The beauty of frequency distribution is it works even if the cipher is backwards. Once you start plugging things in, it’s pretty easy to tell if the entire sentence has been reversed. Easy, that is, if there are spaces that let you know where each word starts. Mitnick’s fourth cipher has no spaces. I’m still working on that one.
The VBA script for the letter counter is below. It’s wired to work with a spreadsheet formatted like the ones above (with A2 as a merged cell for the cipher text).
The Code Explained
The Copy/Paste Version
Sub FrequencyDistribution() Dim Punct As Integer Dim Counter As Integer Dim dict As Object Punct = 0 Set dict = CreateObject("Scripting.Dictionary") Text = Cells(2, 1).Value For Counter = 1 To Len(Text) Letter = UCase(Mid(Text, Counter, 1)) If dict.Exists(Letter) Then dict(Letter) = dict(Letter) + 1 ElseIf Letter = " " Or Letter = "," Or Letter = "'" Or Letter = vbLf Or Letter = "(" Or Letter = ")" Or Letter = "-" Or Letter = "." Or Letter = "—" Or Letter = ":" Then Punct = Punct + 1 Else dict.Add Key:=Letter, Item:=1 End If Next For i = 0 To dict.Count - 1 Cells(1, i + 3).Value = dict.keys()(i) Cells(2, i + 3).Value = dict.items()(i) Cells(5, i + 3).Value = dict.items()(i) / (Len(Text) - Punct) Next i Columns("C:AE").Sort key1:=Range("C2"), _ order1:=xlDescending, Header:=xlYes End Sub
It wasn’t until after I’d written this that I looked at the scrambles again and realized I’m an idiot. Plug these into a Brute Force VBA—it’s not just a simple substitution. The first three Mitnick jumbles are Caesar Shifts (the last one in reverse). The fourth, however, does not appear to be simple shift. I’ll have to look closer at that one later.