I want to know1
and understand1
But I will not1
-- Hashes cracked from the KoreLogic CMIYC 2023 competition
In the previous two posts on the CMIYC competition [Part 1, Part 2], I had focused on how to integrate data science tools into your password cracking workflow and showed how to crack passwords on limited hardware (E.g. my laptop without using a GPU). Of course it's better to have some firepower to crack hashes! One of the hurdles to overcome is I don't have a lot of firepower at my disposal. Despite being super interested (OK, obsessed) about password cracking, I've never invested in a dedicated cracking rig. Still, when I do get serious about cracking passwords I turn to Hashcat and GPU based attacks to do the heavy lifting even if I only have a single NVIDIA GeForce GTX 1070 GPU. That's still significantly faster than trying to run CPU only attacks.
To that end, let's talk about how to leverage Hashcat when competing in these competitions. Full disclaimer: I'm going to go full spoiler in how I'm approaching my cracking. At this point, I've been running cracking sessions way longer than the competition would have lasted if I had competed. Also, I've been on the various Discord and Twitter conversations about the contest this year and know how the hashes were generated. Heck, KoreLogic even posted themselves how they created the challenges [Full Spoiler Link]. So I'm not going to even pretend that this post represents how I would have done. Instead I want to focus on "given what we know, how can someone use Hashcat to crack those hashes".
One issue that pops up a lot for me when using both John the Ripper and Hashcat to crack hashes, is that while their file formats are *mostly* the same, they are not directly compatible. This goes for how these tools expect hashes to be formatted when loading them up, and their .pot file formats they save their cracked passwords to.
The hash format in particular has been a long source of annoyance for me, and writing this blog post inspired me to finally submit a github issue about it to the hashcat repo. The long story short is that John the Ripper uses hash type identifies that Hashcat doesn't recognize. For example, here is a raw-md5 hash (from the CMIYC2023 contest) that John the Ripper can load:
And here is the same hash format that Hashcat expects:
Side note, while you can have usernames in your hash lists, Hashcat won't load the hashes unless you include the "--username" flag on the command line telling Hashcat to strip/ignore those usernames. E.g.:
What this really means is that to support both John the Ripper and Hashcat, I now have two sets of hash lists and two sets of pot files. It would be nice to incorporate some scripts in my Juypter Notebook to sync up both of the pot files between them so I'm not cracking the same hashed password twice. Given that's a rabbit hole which would totally side-track any hash cracking, I'm going to push that project off for another day. For now I'm just going to use Hashcat, and I modified my Notebook to support the Hashcat file formats, (mostly by copying and pasting the JtR code into another cell and then making small modifications). Once again, this is one of the super-powers of using Jupyter notebooks. I can load up my JtR cracked hashes, then write and load up my Hashcat plaintexts, and perform analysis on both in a very short period of time. It's not pretty but it works.
The commands to run Hashcat are very different than those to run John the Ripper. There's pros and cons to both methods. File autocomplete works much better with Hashcat's command line and Hashcat does directory inclusion (such as use all wordlists in a directory) better. But John the Ripper's is less position dependent, has a ton of super powerful features for different attack modes on the command line, and quite honestly I'm just used to it more.
The basic command line for hashcat is:
So for a standard wordlist + rules attack you can run
To break this down:
Running variations of the above attack using standard large dictionaries and a few other hashcat rules cracked a few more MD5 passwords but not many....
One cool feature of Hashcat is that you can specify a directory instead of a wordlist though. So you can use the following command to run a quick set of mangling rules against all of your dictionaries:
When running these attacks, the hashes.org-20202 wordlist did the best. It's a super effective wordlist to use in general and can be obtained from hashmob [link]. Side note, I'm not using Hashmob's own cracked wordlists for this blog post since I'm pretty sure the contest hashes were uploaded to them.
Given the limited success of these attacks (a few raw-MD5 cracks aren't going to give a lot of points). There's really three paths that I can take.
Side note: Options #2 and #3 are generally the ones picked on real dumps as the individual passwords are only loosely related to each other. Also password crackers (at least me) are lazy.
Going with the lazy options first, let's dive in on how to run them. To auto-generate rules you can use the --generate-rules=X option where X is the number of rules to generate. For example:
When you do this, and I can't stress this enough, enable --debug-mode=5. Also log that info to file using the --debug-file debug.txt option. This will output both the rule that successfully cracks as password as well as the plain-text word. Don't get lazy, and do not skip this option. In fact, you probably should be running that for all your password cracking sessions.
Now you may be asking yourself, why "--debug-mode=5"? It's because the debug info will append itself to the debug-file (vs. overwrite it) and you'll be running a lot of cracking sessions. Going back and remembering which dictionary created which cracked password is super helpful. You want all that info. Why throw that info away with a lower debugging option?
Long story short, if you don't know what to do, a default option can be to generate rules for a dictionary you've had some success with, log the results, and then turn the successful rules into a contest specific ruleset to use with other dictionaries.
But what if your input dictionaries are the problem? That's where brute-forcing small key lengths can be helpful using masks.
I'll admit, I started to go into a long, long diversion about the mechanics behind Hashcat's Masks and Markov optimizations. I really hate calling what Hashcat does a Markov attack and there's a ton of optimizations that Hashcat developers can make to it. But that's totally besides the point if you are trying to crack passwords RIGHT NOW. So I'll save that side tangent for a different post and instead focus on cracking these contest hashes.
Masks are one area where having more computational power makes a huge difference. They let serious cracking rigs just chew through keyspace without requiring much skill or ability from their operators. Contest organizers know this and tend to create passwords that are resistant to un-optimized mask attacks. This means going through the entire key-space for 5/6/7/8 passwords is unlikely to be very successful.
As an example of that, I left Hashcat running for a couple of hours brute forcing all ASCII passwords of length 1 through 7 for the raw-MD5 hashes. I didn't crack a single new hash that wasn't caught by earlier runs I had performed with John the Ripper. Going back to my Jupyter Notebook I decided to display password cracks by length, and then also the number of ASCII only (aka no Cyrillic) password cracked by length.
You probably don't have the GPU power to brute force 8-9 character passwords during the contest, and you certainly don't have that for the high value hashes that are worth a lot of points Therefore to be successful in a contest with Hashcat Masks you need to tailor them to find gaps in base-words or mangling-rules that you have already identified. I talked about this earlier with the attacks I ran using John the Ripper in Part 2 of these write-ups. For example, if you were looking to find more base-words for Sales passwords where many of them started with '2023' and ended with a special character, then you could try something like:
There's a lot going on in the above command. Let's break this command down by parts:
That's great, but what if you want to try 5 lower case characters vs. 6. Running these attacks by hand is a pain so it's nice to queue up a bunch of mask attacks at once using a save mask file (e.g. a .hcmask file). Unfortunately, the format is a bit different so let's look at how we can do that next. First, here is the hashcat command line to run a .hcmask file:
You'll notice that all the mask info has been removed from the command line and instead I'm calling an external sales.hcmask file. Let's take a look at what's in that file:
Breaking this file format down:
With all of that, I managed to identify a couple more base words to use targeting sales passwords. This in turn allowed me to target higher value hashes easier. The same can be done by targeting known words to find the mangling rules. E.g.:
Yes you can also do that with a wordlist and mangling rules, but if you only have a couple of words you want to check it can sometimes be easier to do that with Masks instead. Now if you have a lot of words you want to try, then you can look into Hashcat's "-a 6" (Wordlist + Mask) and "-a 7" (Mask + Wordlist) attack modes. John the Ripper doesn't have this specifically because *cough cough* its rule preprocessor supports masks already in its normal mangling rules. But these attack modes can be very helpful if you are using Hashcat.
One thing you'll notice though with the hybrid -a [6/7] attacks is that you can't mangle or apply masks to both sides of a guess at the same time. Also, unlike with standard wordlist modes (-a 0) you can not pipe a wordlist in to -a [6/7] modes via stdin. This is a problem. The whole reason you are using Masks is probably because you don't know what mangling rules have been applied to the base-word.
The key then is to create custom word-lists that contain one side of the mangling rules. I'd recommend picking the "shorter" of the mangling rules to limit how much you write to disk. This is super annoying, but it works. So for example if you want to append 2022 and 2023 to a word and then append a mask attack you could do something like first creating a word-list containing all the words with 2022 and 2023 appended to them (this only doubles the size of the original input dictionary). In this case I'm accomplishing this by using Hashcat's rules and saving the results to disk. To do that, and the run the resulting full Mask attack, you can use the following commands:
Rule file: append_year.rule (Capitalize word and prepend 2022 and 2023).
Generate wordlist command:
Now that we have a wordlist containing words like 2023Sales, run the mask hybrid attack:
Is all of this a pain? Absolutely! But it can be very effective so it's usually worth creating these temporary wordlists for your attacks and then combine them with masks.
As mentioned earlier, the whole reason to try different "spray and pray" attacks against fast hashes is to crack enough to identify how the passwords were created and develop highly targeted attacks against expensive and high value hashes like BCrypt. The mangling rule that received the most post-contest conversation among all of the teams was that several users' passwords were their creation time (found in their metadata) converted to Unix epoch timestamps.
Creating a wordlist of all the various timestamps is certainly one way to go, but what we really want to do is crack bcrypt hashes. This is a perfect opportunity to talk about association (-a 9) attacks in Hashcat. Association attacks take one word per hash and target that hash with it. The word in association attacks can be combined with rules as well. This is a huge improvement when targeting a large number of salted hashes where you may have some idea what the plaintext for each account might be.
To perform an association attack you need to create a hashlist of the hashes you want to target, and then have a 1 to 1 mapping to a wordlist you want to target those hashes with. So for example you might have two files:
HashList.txt:
Wordlist.txt:
For this particular challenge I created the wordlists + uncracked bcrypt hashlist using the following python script in Jupyter Notebook:
Next, let's run some attacks. First, let's just do a quick naïve attack using (-a 0) and the timestamps as a normal wordlist.
Running this attack for an hour and a half isn't the end of the world. But this is a contest. You are a busy hacker. You have hashes to crack and other wordlists to run. Let's try Hashcat's association attack. Here is the command I ran:
ONE IMPORTANT THING TO KNOW: By default '-a 9' mode will not save to your standard .potfile. So if you want to capture these hashes you MUST specify a potfile on the command line using the '-o FILENAME' option. I learned this fact the hard way when none of my cracks were showing up. I asked some Hashcat developers about this and they said there's still some "weirdness" with '-a 9' mode. For example, it will "recrack" hashes you have already cracked and post duplicates cracks/plaintexts to your potfile. So if you are running this attack it is probably good to run it on a new potfile vs. your global one, and then merge the new cracks back into your main potfile after the fact.
And here's the results:
Over 100 Bcrypt hashes cracked in a couple of seconds! That's super fun. As some backstory, association attacks are amazing if you have known passwords for users. Aka you obtained passwords from a different password dump and you are attacking the fact that users re-use password between multiple sites. Leveraging association attacks, you can run common mangling attacks against those known passwords to crack computationally expensive hashes for a subset of users.
The next area to focus on is multi-words and phrases. Korelogic gave out a hint during the contest that several of the Engineering passwords were created from phrases taken from sci-fi books and movies, with the number '1' appended on the end [Link]. This can be seen in some of the cracks I made earlier:
Going back to the hash breakdown by department, Engineering is also a huge department to target:
The approach here then is to crack as many hashes as possible with fast hashing algorithms to try and figure out the source materials. Then we need to target high-value hashes in the engineering department using phrases from those source materials. Basically dumb, untargeted attacks first, then smart attacks later. Let's start with those dumb untargeted attacks!
At a high level this looks like a Correct Hose Battery Staple problem. To target that, let's try all the common English words in two and three word phrases and add the number '1' to the end. For a dictionary we can use the following corpus which contains various word-lists of 10k English words sorted in probability order [Link]. The first really "just get it to work" option I selected was to write a quick python program that loops through the word-list and outputs possible phrases while appending the number 1 to them. I then used the fact that if you do not specify a dictionary, Hashcat's '-a 0' mode will read in words from stdin. So I can run my attack using the following command:
This wasn't pretty, but it did crack a number of hashes. Still, my guess generation was super slow as it is running a slow python script and then pipes those guesses into hashcat (piping guesses is also slow). Raw-MD5 is fast to compute. Basically this option wastes a lot of time and limits the key-spaces I can search. How about we speed this up using Hashcat's combinator attack?
Hashcat's combinator attack '-a 1' allows you to combine two dictionaries together to target multi-word passwords. For example, let's assume you have the following two word-lists
dic1.txt
dic2.txt
If you run the following command:
You'll get the following output:
You can also apply one (AND ONLY ONE) rule to each dictionary if you want using the '-j' (applied to left word list) and '-k' (applied to right word list). So for example if you use the following command:
It'll create the following guesses
As reference the '$' rule appends a character to the end of a guess. So '$ ' appends a space, and '$1' appends a '1'. I think you might see where this is going....
The problem is, this works great for two word phrases. But what about three and four word phrases? I wish I knew of a better solution, but the short answer is I hope your cracking system has some free hard-drive space! You can only use combinator with two input dictionaries, and you can't pipe in guesses into hashcat if you are using '-a 9' mode. The fastest option then is to create a word-list of all two word phrases. If you don't want to write a custom program to do this, you can always use hashcat and pipe the guesses to a file. For example:
Then to try three words you can run
To try four words you can simply run
Side note, I also has success by capitalizing the first letter by changing the -j rule to:
This attack yielded a ton of cracks. Looking through them I started trying to find "unique" and "odd" phrases to try and figure out where the source material came from. This is because while the above attack works great against fast hashes like raw-md5, they will not scale against slow hashes like Bcrypt. We need to further optimize our attacks. Given that, here is a subsection of my cracked passwords:
Most of these phrases were spectacularly unhelpful. But some of them stood out such as 'watch your food'. Running a quick google search on that + the "scifi" highlighted Project Hail Mary [link]. That was a book I loved and hated in equal parts so it brought up a number of mixed feelings, but it certainly seems like a good candidate. The challenge is that the book isn't in the public commons. Still, let's try and create a dictionary of quotes copied from that article.
Next step was to create a janky Python program that would output all 2, 3, and 4 word phrases from the book paragraphs I had found. I know janky Python programs are slow, but so is cracking Bcrypt hashes. In this case it is better to minimize the number of guesses I make vs. focusing on how fast those guesses are generated.
Side note: I apologize for putting this as a screenshot. I really wish Google's blogger had a code insert option...
Running this through hashcat again yielded a new cracked hash!
That's also a pretty unusual phrase, so I have high confidence that Project Hail Mary is one of the sources for the plain-texts. Let's try this against the bcrypt hashes!!!!!
Annnnd nothing cracked.......
This was disappointing, but it's probably because I was only using two paragraphs from the book. I need to find a better source to grab quotes from.
Let me take a step back and say, this workflow loop is one of the keys to this contest. If the cracked fast hashes (raw-md5, raw-sha1, etc) are any indication, around 1/3rd of the high value hashes are phrases taken from books and movies.
Key workflow for CMIYC 2023:
The problem for me is that workflow is manually intensive, time consuming, and quite frankly boring as hell. During a competition it can be fun to get that dopamine hit as you crack new bcrypt hashes. After the contest, I'm simply wasting time while running up my power bill. So the question is, can I automate this at all? My power bill will still be high, but at least then I can watch new episodes of Asohka vs. staring at my computer screen! How about I train my PCFG guess generator on cracked passphrases and let it crunch away at generating guesses? I mean, it worked for the Hashcat team! [Link].
There's various ways to create the training set, but given how Korelogic generated these passwords, and the plain-text values I was seeing, I just threw everything that had a 'space' into a training file using the following command line:
I know, I could have done the word-list generation much better as a short python script in my Jupyter Notebook, but I got places to be and Starwars episodes to watch! Now that I had a good training set, I then trained a PCFG grammar on it using the following command:
I set coverage (-c) to be 1 so the PCFG guesser will not generate any brute force (OMEN) guesses. I then gave this attack a test run against raw-sha256 hashes using the following command:
And.... Yup this looks promising:
Let's see how it does with Bcrypt using the following command:
Success! Limited Success!
There is still a ton of optimization I could do. You'll notice I haven't re-added / merged my potfiles in from the previous cracking of the Unix Epoch timestamp hashes. I also am targeting all of the Bcrypt hashes vs. just the ones in the engineering department. By reducing the target hashes I could easily double the speed of plain-text guesses I am making against the target hash list. I also don't want to give the false impression that this is the best attack method for these hashes. It's not. You would be much more successful by trying to find the source material and creating custom word-lists from that. What this attack workflow has going for it though is it is one of the most automatable options. You can let this run while trying to figure out better methods. Or... you can go do something else besides crack passwords. Call you parents maybe? I'm sure they would appreciate it!
I think this is a good spot to end this blog post. Looking back at it, I somehow managed to cover every attack mode in Hashcat. There's still more techniques to dig into, and there's a ton of uncracked hashes left in this contest. But I might leave that for a future post. If you have any tips, suggestions, or comments, feel free to leave them in the comments. Good luck, and I hope to see everyone at CMIYC 2024! Also thanks once again to the KoreLogic team for putting together such a great contest!