Everything that happens once can never happen again. But everything that happens twice will surely happen a third time.
-- Paulo Coelho
For the last couple of months, I've been (slowly) working on building out a new backend/framework to be able to manage password cracking sessions using JupyterLab as the frontend/GUI. The current version of this framework is available [here].
This project is under active development (well active for me anyways), and I'd really appreciate feedback and suggestions on how to extend and improve it. My goal is to have an opensource, community driven alternative for Team Hashcat's List Condense (LC) collaboration server ready by CMIYC2024.
I view JuypterLabs as a stone soup. It provides a good interface, interactive Python debugger, and a way to save and share analysis results. But it is still up to you to do all of the backend analysis. That became very evident when I used JupyterLabs in the CMIYC2023 contest (as detailed in the previous three blog posts on this site). I spent a lot of time debugging my code and messing with data structures when I really needed to be focused on cracking passwords. I quickly realized for this approach to be effective in future password cracking contests that I'd need to develop back-end data structures and classes to better organize all the data and provide built-in features to aid in common tasks.
Backing up a bit, in the past I've really enjoyed using MITRE CRITs (Collaborative Research Into Threats) [Link]. Unfortunately CRITs is no longer being maintained (the switch from Python2 to Python3 killed the project), but it was a tool for CSOC (Cyber Security Operation Center) team members to collaborate with each other when analyzing intrusion sets and threat actors. CRITs organized Top Level Objects (TLO) into the following buckets/categories:
It then made the data available in these different buckets cross linkable as well as accessible to various plugins. Following that approach, I created several different TLOs for this password cracking framework:
In the future as this toolset becomes more developed, I may end up taking a lot of the metadata out of Targets and putting it into its own TLO much like CRITs did. Also longer term, I may end up incorporating disk storage or a database, but for now I'm keeping this focused on helping with password cracking competitions, (vs. activities like pen-testing and managing ALL my password lists). You can download the framework right now, and it already has some example Notebooks in it that show how to use it on the CMIYC2023 challenges.
What this framework WILL NOT do is crack hashes or run your actual cracking sessions. I may add some scripting/logging support in the future, but this framework is focused on helping with data analysis as well as automating some of the busywork/repetitive tasks in a password cracking competitions such as creating custom wordlists, left list, translating between JtR and Hashcat, and hash submission.
A big question I have is how effective will this framework be in the NEXT password cracking challenge? Since I can't predict what Korelogic is going to do (beyond the fact that Bon Jovi references will somehow be involved), probably the best thing I can do is look back at past competitions. To that end I figured I might dig up the 2022 contest hashes and attempt to crack them again.
To obtain the challenge files, you can visit Korelogic's 2022 contest page [here].
Side note: Mad props to Korelogic for continuing to host past contest files. I really appreciate it!
Also, I had written a short blog post about my experiences in the contest, which is available [here].
I'll admit, it's always hard looking back on past work/documentation, but I'm a bit annoyed with my past self. I think that blog post has a lot of good information in it, but it really doesn't go into too much detail about the contest itself. On the plus side, that will make using this framework for this challenge more "realistic" since I can't just rely on my past documentation, and I have very hazy memories of what happened over a year ago.
The CMIYC2022 contest had three file "drops" over the course of the challenge. Aka not all the hashes were released at the start, so this gave teams something constantly new to bang their heads against.
The contest files are PGP encrypted and as a player you need to decrypt them with the password that KoreLogic provided. Since the first thing I always do is Google "How do you decrypt PGP files" I'm going to put the command here for future me. Also this will hopefully disabuse you early on the false idea that I know what I'm doing ;p
Do that for all three files, saving them with .tgz extensions. Then unzip the files using the command:
This creates three different directories filled with different encrypted file types. These are:
This competition starting to come back to me. The main issue with this challenge was to figure out how to crack the various encrypted file types. Once you cracked the top level file, it would present you an internal hash_list of fast to compute hashes that you need to crack for actual points. For the street teams, the top level file hashes are encrypted with fairly easy to guess passwords. For the pro teams ... not as much.
In my previous writeup [here] the first two "Tips" cover how to set up John the Ripper to crack these files. I'm going to largely skip those tips here, but assuming you followed them, the following commands can be used to extract and save the hash for the above file to a single file you can crack. I'm appending them all to a "encrypted_file_hashes.hash" file that I can load into JtR. If you want to use Hashcat to crack the hashes instead, you'll need to do some additional fixups to remove things like the username (or run hashcat with the --username field). Side note, I also highly recommend checking out [this] external blog entry about using John the Ripper to crack different file formats. I heavily leveraged it since it highlights things like to crack .odt files you need to use lbreoffice2john.
While I could certainly load these hashes into the JupyterLab Framework ... I don't see a lot of value doing so as there isn't a lot of "analysis" to do on them. Perhaps if/when I add the ability to keep track of targeted attacks, wordlists, and mangling rules run against a hash then it will make sense, but for now let's just crack these hashes so we can get access to the larger hash lists which we will be able to leverage the current JupyterLab Framework against. Below I'm going to list the JtR command I used (including mode) to crack the password as well as the plaintext. To help with spoilers, I'm setting the background of the plaintext to be black, but to read it you can just highlight the text and copy/paste it somewhere else.
The next step is to decrypt/unzip/open all the files and save the hashes from them. You may laugh, but I always forget the command line options to do this, so I'm listing how I did that below for future me. I'm also listing what types of hashes were in each file. To determine the hash type I mostly relied on looking at the scoreboard that KoreLogic published [link] and matching it to the hash. If that wasn't provided though, there would be several ways to figure out the hash such as using mdxfind.
Now that we have tens of thousands of password hashes to crack, it's time to use the JupyterLab password cracking framework. To do this, we'll first need to load the hashes into it, and to do that we'll need to configure the config file.
The framework uses the YAML file format for its configs. That means spaces/whitespace is important, but it is also a pretty flexible file format. For our config, we'll define our JtR and hashcat potfile locations, how much the hashes are worth, and where and how to load the hashes. I'll be including # Comments as well to help explain why I'm doing what I'm doing.
---
# This defines where the potfiles are for this contest. This lets you load cracked hashes from them
# as well as keep your JtR and HC potfiles synced.
jtr_config:
main_pot_file: "./challenge_files/CMIYC2022_Street/jtr_cmiyc2022.pot"
hashcat_config:
main_pot_file: "./challenge_files/CMIYC2022_Street/hc_cmiyc2022.potfile"
# This is information on where to load the challenge files. If they have additional metadata you
# may need to write a custom function to import them, but in this case they are pure raw hash lists
# so we can use the "plain_hash" plugin to import them.
#
# This plugin requires the hash type to be specified (if not it will default to "unknown"). Typically
# I use the JtR naming format for the hash type. The "source" field is used to list a source for the
# hashes in the framework. Basically it's a note for you later when looking at the cracked hashes.
#
challenge_files:
list14:
file: "./challenge_files/CMIYC2022_Street/sample_hashes/list14-4214-BrunnersMentalPrisoner.hashes"
format: "plain_hash"
type: "mysqlna"
source: "list14-BrunnersMentalPrisoner"
list16:
file: "./challenge_files/CMIYC2022_Street/sample_hashes/list16-FL_kdIZUGpI.txt"
format: "plain_hash"
type: "half-md5"
source: "list16-FL_kdIZUGpI"
list17:
file: "./challenge_files/CMIYC2022_Street/sample_hashes/list17.txt"
format: "plain_hash"
type: "raw-md5"
source: "list17"
list18:
file: "./challenge_files/CMIYC2022_Street/sample_hashes/list18.hash"
format: "plain_hash"
type: "raw-sha1"
source: "list18"
list19:
file: "./challenge_files/CMIYC2022_Street/sample_hashes/list19-paidanextra500000.hashes"
format: "plain_hash"
type: "raw-sha256"
source: "list19-paidanextra500000"
list20:
file: "./challenge_files/CMIYC2022_Street/sample_hashes/list20-Authoritiesappeartohaveuncoveredavastnefariousconspiracy.hashes"
format: "plain_hash"
type: "raw-sha384"
source: "list20-Authoritiesappearto"
list21:
file: "./challenge_files/CMIYC2022_Street/sample_hashes/list21.hashes"
format: "plain_hash"
type: "mssql05"
source: "list21"
list24:
file: "./challenge_files/CMIYC2022_Street/sample_hashes/list24.hashes"
format: "plain_hash"
type: "ssha"
source: "list24"
# The score info is taken from the Korelogic scoreboard. This isn't necessary, but it is nice to have
# a local count of what your score should be so you can compare it to the official score to validate that
# you are submitting your cracks properly
score_info:
raw-sha384: 46
mysqlna: 17
raw-sha256: 13
mssql05: 9
raw-sha1: 5
ssha: 5
half-md5: 3
raw-md5: 1
This is the easy part for these hashes. The biggest challenge is setting up the config file. Once that's done you can just load it up in the current framework. At this point I should mention that I'll be eventually including the Notebook in this blog post in the example files in the framework github repo.
Once it is loaded you can run the built in tools to merge Hashcat and JtR potfiles, display cracked passwords, and calculate your expected score.
There's not a ton more analysis to do (besides look at the cracks). And don't worry, we'll get to that! But this is where I'm glad that I decided to go back through these old CMIYC challenges. The last contest in 2023 was very heavy in its use of metadata. So I of course focused on metadata analysis for this framework. But for this contest, there is very little metadata. The focus instead was on cracking encrypted containers and (cough, SPOILER ALERT, cough) building custom wordlists from online articles. This highlights other new capabilities that I really want to add into this framework. For example, going through logfiles, extracting the rule/dictionary that cracked a password, and then storing that data with the hash in this framework. That would be super cool! How about doing google searches on passwords for you? That would be cool too. That points to the stone-soup approach of this framework. Once we have these hashes/passwords/metadata stored in a searchable framework with a Python3 backend we can build upon this to add new capabilities.
Enough talking! Let's look at some cracks. Looking above, I managed to crack over 50% of the ssha hashes in under a minute cracking run in JtR. I wonder what those plaintexts look like. I can use the SessionMgr.print_all_plaintext(meta_fields=['source']) to display them.
You don't have to be a master password cracker to design an attack against these. That being said, going back to the score graph, they aren't worth a lot of points. Even if you cracked all 2k ssha hashes (which is totally doable) you would only get 10k points. That's a lot less than most of the other hash types. So this is a list to play around with when you don't have anything better to crack and need that serotonin hit to see cracks flash across your terminal.
That leads us to the other hash lists. Let's check out the raw-sha1 hashes:
At least for the limited cracks so far, it looks like these plaintexts were generated using some common words that have various mangling rules applied to them. You can start to see why I want to automate pulling out dictionaries + rules from JtR and HC logfiles to make it easier to reverse engineer these rules vs. having to do it by hand.
Actually, this might be a good time to end this blog entry and work on some of those new features.... All the code is uploaded to the gitlab site and I'll upload some sample hashes as well so you can follow along. But for now I probably need to look into parsing some Hashcat log files.