After the initial investigation entitled “AWS CloudQuarry: Digging for secrets in Public AMIs” was finalized, we continued with the same idea on Azure in order to search for hidden and forgotten secrets in Azure VM Images.
I will try to keep this article short and present how we managed to collect approximately 120GB of data by scanning 15.000 images. This was achieved by using a simple and efficient tool we created called “Azure Image Scanner” or AIS (GitHub Link).
Article author: Stefan Tita (https://www.linkedin.com/in/stefan-tita-55a349a3/)
Shoutout to Eduard and Matei for the original idea on AWS.
When you want to deploy a new VM in Azure you have the option to choose a public image template from Azure Marketplace or Community Images.
Community images, as the name suggests, can be shared by anyone and can have an unknown and unverified source, that is why you should probably avoid using such images unless you know and trust the source. At the time of conducting this investigation there were a total of approximately 6.000 community images, but this number keeps increasing so you can find the updated list at the following link:
Generating the community images file is easy as the Azure Portal allows “Export to CSV” and the exported file can be used with Azure Image Scanner.
Azure Marketplace images on the other hand come from validated providers and can be used for commercial purposes, but you should still practice some care and avoid using expensive images or unknown providers when deploying VMs.
At the time of the investigation there were approximately 52.000 Marketplace Images, but this number has also been increasing over time. The following command generates the updated list of Marketplace Images ready to be used when running the AIS tool.
#export marketplace images to file (removes multiple spaces and table headers) az vm image list --all -o table | sed 's/ */ /g' | tail -n +3 > images_marketplace.txt
Once we exported the images we proceeded to exclude images from well known providers and decided to scan a total of 15.000 images from lesser known providers (10.000 Marketplace Images and 5.000 Community Images).
Deciding on the best approach to scan 15.000 images was not difficult because Azure allows creating Managed Disks from images and attaching them to virtual machines. This can be observed in the following image:
So we created Azure Image Scanner to automate the above process and scan Managed Disks to extract potential secrets.
The initial downside was that each of these commands (create, attach, detach, delete Managed Disk) takes some time to complete, and waiting for each command is very inefficient. In order to solve this issue AIS executes all steps in advance and in the background. Then each step checks that prerequisites are met before executing. By doing this it saves precious time on each image scanned (at least 30 seconds).
Initiating a scan with AIS is a rather simple process that includes the following steps.
1. Deploy a Debian/Ubuntu based VM in Azure. The AIS script should run on any size VM that is Debian/Ubuntu based.
2. Copy the AIS script along with the exported images file to any location on the VM.
3. Edit the following variables inside the AIS script. You will need to create an Azure Storage Account container and a SAS token, so that AIS can store the extracted data.
#MUST CHANGE imagesfile="images.txt" containerurl="" #Storage account container for data storage sastoken="" #Required to access the container
4. Using the root user, install Azure CLI on the VM and log into your Azure account.
sudo su curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash az login
5. Run Azure Image Scanner as root. (Optional: use “screen” command to run the script in a background session)
sudo su #root access required to execute elevated commands screen -L #run screen session and log data to file bash AIS.sh
Let us go over some of the functionalities and constraints of the tool:
find $mount_point \( ! -path "$mount_point/Windows/*" -a ! -path "$mount_point/Program Files/*" -a ! -path "$mount_point/Program Files \(x86\)/*" \) -size -25M \ \( -name ".aws" -o -name ".ssh" -o -name "credentials.xml" -o -path "*secrets/master.key" -o -ipath "*secrets/hudson.util.Secret" \ -o -name "secrets.yml" -o -name "config.php" -o -name "_history" -o -ipath "*.azure/accessToken.json" -o -name ".kube" -o -iwholename "/var/run/secrets/kubernetes.io/serviceaccount/token" \ -o -name "autologin.conf" -o -iname "web.config" -o -name ".env" \ -o -name ".git" \) -not -empty
Scanning the 10.000 Marketplace Images and 5.000 Community Images resulted in 120GB of data. It took 9 days to scan the entire 15.000 images with 5 VMs concurrently running AIS, and all the costs ended up being about 90 EUR…not bad (this included costs like running the VMs, using Managed Disk, storing extracted data in the Storage Account and some network traffic).
The original article goes into more details on how we decided to parse all the data in order to identify potential secrets, so I won’t go into details here. But we mainly used a combination of Gitleaks, Trufflehog and Keyhacks which simplified the work immensely. However, some manual work was needed in order to parse and validate some of the results.
Even though the results are not as juicy as on the AWS research, I was able to find some impactful accounts:
On top of this there were a significant number of database credentials found in config files, Kubernetes tokens, SSH keys and various others that are not associated with public services and cannot be validated.
Unfortunately, after attempting to contact the impacted providers I did not get any response back.
It is worth mentioning that the number one most discovered and most impactful type of secret was AWS access keys, both when scanning AWS AMIs and Azure Images. This is because even though Access Keys are indeed a convenient and popular way to allow granular user access, they should be properly handled, stored and disposed of to avoid leaks. Moreover, Access Keys should only be used when long-term access is required, otherwise alternatives like interactive logon through IAM Identity Center should be used.
When it comes to numbers, there is a big difference between the over 2 Million public AMIs on AWS and the 60k public images on Azure. But as time passes and these image numbers keep increasing, there is definitely going to be new images with hidden secrets, so new scans might be a good idea in the future and Azure Image Scanner might come in handy.