State File Storage Security
Introduction
This fifth episode in our series of posts dealing with the basics of HashiCorp Terraform for Azure is going to be a bit shorter than usual. I felt we needed a little breather after the last epic that we had to work through. I’m going to discuss how to lock down the Terraform state storage account and manipulate its firewall while we perform DevOps operations.
So what have we done so far? We’ve covered the prerequisites, repositories and pipelines, deploying resources and last time we got to stages, environments and branches. Hopefully you’re organically learning how to use Terraform and Azure DevOps to deploy resources into Azure in a secure and governed manner. The usual message at the start of each post in this series applies:
This series is designed as a memory aid for myself, that can maybe help others out going through the same learning process about Terraform and Azure. It’s not a detailed guide for somebody new to Terraform, DevOps or Azure, it probably doesn’t meet best practices in the DevOps industry, but it’s what I’ve found works for me!
Note that I’m writing this in the middle of 2024, and technology moves on quickly – things may well have changed by the time you’re reading it. As always I’d like to thank James Meegan for his original documentation which I’ve used as a foundation for this series.
As I go forward with each post I’m going to assume that you’ve read each previous post in the series, and that you’re following along with the code and have everything at the same level, with a validation and build pipeline created and secured, a resource group deployed in Azure, a develop and a main branch, and an understanding of how to pull code from the develop branch into main.
Storage Account Firewall
We know that we have a storage account in Azure, that is hosting our Terraform state file. Best practice is to ensure that the firewall for your storage accounts is turned on, preventing public access over the internet. Unfortunately, our remote DevOps agents that examine our infrastructure and deploy resources are hosted on the internet – specifically in Microsoft’s own data centre. The question is then, how do we make sure our agent can review the state file and deploy resources, whilst making sure our storage account is secured? I’ve got three answers, and I’ll explore the benefits and drawbacks of each:
- Whitelist the DevOps IP address(es)
- Self-Hosted Agents
- Temporary Storage Account Firewall Disable
Whitelist the DevOps IP Address(es)
This seems like a great idea. Add the source IP address of the Microsoft DevOps agent and we’re away. Unfortunately, each time the agent runs, a different image is spun up by Microsoft and the IP address can be different every time. There is a way around this, the pipeline can pull the public IP address of the agent VM and whitelist it temporarily on the storage account’s firewall. Unfortunately, there is a known issue with this process that if the storage account is in the same region as your Azure DevOps organisation, or even a paired region, the pipeline can pick up the private IP address and try to whitelist that instead of the public one, which will fail, causing the pipeline to fail. For this reason, I’m not going to go any further into this option but you could research it and see if it’s right for you.
Self-Hosted Agents
To make sure there’s no internet access at all to your storage account, you can use a self-hosted agent in your own environment. Just stand up a virtual machine in Azure (or a scale set, which is my own preference for larger deployments), install the agents then you can not only be more confident that your storage account is secure, but you can use it to for example deploy resources to a web application that is internal only with no internet connectivity. You can learn a little more about self-hosted agents here. I think these are a great idea with a lot of benefits, but there is obviously a cost involved and you’re deploying more infrastructure already before your IaC is ready and deployed. It’s definitely another one to research, but for the purposes of this series, I’m not going to use it.
Temporary Storage Account Firewall Disable
So of the options I posited, that leaves us with temporarily disabling the storage account firewall while we access the state file, then re-enabling it before completion. You can manually control the firewall, turning it off before you run your pipeline, but our goal is about automation, and we want to remove the risk of forgetting to turn it back on. The method I’ve chosen for this series of posts is to have the pipeline turn off the storage account firewall, perform its tasks, then as a final step in the process, turn the firewall back on again. Note that this final step should be executed whether or not your apply or plan was successful.
Disable the Firewall in the Validation Pipeline
We have two pipelines; a validation pipeline and an apply pipeline. In the validation pipeline we will turn off the storage account firewall, run the init and plan steps, then turn that firewall back on. The apply pipeline has two stages, and we need to repeat the process for each stage, so we will disable for the first stage, run that init and plan, then re-enable. After our check is passed and the apply stage runs, we will turn off the firewall, run init, plan and apply, then turn the firewall back on. In a later post we will look at putting the code to perform these tasks in separate files which we call from our pipelines, but for now we’ll put everything in the same file to help with a logical understanding of the process.
We will be adding a new task with an “inlineScript” option, which will run the code used to disable and enable the storage account’s firewall. The task will be the first one in our “steps” code block so sit right at the top. Taking care with our indentation, we will put in a comment, a task type (this being a PowerShell command it will be AzureCLI rather than TerraformTask), a display name and some inputs that will prepare for the actual line of code that performs the disabling of the firewall. We could add a code block to the pipeline in Azure DevOps using the helper as we did in the second post of this series, but as usual we’re going to use Visual Studio Code. Have a play yourself though, using the Azure CLI helper and see what you can come up with. Open your validation.yml file and add the following code:
# Disable the Storage Account Firewall
- task: AzureCLI@2
displayName: Disable Storage Account Firewall
continueOnError: false
inputs:
azureSubscription: 'terraform-series-sc'
scriptType: 'pscore'
scriptLocation: 'inlineScript'
inlineScript: |

To break this down a little further, this is what we’ve added:
- Comment – let’s show people (ourselves included) what’s going on
- Task – it’s a CLI script, so we’re telling Azure it’s AzureCLI@2
- Display Name – helps us visualise the steps when reviewing the pipeline job
- Continue on Error – this is a new one, we’re just saying that if there’s a problem, crash out, don’t do any more tasks in the pipeline.
- Inputs – this is what we want the task to actually do:
- Azure Subscription – this is as usual the service connection, rather than the subscription name
- Script Type – we’re going to be using PowerShell Core
- Script Location – this could be another file as previously mentioned, but for now this is an inline script
- Inline Script – so this is our actual code. Note that I’ve put a pipe (|) symbol at the end of the line. This tells the pipeline that it should ignore the carriage return and treat the next line as a continuation of the same line. It allows us to be clearer visually when looking at the code, but we could put it all on one line.
OK, so we’ve told the pipeline agent what to expect, but we’ve not actually given it any code to run yet. This will go on the next line. Microsoft always give examples of how to do things in code, so I’ve done a bit of hunting and found out that we use the command “az storage account update”. More tests though showed me that if I just set public network access to enabled, it wasn’t enough and it just enabled it from specific VNets and IP addresses. We actually need to do that, and then set the default action to allow. So the actual script I need to use is:
inlineScript: |
az storage account update --resource-group "uks-tfstatefiles-rg-01" --name "ukstfstatefilessa01" --public-network-access Enabled
az storage account update --resource-group "uks-tfstatefiles-rg-01" --name "ukstfstatefilessa01" --default-action Allow

If you want, you can now save and synchronise your pipeline and watch as your storage account firewall is turned off. Turn it on again manually in the Azure portal, then run your pipeline again and it’ll turn off. Great! But let’s do more!
One of the nice things about using code is that when we’re repeating the same information again and again, we can save ourselves a bit of time and reduce the risk of typos by using variables. We’re repeating our resource group name and our storage account name, so I’ve created variables for those and replaced their entries in the script lines. You’ll also notice that I’ve added another line of code at the end of the script beginning “Start-Sleep”. This tells the pipeline to wait for the specified length of time before moving on to the next task. I’ve noticed that if you don’t do this here, because it sometimes takes a while for Azure to actually disable the firewall, the next task fails because it still can’t get to the storage account. The last thing I’ve added, just after defining the variables, is to add a “Write-Host” statement which will put a message on the screen if you view the pipeline job, basically letting you know where you are in the process:
inlineScript: |
$resourceGroupName = "uks-tfstatefiles-rg-01"
$storageAccountName = "ukstfstatefilessa01"
Write-Host "Disabling Storage Account Firewall, Please Wait..."
az storage account update --resource-group $resourceGroupName --name $storageAccountName --public-network-access Enabled
az storage account update --resource-group $resourceGroupName --name $storageAccountName --default-action Allow
Start-Sleep -Seconds 60

So that’s enough to disable our storage account firewall, after which our tasks will be able to read the Terraform state file and perform the init and plan stages. The next bit is we need to enable the firewall again after everything’s done. All we really need to do here is copy the new block we’ve just created, and paste it again after our Terraform Plan task, just changing a couple of things:
- Update the comment
- Update the display name
- Update the “Write-Host” comment
- Update the actual script to disable public access
- Reduce the sleep time to 20 seconds as there are no further tasks
You’ll notice that the variables are defined again at the beginning of the script. This is a new task with a new script so it’s required:
# Enable the Storage Account Firewall
- task: AzureCLI@2
displayName: Enable Storage Account Firewall
continueOnError: false
inputs:
azureSubscription: 'terraform-series-sc'
scriptType: 'pscore'
scriptLocation: 'inlineScript'
inlineScript: |
$resourceGroupName = "uks-tfstatefiles-rg-01"
$storageAccountName = "ukstfstatefilessa01"
Write-Host "Enabling Storage Account Firewall, Please Wait..."
az storage account update --resource-group $resourceGroupName --name $storageAccountName --public-network-access Disabled
Start-Sleep -Seconds 20

That’s it. With a suitable comment, save your file, check your formatting and commit & sync up to your develop branch. If you want to see what’s happening, just go into the networking section of your storage account in the Azure portal and refresh the screen regularly after hitting commit. As your pipeline runs you should see the firewall access change from Disabled to “Enabled from all networks” and back again:

If you look at the steps your pipeline has run in its job, you will notice that you have them for each of the firewall blocks that you’ve just added:

Disable the Firewall in the Apply Pipeline
What we need to do now is add these steps to our apply pipeline. We need to remember that this pipeline has two stages, which are treated as completely independent, so we’ll need to add the steps individually to each stage. As always with YAML, we need to be very careful with our spacing and indentation. The tasks can effectively be copied from our validation pipeline, and pasted into our “steps” section of our apply pipeline, indented to the same level as other tasks

If after pasting into the new file in Visual Studio Code the indentation isn’t quite right, you can select all the text and press <Tab> to move everything at the same time and keep all the same relative locations. In my own files I’ve added a couple of lines to the comments at the top to explain the new firewall manipulation and help those that might follow on from me. Rather than use code blocks here, I’ve linked the pipeline files if you want to see exactly how I’ve configured them:
And that’s it. Save, commit and sync your files to your develop branch, pull them into main if you want, that’s it. If you look back, we’ve actually covered a whole lot of ground in this post. We’ve automated the manipulation of our storage account firewalls using scripts, we’ve added and used variables, we’ve updated our validation pipeline and two stages in our apply pipeline, all with different indentation requirements. We’ve got a deeper understanding of the need for the DevOps agent to access the Terraform state files, and how we can ask our pipelines to pause for a while to let Azure do its thing.
In the next post, we’ll try and add a bit more refinement to our pipelines by adding formatting checks and putting some of our more sensitive information into variables that are locked away in our DevOps library.
Until next time…
– The Zoo Keeper
One comment