Terraform for Azure: Basics (8)

HashiCorp Developer Logo

Directories and Stages

Hi there, this was going to be on the end of our seventh post in this series exploring the basics of using HashiCorp Terraform to deploy resources into Microsoft Azure using Azure DevOps. I realised when I got to the end however, that the post was already way too long, so decided to split it. That means this post should be a bit shorter. Emphasis on the *should*

When writing these posts I make an assumption that you’re following along with me, your code is at the same level as mine and that generally means you’ve worked through the previous episodes:

  1. Prerequisites
  2. Repositories and Pipelines
  3. Build Pipeline and Resource Deployment
  4. Pipeline Security and Governance
  5. State File Storage Security
  6. Pipeline Refinement
  7. Modules

I’m going to write my usual preamble here: I’m writing this series as a memory aid for myself with the hope that others can use it that might be on the same learning path as me. It’s not a detailed guide into Terraform, DevOps or Azure, it probably doesn’t meet all the best practices that a DevOps engineer would use, but it works for me. I’m writing in the middle of 2024, so by the time you read this things might have moved on a little, but I suspect the core concepts will remain the same. As always I’d like to credit my friend and colleague James Meegan for his original documentation which has been used as a foundation for the series.

In this post I want to cover the differences between directories and workspaces (I use directories, I’ll explain why, but there are arguments for both and people’s preferences differ – choose what works for you). I’ll try and give examples of how we can create separate environments using pipeline stages, and I’ll also start adding a little more structure to our repository’s root folder.

Directories vs Workspaces

As I just mentioned above, there are arguments for using directories and those for using workspaces, and all are valid. What are they though, and what are they used for?

Let’s consider a typical deployment of a landing zone in Azure. You’re likely to want different regions, to give you business continuity or disaster recovery options. You’re probably also going to want production and non-production areas. All of these can be considered different environments, and HashiCorp offer two separate approaches using Terraform to separate out and manage these environments logically: directories and workspaces.

I’m not going to delve into the depths of each of these, you can research workspaces here for example if you have an interest in exploring that option. The brief summary though is that they are different instances of state data inside the same directory. You use a variable to refer to the correct workspace for your environment in your values. Directories are just the same as Windows directories or folders, and follow the same structure, and that’s what I’m using in this series. For me, because workspaces are not shown in the code, I find it easy to lose track of which workspace I’m in, and worry that I can accidentally make a change to the wrong environment. I use directories because for me they’re easier to follow while reading the code, and I can see at a glance my current environment.

Working with Directories

I’m going to put a bit of a caveat here before we move on to the next section. We are going to look at creating an environment in preparation for a second, but we will not actually do anything for our second environment until we get to a later post. If we add that second environment now, we will need to use more advanced concepts that will come up organically as we move forward. This first step though will put us in a great position for later on.

So let’s start looking at our code from a different angle. I want two environments: production and development (development will be the one that comes in a later post!). So far, all our work has been done in the root directory, with modules held in their own separate subdirectory. We are going to use directories to separate out our environments. In the previous post we abstracted all our values from our code to make it repeatable and portable. We can now take advantage of that by putting the values for our separate environments into separate directories. All our values were stored in a .tfvars file, and it’s this that defines what goes into each of our environments, so each environment directory will have its own separate and distinct .tfvars file.

Like with modules, I want to create a containing folder that I’m going to call “config”. The name is not important as long as you understand its contents. Inside “config” I’m going to create a “production” directory and a “development” directory. Each of these will contain its own .tfvars file. Again the name prefix can be anything you want, but for that instant glance understanding of which environment you’re updating, it’s a good idea to make it descriptive:

There are many ways you can choose to organise your environments, and you should choose your options individually for each project. You might for example want to have your environments broken down by region. Maybe you’ll even want your different environments in totally different repositories; a repository for development and a repository for production for example, with each having different access for different people using granular permissions. How you do this is up to you, but this series should give you the skills to understand the processes you should follow when you’ve made your decisions.

So back to our own deployment, and although we have .tfvars files, the pipelines have no knowledge of them, so when Terraform runs it will just use .tfvars files in the root. We reference the new environment files using the “commandOptions” key/value pair in our YAML pipelines, with the value being the path to the specific .tfvars file we need for our environment. To make sure the state file remains unlocked if the pipeline run ends unexpectedly, we’re also going to add a value to do that (-lock=false). Finally, we’ll add a value to output the results of the plan that can be used in an apply stage (-out=plan). We’re focussing on the production environment for now, so in our validation.yml file, we will add the following line, after “command” towards the end of our “plan” task:

commandOptions: "-var-file=config/production/do_series_prd.tfvars -lock=false -out=prdplan"

Now we’ve done that, we need to move all the values from the .tfvars file in our root directory to the .tfvars file in our production directory:

Ensure the .tfvars file in the root is now empty, then save all your files, check your formatting, commit and sync, then check out your validation pipeline job to make sure there are no errors and that it’s picked up your resources with no changes. When you’ve confirmed all is OK, we need to add the same information to our apply pipeline. This pipeline is a little different to validation. First of all, remember that there are two plan tasks (one for each stage) and one apply task. Also, we already have a “commandOptions” line for each task. We need to replace each of the two plan task lines with our new command options, remembering to indent appropriately. As for the apply task, that has a “commandOptions” line with the value “plan”, telling the apply task to do exactly what the plan task has just done, we just need to change that to “prdplan”, we need do nothing else, including pointing the apply task at the new .tfvars file:

Once you’ve done that, again, save all files, check formatting, commit and sync, then once your validation pipeline job is complete, perform a pull into main and watch your apply pipeline job to make sure all goes well there, and your resources remain in-situ.

Adding a new Environment

As mentioned earlier, we won’t be adding a new environment in this post, but there are a couple of things we need to take note of. We’re now at a point where we have our deployment code in the root directory, and our production data in a subdirectory. This production data defines that production environment. I pointed out back in the first post of this series that different environments would likely need different backend state files. If we pointed a second apply stage at the development .tfvars file we created in the development directory, it would just compare the requirements against the backend state file, see that we had no data for the production deployment and delete that, replacing it with our development resources. So we need a separate state file that has no knowledge of our production environment, so won’t delete anything if it doesn’t see it in our data. As we move forward in our posts we’ll look at how we create that .tfstate file and reference it in our pipelines.

To prepare us for that, we need to update our validation pipeline so that it can run the plan tasks of multiple environments. This will mean creating stages in our pipeline as we did in the apply pipeline, so that each initialisation (which refers to the state file) is treated as a separate piece of code. We will need to have the following tasks for each of the stages:

  • Check formatting and check out the repo
  • Open the storage account firewall
  • Perform the Terraform initialisation
  • Perform the Terraform plan
  • Lock down the storage account firewall

As with our apply pipeline, we’re going to have to be especially careful around the formatting and indentation. Let’s start with the initial stage information, which we’ll put straight after our agent image information:

# Stages tied to environments
stages:
# Production Plan Stage
- stage: prodplan
  jobs:
  - deployment: NoCheck
    displayName: No Check
    environment: 'Test-Resources-Repo-Validation'
    strategy:
      runOnce:
        deploy:

Note here that I’ve given the stage a name relating to production, and that I’m using the validation environment I set up for the apply pipeline. You might find it makes more sense for you to create completely new pipeline environments for your stages, to help you understand their tasks and give you granular choices over governance, but for the purposes of this series I’m just going to make use of the existing ones.

Next we need to add a line to check out the repository for the stage and indent our current plan steps to match the stage appropriately (select the remaining data in the pipeline and press tab until we’re one step in from “deploy”):

        deploy:
          # Steps to perform as part of the pipeline operations
          steps:
          # Check out the current repository branch
          - checkout: self
          # Check Terraform Formatting
          - script: |
                terraform fmt -check -recursive --diff
            displayName: Check Terraform Formatting

If you want, you can add comments or other hints that this is your production stage. I’ve updated the following display name lines:

displayName: Run Terraform Init (PRD)
displayName: Run Terraform Plan (PRD)

That’s our production stage done, and we’re in a position to add stages for new environments as we move forwards.

We’ve now got stages in our validation pipeline, initially for a single environment. As we add more environments or stages, each will have its own backend state file. Save all your files, check your formatting, commit and sync, then watch your validation pipeline job run. Your pipeline’s stage is accessing the pipeline environment for the first time, so it will need initial permissions granting to access it:

Tidying up the Root Directory

Before we move on to our next post, it’s time to do a little housekeeping. We know that any .tf file in a folder will be read as one with all other .tf files in the same folder. We also know that as we add more and more resources to deploy, our main.tf file is going to expand meaning it will become difficult to manage. I find a good way of dealing with that expansion is to break out each resource type into its own .tf file in the folder. I also find it useful to number the files so they always remain in a consistent order that’s easy to find. I’ll begin by renaming my providers.tf and variables.tf to begin with a low number, so they’ll always appear at the top of my list:

In my main.tf file, my first code block is deploying resource groups, so I’ll create a new numbered .tf file for resource groups, then move the code from main.tf to that new file:

Next in main.tf comes VNets, so I create a new numbered vnets.tf file and move the code from main.tf into it:

The main.tf file should now be empty, with all code for different resources in their own sensibly labelled files. You can take this further if you wish and rename any of the other files. When you’ve done, save, check formatting, commit and sync, make sure all is good in your plan, then pull to main (checking that your apply all runs smoothly)!

Summary

So there we go, I managed a relatively short post! We are now a long way towards understanding different methods of creating multiple environments, and we’ve got our code ready for a more complex deployment. In the next post, which I can’t promise will be as short, we’ll be looking at putting sections of our YAML pipelines into templates, separating out repeated code that is project or client agnostic as we do with our Terraform modules.

Let me know how you’re doing, leave me a comment with your progress and any tips or gotchas you might be coming across yourselves.

Until next time

– The Zoo Keeper

By TheZooKeeper

An Azure Cloud Architect with a background in messaging and infrastructure (Wintel). Bearded dog parent who likes chocolate, doughnuts and Frank's RedHot sauce, but has not yet attempted to try all three in combination!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.