Lately I have been working through a number of automation tasks to create environments and deal with various customer scenarios.
I will say right now, I am all about 'cattle, not sacred cows' as in my configurations are always separate from the machine. All that configuration state can always come from some place else.
15 years ago, I was doing this without automation. We rebuilt our servers at every upgrade of the primary application running on them. And I also updated all the firmware, etc. So I have been in this school of thought for a long time. We just didn't use source control back then, it was documents and settings files and a vault.
The primary concept behind 'infrastructure as code' is that you can run some set of automation. Then bundle up all of the artifacts that drove that automation as a documented source of truth.
Some folks think of the 'infrastructure as code' part as just the settings files. Just the code, and a set of variables. But I challenge that it is much larger than that.
For example: with an Ansible playbook. You would have the source playbook, you have the environment variables passed in. And don't stop there.
You might also have some Jinja2 templates that the playbook used as a transform, you might have had temporary variables in flight, maybe files needed to be purged to harden the machine in production, etc.
All of that is part of the infrastructure as code. Not just what is fed in, but also the scripts and automation that drive the result. All of it.
I should be able to take an archive, open it, play it, and get the same result.
Which means that entire archive is your source of truth at that moment in time.
It is the moment in time part that has some type of source control all involved.
But in reality, your truth might not be in GitLab, it might be an archive in Artifactory. As it might include binaries and other things that don't source control well.
So think about your pipelines and the artifacts make them up and move through them and the end results.
Think about the view of; 'can you replay that and get the same result' or 'can you replay that offline'
I know that way back when we started to look at out rebuild process and combined that with a regular disaster recovery exercise, we really started to refine things and get a handle on the entire process and the dependencies across processes.
Details that are really easy to overlook in the daily grind of making it all just work.