The path from getting a piece of code to a shippable product can be a long one. Depending on the complexity of our application and the level of reliability we expect from it, the effort we need to put into building and maintaining quality control can soon exceed the effort we put into the pieces of the application that bring us actual business value.
Build and continuous integration pipelines, executing hundreds of different tests, may run for minutes -- if not even hours. Even more upsetting is to have it fail in the middle of its execution due to a problem as minuscule as a typo in some hidden part of our source code.
Imagine this little Python try/except block here, which should help us catch errors while connecting to a database:
try: db = dblib.connect(dbaddress) except: raise DBConnectionError("Failed to connect to DB %s" % dbadress)
When we write code, we're often under pressure, and even if we aren't, silly mistakes happen all the time. In this example above, we (well, I) accidentally misspelled
dbaddress in the except block.
Now imagine we committed the code like this to our repository.
A couple of seconds or minutes later, a build process will kick off. Our codebase gets checked out onto a build executor and the process of building and testing the software begins. Python does, by default, not care about mistakes like the above, unless the CPU actually reaches this very piece of code and tries to execute it.
If we have a unit test written for this code path, the error will be caught earliest when this specific test is being run.
If we don't have a unit test for this branch of our program, a system test might catch it -- should we have written one for testing the case that the database is not available.
If we don't have that either, the code gets released and one of our customers might be the first one to see it. That's not great. But that would probably be the worst-case scenario. Let's not be so cynical and instead take a step back.
When Molehills Become Mountains
Because we are sensible software developers, we wrote a system test that makes sure our application can connect to its database and a test that makes sure the program fails in a controlled way if a connection couldn't be established. The latter is the test case that will cause our pipeline to fail, due to the typo we made.
But it takes some time until we reach this point. First the application gets built, then we run some unit tests, followed by the setup of the system test environment and only then the system tests are being run. All in all, it takes us -- let's say -- five minutes before the pipeline fails and reveals our mistake.
Five minutes isn't the end of the world, of course, but it's long enough, that in the meanwhile, right after committing your code, you got up from your desk to get a fresh cup of coffee. On the way, you met a colleague and had a quick chat with them. By the time you get back to your desk, ten minutes have passed since you did the git push.
If you're checking the build status straight away and find the cause for the failing build at first glance, you can quickly open the file in your IDE, fix the typo, and recommit your code.
[caption id="attachment_6330" align="aligncenter" width="826"]
Small code change, late notice[/caption]
Chances are, however, you might not check your build straight away (especially if you don't have any notifications set up), and even once you do, it perhaps won't be obvious to you what the error was. The point is, there are a lot of factors involved in how long it takes from writing a line of erroneous code until the detection and resolution of the issue.
If the issue was that we made a semantic error when putting together our business logic, this may not be such a big problem; after all, we write tests so they can tell us about those errors. If it's a silly typo in some exception we should never even reach in production operation, finding out about it this way is demotivating, annoying, or inconvenient at the least.
Thankfully, there are things we can do about this. Things that can help us reduce the time between producing and detecting such an error, from minutes or even hours down to seconds.
Static code analysis is one of the things we should be looking at; a tool that walks through our codebase and checks if what we have written is in line with lexical, syntactic, and semantic rules of the language we use as well as design guidelines we are meant to stick to in our organization.
An example for such a tool, which would catch the error in the above Python code, is pyflakes. Executing it manually would look like this:
$ python -m pyflakes . main.py:11: undefined name 'dbadress'
Of course, we can't possibly rely on ourselves running this manually each time we modify code (especially not if we need to run multiple analyzers).
If you are a software developer, you probably have according integrations in your IDE; if you are more of a systems engineer, maybe a complex IDE is a bit of an overkill for you, in which case a text editor might be your coding environment of choice (which may or may not have similar integrations).
Either way, there are occasions where we just want to quickly add or amend something in our code and open a file in a plain vi on the console because starting up some graphical editor and finding the right file just for that one small change is simply not worth it. So we open our file in vi, jump to the line we are interested in, make our changes, save, and exit vi again. We are confident that everything is fine and commit and push our code.
This is where we made our mistake, and since we didn't have any code analysis running, we won't find out about it for another while.
The question now is, how can we get this analysis to run without compromising the comfort we get from quickly editing files in a plain text editor? What we need is for the check to run independent of our development environment. It needs to be triggered upon performance of an action that is part of our workflow regardless of how we edited our file.
And indeed, there is one thing we always need to do, no matter how we modify or code; namely, committing and pushing the changes to the version control system.
According to the most recent StackOverflow developer survey, over 87 percent of developers use git as their version control system, hence it is the VCS I chose for exemplifying this concept of commit-triggered code checks (by the way, if you belong to the 7.9 percent of developers who use zip file backups as version control, I would love to have a chat with you over a beer).
A git hook, essentially, is an event-triggered script. A piece of code that runs whenever a certain git-related action is performed. There are hooks for all kinds of git events, and it is certainly worthwhile having a look through all the options available to you as they are a great way to automate repetitive tasks.
The kind of hook we are interested in is the pre-commit hook, which is fired off whenever you run, well,
git commit. And the best thing is that if the script you installed as the pre-commit hook exits with a non-zero code, the commit is aborted and your code won't end up in the git repository. This makes pre-commit hooks the perfect place for quality gates (of course, you can always explicitly ignore the failure and fore the commit, if need be).
Installing such a pre-commit hook is as simple as it gets. Write the script you want to have triggered upon a commit and store it as
.git/hooks/pre-commit (make sure it's executable).
In our case, we simply put the line
python -m pyflakes . in this file. And there we go:
$ git commit main.py:11: undefined name 'dbadress'
Refining the Process
The above works fine, but if you are concerned about slowing down your workflow with this, you could implement some performance-enhancing tweaks. One example would be to only apply the static code analysis to files that have been added or modified and staged for commit:
#!/bin/bash // Find all python files ataged for commit // that have been added or modified (ignore deletes) added_modified=$(git diff --cached --diff-filter=AM --name-only | grep .py$) // If no python files are affected, skip the analysis and exit successfully [[ -z $added_modified ]] && exit 0 // Otherwise run the analysis across all affected files python -m pyflakes $added_modified
Another tweak would be to containerize checks and make them execute in parallel. This one is tricky; first of all, it's not as trivial to implement as the above method, and second, it could actually make the checks run even slower if not done carefully. But if you're interested in this anyway, I'd like to refer you to a blog post I wrote about this last year.
The Moral of the Story
Making errors is inevitable, but letting them pass unnoticed doesn't have to be. Static code analysis tools can help us catch certain types of errors early on in the software build process, before they get the chance to break integration pipelines and ruin our day. Having these analysis procedures triggered by version control system events allows us to integrate them into our workflow without adding additional manual steps while at the same time keeping them separated from our coding environments.
I encourage you to experiment with hooks like these and would love to hear your feedback and opinions.