Software Automation On a Budget

Written by: Matthew Setter

When a business is getting off the ground or a startup is launching, it’s understandable that money will be tight and cash flow all but nonexistent. There's a tangible sense of urgency in getting the business’ core product ready, along with its marketing strategy and other core business functions, rather than focusing on the ideal automation solution.

Anything that isn’t directly related to validating the service offering, winning new customers, or making sales is a legitimate candidate for being put off till some stage in the future. Given that the success of a business isn’t guaranteed, it’s easy to see how thinking long term is an optional extra, even a luxury; something to leave to a later time when the business has obtained some level of certainty and predictability. So many things can fall into this category, one of them being software automation.

However, if a business has a strong desire for success over the long term, consideration needs to be paid to activities that will reduce costs and expenses, such as software automation. While it is important to maintain a more short-term focus during the bootstrapping stage, businesses must think about handling long-term costs.

Given that, a well-thought-out software automation solution should be considered almost right from the outset. This leads us to two central questions:

  1. How can we create a solution that's easy to implement at the start, yet can scale with the business as its needs change and grow?

  2. How do we build a solution with an investment proportionate to our time available and the development stage of the business?

Those are not easy questions to answer, so let’s work through a series of potential solutions to find out. But before we do so, it’s important to recognize that not all businesses are the same, especially in the world of technology.

For this article, I’m going to base the solutions around a hypothetical business that's releasing an online video training website, where people can log in and purchases courses around their interests. Let’s assume that the site’s written in one of the modern, dynamic, languages, such as PHP, Python, or Ruby. Let's also assume that it uses one of the more popular JavaScript front-ends, such as AngularJS, Ember.js, or Backbone.js.

I could have proposed a much more sophisticated architecture, but let's keep the focus on the discussion of the automation solution, and not on the product or service.

Is The Automation Solution Easy to Implement?

What’s easy for one development team may be a challenge for the next. But that being said, let’s consider some solutions at the easier end of the scale.

What is likely going to be involved in deploying an automation solution? Well, you’re probably going to have the following components:

  1. Server Provisioning: Do new services need to be installed, stopped, started, or restarted?

  2. Database Migration: Do database migrations need to be run to update the schema, potentially as the result of code changes?

  3. Code Deployment: Are there new code changes which need to be rolled out, whether to testing, staging, live, or another environment?

  4. Cache Clearing: Do cache directories or services need to to be flushed?

  5. Authentication Credentials: What services or servers are required? Details for these could be provided directly in a configuration file or extractable from the server environment.

  6. Version Control: What system manages the code? I’ll assume that all source code is versioned by Git.

These are what I’d consider a minimum set of requirements for an automation process to manage. Now let’s discuss how to implement such a process.

https://js.hscta.net/cta/current.js

hbspt.cta.load(1169977, '05489183-4551-4d0e-a73f-b2bb8d741274', {});

A Basic Software Automation Solution

I’ve said many times that computers are designed for automation. Humans aren’t. We make mistakes at the best of times, let alone at times of stress, fatigue, or emotional distraction. So the first thing is to ensure that our automation solution is scriptable so that the process repeats the same way every time.

Let's start with a solution written in either Perl or one of the many Linux shells, such as Bash or ZSH, which was initiated by a Git hook.

After a release was tagged, a Git hook would launch our test suite and, if the tests passed, would launch our automation script. This would then use a command, such as the following, to find the changed files between our tag and the previous one, giving us the list of files to deploy.

PREV_TAG=$(git rev-list -n 1 0.13.0)
LATEST_TAG=$(git rev-list -n 1 0.14.0)
git diff --name-status $PREV_TAG..$LATEST_TAG | awk '{print $2}'

The returned file list could then be deployed to the testing server using standard tools such as SCP, SFTP, or Rsync over SSH.

Once finished, the script could call a remote script to clear cache directories on the remote server, run any database migration scripts, and make changes to packages on the testing server.

As an added integrity requirement, we could have the script require a user number or password, which it could use to identify the user running the script. Alternatively, it could use either an operating-system level technique such as whomami or a service such as LDAP to identify the currently logged on user.

The positives

Let’s consider what we've achieved with this basic software automation solution. We’ve created a homegrown, automated, repeatable solution that incorporates our seven core requirements that we listed earlier in this post.

The deployed files are only those that Git identifies as having changed. If something goes wrong, such as the creation of a new bug, we can quickly track down where it came from and who ran the last deploy. It can be run anytime, day or night, with the same outcome, regardless of who runs it.

Finally, since it’s written in a Linux shell or Perl, there’s a large potential pool of developers who could maintain it.

The negatives

While this should be relatively quick to setup, it creates a new codebase that in turn requires maintenance as well. It may or may not be scalable across as many servers as the business requires, at least not without significant refactoring or overhead. It could also have performance or security flaws.

Then there’s the cost of developing and maintaining the solution. Is developing a complete solution in-house, from-scratch cost effective?

This approach will require no further cost for external services or software. But what about the cost of developer time? Arguably this is significantly higher than fees for external services or additional software, when your devs should be focused on developing the company’s product or service.

So what about something more mature, more reusable, and less demanding of developer time?

Improve Provisioning and Simplify Deployment

Let’s refactor the solution and address some of the original version’s shortcomings. Let's start with replacing the component of the script responsible for server provisioning with an open-source solution, such as Ansible, Chef, or Puppet.

All of these are tools that are independently developed and maintained and have large repositories of third-party package additions for all manner of scenarios. They’re all well-known solutions, have extensive online documentation, hold accredited training sessions around the world, and are used throughout the industry.

The code uses standard conventions and notations, keeping it well organized, versionable, and understandable. Any one of these will be easier to develop and scale than our homegrown solution whether we’re managing several or several hundred servers.

Given the near de facto nature of these tools, it will also be easier to find experienced developers to maintain our code.

Now let’s take on code deployment. One solution is to have a bare Git directory on the remote server, along with a clone of the source code. In this scenario, after the tests pass, instead of finding a list of files to sync, a push would be made to the remote, bare, repository, which in turn would handle updating the remote deployment directory.

I’ve used it in the past for small applications, and it works just fine. There’s no need to write extra scripts or be worried that a file may get overlooked, have incorrect permissions, and so on. However, it lacks some key considerations, such as being able to rollback to a previous release or manage related processes such as database migrations or cache clearing.

So let’s look at other solutions to fill the gap. Gladly, there is a range of tools available. These include Deployer, Capistrano, and Mina.

Each of these, regardless of language, are configurable solutions that can deploy code in releases, as well as support the ability to rollback to a previous release, manage a fixed number of releases, perform role filtering, host filtering, and so on. What’s more, they provide both a preset list of premade deployment tasks and allow for creating custom tasks. They can also integrate with language-specific tools, such as Composer for PHP, which can perform related tasks like database migration scripts.

Once created, again using defined documentation, deploying code changes can be as simple as running a script such as deploy <deployment environment>. Everything’s take care of, the same way, every time.

The positives

At this stage, we have a reasonably well-featured automation solution, adaptable to one or more languages, able to perform all the tasks we need, in a well-documented and well-organized manner. While there’s still developer time required to build and maintain the solution, it’s likely going to cost less than the previous solution.

The negatives

It is arguable that our solution is starting to get a little complex, with a range of tools in the mix that would demand certain skills of future maintainers. However, I don’t consider this a negative, as the tools chosen are well documented, well supported, and well understood.

A Fully Hosted Automation Solution

But we could do better. What about a solution that demands even less from us? What about an online solution? As soon as we’ve done a push to a particular branch, it would:

  1. Launch the tests.

  2. If the tests pass, begin the deployment process.

  3. Provide integrations with a range of services, such as those we need.

  4. Provide integrations for all the software languages we may need.

  5. Provide a range of notification options.

  6. Do this all from a professional web, or command line, interface.

The positives

In this scenario, we don’t need any code to determine which files need to be synced or to sync them or to call services for us to do so. That’s all taken care of by the integration between an online service, like Codeship, and either GitHub or Bitbucket.

Via configuration choices, we’re able to build upon predefined test pipelines and configurations, such as PHPUnit for PHP. We’re able to build upon predefined deployment pipelines and configurations and deploy to such services as Heroku, Google AppEngine, or Amazon S3. We’re able to integrate with notification services, such as HipChat and Slack.

Here, we’re able to get an automation process up and running, provisioning aside, in a relatively short period with a minimum investment of developer time and effort.

We’re able to stay abreast of what’s worked and what’s failed without having to code a user interface or report infrastructure. We can look, almost at a granular level, to inspect what went wrong when necessary. We can take the pressure off our team and our hosting environment and offload it on an external service with a support team.

The negatives

Having either worked with, built, or maintained solutions of each type, the one I prefer is this last one. However, when things go wrong, you have to remember that you've surrendered some control in return for the benefits you’ve received. It may take the vendor’s staff time to resolve the situation or to help you to solve it yourself.

However, even that negative can be a benefit. In addition to your staff troubleshooting the situation, you have support from an additional team, which will likely help you resolve the situation that much faster.

Conclusion

Now that we’ve evolved a potential software automation solution, let’s consider what we’ve achieved.

Budget

We’ve built a software automation solution that can work on any budget, whether extremely tight or with funds to spare. We’ve done a good job of balancing features, cost, and risk exposure.

Sophistication

The solutions are increasingly mature, sophisticated, maintainable, and scalable. What’s more, they use tools and technologies which are rapidly becoming the de facto standards. Additional developers can be brought on to either aid in maintaining them or fully take over maintenance and development.

Scalability

The solutions work, whether we’re working with a few or many servers, and also take into account different environments, such as testing, staging, and live.

Software automation doesn’t necessarily require large quantities of time or money, either in the initial creation or maintenance phases. It's achievable on even the most modest of budgets, and it's possible to grow your solution along with the business.

It’s not always easy to see in the early days exactly what a business’ needs will be in the future. But with research, forethought, and careful planning, we can build solutions that grow with us, as much as is practically possible.

Stay up to date

We'll never share your email address and you can opt out at any time, we promise.