Should You Deploy -SNAPSHOTs

Stephen Connolly's picture

The first time you come upon Maven you will encounter the magical -SNAPSHOT versioning of artifacts. The first question people have is usually “What in ***’s name are these ‘effin -SNAPSHOTs?”

Well that is easy. They are informal snapshots of the artifact as you head towards the version without the -SNAPSHOT. Say we are working on version 2.5 of our Überwizbang product. Well in that case we will give it the version 2.5-SNAPSHOT… when we feel ready to cut a formal alpha release (i.e. a snapshot that we think is somewhat stable) we will create a 2.5-alpha-1 version… or maybe we will use milestones, cutting 2.5-m1. -SNAPSHOT is just another qualifier that lets you know this is not the final release. Maven says anything with a qualifier comes before the actual release version.

As an aside, this may explain for you why some organizations (I’m looking at you Mr. Trilby-esque Linux) use qualifiers for all their releases… where you see -ALPHA-___, -BETA-___, -CR-___ (candidate releases), -GA (the actual release), -MR-___ (maintenance releases) with CR being carefully chosen to come before GA… and never will a non-qualified version be released.

Nothing wrong with this approach, as long as you are consistent. Of course adding another dot-seperated-digit and using that to indicate the maintenance releases works just as well, and has the side effect that when you have more than 10 maintenance releases you don’t need to start giving them strange version numbers, e.g. 1, 2, 3, … 9, followed by 91, 92, 93, … 99, 991, 992, 993, etc, or plan up front and pre-pad with enough 0’s  just to ensure that ASCII sorting results in numeric sorting

Maven treats -SNAPSHOT versions specially. Maven expects that -SNAPSHOTs will change, and conversely that non-SNAPSHOTs will never change. The result is that once Maven has cached 2.5-alpha-1 (which is not a -SNAPSHOT) in the local cache (a.k.a. local repository) it will never look to refresh the cache. So if you respin 2.5-alpha-1 several times until you “get it right”… well anyone who pulled the first one into their local repository will only ever have the first spin and not the “real” 2.5-alpha-1 which was the third spin.

When it comes to -SNAPSHOTs though, Maven periodically checks for updates.

I hear you ask how often… well that depends on what you configure in your settings.xml the default is daily, but every developer can pick any interval that suits them.

It is unfortunate that the XML schema for repositories reused the same structure for <releases> as for <snapshots> as the side-effect of giving <releases> an unused <updatePolicy> child element is confused users [Note, it’s not really unused, it controls how often Maven checks for changes to the maven-metadata.xml files, but still it causes confusion in users])

So what happens is that you kick off a Maven build and the first thing it does is go checking for updates to your -SNAPSHOT dependencies that are outside the reactor (i.e. list of modules you are currently asking Maven to build)

Now when you have a very big project with many developers, you sometimes to see a pattern whereby groups of developers tend to work on only a small subset of modules within the project. This can happen in one of two ways:

  • The whole über-project is released as one big bang.
  • Each “team” of developers releases their components one by one, sometimes re-releasing individual components as needed until finally the über-project installer/distribution module is ready to be released bundling together the individual components

With the big-bang release, if your build takes a long time to complete (even with skipped tests) or some modules need specific toolchains then as a “convenience” to developers you might configure a CI server (such as Jenkins) to deploy -SNAPSHOTs so that if I am only working on one module I don’t have to build everything all the time.

With the component based release, you might configure a CI server to deploy -SNAPSHOTs of each component so that downstream teams can integrate with upstream changes without having to wait for the upstream team to cut their next release.

In general, though, we need to keep in mind that when introducing deployment of -SNAPSHOTs we don’t want to start forcing the developer to build on sand. There are three factors that interact in this area:

  • How often are -SNAPSHOTs deployed (controlled by the CI server… and perhaps some individual developers pushing -SNAPSHOTs manually when others get blocked waiting for changes if the interval is too long for their needs)
  • How often is Maven configured to check for updates (controlled by the individual developers… and perhaps overridden with either the -o option to fake “offline” and prevent over-eagar updates, or the -U option to force an update)
  • How good the rest of the developers are at not breaking things

I am the type of developer who likes to know the foundations on which I am building. So I prefer to start my day by updating source control and doing a full build after each update. I don’t want -SNAPSHOTs deployed to the remote repository because now, when working in a submodule, if the dependency I am using is the one I built locally when I did my update from SCM or the one that Maven downloaded from the remote repository. I am likely to want to set my <updatePolicy>never</updatePolicy> to prevent such uncertainty. 

There are other developers who don’t mind working on such a shifting sand… and indeed there are times when I don’t have the full toolchain on my machine and I am forced to develop in such a situation… but if you want productive developers, they need the ability to control when their updates take place and to have certainty as to what might have just changed. 

Define the -SNAPSHOT deployment policy

This is the most important thing you can do. Let everyone know that -SNAPSHOTs will be deployed, e.g. every day at 1am GMT, or every Sunday, or every hour, or on every commit, or whenever Joe decides to push a new -SNAPSHOT to the repository. I don’t mind so much what policy you pick. Just pick a policy and let everyone know. Some thoughts of mine:

  • Please consider using an atomic deploy mechanism. 
  • Define the -SNAPSHOT retention policy (so that people using timestamped -SNAPSHOTs know how long they can rely on them)
  • Please consider the time-zones for developers. Try to ensure that the deploy happens before all developers start their work day. The worst thing that can happen is you are working on a feature and half-way through your morning’s work Maven starts downloading -SNAPSHOTs and the code you were working on now won’t even compile because of changes elsewhere… you want to fix those issues when you are ready to commit or merge back your changes… you don’t want to get knocked out of the flow fixing these issues while you are trying to solve a specific problem.
  • Either use a CI server to deploy -SNAPSHOTs at a fixed cadence or have developers push -SNAPSHOTs as required. Don’t mix and match.
  • It is perfectly fine to say you will never deploy -SNAPSHOTs. There should be no need to deploy -SNAPSHOTs as long as:
  • mvn clean install -DskipTests of the über-project is relatively fast (i.e. it will be finished by the time I have made my cup of coffee)
  • I don’t need some specific toolchain to build the über-project. For example if you need Visual Studio to build the JNDI .dll and GCC to build the .so and… well then you probably need the CI server to be deploying -SNAPSHOTs

Let developers decide their <updatePolicy>

Don’t mandate a specific <updatePolicy> as that will force the developer’s hands. By all means you should educate your developers and recommend an <updatePolicy>. Just don’t lock down their settings.xml and force them to live with a specific <updatePolicy>.

Define what the -SNAPSHOT policy is for feature branches

Feature branches that will last longer than about a day or two really need their own version number, even if that is only “2.5-feature-blah-SNAPSHOT”. If your going to deploy -SNAPSHOTs for those feature branches, you don’t want two different branches to collide, so hence they will need different version numbers.

Use timestamped SNAPSHOTs

One of the features Maven has is so-called timestamped -SNAPSHOTs where, at deploy time, the -SNAPSHOT is replaced by the -YYMMDD.hhmmss-n timestamp of the deployment. This can resolve some of the issues with feature branches, especially in conjunction with the Versions Maven Plugin which has goals for “locking” and “unlocking” your -SNAPSHOT dependencies to the current timestamp version, thereby removing uncertainty. This can be helpful for feature branches.  

Those are just some of my thoughts. Hope you find them helpful.

—Stephen Connolly
CloudBees
www.cloudbees.com

Stephen Connolly has nearly 20 years experience in software development. He is involved in a number of open source projects, including Jenkins. Stephen was one of the first non-Sun committers to the Jenkins project and developed the weather icons. Stephen lives in Dublin, Ireland - where the weather icons are particularly useful. Follow Stephen on Twitter and on his blog.

Comments

Stephen i do not get what you mean by timestamped SNAPSHOTs being helpful for feature branches. Can you give any example?
Stephen Connolly's picture

@milus The issue is around how you handle your feature branch and how your code is organized in SCM. If you use an SCM like GIT that encourages many separate repositories, one for each chunk of stuff that should be released at the same time, then you can have a situation where you have 3 release roots: A, B and C. You start a feature branch on module C, so you switch it's POMs from version 3.2.4-SNAPSHOT to 3.2.4-feature-blah-SNAPSHOT. The problem you face is that both A & B are still moving, and you really want to get the feature working on a stable base. One way of handling that would be to create the same feature branch in A & B with the version number in all POMs switched to 2.3.0-feature-blah-SNAPSHOT and 1.4.8-feature-blah-SNAPSHOT and then you update C's POMs to use those feature branch versions of A & B... well now you have three problems... 1st you have to have permission to branch A & . 2nd you now have to deploy those versions of A & B to the internal repository. 3rd you have to make a lot of changes that will get unwound again... of course there is an upside, namely if you need to make changes to either A or B in order to support the feature work in C you have a branch you can work on immediately, but that is an upfront cost that gets higher the more release roots you have and the benefit gets smaller (as if you have 20 release roots the upfront cost is massive but the opportunity gain is just 1 module out of 20). The other way is to just point C's POMs at the timestamped -SNAPSHOTs. That straight away gives you a solid base, they are already in the repository manager, you are not having to make lots of changes all over the place. If you make the locking of -SNAPSHOTs as one commit with no other changes, then unwinding that one commit should be easy... and if it turns out you need to make a change in module B to support the feature branch in C, you can use the timestamp of the -SNAPSHOT to infer which commit you need to fork B from (or just rebase to newer -SNAPSHOTs and fork from the head of B for its feature branch). Neither of the above are the *best* way of doing things, nor the *worst*. My point is that they can be helpful, you need to pick the right approach for the task in hand... it all depends on your SCM(s) and your project structure

Add new comment