Versioned Go Get

Since Go's 1.0 release over a year ago, there's been a struggle with go get and its lack of versioning. I've been thinking about this problem, and though I'm not alone, I wanted to describe the status quo, the problems people have with it, and issues with some commonly proposed solutions.

First, it's worth it to explain upfront that Go's import paths correspond to directory paths somewhere on $GOPATH/src/. Provided this relationship exists, go build will be able to find those imports and build your program. If that path is a url to a go package, go get will be able to install it using that same path.

This duality of import path and get path is great for open source, and because of it Go has quickly built an impressive amount of third party libraries installable and importable in this way. This superficially looks a lot like package frameworks that many developers are used to, like gem, npm, cpan, or pypi, but without a central package registry. The Go authors are clear, however, that go get exists as a convenience for a common workflow; not to solve the many problems with dependency management.

Of course, it works well enough and in use widely enough that it is overwhelmingly the main way to install libraries and packages. Given that reality, the obvious fly in the ointment is that if upstream dependencies change, your code might no longer build. The centralized package repositories provide frozen versions of packages so you can always specify particular versions or sets of versions required to build your program, but Go's ad hoc method requires that packages maintain backwards compatibility (or downstream maintainers remain vigilant) for things to remain go getable.

If you are working on a standalone product, this is easily overcome by freezing the versions you need in your $GOPATH, maintaining external dependencies in a separate repos, or even importing them into your product's repos. This would involve some manual management (perhaps aided by SCM things like git subtrees or submodules), but it would work fine.

The problem is worse when you're working on a library or a piece of reusable code that you want to be able to "distribute" via go get. Ensuring a particular version of a dependency gets installed with the current system requires you give it its own go getable url. This is how the problem with package versions has been approached by some members of the Go community already, with the MongoDB driver mgo having two import paths: labix.org/v1/mgo and labix.org/v2/mgo. Of course, many hosted code repositories do not allow this type of URL control, and splitting projects up among multiple versioned repositories (ie sqlx, sqlx2) is not always appropriate.

It's worth noting at this point that much of the above is perceived as positive by a significant numbers in the Go community. The friction involved with maintaining a mature, versioned library provides a threshold of effort which can be taken as a positive indicator of quality. Many find the idea that software dependencies are resolvable at build time by resources on the web to be undesirable in the first place. Pulling all dependencies into one tree and managing them manually is either common or expected at Go's birthplace, Google.

By far the most popular suggestion people have to address this issue is to add something to go get paths which the go tool will interpret as a tag or a branch. Usually, the path looks something like github.com/jmoiron/sqlx:2.0, with the version number coming at the end. Unfortunately, this would either violate Go's package and binary naming conventions or the duality of import and get path, which is necessary for an install/get command to recursively resolve dependencies, neither of which are acceptable. Another possible solution in this vein is to allow urls like github.com/jmoiron/@tag:2.0/sqlx, which avoids the problems mentioned above but means you lose the common relationship that the go get path is also reachable by a browser.

There's a case to be made that the problem is simply intractable. If you're relying on a URL to exist that you do not control and is not guaranteed to be preserved by a trusted third party, you're opening yourself up to build failures. You might have a reasonable assurance that a tag will not change or a branch can stay backwards compatible, but you have no such assurance that these repositories won't simply be deleted. In the Google IO 2013 fireside chat, bradfitz remarked "Your deploy to production script shouldn't involve fetching some random dude's stuff on github."

Despite such arguments, I believe the authors of Go might underestimate the value of go get in the growth in use of the language. If you're deploying to production, of course you aren't using go get, but if you are building a library to be distributed, it's best to be able to build off of the increasingly rich ecosystem of go packages in a reliable way.

There's no convincing argument that a similar system to go get could not be developed that fixes some of the problems with the current system, or that the current system could not be amended with additional features that would make tracking and upgrading dependencies better while still playing nicely with the basic workspace conventions and build tools. At this point, the choices are manual management or the shotgun go get -u approach, and if the versioning issue is going to persist, then surely more can be done to help downstream developers manage these issues.

May 15 2013

jmoiron plays the blues

Versioned Go Get