Golang ORMs
Go's lack of a traditional class/type duality and completely flat type "hierarchy" makes the concept of an ORM an odd fit for the language. Furthermore, its strict syntax requirements make it difficult to duplicate the declarative ActiveRecord pattern that many developers are used to. Still, there is clearly a widespread desire for something to slot into the M of MVC.
Elephants
ORMs get a lot of bad press because sometimes assumptions they make about your database design can cause them to be unusable or difficult. The DjangoORMs decision to add an implicit serial integer id
field to models by default can become a crippling technical debt when that table grows to the point where it must be sharded. ActiveRecord and DjangoORM both allow you to have foreign key fields which will lazyload from the database, often without the knowledge of inexperienced developers. When these time bombs blow up, they consume developer resources and create a climate of developer anger.
As these pitfalls are numerous and manifest, and they get blogged and re blogged, independently discovered by countless young developers who get caught up by them. When this happens, the tools are blamed. And it's right that they share some of the blame, because they've provided the environments without which these anti patterns could not exist.
Philosophy
ORMs generally do at least some of the following:
- map query results to structs/classes (serialization)
- simple CRUD-style object persistence
- provide programmatic query construction
- create INSERT schemas or provide schema migration
- hooks into database usage (ie. logging all query times, pre/post save, etc.)
- manage database connections
- manage caching
Of these, the last is unattempted by any currently available Go ORMs, and managing database connections is generally left to the underlying database/sql
library.
Implementations
For SQL databases, there are 4 roughly equally popular packages to try to fill the ORM role: gorp, beedb, hood, and qbs. Reading these libraries and their documentation has been an interesting view into how people are approaching these types of interfaces for which I initially thought Go was ill suited.
Serialization is implemented the same way on all Go ORMs, and can be done regardless of the provenance of the result set. sql.Rows.Columns()
is used to match column names to struct fields, generally with the optional help of struct tags. This is covered by sqlx as well, which is not an ORM.
From there, they start to differ. The readers digest version is that I prefer gorp
. The problem with the others is that they attempt to pack too much into struct tags in an attempt to get a more declarative data model. Struct tags have the same problems that going with raw SQL has; they are opaque strings which happily compile with semantic errors and can fail at runtime all too easily.
There are only a few issues with gorp. It re-implements the Scanner
and Valuer
interface from the builtin database/sql
and database/sql/driver
packages unnecessarily, and it (optionally, but unnecessarily) uses empty struct literals as a way of passing types in places where idiomatic Go would pass struct pointers or slice pointers and keep memory allocation control with the caller. Since gorp's API is more or less frozen now, I forked it and created modl, which removes these redundancies and replaces its serialization layer with sqlx
. In future, I hope to come to a more unified interface, where some of the advanced features of sqlx (like NamedQuery
and Rebind
) will feel more natural in modl.
Though I prefer gorp, I really like the direction that the beedb
API for query construction is going. It's extremely simple conceptually and in the code, but it allows complex and parametric composition and preserves much of the power of handwritten SQL. Unfortunately, its simplicity also means that it retains some of the "string opacity" drawbacks of raw SQL.
Despite liking the interface and the simplicity of the code, I'm a bit concerned about the possibility of injection attacks (or just outright bugs: table and column names are never quoted) because of its widespread use of Sprintf
in the query construction process and the difficulty in providing args...
to let the driver escape when the order of the bindvars aren't necessarily the same as they appear in your source. I believe better can be done.
hood and qdb additionally provide data validation and sophisticated control over schema output, but this is shunted into the struct tags and it comes at some considerable complexity cost. gorp
s hook approach is less elegant and less DRY but more flexible and, in the long run, probably enough.
Missing from all of these, of course, is the R in ORM. And the O, sort of.
But there's still a long way to go. Gorp's TraceOn
feature is half-baked; production applications often require extensive logging parameterized on the context of the query being logged, like the tables being queried, for instance. Many of the points in my seven point ORM philosophy are under developed or not fully explored.
Simply getting data out and putting data back into the database still comes with a lot of friction, due to the lack of adequate query builders. Partial two-way serialization is very difficult because it's impossible to distinguish between zero values and values that were not present in the initial load. There's a general lack of documentation on how you solve anything but the most basic of use cases with these libraries, and admittedly I did no one any help by adding a 5th undocumented ORM into the mix.