A few weeks ago, Alex Gaynor gave a talk at waza entitled "Why Python, Ruby, and Javascript are slow". A video now available, and it is a good watch if you use any of these languages. Even if you don't, or have just viewed the slides, it is still thought provoking.

He rightly cites (unsurprising, given his pypy experience) the old rags of dynamic typing or monkey patching being slow as being the modern version of the 1990s vms are slow mantra: oft repeated, but essentially untrue. Instead, he identifies hash/map addiction and over-allocation as the major sources of slowness in scripting languages.

Hash addiction is a term I've re-coined here to describe the prevalence and importance of hashing and hash tables in scripting languages and the predilection of programmers who mainly write in these languages to reach for the hash table wherever named lookups are desired.

Some hash addiction is cultural. Programmers of elegant scripting languages desire similarly elegant looking code with flexible data structures even where no flexibility is in play. Some who know better will understand that technical limitations make the difference between hashes and other types of named lookups essentially nil for many popular language runtimes.

These programmers know that the other side to hash addiction is technical and implicit. In CPython in particular, nearly all namespace lookups are done by hashing. For object namespaces, this hash is even easily accessible via obj.__dict__. It's a well known optimization idiom for CPython code to reduce the number of namespace traversals by re-defining out of scope variables or deeply nested names in your critical sections to avoid excess lookups; partly due to their hashy nature, and partly due to their repetition per loop. Not to be outdone, in JavaScript, the object and hash table are even more intimately related.

This was, ca. 2001, a difficult thing for Java exiles to wrap their heads around, because statically typed languages handle namespace lookups at compile time. Any given reference to a variable, whether it is Foo or com.java.sys.std.coolguys.Foo, incurs the same runtime costs. Go does this as well, which largely removes implicit hash addiction. Go does however include modern hash map semantics with its map type; more on these below.

The case of allocation is more interesting, because it's the primary source of slowness in many languages that have garbage collection. Garbage collection means cleaner APIs which can intelligently allocate their own buffers and can therefore relieve the user of what can be a tiresome or error-prone burden. Over-copying or over-allocation also has both cultural and technical causes.

Alex talks about how python programmers are addicted to string methods like str.split or str.strip despite the fact they must allocate new strings whenever they are used. Anytime you've ignored the results of map (yes, even pool.map) you've in some way succumb to over-copying. Think about range vs xrange or lists versus iterators in general. But languages like Python also lack APIs to explicitly allocate buffers and lists for reuse and to avoid unnecessary and costly runtime resizing (allocation!), which means that none of the tools built on top of these languages can make use of these time honored techniques either.

Allocation is a source of slowness in Go as well. In perhaps the seminal work on the subject of performance in Go, Russ Cox profiles a naively written Go program which commits both cardinal performance sins and, through excising hashes in favor of structs and lists and reducing allocations through classic techniques like caching/memoization, he manages to speed up a program by 15x while reducing its memory usage by nearly 2.5x. But, while Go has implicit allocation for single value variables, it has widely used idiomatic ways to pre-allocate slices and pointers to assist in reuse and to reduce copying.

This all might seem rather unremarkable, but there's a philosophical point to be made in all of this: there are cultural and technical sources of performance issues, and they are related. In his talk, Alex criticizes the naïveté of both tools and programmers. He bemoans the lack of sophistication in tools that make implicit hash addiction and allocation problems unavoidable, but he also complains about the commonality of poorly performing idioms in scripting language code. The tools, or in some cases lack of tools, encourage these poorly performing idioms in this code.

But with Go, you have allocation APIs, you have compile time name lookups, you have ways to avoid copying ([]byte vs String, pointers), and furthermore the best ways to take advantage of these techniques are generally the idiomatic ways of solving problems in the language. You have a map type which is, compared to its scripting cousins, inflexible: the key and value types are fixed at runtime. You have duck typing in the form of interface{} which works incredibly well when you absolutely must have it, but is otherwise cumbersome, requiring costly (in terms of line count and time) type assertions to functionally employ. These things combined mean that most Go programmers would rather reach for a struct when some kind of named collection is required. The struct they build has type safety, implicit and statically sized allocation known at compile time, and essentially free namespace lookups. You have a language where the cultural and technical performance deficits are minimized in a way that's positively reinforced by the semantics (and, sometimes, lack thereof) of the language.

I've written a fair amount of Go code, but recently I found the opportunity to spend about a month writing nothing but. Having got back to Python the last week, I've had this visceral feeling of being downright wasteful with time and memory. Some of that waste is cultural and avoidable; it's something I catch myself doing in a cute, elegant, slower way. A lot of it is technical, inescapable, and maddening.

Yesterday, Travis Reeded published an article about iron.io's migration from a ruby backend to a golang one titled "How we went from 30 servers to 2". People have commented that these savings could have been gained from writing critical sections in C or by going over the original code with an eye for performance. Putting the obvious parallelization benefits that Go has over most other languages aside, the point they're missing here is that you more or less achieve these results by writing normal Go code. You needn't inline C, perform semantically odd inline tricks, or open yourself up to the litany of programming errors that lie in wait for you in most languages that lack garbage collection. What's going on in your code is just right there in your code, unmasked by thick layers of powerful but obfuscating runtime complexity.

Mar 13 2013