Typical

Dynamic and Static typing seems to be getting some hot press recently. I've had a long history on both sides of this argument and my opinion has changed over the years.

In the beginning, there was Fortran and Lisp ...

typing... of the dead

The first contemporary piece I remember reading on the subject was an artima.com blog post by Bruce Eckel in 2004, well worth reading in its entirety. I've reproduced a relevant snippet here:

We believe that static type checking prevents bugs, and yet a dynamically-typed language (Python) produces very good results anyway. As I have tried to delve more deeply into this mystery, many of my preconceptions – the major one being that static type checking is essential – have been challenged. An initial response to this is often to simply deny that it's the case, but once you begin denying evidence your theories rapidly become nothing more than fantasies. [...] In trying to do [determine why Python works so well] I have discovered many things and gained greater understanding about the other languages I use [...].

My guess is that Python allows me to think more clearly about the concepts of the problem that I'm trying to solve. It is less distracting because it doesn't force me to think so much about the rules imposed by the language – rules that are basically arbitrary when I'm trying to produce an effective model of my problem space. By getting out of the way, Python and similar dynamic languages allow me to spend more of my brain's "seven plus or minus two" items on the problem itself, and less on the details of the language implementation.

This was satisfying to my tribal lizard brain. I had just spent the last year or two of uni telling everyone that Python was the best, leading a professor of mine to quip that "Python must be the union of all programming languages". In my enthusiasm, I told Aaron Seigo that KDE would be better and get more contributions if the UI standardized on PyQt, and he politely told me that was nonsense. I was unbearable. By an accident of history, Eckel's post is a superb, mature, and honest exploration of the subject and its nuances, even though I couldn't fully appreciate it at the time.

At the risk of over-simplifying, there were a few prevailing classes of languages in 2004. Functional languages both dynamic (like lisp, scheme, erlang) and static (like ML, haskell) were not widely used in industry. The statically typed C and C++ were particularly damaging to the notion that static typing reduced errors, as their type systems were trivially and habitually escaped, and they tended to fail in ways (mem leaks, seg faults, bus errors) that dynamically-typed interpreted programs didn't.

Java was statically typed and managed, but autoboxing and generics were still very new and not yet widely adopted, which meant it was still a caricature of cumbersome "enterprise" development, not an entirely undeserved reputation. It was also too resource hungry for things you'd write C/C++, and too bureaucratic to fully take advantage of the freedom provided by garbage collection. It suffered from this categorization for a long time, and perhaps still does.

Into this milieu came Python and Ruby. They were general purpose, classically imperative, and dynamically typed without being inscrutable (perl) or lacking in coherence (php, javascript). They carved out a notable niche in a way many dynamically typed programming languages in the past, notably smalltalk, never quite fully realized.

The hegemony of these popular languages was strong, and static vs. dynamic typing was framed for a long time in this context. The pitfalls of C, the shackles of Java, and the unreasonable effectiveness of Python/Ruby¹. Unrelated strengths and weaknesses of these platforms came to embody the whole static vs dynamic dichotomy, rather than the other way around. Concepts became bad or good by association. A generation of developers primarily exposed to dynamically typed programming languages inherited this worldview implicitly.

As an example, it was commonly accepted wisdom that while static languages did compile time checking, compilation was slow and delayed turnaround times. Dynamically typed languages, nearly all interpreted, lacked the checking, but allowed more rapid iteration during the discovery and exploration phase.

These lines are today increasingly blurred, if not altogether shifted. There is a ton of interesting, valuable work being done that challenge and expose these associations as incidental rather than intrinsic. Sadly, instead of trying them out, many people seem content to bitterly fight to defend their old dogma. Without diving in, there's no mystery to unravel, no preconceptions to challenge, and no progress. Viewed through this prism, the questions being asked are fascinating:

Go asks: what if a statically typed language had structural typing, a good generic hash type, proper modules, compiled quickly, and used type inference? Would having static type checking and fast development iteration give you the best of both worlds, the worst, or somewhere in between? Are generics actually worth the added complexity in the presence of these other features?

Rust asks: is encoding ownership and borrowing in the language to skip garbage collection worth the conceptual overhead? How many of the benefits of immutability to you get by making it the default?

Multiple dynamic languages are asking: what would happen if you introduced optional static type checking? Python is going through this now, and one of the most succinct and persuasive cases for static typing came in the recent PyCon MyPy talk introducing this feature. PHP even got there beforehand, albeit with its traditional stable of arbitrary limitations. TypeScript has even managed to add static typing to a loosely typed language with no integers.

I'm very excited to see how these hybrid approaches work in practice, which is something we likely won't know for a while². I'm excited because years or programming in Python followed by years of programming in Go have taught me that static typing does prevent classes of real errors I make routinely.

An absolute classic is 'NoneType' object has no attribute..., which is an oversight that static type systems at best make impossible and at worst make visible. Many claim diligent testing methodologies fix this, which is kind of like the programmer version of fat shaming. Without a way to limit the surface area of your API to some set of values, people invariably end up testing preposterous inputs to ensure behavior in the event of hypothetical future abuse, or give up and try to encode these requirements in variable names or comments³.

I've also become convinced that living in a dynamic-only world leads to weaknesses in API design, where specificity is an asset. The result is a proliferation of HTTP Json APIs where keys can encode one-or-more type objects because you're just going to check at runtime anyway, or an entire object's structure can change depending on the presence or value of another key because it's assumed you're going to load the whole thing into a hash first anyway.

Finally, as the MyPy presentation covered, and as I've alluded to recently, I'm increasingly skeptical that context and names are sufficient to write easily readable programs. Domain data types in real world programs are often passed and stored in a variety of formats: a user can be an integer ID, an email address, an object/struct, a protobuf or json representation of that object, a specialized cache representation of a subset of fields, an ad-hoc tuple with additional presentation information for a template context, a database result, etc. If you're going to try to encode this into your variable and function names, why not have a succinct way to enforce those requirements?

These are unsatisfyingly empirical feelings, and as such I don't expect them to be persuasive. Attempts at trying to back them up with data generally devolve into a discussion on methodology. It says something about the virtue of that pursuit that we still seem to be moving in some kind of vaguely forward direction. On the whole, the language landscape is slowly accepting, rejecting, and re-evaluating ideas, taking them apart and putting them together in different configurations, and trying to see what happens. Go check it out for yourself and push things forward!

Functional languages were largely ignored, as they had been since the 60s. This too is slowly changing ...
There are other problems I have with dynamic interpreted languages, but I'm cautiously optimistic that these are accidents of design rather than intrinsic limitations.
User input is, unfortunately, notoriously shit at reading comments. The point of course isn't that testing is for suckers, but reproducing in tests what you'd get anyway by type specification negates the benefits of dynamic typing and obscures your tests, which may actually encode sophisticated value expectations difficult to express with type systems.

Jun 6 2016

jmoiron plays the blues

Typical