login
v2
v1

jmoiron.net

Howdy Y'all from Austin, TX

posted June 19th, 2008 @ 01:10:36

- tags: development , life , travel

- comments: 0

I suppose this post is long overdue in a lot of ways. I've ended my employment at Attila Technologies and started a position at Advance Internet. I've quit my job as a toolsmith/utility/architect and have become a python/django developer. I started my new job not in Journal Square (where the company is located), but in Austin, Texas.

I have some experience with Austin; my aunt Mel, herself fantastic, lived here with her wonderful friends before she passed away, and I've been down here a few times before to visit. A few months ago, really, after not having my contract formally renewed at Attila, and not getting any raise whatsoever, or any offer of any kind of compensation at all other than what I had been receiving, I decided to pursue other options. I decided on Advance, and upon that decision they told me that they were sending some developers down to Austin and wanted me to join.

The long and the short of it is that I wanted to join, too (even at the cost of a week's vacation), and given the flurry of activity and learning this experience has been, I'm glad I did it. We are visiting a contracting company called Optaros who have developed the application that it seems me and a few other developers will soon take over.

Optaros Austin is, to this point, the epitome of a laid back awesome "agile" development environment. The people on our project are cool, varied, professional (in a good way), intelligent, excited about their work, out-going; pretty much everything good imaginable and with the relaxed, calmly upbeat tone Attila lacked. Part of it is probably Austin itself, and another part of it is probably them. I came into the situation here completely lost on both sides (having not worked a day in AI's office and having never met any Optaros guys), so it's been pretty interesting for me so far. I think that I've been able to make a fairly decent contribution all things considered, although I'm not really sure. The separation from the familiar for me has been very good, and seeing the architecture and the way in which they go about development have given me a lot of ideas on how I will want to work upon my return to New Jersey.

The Unified Theory of "Goodness"

posted June 5th, 2008 @ 00:46:26

- tags: life

- comments: 0

A question has really plagued me for quite some time now. What makes something "good"?

I started thinking about this a long time ago in college. Back then, good design to me was kinda like pornography: in the immortal words of justice Potter Stewart, I couldn't define it but "I know it when I see it." For me, it isn't enough to just recognize something as good, I have to know why I recognize it as good.

I don't mince words here. I firmly state here that some things are good and some are bad and that the difference between these things is not merely subjective. It's not that I think something is better than something else, in some sense I know it's better. It's easiest for me to assert this on things that I am an expert on; I know that some programs are better than others, and I know when code is bad and I know when code is good. In this world, there are some easy measurements for goodness (space & time complexity) and some difficult measures for goodness (cleanliness, elegance). It's the difficult measures of goodness where people run into trouble.

I originally started to think more deeply on this after reading Paul Graham's Taste for Makers. In it, he describes how designers (whether they be designing paintings or programs) get "better" as time goes on, which strongly implies some objective and universal measure of 'goodness'.

Some years later, I read Gödel, Escher, Bach. Within, Douglas Hofstadter explains Gödel's incompleteness theorem, the artistic works of M.C Escher, and the intricate puzzle canon's of Johann Sebastian Bach. He uses these concepts along with a host of other muses to talk about how systems themselves can hold and process information not implied by its constituent parts (like the human brain or a colony of ants) to talk about AI. The meat of this book, for me, though, was the inherent beauty and elegance of Bach's work.

Music is largely considered a matter of "taste" in the United States, and at least people of my generation were taught (virtually brainwashed) that matters of taste were purely subjective. In the "easy case" described above, where there are simple numerical constructs that can define goodness and badness, people are able to yield that one thing is better than another: for instance, Lionel Messi is a better soccer player than my Grandfather, because his statistics are superior. But Hofstadter introduces the genius of Bach and his music (and contrapuntal baroque music in general) via mathematics, an utterly stringent framework in which elegance is immediately obvious to trained professionals and the relative complexity and difficulty of problems is very well explored.

Recently, I watched a program on PBS' Nova called "Secrets of the Parthenon. I was amazed to learn from this program that the Ancient Greeks designed the Parthenon to have "perfect imperfections"; the parthenon is completely devoid of straight lines and right angles. A popular theory for this and the use of entasis (slight curvature in columns & foundations found in the Parthenon and other Greek structures) in ancient Greek architecture is that it subdues or corrects for the effects of optical illusions that you would get in perfectly straight architecture. Over time, the Greeks would have understood that if you line up a bunch of straight lines and view them on a horizon, they no longer appear straight. The ratios between height and width of their temples and the width and spacing are also carefully constructed and come from common ratios in nature (the "golden" ratio, and others).

The Greeks not only came to understand the nature of beauty (which to them, was defined by nature itself and the ratios therein) but were able to duplicate in their works, from beautiful statues to magnificent architectural feats. 200 years ago, Goethe, one of the most intelligent polymaths in modern history, began to unravel basic laws of human perception and revolutionized the understanding of optics and color. And yet today, we've thrown away their hard work and generations of experience; it's all just a matter of personal preference.

This isn't to say that there is a final universal answer to whether or not Pele is indeed better than Maradona, or if The Beatles are better than The Rolling Stones, but merely that there is "good taste" and "bad taste", and the difference is not merely subjective.

I've found out that this idea is sometimes met with hostility, because if you accept that design can be good or bad and that your opinion of something is not necessarily right, it brings up the possibility that things you like might be crap and things you hate might be great. It also means that, although you might be fine with your particular tastes, they can be improved.

To not accept it is worse yet, I think, for anyone who aspires to design anything. It allows you to dismiss any critics you don't agree with without cause, and also means that you can't improve upon your skills. If you hold someone to be a better designer than yourself, it's difficult to explain this and to reconcile the differences in design skill within this framework of thought.

Good of Society?

posted May 10th, 2008 @ 18:47:37

- tags: development , python

- comments: 0

I've disabled comments because I have some neat captcha-less ideas for how to tell bots to fuck the hell off from my comments section but I don't feel like implementing them for django. The new system (mostly homegrown glue w/ selector, beaker, mako & werkzeug thrown in) has been in heavy development the past few weeks and commenting should be available by the summer. Sorry Mike; now you've no reason to come here.

For a while the bots were just hitting a few old posts about 5-10 times a day, so I'd kill the spam every once in a while, but then I started to get pretty disgusting stuff on brand new posts so now it's gone. It's one of the classic models of digital social interaction at work, I suppose.

Otherwise, I've been hard at work developing the new site. As part of creating the new site, I wanted to develop very specific things and then generalize them out quickly to create a sort of reusable toolkit. That toolkit (which is standalone, basically) is davenport, and it's been getting a lot of love recently. Some couch-specific modeling stuff and hopefully a davenport CRUD generator will be in the future, as I flesh out the backend of the Hot New Shit. I also have to get my act together and send cmlenz some patches for python-couchdb. Busy Busy Busy.

Swap is good for computers

posted May 1st, 2008 @ 02:06:49

- tags: linux

- comments: 0

Or, using the distribution upgrade for Ubuntu to move from 7/10 to 8/4. I've tried on 2.5 machines so far. The full install, if you are using Ubuntu and have already downloaded the ~1GiB of package data, can still take around 45 minutes to an hour, and stopping halfway could seriously bork things beyond what a journeyman could repair.

But do not despair. Everything (except some translation file for Russian Blackjack; sorry Russia, you're off the island) upgrades fairly smoothly. However, no upgrade is without it's hiccups, and here are two that "got" me. The "got" means that I was mildly annoyed but knew how to fix them.

One was unique to my situation and probably not due to the Ubuntu upgrade per se, but was definitely something I've been seeing more under the heavier-on-the-memory 8/4 than 7/10, and that was a complete system lockup. Happened when I "did too much". I checked free a couple of times too many before I noticed that my swap wasn't there, but that's how it was. As I might have mentioned earlier, swap is good for computers.

The cause of this is almost certainly some dodgy partition swapping I did a month ago when I installed Windows XP on this computer to play Dreamfall. I had a 10GiB partition all set up for Windows + Game for a long time, but never bothered with the install (YAGNI). When I needed it, I realized (after a few runs of waiting 15 minutes for the Windows installer to load all of it's drivers) that I was trying to install to a logical partition and Windows doesn't appreciate it thank you very much. I had to shift around my swap and Windows partition to make the swap logical and the Windows primary before it would yield and install.

Ubuntu uses drive UUID's in the fstab (and in dev under /dev/disk/by-uuid/); these uuid's don't change as long as the partitions don't change (HEH) and they are also device dependent, so you can put your card reader on /media/cardreader/ and your ipod on /media/ipod even if both are going to be '/dev/sdb'. UUID's for devices are generally a good idea, especially since linux has a habit of switching disks (sdb? sda? who cares!) when their position on the bus switches (or just randomly, it seems). Some people don't understand, but they probably just don't like seeing a big ugly UUID in their fstab. They should get over it because for now it's the best solution if your system is going to be managed programatically. Anyway, I never even remade the swap partition, so after mkswapping it and turning it on, I added the new UUID for the partition into the fstab and presto changeo things are nice.

The other issue is with flash sound. 8/4 moved to the "pulse audio" sound server, which is a Good Thing. PulseAudio is like every other sound server in existence except that it's a lot better and it is easier to integrate via wrappers. One such wrapper is the "libflashsupport" wrapper distributed in 8/4 to wrap the non-free flash player's direct use of /dev/dsp (or alsa, i forget which) in calls to PulseAudio. For people who think that this is just "stupid and pointless", PulseAudio buys you lots of nifty things like program specific volume control and a far better backend architecture than esd or arts, and it would behoove people to finally solve playing a sound without fighting over /dev/ resources once and for all in Linux.

So, if your flash stuff isn't working, the first thing to do is check if libflashsupport is installed. If it is, then chances are pulseaudio is not working correctly on your system. If you didn't have problems before with sound, or just want your goddamn youtube, remove libflashsupport. If your flash stuff isn't working and you don't have libflashsupport, install it. Remember, if you want to not use pulseaudio to edit your /etc/firefox/firefoxrc to use 'aoss' as the audio backend for firefox.

Mercurious

posted April 23rd, 2008 @ 22:49:26

- tags: development

- comments: 0

This past week/end, I set up Mercurial on my server and started a few projects there. One is called slipcover, which I talked about last time. The other project, which I've started more recently, isn't really called anything. It was started as a direct result of me using Hg and I must say that so far I am pretty happy.

That project is called Beaker and it already has a perfectly nice home. Beaker is a session management library for Python that allows a number of different storage backends: memcached, database (via sqlalchemy), memory, files, etc. My project is a branch that has a CouchDB backend extension for Beaker.

In Hg, the "checkout" operation actually creates a branch that is no different in terms of operation from upstream: you can checkout from it, checkin new revisions, etc. If you are more interested about how Hg being distributed helps make this type of operation simple, and are coming from a centralized versioning system background (as I was), I highly recommend you read Armin Ronacher's "Mercurial for SVN Users", which does the job of explaining the philosophy of Hg to people used to svn/cvs admirably and better than I could without lots of effort.

Hg is as of just last month "1.0" software, and has already proven itself in major projects like Java. There is a fairly mature trac plugin and various rcs conversion tools. Although there is no WebDAV support, hgwebserve supports most of the same uses with respects to checking in/out via https, and comes with a revision browsing interface that includes colorized diffs and various changelog views. In the coming weeks, I'll be migrating my old svn repositories to Hg and migrating my trac instances to use TracMercurial.

Slipcover

posted April 19th, 2008 @ 19:42:50

- tags: development , python

- comments: 0

I've been doing a lot with CouchDB and WSGI the past few months, with positive and negative results. I'm finally getting the "hang" of Document Orited Databases (DODB? Rubbish acronym), and starting to understand what is possible and what isn't, how to do one-to-many and many-to-many relationships without having lots of queries. It struck me that although the per-query time in CouchDB is thusfar a lot slower than relational databases, the overall database time for individual webpage loads is far less, and there are far less queries and complex information joins going on.

There are a few reasons for this. The first is that Django is a general purpose framework, and it's ORM is similarly meant to be general purpose. The result is that getting the comment counts for 10 posts on the front page takes 10 queries. This type of a query will be possible in one view in CouchDB when Reduce is implemented, but the only way to do this without modifying your "data schema" is to get all of the comments and count them manually.

Because data is so fluid, it's no problem to add a "comments_n" field to each of your commented-on documents. This kind of application specific update-on-write hack is probably going to become very commonplace as CouchDB gains mindshare and the "best practices" are discovered, not because there are inherent limitations with the view system (or at least, there won't be, once it's done), but because the lack of a schema makes it extremely easy, and it's always more efficient to write the application this way than to calculate it over and over.

In light of all this tinkering, I've done a few things. I've started to take a closer look at Erlang especially of late, as I've started to use CouchDB trunk instead of 0.7.2. I've also done a lot of python speed & feature related hacking with both the more or less official couchdb-python library and with my own re-implementation using pycurl instead of httplib2 which I've called curlcon. Originally, I had devised an httplib-style replacement object that used curl as it's transport layer (Hence Curl Connection) to be plugged into python-couchdb, but it evolved into a sort of experiment to see how fast I could get the http part of CouchDB to run.

The code is now available in my mercurial repository under the "slipcover" repository. As I start to add testing to some of my "web framework" for the next version of jmoiron.net, "slipcover" will become more of a full project whose purpose (besides to run my website) is to be an example of an idiomatic CouchDB/Python web application.

Clonable Server Demarshaller

posted March 18th, 2008 @ 21:05:38

- tags: politik

- comments: 2

I might just take all of my title names from classnamer from now on. It really saves me from the hardest part of writing these damned things.

This past weekend (and even now) I've had some foul flu-like disease which has really destroyed any Joementum I've had coming into the stretch here. I was ready to go to 2 concerts this week and reconnect with some old mates when this bug came in and kicked me straight in the ass. I was out even worse than Bear Stearns this weekend. It's pretty funny, right? Hah, Hah! But I'm actually extremely pissed off; I'm mad as hell and I'm not going to take it anymore. It's just so far past "I told you so" that I can't say that anymore. Doesn't have the requisite OOMPH. The only thing I can think to say is fuck you; It's the new black.

Lets put things into a ludicrous perspective. Anyone can go onto Yahoo and find out just how bad the dollar has taken a tumble since 2002 against the Euro (protip: it was on parity, now it's around $1.60 for 1 EUR), and it's very startling to an American's sense of idiotic superiority to learn that the CAD is worth more than the USD, but to me the anguish really concerns Japan.

A mere 2 years ago, I went to said island nation and was able to fetch 127 JPY for a single greenback.
Not 1 year ago, I revisited the land of the rising sun with my younger brother, and we received not less than 117 JPY per hard earned American Dollar at the very same money change counter in Narita. In those 15 months, something that I enjoy doing got 8% more expensive purely because our economy is in the shitter and nobody trusts the complete idiots in chief to right it.

This monday, the dollar hit as low as 96 JPY. 96. That's a roughly 18% drop in 9 months. In January, it was already down around 106, 105. Something that I liked doing in January 2006 is now 25% more expensive to do purely because our economy is in the shitter and... you get it. I'd love ever so much to blame this on Ohio, and I do; but the problem is really a sociopolitical perfect storm that can only be described as completely fucking retarded. I mean this with the utmost respect to the actual mentally retarded, because they don't deserve to be compared to the intellectually indigent beltway assholes steering everyone straight into the shit pile.

This is what happens, America, when you allow anti-intellectualism to become the dominant culture; you get people trying to stress already underfunded schools by making them teach idiotic filth alongside evolution, you get a hulking mass of voters who want their political representatives to be typical instead of exemplary, and you get a bunch of deified capitalist idealistic monsters who have just plain have no fucking place in reality as we know it. And you ruin my fucking trips to Japan. So if you are out there, and you voted for Bush because Kerry seemed like he was too boring, or his face looks like that of a horse, or you think John Edward's hair and plastic smile and double chin looked a bit too perfect, or (and I sincerely hope against hopes this doesn't apply) that you actually agreed with anything that ridiculous, impish conman said in his first 4 years, then I say to you fuck you as emphatically as a sick man can.

Happy Anniversary: An Introspective

posted February 22nd, 2008 @ 22:40:36

- tags: development , life , python , site news

- comments: 2

Happy Anniversary to me!

6 years ago I started my blog on the now defunct IRIX server attila.stevens-tech.edu. The first post in this blog was made with a bash script that basically used cat & sed to style a quasi-structured text file that I'd shell into the server and create. The server did not offer any CGI services (for fear that the students would screw up and bring down the server, which also handled email), so it would be a year or so until I was able to move my website to PHP.

From there, I wrote two versions of jmoiron.net using the classic LAMP stack. The second one used PEAR:DB for database safety, a thin home-grown templating system based in PHP (both heavily inspired by jeremy mikola).

In May, 2003, I started to learn the Python programming language (This was the beginning of my summer vacation for the Junior->Senior year of University). My very first mention of the language in these hallowed pages was, well, completely retarded: I was "gonna write an interpreter, or a C compiler, or something." It's funny ha-ha.

By late 2004/early 2005, I had ditched the PHP beginnings and had written some custom mod_python stuff to run the site. I ditched mysql at the same time. By late 2005/early 2006, I started writing my own plain-text -> HTML markup language based on MoinMoin syntax. I had almost my whole database converted (automatically) to this script when, in mid 2006, I traded manual mod_py for django and my markup language to markdown.

I quickly ditched the first iteration of the django site and ended up with what you see today; It's been here for over a year and a half! It was by far the harshest transition, because I also made a switch from storing my posts as HTML to storing them as markdown (which I am, of course, now unhappy with); and back then the automagic html->markup filters left a bit to be desired. I am working on yet another iteration with yet another set of technologies; the next iteration (probably set to finish around April) won't even involved SQL at all, and will be the first one without a real solid framework since I moved to Django.

This site is kind of a technical experiment of mine; It's where I express myself both through code, through design, and through words. Hopefully sometime soon, through pictures too. I hope that it will continue to be something that I can hang out there to dissuade future employers from hiring me!

On WSGI, CouchDB

posted January 30th, 2008 @ 01:25:41

- tags: development , python

- comments: 0

pythons on a couch

I've been thinking about WSGI and CouchDB recently, while on the subject of digital inflexibility. First, I want to clarify a few things about what I mean by flexibility with respect to an application, and how the current crop of frameworks approach this problem. If you want to follow this musing well, I highly suggest reading "What PHP Deployment Gets Right" by Ian Bicking; or just his entire blog, and most of the crosstalk on the web about REST, Web Services, and the evolution of the WWW.

How do modern frameworks (Rails, Django, Turbogears, or other "canonical" ones) deal with the problem of flexibility? They don't, for the most part. For one, flexibility is hard both programmatically and conceptually. Secondly, they replace flexibility with simplicity, which is almost always a tradeoff that results in quality. They achieve this simplicity by strictly dividing tasks and then conquering each task by building up a structure around how one is supposed to go about solving that task.

These are not negative qualities at all; one of the things that Rails has gotten right is that design ought to be opinionated. So your task when you go to develop in your now "classic" framework is to set up your REST API (your "Routes" or "urls.py") and your controllers, design your models, and set up your views, and you've got the whole MVC ready to go. The problem is rigidity, repetition, and BigDesignUpFront.

The solution is flexibility. Joel defends BigDesignUpFront, and when you are working with a team on some critical make-or-break software for your company, BDUF might be well worth it for it's benefits in fleshing out potential problems, helping with schedule and cost estimation, etc. But for prototyping, exploring technology, or exploring a problem space, BDUF is deadly. For "agile" or TDD, popular buzzwords that are worth far less than their hype but still provide useful insight for all developers, this is potentially damaging. Coupled with rigidity (in the form of SQL) and repetition (even in DRY espousing frameworks like Django) this is tough to overcome, especially when looking at migrating lots of data to a new application framework.

The first way to overcome flexibility I want to talk about is WSGI, whose design is inspired in part (or so I understand it) by Java Servelets. At it's core, WSGI is a specification for how web servers and python applications communicate; but more interesting (and far more necessary in the statically typed world of Java) it also defines specifically how various python applications are called by the web server. This means that other python applications, given that they abide by the specs, are free to call other WSGI applications themselves with impunity and expect them to work.

The way it's implemented, you need only define the __call__ method to receive 2 passed arguments and return an iterable in order to qualify as a WSGI application. These are incredibly weak requirements on applications, and make many middlewares truly plug and play. What's more, the effort was originally to define a standard that the existing plethora of Python frameworks could all use so that their component pieces would be interoperable with each other. WSGI is still pretty new, and opinionated frameworks like Django are probably not eager to ditch their middleware integration layers for pure WSGI interfaces anytime soon (although Django does work w/ WSGI, I think that's more of an interface between a web server and a Django application taken as a whole), but the proposition of using, say, Django's caching middleware, for any python web application written to conform to WSGI is really exciting.

This gives you flexibility in designing your own "framework" built of hand chosen component pieces. Pylons is essentially a framework built upon PythonPaste that facilitates you in choosing these WSGI middleware components, but I've found some of the areas (particularly the URI routing) to be a little less flexible than I'd like (and, sadly, the documentation is a far cry from Django's). Accepting the dogma of one framework or another does come at a practical advantage; you avoid writing the necessary glue between components. But as the glue itself is agonized over, standardized and simplified, it becomes just another component.

It also gives you another interesting flexibility: the ability to attach applications written completely differently (even in different frameworks) to different URIs at the same site, all of them using the same middleware. This blog works as a Django application; why change it? But my Gallery might be better implemented using other technologies (and I discuss this below); with everyone on board using WSGI, it'd be trivial to attach a different application to handle the '/gallery/' URI space but keep both applications using the same caching, gzip, and authentication middleware. This idea is extremely powerful, because it allows one to select the proper tool for the job and align with whatever tool chain most closely reflects the problem at hand.

What about flexibility at the genesis of the application? Web applications these days deal mostly with the storage and presentation of data. Certainly, the current crop of frameworks reinforce this idea; ditch Django's ORM or ActiveRecord and see what's left with respect to creation of a data driven website. This is where CouchDB, or what I perceive CouchDB to offer, enters the equation.

As a metaphor, lets look at programming languages and type binding as a method of describing and manipulating data. In a statically typed programming language, the structure of data is described explicitly and is enforced by the compiler. You go about defining what a widget is, and then create instances of widget. Methods that would manipulate widgets must receive a widget as their in put.

Where statically typed languages provide subtypes, super types, and other ways to make the definition of what qualifies as a widget more malleable, databases struggle at this. You describe data (in the form of tables, relations, etc) beforehand as before, with each field being a strict type and each table describing some strictly typed record. To alter these definitions, you have to define new tables to make additions to the previously defined record types, and modifying the type of their existing data is not possible.

If you want to act on all widgets, you must be cognizant of other widget like tables. Even if your new widget is exactly alike from the old one, grouping both is either manual or inane and always slow. So how do you do migration? You dump the database, add or massage the types of the new table columns you will be adding, and then re-import. Some frameworks provide tools around this process, but the necessity is fundamentally broken.

The document oriented approach CouchDB takes is much more like having a large, flat, "duck typed" table where you can store anything. You define views of your large data soup that pick out items based on specific characteristics of those items, not on their structure. Want all "things" published on some day? It isn't a problem; everything is a thing. A quick stab at structure is to add a type field that allows you to filter out "things" that match a type string. These things are guaranteed, upon delivery, only to have matched that type string and nothing more. This is a weak guarantee, but weak guarantees buy us flexibility.

In Python, often times functions are described as taking objects that allow certain actions on them; for instance, iterable. Requiring only that an object be iterable is a very weak requirement, far weaker than acting on "anything of type foo". In practice, many functions merely require that the objects they manipulate only contain certain methods or attributes, not necessarily that they satisfy some larger unused type structure. This is a trade off, to be sure, but it's a trade off towards both simplicity and flexibility.

As a concrete example of how this can be useful, lets take my ever languishing gallery application. The goal is to keep in the database my images as well as their EXIF tags such that I could easily perform a search like "Find me all images with this aperture" or "Find me all images taken with this camera." Because I have images taken from at least 3 different cameras (not to mention pictures my friends or family take that I might want to include), and camera makers all add their own types of tags in the "MakerNote" section, I can't have a single per-image "tags" table.

As it stands now, my proto-SQL database has 3 simplistic tables to handle this: a gallery_image consisting of id, title, description, etc; a gallery_image_tag, which is supposed to represent a single EXIF tag consisting of an id, title, desc, etc, and 'gallery_image_tags' which allows me to tie the two together so I can get "an image and all of it's exif tags" in one query. This is straightforward (albeit painfully unoptimized) using the Django ORM, but it's a horrible rigid design that sees me making potentially dozens of database updates for each uploaded image.

In CouchDB, I could simply designate my images as having a type field of "image", and then dump in the tags as key/value pairs. It is as trivial to create views of the type described above of this database as it is to create a view returning all images; while the 'all images' view would map documents based on their satisfaction of (type == image), the more complex views are just as simple (camera_model_name == ...).

Looking to the future, it is also far easier to modify the CouchDB database to allow for new features. Lets look at some potentially interesting features: an algorithm that gauges the color temperature of a photo to group "like" photographs together, such that you can view a "gallery" of dusk pictures, dark pictures, or black and white pictures algorithmically rather than by manual tags. Implementing facial recognition to determine whether or not a picture is a portrait. I could run these algorithms on my database images in batch mode and then simply update each document with their temperature score or their boolean portrait status without ever explicitly modifying any structure. As the temperature scores or portrait statuses are tabulated, they are added to each document and the "gallery" views incorporate them automatically.

Software developers have this kind of wish list view of the future, where writing a web gallery can quickly turn into pushing the forefronts of computer vision technology or spawn a perl to python compilation project. Sometimes these whims manifest themselves as something very interesting or inspiring, and wherever they aren't too critical they should be possible!

Digital Inflexibility

posted January 25th, 2008 @ 18:45:28

- tags: development

- comments: 1

I've been playing with a few ideas recently, stemming from the perceived inflexibility in my blog software and my inability to develop a gallery over the past, oh, year and a half.

I already know what level of flexibility I want; the wiki. No logging into an admin interface and filling out some convoluted form to create a flat page; you go to the URI you want to represent your text and you add it.

Wiki's can afford this flexibility by sacrificing a lot of what makes a traditional website powerful: structure. When you add data to a wiki page, there is no structure to the text within that page relative to the system. When I add a blog post, there is a strict structure: there is some text that is a "body", "title", "timestamp", etc. This structure is very powerful, because then you can treat all posts the same and provide different views on them. This blog post is rendered by the exact same tiny template as any other blog post in any other url that you find on my site.

But in a traditional "backend" driven website, that structure exists in a relational database. Changing databases is difficult and annoying, and gets moreso as the amount of data you store in them increases. More unfortunately, you have to create that database structure at the beginning of your website development, forcing you to think about the exact structure and interdependence of your data before you have anything off the ground. I've spent 12 months being paralyzed by the daunting task of coming up with all of the possible metadata I might want to save on an image to create photo gallery software.

Some of this is tool failure, some of it is personal failure, some of it might even be my very own ignorance. But there is a kernel of structural failure in the very way that the classic "web application" is created. As it turns out, the "new jersey method" of web content delivery, the completely unstructured wiki, is better for most things.

At its core, at its most deep philosophical level (ignoring community driven content, which is outside the scope of this discussion), a wiki is a way of linking a UID, represented by a URL, to a piece of text. The other step forward in wiki is the pervasive use of simple, human friendly markup.

What if you could apply these to more than just text? What if you could give structure itself an easy, wiki-like markup? A wiki's handling of rich content like images or movies is notoriously bad; what if you applied pythonic namespace theory to a wiki so that /!img/foo.png allowed you to "create" foo.png if it did not exist, or /!mov/bar.flv allowed you to "create" the flash video bar.flv at that "location"?

I've been mulling over this recently, and over my own dissatisfaction with django/turbogears/pylons and just SQL in general. I've had some interesting ideas for wiki extension (like pervasive use of DAV, using an RCS backend instead of a FS backend, etc), and looking into "thin" web glue technologies like paste to assist in creating a framework suitable to my needs using mod_wsgi, django's URL routing, CouchDB as a backing store and whatever templating language I end up preferring (probably the fastest one).