Visualizing the Gnu GPL

My suggestion for the Decoding Digital Humanities meeting has been accepted, by both the London and Melbourne groups, for next Tuesday (24th August) here in the Great Wen, and next Thursday (26th August) down under. I’m feeling the warm glow of internationalism!

One reason I suggested the Gnu GPL as a text was for its unfamiliarity of form. It’s a software license, a genre often viewed but rarely read. I’ve clicked through many, barely registering the dense legalese, meaning I’ve probably promised to sacrifice my first-born to Bill Gates. The GPL, to its great credit, has a clear and concise preamble. But nevertheless, it is a legal document, written to withstand exacting juridical scrutiny.

As digital humanists, we shouldn’t be frightened of such things, for we make tools to deal with such difficulties. Whether the texts are in another language, damaged, obscured, fragmentary, long-winded, self-referential, or simply too numerous – not forgetting that no text is so transparent that one simple reading will comprehend it entirely -we can hack them.

One popular way of doing this is with wordles. These are, in essence, visualized concordances. The words are weighted according to frequency, then displayed as clouds. There are various options for colour, layout and font, but these do not reflect any aspect of the text, being more for aesthetic appeal, and as such a cause for their popularity. (The creator of Wordle, Jonathan Feinberg, discusses this in Viégas et al, “Participatory Visualization with Wordle.”)

So here I present the three versions of the Gnu GPL as wordles. They are made from the 100 most used words, filtered for the common and ordinary (‘the’, ‘and’). I have attempted to minimize the extraneous as much as possible, having the words displayed horizontally, (near) alphabetically, in plain, plain black and white.

Wordle of the 100 most used words in the Gnu GPL v.1.

Wordle of the 100 most used words in the Gnu GPL v.1.

Wordle of the 100 most used words in the Gnu GPL v.2.

Wordle of the 100 most used words in the Gnu GPL v.2, 1991.

GPL v.3: Wordle of 100 most used words

Wordle of 100 most used words, GPL v.3, 2007.

By taking the three versions, I’m treating the GPL historically, as changing over time. The most obvious and startling finding is that the term ‘program’ has dramatically declined in use from version 2 to version 3, changing the whole picture from being arrow-shaped to more cloud-like. (The algorithm for laying out the words is in Viégas et al.)  Its synonym, ‘Work’ has risen in its place. ‘Free’ has declined proportionally,  but in absolute terms, the story is quite different: it features in v.1 23 times, v.2 28 times, and v.3 20 times. ‘Freedom’, not found in the graphics above, rises from 3 usages in v.1, to 4 in v.2, and 8 – doubled – in v.3.

I could spend all day pouring over these things, but I’ve probably spent too long already when I have a dissertation to write. In any case, the purpose has been to suggest ways of reading the Gnu GPL, and will leave discussion to the convivial atmosphere of the meetings.

NB: The code behind wordle.net is owned by IBM, and closed. A free version, that allows adjusting and playing with the code, would be most desirable.

Reference: Fernanda B. Viégas, Martin Wattenberg, Jonathan Feinberg, “Participatory Visualization with Wordle,” IEEE Transactions on Visualization and Computer Graphics, vol. 15, no. 6, pp. 1137-1144, Nov./Dec. 2009, doi:10.1109/TVCG.2009.171 Behind a paywall, sadly, but abstract available.

This entry was posted in digital humanities and tagged , , , , . Bookmark the permalink.

One Response to Visualizing the Gnu GPL

  1. Pingback: Tweets that mention Anterotesis » Visualizing the Gnu GPL -- Topsy.com

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>