The first lesson is that linkrot is incredibly rapid. The second lesson is that it thus becomes critically important not just to link but to quote--and to quote extensively. The third lesson is that not even fear, surprise, and ruthless efficiency can defeat linkrot. If you want your links to be worth anything in two, three, or five years, download *all* the pages you're linking to to your hard disk.
Posted by DeLong at January 13, 2003 09:57 PM | Trackback
Idle Words: I've been working with some of the many Movable Type weblogs I got this week, seeing how my search code works and scarfing down the content. I purposefully picked weblogs that had been running for years, and left the dates out of the search display. I'd heard people go on and on about the chains and shackles of reverse chronological order, and I thought I'd experiment with just reading things by topic. Well, it doesn't work. I mean, the search itself works -- you search on dog and get back results on dog -- but what doesn't work is the links. By far the majority of weblog posts are short one-liners with a link in them. The next category after that is the tossed salad variety format -- a paragraph full of loosely connected ideas built around pointers to interesting sites. Of course this is the whole point -- we're supposed to be making a reasonable stab at hypertext -- but it turns out the links are terribly brittle. Reading these grizzled posts is like looking through an old scrapbook, where the writing is clear but the pictures have all bleached to white. The further back you go in the past, the fewer working links you can find. 'Permalinks' to other boggers get broken as people change ISPs, domain names, or software. Links to novelty sites and flavors of the month dry up; links to bubble-era dot coms have gone down with the ship. 'Permanent' links to news sites get retired to a polite 404 every time the software changes.
The irony here is that most of this content still exists. More things get moved around than disappear, and much of what is really gone still lives on in the Internet Archive. But the cost of finding that information skyrockets once a link goes down. Something as simple as a tabbed interface made a difference to thousands of web users because it became easier to open new links. By the same token, any rotted link throws up a wall to the user. Even a custom 404 with a good search box on it, guaranteed to find the content you are after, is no match for a working link. And very often the link is an integral part of the content. Just think of dear old Suck, itself now defunct, where the links were their own commentary. Try reading a few of their back issues from 1998 and see if you can find anything in that link graveyard. The sad part is that these old sites and old posts aren't old by any meaningful standard. The oldest blog entry I've looked at dates from 1998, and the blogger who wrote it is still in his twenties. I have book reports from the fourth grade in a paper bag in my closet, but I can't find a silly Jakob Nielsen parody done two years ago.
We're so caught up in keeping track of who is linking to what just at the moment that we've neglected to think about what is going to remain of the "blogosphere" ten years from now. *Two* years from now, for many sites. The average half-life of a link on an education site is fifty-five months -- less than five years. What do you think the figure is for weblogs? What do you think it will be for trackbacks, or site comments? I keep thinking of the museum up in St. Johnsbury, where they have case after case of stuffed tiny birds, meticulously catalogued, with their feet glued to the branches and their feathers all falling out. And in the corner, a gigantic piebald moose. We need some better way of capturing the web for posterity than just a bunch of screenshots grabs, essential as they are. There's got to be a way to make our links less brittle.