August 27, 2003

Clay Shirky Is Enthusiastic

Clay Shirky is very enthusiastic about world-writable strongly-versioned information systems--"wikis":

Social Software: Every time I show a wiki to someone who has never seen one, I invariably see the same two reactions: "That's pretty cool", followed seconds later by "It'll never work." This second reaction is understandable, as wikis take a radically different attitude towards process than almost any other piece of group software.

Process is an embedded reaction to prior stupidity. When I was CTO of a web design firm, I noticed in staff meetings that we only ever talked about process when we were avoiding talking about people. "We need a process to ensure that the client does not get half-finished design sketches" is code for "Greg f***ed up." The problem, of course, is that much of this process nevertheless gets put in place, meaning that an organization slowly forms around avoiding the dumbest behaviors of its mediocre employees, resulting in layers of gunk that keep its best employees from doing interesting work, because they too have to sign The Form Designed to Keep You From Doing The Stupid Thing That One Guy Did Three Years Ago.

Wikis dispense with all that -- all of it. A wiki in the hands of a healthy community works. A wiki in the hands of an indifferent community fails. The software makes no attempt to add 'process' in order to keep people from doing stupid things. Instead, it provides more flexibility, a crazy amount of flexibility, and intoxicating amount of flexibility, allowing massive amounts of stupidity and intentional damage to be done, at will, by roving and anonymous posters. And it provides rollback.

Bad things happen on wikis. All the time. As historyflow shows (w00t!), pages frequently get deleted outright. But, as historyflow also shows, in a healthy community they also get restored, quickly.

Programmers have known for decades that a version control system covers a multitude of sins, and wikis embrace versioning as the cardinal virtue. With versioning, there's no need to try to prevent bad things from happening, so long as they can be quickly undone. "Detect badness? Get back to the last good version, then start out again from there."

I was recently reminded of this marvelous property when checking out wikitravel.org. I noticed a page for my city, and checking it out, I saw that it was an entirely fake review, making reference to places and events that never happened. It looked like an attempt at humor writing (though a fairly lame one), but of course the side effect (or perhaps intentional effect) was to undermine the goal of the site.

Seeing this, I simply deleted the current content and put an "Add content here" stub on the page, then went to Recent Changes to see if there had been other such grafitti entries. The same IP address came up in two other places, both also fake entries, and I deleted them as well.

Looking at the timestamps in recent changes, I saw that our budding satirist had spent an hour and three-quarters working on his trio of masterpieces. They were on the site less than two hours when I came along, and I undid everything he had done in two minutes.

And this, mirabile dictu, is why wikis can have so little protective armor and yet be so resistant to damage. It takes longer to set fire to the building than put it out, it takes longer to grafitti the wall than clean it, it takes longer to damage the page than restore it. If nearly two hours of work spent trying to subtly undermine a site can be erased in minutes, that's a lousy place to hang out, if your goal is to get people's goat. Better to go back to posting Microsoft trolls on slashdot.

The freedom from process is quite remarkable, and is also the hardest thing to explain about why wikis don't just fall apart with the first attack.

Posted by DeLong at August 27, 2003 09:19 AM | TrackBack

Comments

I don't have the impression that the people behind Wiki really know much about library science/information science; thus they might be condemning themselves to reinventing the wheel.

Posted by: Stephen J Fromm on August 27, 2003 09:27 AM

Interesting comment. Albeit, a generalization. Got any specifics or examples?

Posted by: Ross Mayfield on August 27, 2003 11:06 AM

Interesting comment. Albeit, a generalization. Got any specifics or examples?

Posted by: Ross Mayfield on August 27, 2003 11:08 AM

Are you talking about problems of taxonomy and replication, Stephen?

Posted by: nick sweeney on August 27, 2003 03:26 PM

When I read the comment about it taking longer to cause the damage than to do it, my reaction was that that would only last until somebody automated the process of doing the damage.

Posted by: Barry on August 27, 2003 04:47 PM

Ross Mayfield wrote, "Interesting comment. Albeit, a generalization. Got any specifics or examples?"

I might have jumped the gun a bit on Wiki in particular. I'll comment on that below. And I'm hardly an expert on library/info science, so my words should be taken with a grain of salt.

I myself am trying to put together a website/knowledge base on politics and public policy:
http://www.truthandpolitics.org/
Before doing so, I read up a little bit on library science.

Librarians and related info science people have been thinking about classifying knowledge for hundreds of years. Many "online" efforts seem to ignore this body of knowledge, as far as I can tell.

One example is the concept of "controlled vocabulary." I saw one nascent site---it might or might not have been a Wiki site, I don't remember---in which it was suggested that posts to the site be marked by a single keyword. There are two problems with that. First, it's pretty rare that a single keyword will suffice. Second, and more relevant to my point here: there was no effort at vocubulary control that I could see.

Re Wiki, I just went to wiki.org. I searched on "thesaurus," "controlled vocabulary," and "vocabulary," and got no hits.

A random, non-Wiki example is topic maps. I'm looking at the page:
http://www.ontopia.net/topicmaps/materials/tao.html
I understand what they're trying to do, but the library science/info science people tried similar things decades ago, and (in terms of what turned out to be practicable) they were largely failures. (Even such a simple scheme as "links and roles", whereby keywords are linked together if they occur together, failed.) Anything that attempts to do detailed semantic modelling is, IMHO, very ambitious and likely to fail. The most successful "semantic" undertakings are different: they use computer-generated semantic networks based on satistical properties of a corpus of texts, which is a different thing.

Of course, I don't want to be a snoot and would be interested in evidence to the contrary...

Posted by: Stephen J Fromm on August 28, 2003 09:43 AM

nick sweeney wrote, "Are you talking about problems of taxonomy and replication, Stephen?"

For example, yes. It's well known that it's difficult to classify/categorize knowledge, objects, etc.

Replication---I'll read that as "consistency"---is a very important issue. For example, Bella Hass Weinberg, a leading library scientist (her specialization is indexing) wrote:
"Roles and links were invented in the early 1960s to overcome the problem of false drops in postcoordinate indexing. Artandi and Hines (1963) criticized these devices in a paper entitled 'Roles and links--or forward to Cutter,' arguing that they served the same purpose as subdivisions in traditional subject heading systems. Studies showed that roles and links were not applied consistently, and their use was abandoned. There have been recent proposals for refining the coding of related-term references in thesauri, to specify the nature of the relationship. Similar problems of consistency are expected to result, as noted by Milstead (1995, p. 94)."
See:
http://www.asis.org/annual-96/ElectronicProceedings/weinberg.html

Note BHW's skepticism regarding the fine-tuning of RT references. I think (my contention only; I haven't asked BHW about this) that a similar skepticism should be applied toward topic maps (see URL in my previous post). On the topic maps page, it's written that:
"Just as topics and occurrences can be grouped according to type (e.g., composer/opera/country and mention/article/commentary, respectively), so too can associations between topics be grouped according to their type. The association type for the relationships mentioned above are written_by, takes_place_in, born_in, is_in (or geographical containment), and influenced_by. As with most other constructs in the topic map standard, association types are themselves defined in terms of topics.

"The ability to do typing of topic associations greatly increases the expressive power of the topic map, making it possible to group together the set of topics that have the same relationship to any given topic. This is of great importance in providing intuitive and user-friendly interfaces for navigating large pools of information."

My read is that this sort of effort is bound (at this time; maybe not decades from now) to fail.

Posted by: Stephen J Fromm on August 28, 2003 09:54 AM

I believe the Information Architecture wiki does show awareness of library science.

http://iawiki.net

Good place to visit if you found the discussion above interesting.

Posted by: Seb on August 29, 2003 04:59 AM

Seb wrote, "I believe the Information Architecture wiki does show awareness of library science."

Not clear. On page
http://www.iawiki.net/ThesaurusRelationships
it claims "In practice there should only be one BroaderTerm? for each term, forming a simple hierarchy. Example: 'Windows 95' would have a BT of 'Microsoft Windows'."
To the best of my knowledge, this is simply wrong; thesauri are *polyhierarchical*. Also, I don't see any reference to the thesaurus standard, which is available online.

Posted by: Stephen J Fromm on August 29, 2003 05:34 AM
Post a comment