andrewducker: (Default)
andrewducker ([personal profile] andrewducker) wrote2005-01-06 07:53 pm

Categorisation - a few thoughts

Various sites, including bookmarking site del.icio.us, email site GMail, photo site Flickr, and our very own Livejournal Photo Hosting allow users to categorise the items they store. This is generally done through the use of 'tags', allowing us to label photos as "Family", emails as "To-Do" and bookmarks as "Porn" so that we can find what we need when we need it.
Rather than impose arbitrary 'top-down' categories on users these sites allow us to define our own tags, to use the labels that we would naturally use, making it much simpler to use the filing system we create.

However, there are problems with allowing this. The tags can conflict both with other users (for instance one user could use "Mac" to refer to items related to Apple Macintoshes while another uses "Macintosh" and a third uses "Apple") and with themselves (when a user's nomenclature changes or they mistype). This can make sharing information difficult and even make it hard to find all of the information you've stored yourself.

There are a few obvious solutions to this:
1) Reuse: Help the user to re-use their old tags by offering them a list of previously used tags - this will prevent typos and unintentional changes.
2) Synonyms: Help users to lump tags together by stating that "Mac" and "Macintosh" mean the same, as far as they are concerned. When they look for tags in the same category as "Mac" the search will automatically be broadened to include similar ones.
3) Build categories from the most commonly used tags. This returns to the top-down imposition of structure, but builds it from the tags that people actually use. If a tag is used by more than x% of the population then categorise it and assign it a detailed description. For instance, if more than 1% of people are using "Mac" as a tag, then "Apple Macintosh Computer" could be assigned as a detailed description. Users could then choose to use the 'official' tag. Synonyms would also exist, so that "Macintosh" and "Apple" would both link to this single 'anchor'.

The use of more-defined descriptions would allow multiple meanings for the same tag to exist, so that someone using "Apple" as a tag could be offered the choice of attaching that tag to the definition "Apple Macintosh Computer", "Apple Fruit" or "Apple Music Corporation". The user could obviously also attach it to any other definition or leave it definitionless.

I am, of course, assuming that most people would find utility in using common definitions, as it would allow them to find things that used the same tags, whilst leaving them the freedom to use any tag they like for their own use.

[identity profile] sbisson.livejournal.com 2005-01-06 08:06 pm (UTC)(link)
Taxonomy and ontology are hard... Especially if you want user support...
ext_8559: Cartoon me  (Default)

XML

[identity profile] the-magician.livejournal.com 2005-01-06 11:50 pm (UTC)(link)
This has always been the failing of XML as a replacement for EDI.

EDI has standard tags which are set by an authority (e.g. EDIFACT are set by UN teams, ANSI X12 is set by (unsurprisingly) the ANSI team (American National Standards Institute I think))

But there is not the same thing for XML (unless you count things like AS2, Rosettanet, Commerce One etc.) and so every trading partner comes up with their own tags for addresses, for describing items, for putting together packs/kits/sets, for identifying currencies and quantities etc.

But people are like ventriloquists dummies, they don't want to go back in the box :-) they don't like being fenced in, but when they then want to search on Google, they expect webpages to have the tags they would use, which is why some of us are better at internet searches than others, we can guess better what other people may have put on the pages we want to find.

[identity profile] whumpdotcom.livejournal.com 2005-01-07 06:35 am (UTC)(link)
Flickr gives you feedback on tag usage with a "most popular" tags in use function.

http://www.flickr.com/photos/tags/

[identity profile] channelpenguin.livejournal.com 2005-01-07 08:55 am (UTC)(link)
A large part of the hell that was one of my previous jobs was trying to come up with a perfect, logical and 'correct' (the boss was crazy, don't forget) categorisation system for storing our files. Never mind that he wanted some sort of prefix on every filename that indicated (in the same sort of way) precisely what it was. As I said, he was completely nuts - if you like labels, I'd say borderline autistic and very very paranoid.