Monday, March 14, 2011

Session 5

Social tagging
vs.
Professional cataloging and classification


First, I want to thank Philip who commented on my post last week and brought up the term “controlled vocabulary”. It made me start thinking about the connection between tags on bookmarking sites and CV, and luckily, the readings in this session help me a lot in understanding my concern.


In Tagging Video: Conventions and Strategies of the YouTube Community, the authors took YouTube’s tagging system as an example to show us the two main reasons why social tagging system is a necessity: fast growing speed of the items and the difficulties in cataloging or classifying them with a established professional cataloging or classification system. Although the authors only referred to resources like digital videos or moving images, I think the two reasons they concluded can also be applied to resources like websites or web pages, as those in De.li.cious.


According to the instruction from De.li.cious, tags here are “one-word descriptors” that can be used to organize bookmarks. They “do not form a hierarchy” and can be applied as many as one can to a particular bookmark. Basically, they’re keywords that users created or modified by themselves, like what Geisler and Burns called as “assigned free-form terms”. And the advantages of these tags, cited from De.li.cious, are “driven by personal interests” and “more flexible than fitting information into preconceived categories or folders”.


Speaking of preconceived categories, or professional cataloging and classification terms, the first one that popped up in my mind is the Library of Congress Subject Headings. As an authentic and commonly used system, these authorized headings play an important role in bibliographic control in libraries, and help librarians to “collect, organize and disseminate documents” in an effective way. However, it seems that there are few social network sites or online communities are consulting LCSH when intitling their tags. The situation could be due to a number of reasons, but it is also an interesting phenomenon when some of these self-generated tags are also terms in LCSH.


Therefore, in order to compare between the “LCSH-tags” and “non-LCSH-tags” to see if there are any kinds of relationship, I chose the top 38 tags from the Popular Tag Cloud in De.li.cious.




Among these tags, there are 23 that can be found as Subject Headings in the Library of Congress Online Catalog, and 15 that can’t. There are several conditions that can be confusing. For example, the tag “web” is not a Subject Heading because “web” is too board as a term in LCSH, instead, there are Subject Headings like “web archiving”, “web browsing”, or “WEB (Computer program language)”. “Reference” is not found either but there is a corresponding Subject Heading of “reference and research services”. Also, “development” is not a Subject Heading because there are Subject Headings like “development and education”, “development associates” and etc. At the same time, although some tags can also be found in LCSHs, the meaning of the same term could be different. Like “mac”, as a tag in De.li.cious, it is mostly related to the operating system of Apple’s Macintosh, but as a Subject Heading, it is the pseudonym of mario medina correa.



Also LCSH

Not LCSH

Also LCSH

Not LCSH

Also LCSH

design

blog

web2.0

development

technology

tools

video

google

news

travel

music

software

inspiration

flash

shopping

programming

web design

photography

blogs

books

art

reference

food

tips

mac

how to

tutorial

css

politics

science

javascript

web

education

opensources

games

linux

free

business



However, it is Interesting that the result of overlapping is more than I expected. Given the fact that these 38 tags are the most popular ones in De.li.cious, the relatively high rate could be due to conformity. Users tend to pick normal or formal terms when they form a tag because it will be easier for them to identify and classify both the groups and individual items. So, there is a big chance these tags can be found in LCSHs, because LCSHs are also conventionally generated from common terms. Therefore, it is hard to deduce or imagine the number of tags that are also LCSHs among the rest of the tags in De.li.cious, we can even suppose that there would be fewer among the less popular ones, and little among the least popular one, for the fact that the less popular tags might be either individualized or indicate newly emerging information that few people know.


And the factor of some newly emerging information is existing, especially when it comes to websites, is quite a big reason why free-form tags should be existing. If there are terms that none of those professional cataloging and classification system had come out with, these tags could be great complement to De.li.cious or any other sites. In addition, another advantage of free-form tags is that professional cataloging and classification could be hard for general users. It might takes a long time for the users to memorize, or even to learn all the standard terms, and probably will frustrate them. Moreover, free-form tags gives personalization to tags and can form characteristic of the community. According to Geisler and Burns, since it “enable all members of the community to see the tags that have been previously used to describe content”, this sharing feature can enhance a sense of belonging of the community among the users.


The disadvantages of free-form tags, however, is that they might be non-standard and reduplicated sometimes. Like in the case above where there are both the tag of “blog” and “blogs” in the top 38 popular tags. It is the users personal choice to use which tag but somehow will cause confusion when searching for information or doing researches. So in general, I think the social tagging system and professional cataloging and classification systems can work as complementary supplements to each other, to create a most effective, easy-to-use, and “fancier” tagging system.




Reference:

http://www.delicious.com/help/faq#tags

http://www.ieee-tcdl.org/Bulletin/v4n1/geisler/geisler.html

http://authorities.loc.gov/cgi-bin/Pwebrecon.cgi?DB=local&PAGE=First

http://www.slate.com/id/2179393/fr/rss/

9 comments:

  1. Nan - Interesting comparison and methodology! I was also surprised to see such overlap btwn the tags on De.li.cious and the LCSH. I agree with you that the informal and formal tags can be complementary. Thanks for sharing!

    ReplyDelete
  2. Very interesting choice to compare LCSH and the website delicious. I had never cosnidered the sue of LCSH as common terms that one would use in a topic search. It would be interesting to take a look at some of the more unusual LCSH and also see if it could be found in Delcious.. like one of the more technical or scientific terms.. hmm..

    ReplyDelete
  3. Terrific post! I really enjoyed reading about your thoughts regarding the use of LCSH and free form terminology when tagging pages. I was a little surprised to see that you found 23 tags that were LCSH and 15 tags that were unique. In this sense, I also wonder about LCSH tags being applied correctly to content. For example, if I use a controlled vocabulary and receive incorrect results, how much will it detract me from using a specific method of searching? I also enjoyed reading your discussion about how tags can be easier for lay individuals to use, as opposed to LCSH. However, I do wonder about the long term impact of this type of tagging, as jargon used based on different times can change constantly, with content becoming harder to locate with terms that are not used anymore. In this sense, LCSH would be beneficial for long term search ability. Overall, I really like your idea about using the system in a complementary fashion, as it would allow for the best of both worlds. However, I do wonder which version would be used if both existed simultaneously. It might just be the peer constructed tags, as the law of least effort may indicate that people will search for terms that come to mind, as opposed to finding the LCSH term(s).

    ReplyDelete
  4. Your comparison between the del.icio.us tags and LCSH tags was very interesting. The observation you made about users tending to conform to popular tags both excites and concerns me. On one hand, a more consistent system of tags makes it easy for a user to find what they need. Having set categories helps to seperate relevant content. One the other hand, by conforming to a set a popular tags, they lose out on the benefits that you mention with free-form tags. It seems to me that for narrowing content down, using conventional tags would be more helpful, but as you go deeper and look for more specific content, that is where free-form tags shine.

    ReplyDelete
  5. I agree with your conclusion. Looking over your table above, I see Tags as descriptions of pages and LCSH as descriptions of sets. So a LCSH more or less could just be set 1, set 2, set 3 ... and as users follow those links they will, or should, find equally relevant pages. The Tags on the other hand are more like items or characteristics of the page and those links may or may not lead to other relevant pages. It is interesting that blogs and blog are both listed as top tags. This leads me to believe that the site is not automatically linking these two tags together. This would double the work of someone using this feature.

    ReplyDelete
  6. I mentioned a tag cloud in the comments to another student's blog, but you've take the idea much further here--very well done! As a very rough guide, when researchers study the work of professional indexers to see how consistently they apply terms from controlled vocabularies like LCSH, the results usually cluster around 55%. In other words, even professionally trained indexers using a standard controlled vocabulary will agree on the appropriate descriptive term for the same item a little more than half the time. Tags embrace and encourage the natural scatter around the 'ideal' descriptive term (if there is such a thing), and I agree that a merged approach is perhaps the only way to link the fullest range of diverse query expressions with relevant items.

    ReplyDelete
  7. Wow, that's awesome that you went and looked up what keywords were also LCSH. I agree that social tagging can co-exist with traditional cataloging and classification schemes. My attitude is, why not try it?

    ReplyDelete
  8. Caloha, your suggestion is interesting, I'll put some attention on the unusual technical and scientific terms, but looking at the table, there are some terms like css or javascript that are overlapping as well.

    mbco, Philip, it is a problem that free-form terms are not that accurate as LCSHs and can be either confusing or double users work when searching. I guess we can call it a dilemma because since are so many different sites out there, a practical standard tagging system seems just "impractical".

    Dr. Gazan, thanks for the data you provided, and I do agree with your viewpoint of the "merge approach".

    ReplyDelete
  9. interesting look at CV vs. NL descriptors in this posting and yes, I agree with your conclusion that the two systems compliment each other until we can find 'the one' system that can truly transform our current information retrieval and preservation methods.

    ReplyDelete