Tuesday, March 25, 2008

Archive.org takes a big step into academia

I am constantly surprised by what is available at archive.org. For myself, it's a fantastic junk-bin of media, with a fair amount of it in the public domain. I have used many bits and pieces of content from archive.org in my own projects.

This announcement could be something important for academia. Zotero and archive.org have announced an alliance to allow academic papers, research, media to be searchable in ways useful for other's research. This includes, of course, the idea of metatagging etc, but also includes the ability to convert scanned documents into text via server-based OCR. Someone submits scanned documents, the server ingests them, and gives back searchable text. I am not certain as to accuracy of the conversions -- it may be low for damaged documents -- but I would guess there is a way to correct mistakes after the conversion. It is certainly better than nothing.

The plan has quite a scope. I just looked at Zotero, and it is quite nice. I have been using delicious bookmarks to handle tagging web content for research, but it appears to me that Zotero may be much better. I use the delicious bookmarks manager extension for Firefox, and Zotero is an extension as well. How convenient! I am installing it right now -- in fact, have to quit my browser to load it. That is enough for now then.