Friday, August 01, 2008

Digital repositories

    ". . . the digital collections that libraries, museums and archives create with great effort and expense are not always well-indexed by Web search engines, thus decreasing the potential use and impact of those digital resources. OAIster, a "union catalog of digital resources" developed at the University of Michigan, provides access to over 16 million digital resources by harvesting OAI metadata from over 1000 repositories worldwide. About 45% of this material, the authors determine, is also indexed by Google, leaving the remaining 55% "hidden" in the deep web, unindexed by Web search engines." Hagedorn, Kat, and Joshua Santelli. "Google Still Not Indexing Hidden Web URLs" D-Lib Magazine 14(7/8)(July/August 2008)
No surprise to me. Hidden probably because librarians had little to say in the design, from the looks of it. I’ve never seen anything more poorly indexed than OSU’s Knowledge Bank. Some items look like they were retrieved from the circular file or store room by the secretary and then scanned and cataloged by the lowest paid, newest hire in the department--sometimes no title page, no date of publication, no thought to subject terms or even the official name of the Department. And really folks, a lot of “senior thesis papers“ need to be tossed in a box and stored at their parents, not indexed on the internet where a junior high kid or left wing blogger can find it.

Here lies the problem (from an October 2007 presentation) in my opinion. Keep in mind that a "community" is any division or department within the Ohio State University.
    KB Community & Collection Policies

    A Knowledge Bank Community has the right to:
    • decide policy regarding content to be submitted
    • decide who may submit content
    • limit access to content
    • customize interfaces to community content
You can search by author, title, subject, "community," or date. There is no search for "creator," or "publisher," even though that information appears in whatever main page you bring up. In a database by and about OSU, I'd expect more than five entries to come up for the author, "Ohio. . .", but that was it. As subject, however, Ohio State University brings up 11. Adding subdivisions, there are probably hundreds, including Ohio State Univerity--Libraries, and Ohio State University Libraries, and library and libraries. But to actually find documents created, sponsored, published or about Ohio State University Libraries and its faculty, you'd have to search "community," and sorry, but that's not what comes to mind when I think of a university department. If in desperation you try a general search on the word Ohio, you'll get thousands, including "front matter," and "back matter," of scanned journals with the word Ohio in the title.

If other repositories created with dspace with our tax money “with great effort and expense” are this poor, why should Google have to rescue it with private money?

No comments: