Showing posts with label digital repositories. Show all posts
Showing posts with label digital repositories. Show all posts

Wednesday, May 08, 2019

Trying out the aggregator/repository CORE

Periodically I try out new search engines, data collectors and repositories I haven’t used, so today I discovered CORE, https://core.ac.uk/search, which at the moment has open access 135,500,000 publications, documents, blogs, trivia, thoughts, and according to my search, “President Trump,” over 725,000 items on our president even though he’s only been in office two years.

CORE’s mission is to aggregate all open access research outputs from repositories and journals worldwide and make them available to the public. In this way CORE facilitates free unrestricted access to research for all.

The co-investigator on the Euro Crisis in the Press (Japan) concluded even before Trump took office that he was a phenomenon like the world had never seen.

“Brexit has prevailed, the EU is in tatters, and finally Mr. Donald J. Trump has been elected President of the United States of America. Without any possible overstatement, the consequences of his ascent to the US presidency cannot be underestimated. It is a veritable game changer for global politics, an unexpected and glorious triumph for some, an unfathomable disaster for others.”

Checking his Tweets against the SOTU by three (American or British names) authors at Instituto Complutense de Análisis Económico (ICAE)

“State of the Union Addresses (SOUA) by two recent US Presidents, President Obama (2016) and President Trump (2018), and a series of recent of tweets by President Trump, are analysed by means of the data mining technique, sentiment analysis. The intention is to explore the contents and sentiments of the messages contained, the degree to which they differ, and their potential implications for the national mood and state of the economy. President Trump's 2018 SOUA and his sample tweets are identified as being more positive in sentiment than President Obama's 2016 SOUA. “

And from a UK blog, an interesting quote from Hillary Clinton even before he was the Republican candidate.

“It’s clear he doesn’t have a clue what he’s talking about. So we can’t be certain which of these things he would do. But we can be certain that he’s capable of doing any or all of them. Letting ISIS run wild. Launching a nuclear attack. Starting a ground war. These are all distinct possibilities with Donald Trump in charge.” –  Hillary Clinton, Speech in San Diego, CA, June 2, 2016

Yes, if I had time to browse 725,000 bad predictions, slanders, and hysteria, it would be interesting, but we’ve heard it all for 3 years.

Friday, January 02, 2009

Preserving Special Media

If ever a government guide should be digitized and on the web so you could see it, I would think this one should be: "Records management handbook for United States senators and their archival repositories / Karen Dawley Paul ; prepared under the direction of the Secretary of the Senate by the Senate Historical Office. [Washington, D.C.] : U.S. Senate, 2006. Series: S. pub; 109-19. Then you'd know why information has disappeared through theft, deterioration, mishandling, or other oopsies as administrations come and go. Leafing through the copy at Ohio State University I see things that are also of interest to us average folk who increasingly are relying on non-paper to store our information. Say what you will about the way our grandparents did things, I can still read my grandparents' 1890s grocery lists, farm records and book notes, something I can't do for much of my own material from the 1990s. In the above photo (1988), I'm using one of the most advanced systems in the OSU Libraries--none of it works today--not even the curly perm.

But back to the senators. On p. 50 it says senators are supposed to have established guidelines for maintaining permanently valuable electronic records, including e-mail. Now, I don't see in this publication what those guidelines are, only that they are supposed to have them and the senator's staff is supposed to understand them (written in-house?) and archive the paper and e-documents. There are lots of questions on her check list, like are attachments systematically saved, are documents labeled, is scheduling information retained permanently, but I don't see the requirement to do so.

So how do they dredge this stuff up for the special prosecutors 5 years later, if the guidelines are not specific about who, what, when and where? The answer seems to be on p. 1:
    "United States senators personally own and control the records created and maintained within their own offices. Because of the private status of these records, members must personally establish office policies and procedures that will preserve historically valuable documentation."
So it would seem that Senator Obama can withhold from our view anything he wants about discussions with Blago--he's not required to keep anything he doesn't deem historically valuable. He's still a senator until someone else is appointed, president-elect or not.

But back to the rest of us and our special media. According to Ms. Paul
    More audio and videotapes are lost by accidental erasure than by misuse.

    Fax paper lasts about 5 years.

    Videotape must be re-recorded after 15 years.

    Color photographs need cool, dark storage.

    Audiocassettes need to be rewound every 2 years to prevent "printthrough."

    Use of "fast forward" and reverse speeds can distort tape tension (I think anyone who has borrowed a tape has discovered that).

    Computer tapes used for archival storage should be copied to new tapes every 10 years.

    Computer software has a 3-5 year period of use before becoming obsolete.

    Newsprint should be copied onto bond paper.

    Permanently valuable mail should be copied onto bond paper, or it should be scanned and microfilmed.

    Irradiation can erase magnetic media, expose film and fade color photographs

    CD-ROM and DVD are not considered suitable for long-term storage of permanent records.

    Digitization is not an alternative for preservation because of technology becoming obsolete.

    Microfilm, remains for now, the preferred long-term preservation medium.
And to think when I was in library school we'd shake our heads over the brittle, "burning" paper in books of the 19th century. Now we've got stuff that won't even last a decade. We're going backwards. And we're throwing the paper stuff out!

Friday, October 31, 2008

Digital Repositories

Digital Repositories or digital suppositories.
Is it just me and my need to rhyme?
To poke fun, or just poke all the time?
Or is it the experience I've had
with these library-wanna-be's bad
that just take a dump and have no class
order or sense up their ass pass
just all half-digested, and plain
with nary a librarian to ease the pain.



Digital Repositories--what is it? Your guess is as good as mine.

A librarian has taken offense at my language. That's a hoot (223:1 liberal to conservative). Probably has also objected to filters on the library's computers that protect children and has decided to be a hypocrite. Anyway, for that hypervigilant liberal, I've found a new word to rhyme with "class," that will convey the passage of undigested matter to the posterior opening of the alimentary canal.

Friday, August 01, 2008

Digital repositories

    ". . . the digital collections that libraries, museums and archives create with great effort and expense are not always well-indexed by Web search engines, thus decreasing the potential use and impact of those digital resources. OAIster, a "union catalog of digital resources" developed at the University of Michigan, provides access to over 16 million digital resources by harvesting OAI metadata from over 1000 repositories worldwide. About 45% of this material, the authors determine, is also indexed by Google, leaving the remaining 55% "hidden" in the deep web, unindexed by Web search engines." Hagedorn, Kat, and Joshua Santelli. "Google Still Not Indexing Hidden Web URLs" D-Lib Magazine 14(7/8)(July/August 2008)
No surprise to me. Hidden probably because librarians had little to say in the design, from the looks of it. I’ve never seen anything more poorly indexed than OSU’s Knowledge Bank. Some items look like they were retrieved from the circular file or store room by the secretary and then scanned and cataloged by the lowest paid, newest hire in the department--sometimes no title page, no date of publication, no thought to subject terms or even the official name of the Department. And really folks, a lot of “senior thesis papers“ need to be tossed in a box and stored at their parents, not indexed on the internet where a junior high kid or left wing blogger can find it.

Here lies the problem (from an October 2007 presentation) in my opinion. Keep in mind that a "community" is any division or department within the Ohio State University.
    KB Community & Collection Policies

    A Knowledge Bank Community has the right to:
    • decide policy regarding content to be submitted
    • decide who may submit content
    • limit access to content
    • customize interfaces to community content
You can search by author, title, subject, "community," or date. There is no search for "creator," or "publisher," even though that information appears in whatever main page you bring up. In a database by and about OSU, I'd expect more than five entries to come up for the author, "Ohio. . .", but that was it. As subject, however, Ohio State University brings up 11. Adding subdivisions, there are probably hundreds, including Ohio State Univerity--Libraries, and Ohio State University Libraries, and library and libraries. But to actually find documents created, sponsored, published or about Ohio State University Libraries and its faculty, you'd have to search "community," and sorry, but that's not what comes to mind when I think of a university department. If in desperation you try a general search on the word Ohio, you'll get thousands, including "front matter," and "back matter," of scanned journals with the word Ohio in the title.

If other repositories created with dspace with our tax money “with great effort and expense” are this poor, why should Google have to rescue it with private money?

Tuesday, January 22, 2008

4549

DNA = Darn Nuisance Again

It's in our genes. Something in our DNA gurgles forth when we find a problem on the web. Talking with other librarians at the retirees lunch last Friday I realize I'm not the only one who gets sidetracked in the middle of researching something to offer the webmaster or IT staff some suggestions about broken links, links that misdirect, or bad printing advice. It just happened again, although not at a library or church site (where I usually suggest they at least mention the name of the town or city when giving the street address). This was a very lovely letter from Campus Crusade for "Rapid Deployment Kits" providing spiritual resources for our troops. Because my husband doesn't use the computer (and sleeps in longer than I do and I would forget this by the time we see each other), I wanted to print it. We usually consult with each other before straying from our list of parachurch donations.

To print a webpage I first do a print preview, because I hate getting that 3rd or 4th page with one line of advertising on it. But some web pages get around this by printing the pretty stuff (don't know the technical term) on page 1 after you've adjusted your printer to print only one page based on a "print preview." So after it spits it out, you have the colorful heading and no letter. For some reason, my printer (HP Photosmart, 3 in one, don't ever buy one), will then grab 5 or 6 pages if you try to turn that sheet over and print the "real" information, jamming as it goes, requiring a 5 minute hassle when all you wanted to do was donate $10 so a soldier in Iraq or Afghanistan will have a New Testament.

After the paper is unjammed and you have the letter (if you have librarian DNA), you then stop to send the company webmaster/IT staff an e-mail explaining how they could be more helpful to the ordinary reader who isn't 21 years old and gaming in their off time. It takes awhile to do this because the comment window is well hidden behind the FAQ which they want you to read first. Since they rarely acknowledge that their own stuff might be unhelpful, printing instructions are rarely included in an FAQ (also printers differ). The easiest thing is to have a "print only this page" or similar option, but that might leave out the advertising, so not all sites offer this. My prontomail e-mail is a mess to print. I have to copy and drop it in a word processing document or it tries all sorts of funny things. No matter what I comment on, they tell me to clear my cache.

I tried to order 2 books for my husband and his friend from Amazon. I had checked it out of the library for him (he also never goes to a library) and he loved it. Amazon offered me an additional 30% off if I'd get their charge card. I don't really want another charge card, but I thought it might be useful for ordering books. "Just a few seconds" ran into many minutes and gazillions of pages of tiny print I'm sure no one but a librarian would read, so I backed out of that and went back to my one credit card. Five times I tried. Five times it told me I didn't enter select the name of the card (but I did). So I backed out again, and decided I would just go up the street to Barnes and Noble and talk to a human being and tell them I wanted to order 2 copies of one title for 2 retired architects who want to own the wonderful book on architectural drawings that's available at UAPL.

But my DNA kicked in and I stopped to search for the comment window (not easy to do) and tell them about the problem. After 3 days, I finally got a response, apologizing for nothing, and telling me I hadn't selected the right card name (I only have one card so I think I know which company I use), or my number didn't match their database (why is my number in their database?) or other snarky suggestions. I wrote back that I'd done it all correctly and I was going to a bookstore, but their response was automated and said I couldn't reply to their reply! I will waste no more valuable librarian DNA on Amazon. They can just keep their old books.

When I shared with the retirees group my opinion on the lack of flexibility and awkwardness of "digital repository" software (many libraries use the same program which looks like no librarian ever sat on the selection committee) one retiree told me I should volunteer for the committee at OSU that handles that. A committee? No way! If I wanted to spend my life on a committee I wouldn't have retired.

Rant over. I feel better.