Receive the weekly sampler of posts and "Resource of the Week".
Subscribe »

Enter your
email address:

My Account »


Bookmark and Share

Testimonial?
If you find ResourceShelf useful, please supply a testimonial »








Home > ResourceBlog > Article

« All ResourceBlog Articles

 

Bookmark and Share   \"Feed\"

Monday, 11th October 2004

OCLC Opens Up the Entire WorldCat Database to Web Engines and Other Partners

OCLC Opens Up the Complete WorldCat Database to Web Engines and Other Partners
By Gary Price and Steven M. Cohen
Today, OCLC announced that they're expanding the Open WorldCat program and offering the entire WorldCat database to Google, Yahoo, and other partners to crawl.

Barbara Quint has all of the details in this ITI NewsBreak. It's a must read!

Today's announcement is good news. Potentially, more books will get circulated since more WorldCat items will be available via Google, Yahoo, and other partners. At the same time, let's hope that we can also use this as a marketing tool for all libraries.

Will Google and Yahoo want all of this material? Let's hope so. Fifty million records is a lot of content. The announcement doesn't indicate if Google and Yahoo have said they will crawl all of the records and how long it will take to get the material into the database.

With a commitment to crawl, some of the material can start getting harvested in about a month. Then it will be up to the web engines as to how long it will take to get all of this material into their databases. Remember: As of today, the entire WorldCat database is not indexed in Google or Yahoo.

Here's the official language:
"Harvesting by partners will occur gradually over a period of time." Read the full text of the OCLC Open WorldCat Fact Sheet.

What type of arrangements will OCLC make with Google and Yahoo as to how often the data will be updated and new material harvested? We think new records would have to be entered at least once a week. OCLC could use RSS and provide feeds to Yahoo and Google (or other engines) so new content could be indexed on the fly. Feedster, for example, has been able to create a useful engine by indexing RSS feeds rather than html.

Over the weekend, we reviewed the Open WorldCat Fact Sheet and had a few thoughts. Consider these as constructive criticisms and ideas that might make the program even better.

Open WorldCat is here and growing. What can we do to make it better? Remember, books are for use!
---
---
+ Participation in Open WorldCat (making your library's holdings visible) is only available if your library buys access to WorldCat on FirstSearch. This makes sense; OCLC is trying to protect itself from libraries taking but not giving back. In other words, participation in Open WorldCat is a member benefit. You can read more about it in this letter from OCLC CEO Jay Jordan to members.

+ The fact sheet talks about the increase in clicks (3.4 million last month) on the Open WorldCat records. Which is good news.

+ However, the number of actual visits to holdings info pages is about 8%, or 272,000. We wonder how many of those click-throughs came from librarians doing bibliographic verification? What we would really like to know is the total number of items circulated or even "influenced" after they were first discovered via Open WorldCat? Did people actually get the books? What was the end result? Compare this process to Amazon or, even better, to the ROI when people search local library catalogs. Let's hope the new statistical tools OCLC plans to offer will answer these and other questions.

+ Will local libraries spend less time and money maintaining and upgrading their own catalogs and just tell patrons to go to Google or Yahoo? Do local OPACS still have search value for the typical patron? Is it worth the money to provide access to this service when all of the material is available on the web, accessible via an interface with which the public is familiar?

+ One thing that we would like to see (which would make Open WorldCat records even more useful to the public) is to make subject headings hyperlinked -- e.g., visible and clickable. This would allow a searcher to get a list of all books with a particular subject heading.

+ OCLC must help improve subject access to material. We ran several subject-type searches (random topics) over the weekend (even including the word "library" in some of them) and got very poor results at both Yahoo and Google. Is one engine better than the other? Yes. It looks like you have a "better" chance with Yahoo. The searches we ran, along with the results, are accessible here.

+ Want to make Open WorldCat better and easier to use? How about OCLC working with Clusty, allowing them to crawl the material. Not only would this be a new outlet (more visibility), but the clustering would help with subject access. We're going to mention this to Clusty's CEO. Clusters could be based on subject headings and other parts of the record. This is just what Vivisimo does with ClusterMed (using MESH). We think working with Gurunet and others would also be a good idea.

+ What about RLG's RedLightGreen site?
Since RLG also provides access to Google, RLG will be indirectly providing access to OCLC records. Will this cause any issues -- especially in trying to market RedLightGreen?

+ Since OCLC is making all 52 million records available for Google and Yahoo, will they make the entire catalog available on the web at WorldCatLibraries.org? OCLC could make additional user services available, including the ability to create lists, save results, and build a bibliography.

+ The fact sheet makes no mention of what OCLC's own people have said is a problem -- where Open WorldCat records fall on search engine results pages. This is a key issue. Will people find (and use) the records if they're on the second or third page of results?

+ Last week when I talked with a Google spokesperson about the new Google Print program, I asked (BQ did too!) if in addition to providing links to purchase the Google Print item, Google would also provide a direct link to the OCLC record. The response that we got was that they'll have to look into it. Google said the same thing to LJ.

+ OCLC needs to develop a method to show holdings info only for those libraries a user has access to -- unless his or her library will pay for an ILL request.

+ If we need to to teach the public to use a tool (bookmarklet) or a specialized interface to access Open WorldCat via Yahoo or Google, couldn't we also show them how easy it is to reach individual OPACs via the web? When we do have a chance to teach users to access library records, should we show them Google/Yahoo (something they're already familiar with) or the local library OPAC web site?

+ How about Yahoo adding a "library" shortcut? They do this for gas price databases and movie times. Actually, both Yahoo and Google recognize ISBNs if they're entered directly into the search box. In addition to providing links to book merchants, couldn't they also provide links to the Open WorldCat record? Ask Jeeves has been very big into creating "ready reference" answers. OCLC should also see if they're interested.

+ Worth noting: With the new Google Print program underway, users in many cases will be offered access to full text books before they even have a chance to view a library record. This is another reason why Open WorldCat records need to be represented in the first few search results.

+ Since Google and Yahoo show only one or two results from any single domain, it's going to be important to teach people how to turn the clustering off (clicking "more results from link...") or to develop tools that do this and then share them with the public. In other words, a library might make numerous books about a topic available, but the user will only see one or two.

+ We found a problem with Open WorldCat records via Yahoo. Nothing major, but...
Example Search: library books about ben franklin.

An Open WorldCat result is available at number 6. A typical searcher will likely not use "library books" in his or her search, but #6 isn't bad. Btw, if you remove the word "library" from the search, the #6 result drops to the 65th result.

OCLC WorldCat has other Ben Franklin books available, but Yahoo only shows one result per domain. So, I click "see more results.". Will the typical searcher do this? Not sure, but let's continue.

You're now seeing the same record. Where are the others? Well, you'll need to click one more time on the "repeat the search with the omitted results included" link.

Views: 197



blog comments powered by Disqus

« All ResourceBlog Articles

 

Read about the FreePint FamilyThe FreePint Family is a family of resources to help information workers be more effective, raise the value of information in their organisations and contribute to success.

'FreePint... provides most of my professional development because it won't come through work and [other resources] just don't cut it.'

Read about the FreePint Family »


Visit the FreePint ShopFreePint Shop: FreePint sells reports, resources and subscription products to support your information work and information-related decisions.

Latest: FreePint Volume: Critical Insight on Social Media 2012 (01 Feb 2012) | FUMSI Report: Folio on Conferences and Continuing Professional Development (26 Jan 2012) | FreePint Research Report: Information Governance Policies and Priorities (25 Jan 2012) | Docuticker Report: DocuTips on Health Literacy (19 Jan 2012) | VIP Magazine: 98 (18 Jan 2012)

Browse the FreePint Shop »


FUMSI ForumFUMSI Forum: Do you have a research question? Post it to the FUMSI Forum, where professionals share Q&A and useful tips on how to Find, Use, Manage and Share Information. It's free.

Latest FUMSI Forum postings: Most Shared Content on Finding Information (09 Feb 2012) | Times are changing - a FUMSI Editorial (09 Feb 2012) | [TIPPLE] eBook resources - Share (07 Feb 2012) | Most Shared Content on Sharing Information (01 Feb 2012) | Our own worst enemy? - a FUMSI Editorial (01 Feb 2012)

Visit the FUMSI Forum and post »


VIP LiveWireVIP LiveWire: Offers commentary on emerging news stories of interest to premium content users, vendors and industry insiders.

Latest VIP LiveWire postings: Compliance - it's not just financial (10 Feb 2012) | Social media and BRIC - new report (08 Feb 2012) | Reuters takes the social media pulse (08 Feb 2012) | How to deal with the tech-savvy customer? (08 Feb 2012) | More ways for employers to poke around (01 Feb 2012)

Visit the VIP LiveWire »






Subscribe

Subscribe to the ResourceShelf Newsletter and receive the weekly sampler of posts and Resource of the Week.

Find out more »

ResourceShelf sponsored by:

Article Categories

All Article Categories »

Archive

All Archives »