Home > ResourceBlog > Article

« All ResourceBlog Articles


Bookmark and Share   Feed

Wednesday, 18th October 2006

Cornell Joins Microsoft Book Scanning Project and Other Scanning News And Tools

Let's digitize some books.

That's great/amazing in theory (and don't forget many libraries and archives have their own projects) but you have to wonder how much duplication is going on or will take place. Put another way, how much time is being wasted scanning the same material both as separate projects (same book from OCA and GBS) and internally (same item from various libraries)? But that's business I guess. Peter Suber from Open Access News also mentions this same topic.

Yesterday, Cornell University signed a deal to be part of the Microsoft/OCA program.

Today, Microsoft also announced they have licensed high-speed scanning technologies from Kirtas for its scanning program which is part of the Open Content Alliance (OCA)..The works scanned by Kirtas will become available via Windows Live Book Search starting in early 2007. Cornell librarians will have a hand in choosing which versions of books to scan and overseeing quality control of the digitization process, according to Cornell.

Schools like the University of California are part of both. Last year the University of California system announced they would be part of Open Content Alliance (members include Yahoo and Microsoft) digitization program.

Then, a couple of months ago, the UC System announced they were also also joining the Google Book Search program.

From Canadace Lombardi's article today:

The project, when complete, will make public domain works, as well as copyright material from publishers who opt-in, freely available through Microsoft's online Web application.


What about Yahoo? In fact, their blog was the first place where the OCA was announced on October 2, 2005

Yahoo will index the content and is also funding the digitization of an initial corpus of American literature collection that the University of California system is selecting, Adobe and HP are helping with the processing software, University of Toronto and O'Reilly are adding books, Prelinger Archives and the National Archives of the UK are adding movies, etc. We hope to add more institutions and fine tune the principles of working together.


Microsoft announced its involvement with the Open Content Alliance and Microsoft Book Search a few weeks later.

From SEW Blog, October 26, 2005

According to [Danielle] Tiedt, Microsoft has currently committed to fund the scanning of 150K books. In the case of these books (public domain content), Microsoft is making deals on their own with libraries (we don't know which ones) who will provide the content. Then, some (but not all of this material, depending on the library and the actual content) will be available as part of the OCA database. Every library that provides a copy of the book for scanning will also recieve a file for local use.


Other organizations and schools that are part of the OCA include:
* European Archive
* Internet Archive
* National Archives (UK)
* O'Reilly Media
* Prelinger Archives
* University of California [now also part of Google Book Search]
* University of Toronto

As we pointed out in this post from earlier week, a good portion (we can't get actual numbers) of content in Google Book Search so far comes as limited preview material direct from the publisher. This is very similar if not exactly what Amazon.com offers with Search Inside the Book. An Amazon.com/OCA hook-up would be very powerful. Let's also not forget that access doesn't guarantee retrievability, especially when it comes to a subject search in a massive database coupled with the poor searching skills many have.

So, that's the story. Confusing? Of course. We also don't think the masses understand (though Google has tried hard to explain) the differences between various types of scans. We hear from people all of the time thinking that once the project is complete ALL BOOKS (new, old, or in between) will be available from their computer for free. They seem to miss out on the snippet part of the story. In terms of what Google calls limited view books, Amazon.com is also doing a great job with Search Inside the Book.

Btw, say you're online today and want to look at eBooks. Here are a few places to review:

+ World eBook Fair
Free access all of this month. More than 500,000 full text books all in PDF. Rest of the year, $8.95.

+ International Childrens Digital Book Library
Full text books in many languages and a very cool search interface.

+ ebrary
Free remote access to more than 20,000 books. All full text and full image. Pay only to print or copy a page. About 25 cents. No limit on how much you can view.

+ NetLibrary
Available free from many libraries. Full text, no limit on how much you can view. Remote access with a library card.

+ The Online Books Page
More than 25,000 full text books from various sources. All free. All public domain material.

+ The OpenBook Library
Cool technology. Reminds me of the "Turning the Pages" technology see here (NLM) and the British Library (12 full text books).

+ DigitalBookIndex
Over 128,000 titles. Some fee, some free. 128,000 titles about 88,000 free.

+ eBook Locator

See Also: Let’s Scan: The First Contribution from Univ. of Pittsburgh to Open Content Alliance

See Also: Microsoft to offer book search (10/2005)

See Also: A video of Book Scanning Robot at the University of Toronto in action.

See Also: An article about U of Toronto Scanning: Building an Online Library, One Volume at a Time (via WSJ, free)


Views: 1072

« All ResourceBlog Articles



FreePint supports the value of information in the enterprise. Read more »

FeedLatest FreePint Content:

  • Click to view the article Product Review of StrategyEye Digital Media (Value - Competitors, Development & Pricing)
    Wednesday, 9th July 2014

    The competitor landscape and product maturity of StrategyEye Digital Media are examined in the final part of Jan Knight's review of this business intelligence (BI) product. She also looks at the value it offers corporate clients and investors interested in data and insights on technology companies, particularly those in start-up phase or are innovators or disruptors.

  • Click to view the article News - Less and Less Safe?
    Wednesday, 9th July 2014

    Recent revelations that one search engine is already removing links to news in response to the European Court's "right to be forgotten" ruling, and that a major social media provider has been conducting an emotional impact experiment by tinkering with users' feeds, raise important issues about the integrity of news. As another study reveals that journalists are now more likely to source news from social media and less likely to check it - despite its known unreliability - there are new challenges for information professionals in how they source and disseminate news within their own organisations, and how they encourage end-users to evaluate it.

  • Click to view the article My Favourite Tipples from a European Union Specialist
    Tuesday, 8th July 2014

    My Favourite Tipples are shared by Carol Bream, advisor to the Head of the Library of the European Commission Central Library where she focuses on the future development of library and information services. She shares her favourite online resources on topics from resource discovery to metadata and EU publications.

  • ... more ...

All FreePint Content »
FreePint Topics »

A FreePint Subscription delivers articles and reports that support your organisation's information practice, content and strategy.

Find out more and order a FreePint Subscription by visiting the
completing our online form: Subscription Order page.

FreePint Testimonials

"It was really useful to get so much input from customers and hear their perspective - I have come into the office this morning full of things ..."

Read more testimonials and supply yours »




Register to receive the free ResourceShelf Newsletter, featuring highlighted posts.

Find out more »

Article Categories

All Article Categories »


All Archives »