Home > ResourceBlog > Article

« All ResourceBlog Articles


Bookmark and Share   Feed

Monday, 22nd October 2007

NY Times, Book Scanning, and Lots of Resources

OK, let's see if we can try to make this story clear, one step at a time. This post will focus on content. We can save other issues for future posts. With so much scanning going on, it can be very easy to get confused. Bottom Line: Book scanning involves many more projects than the ones that get a lot of the attention. BTW, the NY Times ran another story about digitization projects in March.

The Story: Libraries Shun Deals to Place Books on Web
The focus is on Google Book Search, Live Book Search from Microsoft (when was the last time you visited that service?), and the Open Content Alliance.

(An example of a topical collection from the Open Content Alliance -- Illinois Harvest. Includes books about Chicago, Abraham Lincoln, and many other topics.)

+++ List of Open Content Alliance Contributors
+++ Google Book Search Library Partners

1) In this post, we're talking about digitizing books (both in and out of copyright) that are found in library collections. We're NOT talking about material made available from publishers directly to Google Book Search (Google Book Search Partner Program) and Amazon's Search Inside the Book databases. We have found that this difference can confuse people.

2) Some libraries are working with Google/Microsoft/Open Content Alliance.
In fact, both Cornell and the University of California Libraries have announced they will work with both projects. However, when you look at the number of libraries (and don't forget about archives, museums, etc.) in the world, it's really only a small number. It's sad to see that what's likely happening is that money (not a major issue, in this case) and TIME (a key issue) likely mean that the same titles are being scanned multiple times. We could all think of other uses for the dollars going to digitize the same title more than once.

The article also points out, both MSFT and Yahoo are members of the Open Content Alliance, and it discusses the pluses and minuses of each program. Here's how we covered it almost two years ago. Then, as today's article notes:

A year after joining, Microsoft added a restriction that prohibits a book it has digitized from being included in commercial search engines other than Microsoft’s.

3) Book digitization is NOT NEW. It's difficult to believe that the NY Times article makes NO mention of Project Gutenberg, which has been digitizing books for over 36 years. That's right, 36 years! BTW, Project Gutenberg Canada launched a few months ago.

4) Keep in mind that access and organization are two different things here. We also know that search habits (for many) will have people searching for phrases like "Dallas Cowboys" or "London Underground" or "New York City Fire Department." We know that most searchers will not use quotation marks to search the words as a phrase. That means millions and millions of hits. This is an excellent example of what constitutes a good part of the invisible or deep web in 2007. True, Universal Search, Onesearch, 3D search, etc., can help but that's another story.

5) Other issues for other ResourceShelf posts include:

A) Book digitization from companies like:
++ ebrary. (ebrary Discover offers more than 20,000 full text books for free. Pay only to copy or print a page.)
++ NetLibrary, available free from many public libraries -- which just passed the 150,000 book milestone
++ Books 24x7
++ Safari Tech Books O'Reilly and Pearson

B) Quality of the scanning and how it appears on the web.

C) The issue of whether people really want to read books on a computer screen -- be it a large monitor or on an iPhone or Treo?

5) Let's review some projects, services, and where to find digitized books:
+ Online Books Page
Thousands and thousands of FREE, full text books from many sources. If you browse the "What's New" page, you'll see links to freely available full text books -- both old and new -- being digitized by organizations like:

+ American Historical Association
+ John F. Kennedy Library
+ The Online Library of Liberty
+ LibraryIreland
+ Rice University Press
+ Internet Sacred Text Archive
+ Doctortee.net
+ University of Virginia Digital Collections
+ Making of America (U of Michigan), Over 12,000 Volumes
+ Illinois Institute of Technology

And these are just the tip of the iceberg.

In other words, many organizations and LIBRARIES, are digitizing books.

Info pros should know about a variety of sources. Here are a few more:
+ International Children's Digital Library
Both old and new books. Free Access. Fun for all!!!
+ Digital Book Index
130,000 titles listed, over 100,000 free. Also note the list of organizations providing content in the right rail.
+ World Public Library
Over 500,000 titles, searchable, available for a very small yearly fee.
+ Internet Archive--Texts
Comprises several projects and has the same leadership as the Open Content Alliance. Also, many titles are available in several formats, from simple text to HTML to PDF.
+ UK: Full text books and cool technology from the Turning the Pages service at The British Library.
+ UK: British Library books go digital
+ OpenLibrary.org
+ Shakespeare Full Text and Full Image on the Web
Some gorgeous work.

Want More? Projects from Around the Globe? Dave Mattison's British Columbia International Digital Library is the place to begin.
Start browsing here and here. Wow!!!

Publishers Get in the Act: The National Academies Press Offers Thousands of Full Text Books at No Charge to Search/Read (Unlimited Amount) at No Charge.

See Also: Bradley on Changes at Google Book Search: Google Book Search Improved(?) (via SEL)

See Also: an article about U of Toronto Scanning: Building an Online Library, One Volume at a Time (via WSJ, free)

See Also: 2004 Video of Book Scanning Robot at University of Toronto


Views: 1806

« All ResourceBlog Articles



FreePint supports the value of information in the enterprise. Read more »

FeedLatest FreePint Content:

  • Click to view the article Product Review of Reg-Track (Introduction; Contact Details)
    Tuesday, 22nd July 2014

    Reviewer Chris Porter introduces Reg-Track, worldwide regulatory tracking service for compliance professionals, from Reg-Room LLC.


  • Click to view the article It's Okay to Browse the News
    Tuesday, 22nd July 2014

    This article looks at two recent European court decisions relating to online copyright and what it means for everyday internet users and media monitoring services. These cases also highlight some of the difficulties of applying copyright law to the online world.

  • Click to view the article FreePint Launches New Series - What You Need to Know Your Customer (KYC)
    Tuesday, 22nd July 2014

    FreePint is taking a fresh look at the ever-changing world of risk and compliance, with a particular focus on Know Your Customer (KYC) requirements. Chris Porter, co-producer of the series with Andrew Lucas, introduces the series.

  • ... more ...

All FreePint Content »
FreePint Topics »

A FreePint Subscription delivers articles and reports that support your organisation's information practice, content and strategy.

Find out more and order a FreePint Subscription by visiting the
completing our online form: Subscription Order page.

FreePint Testimonials

"It was really useful to get so much input from customers and hear their perspective - I have come into the office this morning full of things ..."

Read more testimonials and supply yours »




Register to receive the free ResourceShelf Newsletter, featuring highlighted posts.

Find out more »

Article Categories

All Article Categories »


All Archives »