OK, let's see if we can try to make this story clear, one step at a time. This post will focus on content. We can save other issues for future posts. With so much scanning going on, it can be very easy to get confused. Bottom Line: Book scanning involves many more projects than the ones that get a lot of the attention. BTW, the NY Times ran another story about digitization projects in March.
(An example of a topical collection from the Open Content Alliance -- Illinois Harvest. Includes books about Chicago, Abraham Lincoln, and many other topics.)
1) In this post, we're talking about digitizing books (both in and out of copyright) that are found in library collections. We're NOT talking about material made available from publishers directly to Google Book Search (Google Book Search Partner Program) and Amazon's Search Inside the Book databases. We have found that this difference can confuse people.
2) Some libraries are working with Google/Microsoft/Open Content Alliance.
In fact, both Cornell and the University of California Libraries have announced they will work with both projects. However, when you look at the number of libraries (and don't forget about archives, museums, etc.) in the world, it's really only a small number. It's sad to see that what's likely happening is that money (not a major issue, in this case) and TIME (a key issue) likely mean that the same titles are being scanned multiple times. We could all think of other uses for the dollars going to digitize the same title more than once.
The article also points out, both MSFT and Yahoo are members of the Open Content Alliance, and it discusses the pluses and minuses of each program. Here's how we covered it almost two years ago. Then, as today's article notes:
A year after joining, Microsoft added a restriction that prohibits a book it has digitized from being included in commercial search engines other than Microsoft’s.
3) Book digitization is NOT NEW. It's difficult to believe that the NY Times article makes NO mention of Project Gutenberg, which has been digitizing books for over 36 years. That's right, 36 years! BTW, Project Gutenberg Canada launched a few months ago.
4) Keep in mind that access and organization are two different things here. We also know that search habits (for many) will have people searching for phrases like "Dallas Cowboys" or "London Underground" or "New York City Fire Department." We know that most searchers will not use quotation marks to search the words as a phrase. That means millions and millions of hits. This is an excellent example of what constitutes a good part of the invisible or deep web in 2007. True, Universal Search, Onesearch, 3D search, etc., can help but that's another story.
5) Other issues for other ResourceShelf posts include:
A) Book digitization from companies like:
++ ebrary. (ebrary Discover offers more than 20,000 full text books for free. Pay only to copy or print a page.)
++ NetLibrary, available free from many public libraries -- which just passed the 150,000 book milestone
++ Books 24x7
++ Safari Tech Books O'Reilly and Pearson
B) Quality of the scanning and how it appears on the web.
C) The issue of whether people really want to read books on a computer screen -- be it a large monitor or on an iPhone or Treo?
5) Let's review some projects, services, and where to find digitized books:
+ Online Books Page
Thousands and thousands of FREE, full text books from many sources. If you browse the "What's New" page, you'll see links to freely available full text books -- both old and new -- being digitized by organizations like:
+ American Historical Association
+ John F. Kennedy Library
+ The Online Library of Liberty
+ LibraryIreland
+ Rice University Press
+ Internet Sacred Text Archive
+ Doctortee.net
+ University of Virginia Digital Collections
+ Making of America (U of Michigan), Over 12,000 Volumes
+ Illinois Institute of Technology
And these are just the tip of the iceberg.
In other words, many organizations and LIBRARIES, are digitizing books.
Publishers Get in the Act: The National Academies Press Offers Thousands of Full Text Books at No Charge to Search/Read (Unlimited Amount) at No Charge.
The FreePint Family is a family of resources to help information workers be more effective, raise the value of information in their organisations and contribute to success.
'FreePint... provides most of my professional development because it won't come through work and [other resources] just don't cut it.'
FUMSI Forum: Do you have a research question? Post it to the FUMSI Forum, where professionals share Q&A and useful tips on how to Find, Use, Manage and Share Information. It's free.