Friday, 9th July 2010
Clay Shirky Asks, "Can the Internet Save the Book?" and a Bit of Recent Book Search History
Access Full Text of Salon.com Article
This is a reprint (with additional material) of an edited transcript between Shirky, Barnes & Noble Review Editor-in-Chief James Mustich, and BNR contributor Andrew Keen.
Here is One Exchange:
Andrew Keen: Can we go back to the book? It occurs to me that one of the reasons the traditional book lasted so long--one of the reasons it was Russia [the Russian/Poland references refer to a theory Shirky proposes earlier in the article]--is that the form and the function went very well together, and the book was a great way of tracking talent. Take the birth of the 19th-century novel, which is the classic way of putting together a finished product, which then the Industrial Revolution was able to polish and distribute. So when Poland went, it wasn't so dramatic. But when Russia goes, it's going to be really dramatic. We haven't even seen the beginning in the book revolution, have we?
Clay Shirky: I think we are literally just seeing the beginning now. Just yesterday, Google says "Our negotiating position vis-a-vis the publishers has changed dramatically in the last 30 days." Google has been doing this stuff quietly, one way or another, since 2005--Google Scholar, Google Books, digitization, negotiating digital rights, and so forth. It was because they were essentially going to be the second entrant in a monopolistic environment largely dominated by Amazon. The rise of the iPad and the at least not completely accidental renegotiation of the MacMillan-Amazon relationship at the same time has meant that supply and demand are more nearly balanced now, and that the publishers have greater leverage to use that platform.
That is a two-edged sword, which is to say that the ability to engage in price competition with one another cuts both ways in a digital environment because the marginal cost of distribution is still zero. But I think that the last 60 days are the beginning of the real change.
AK: Ok. But digitalizing a book doesn't change anything about it in grand historical terms, does it?
CS: I think it does, because it puts it into an ecosystem where more people have access to more books. The digitizing of a book adds to searchability, it adds to portability, it adds to...
CS: Search is essentially the current model of information-finding, where the old model of was you go to the library and they tell you that you have to know what database you're looking in before you look. That's fine when there are 500 databases, maybe, and someone can help me decide. But when there's an unlimited number of data sources, search becomes the intellectual model of the age. I remember knowing when I'd switched over to thinking digitally when I picked up a copy of "Naked Lunch", and I wanted to find a little passage called "Hauser and O'Brien," about two cops in New York City. I realized, "I can't search for that." I had to remember that it was about three-quarters of the way through the book, and I can kind of vaguely remember it was midway on the right-hand page or something. That experience of not being able to recapture what you've done before is one of the great infelicities of the book world, and I think it's especially frustrating to people with nonfiction when there is a particular point they want to go back to.
The other thing it does, though--which is good or bad, depending on your taste--is it encourages the ability to skip ahead to the parts they want to read. I mean, nonfiction books are going to be transformed, I think, much more dramatically than fiction, precisely because their utility means that people are going to essentially disassemble them mentally even if they're sold as a single package. So to your point about Dickens being assembled in the book after having been created in this disassembled way, we may potentially be seeing something like that on the demand side, which is: I'd like to be able to take this nonfiction book and take it apart again, and preserve or flag the parts I'm going to refer to continually.
1) Tools already exist and continue to be developed to assist a searcher decide which database(s) to select or simply select them outright based on a libraries subscriptions and holdings. Call it federated searching, Summon (from Serials Solutions), EBSCO Discovery Service, or even Dialog's DIALINDEX.
2) In terms of non-fiction, publishers must make sure that back-of-book indexes (or their electronic equivalent) are hyperlinked to provide real value. Full Text searching will work in many cases but with other content (technical material, travel guidebooks, text books, directories, etc.) the cross-references to bring related terms and concepts together that are often found in a back-of-book indices, must be available in an ebook environment. Otherwise, full text searching ONLY can lead to both overload, missed material, and wasted time.
Google Scholar debuted in late 2004. November 17, 2004 to be precise. Since then Microsoft attempted something similar but Microsoft Academic Live was awful. We think their new (live in October, 2009) Microsoft Academic Research from MSR Asia is MUCH improved, offers several unique features (vs. Google Scholar) and continues to grow/improve.
Also, the concept of autonomously crawling the open web for "academic" content and adding a great deal of value added material is much older than Google Scholar. For example, CiteSeer and now CiteSeerX
have been around since 1997. This page (left side) lists some of the value added tools CiteSeerX provides.
Google Books which is really the combination of Google's Library Project (digitizing library materials that was announced in December, 2004) and Google Partner Program for Publishers. If you want to be even more technical about it, Google Print (pre-library program) was FORMALLY announced at the Frankfurt Book Fair in October, 2004. Here's an FAQ about Google Print from that year. HOWEVER, the Google Print project actually began in December, 2003
This post has links to several items as wells as a timeline that was prepared for the second anniversary of the project.
Finally, on October 23, 2003, Amazon.com debuted "Search Inside the Book" that allows (it's still available) users to keyword search the full text of thousands of NEW titles. The amount of content that can be viewed online for free is determined by the publisher (just like Google for Publishers). Today, you can tell if a book is available for searching look for the text "Look Inside the Book" above the book cover on the right side of book page. Here's an example. It's also the case that some books available for full text searching on Amazon are not available from Google Books and vice versa.
For example: John Battelle's, "The Search" (a book about Google) is NOT available for full text searching via Google Books (they provide a book overview) BUT Battelle's book is AVAILABLE for full text searching from Amazon.com.
Also, Amazon was first to begin providing a variety of statistics about some books. You can see them beginning with, "Inside This Book" about one-half way down this page.
As we said earlier, the opposite is also true ("full view" previews from Google and unavailable from Amazon). Plus, Google Books is now providing searchable full text of various archived periodicals. Here's an example from August 11, 1969. Amazon.com does not offer this wealth of material.
See Also: Google Books Chronology