Wednesday, 6th September 2006
Google Launches News Archive Search: Commentary
Last Saturday we posted about the possible discovery of a new Google service of news archives. Kudos to Garett Rogers for another on-target prediction. That post has more, so please be sure to stop by.
Today, Google released Google News archive that Chris Sherman offers an overview of here. He also writes:
News archive results are also returned when you search on Google News or do a general Google web search and your query has relevant historical news results.
A couple of comments:
1) A large portion of the material still costs the end user money. Libraries (of all types) offer this content and more for free. Kudos to Chris for pointing it out and linking to our post. So, in essence this is a potential revenue stream for Google. As the FAQ points out, Google is not receiving payment for sold articles or (at least for now) there are no ads on results pages. We will have to watch and see how this might develop. Might a subscription service be in the future plans? Thanks to R.M. for the help. Btw, in some cases, even on preview pages for articles you have to purchase, Google ads are visible. So, while the FAQ is completely accurate, more content means more eyeballs, more eyeballs mean more potential clicks on Google ads.
2) As we noted in our Saturday post, this is not a new idea. Northern Light offered a similar service years ago and aside from Northern Light's other problems, people were not key on paying. Here was their title list. Again, more in this post from last Saturday.
3) About two years ago Yahoo released Yahoo Subscriptions. Interesting idea but I don't think it has gotten the notice or use Yahoo hoped for. In fact, some of the Yahoo content providers (Factiva, LexisNexis, Thomson Gale) are also part of the Google program. On Google's side, if you know about the specialized interface, Google News Archive is easier to use.
From Chris Sherman's article:
Search results available for a fee are labeled "pay-per-view" or with a specific price indicated. Google does not host this content; clicking on a link for fee-based content takes you to the content owner or aggregator's web site where you must complete the transaction before gaining access to the content...Google has no plans to become a content aggregator itself, or to even offer a streamlined payment system where you can use your Google account to pay for content, according to Google content partnerships director Jim Gerber. "At this point we are focusing on trying to make the content easily searchable and navigable," he said...
Chris does point out the direct links to library holdings via Google Scholar***, but for now, Google has no plans to build gateways to content through public libraries.
I'll assume that also means special and school libraries.
4) We often hear from people that Google's single box interface is easier to use than what you might find from InfoTrac, Factiva, ProQuest, etc. Perhaps. But these companies have done a great job in creating a single search box for those who want it. We think many people say this because they haven't seen what these companies now offer. Also remember that these interfaces also provide the power that some users might want. For example, browsing by journal title. In essence, creating a virtual newsstand. As we've said many times, users can't use what they don't know about. Sad but true.
Sherman reports that providers are happy about this service. Why? They make money. The question then becomes, are licensed databases from public libraries, academic libraries, and special libraries going to disappear? We're not so sure about the academic and special libraries, but public libraries might be another case. That said, they do bring in lots of revenue for these companies. The problem has been that the providers have done nothing to help libraries promote them. In other words, sell them and say goodbye.
A) Advanced Interface Available
However, no list of sources is online. Google uses NewsBank as a source. NewsBank is an aggregator, not a publisher like The New York Times. How would a typical end user know that an article is in NewsBank? Would they even know what NewsBank is and what it contains?
B) Remember, this is a news search tool. However, due to access to HighBeam, lots of "non-news" sources are listed. Again, since a list of sources and run dates aren't listed, it's hard to plan. Yes, some archived articles are available from Esquire, but how is the typical user supposed to know the range of dates? Could they be missing something?
Btw, source:[foo] works.
C) HighBeam articles. When you access one (see the 2nd and 3rd results here) you get a small citation. However, to read the full text you need to subscribe to the service. Yes, a free one-week trial is available, but first you need to give a credit card and, second, it's YOUR responsibility to cancel the subscription within that week. A monthly sub costs $29.95 or $149.95 per year. Couldn't a free trial be made available and automatically cancelled after the one week? Is it necessary to put the burden on the end user? We think in this day and age and especially with Google's "pull," taking the burden off the end user to cancel is the proper thing to do.
D) International Audiences
We will have to await reports. Not sure (no documentation) if this material can also be purchased by those outside of the U.S. There is non-English material in the database with more to come.
E) No subject searching available using the controlled vocab many databases provide. Sure, most end users will not use it but it would be useful for advanced searchers.
F) Title Confusion
We searched for the current magazine The Weekly Standard and found materials. However, when we went to limit by date (left rail) we were given dates in the 1850's and 1860's. Same title, different publication.
G) Timeline Search, Cool Idea But Needs Work
KUDOS on this one to Google when it works. Neat idea. Of course, Topix.net also launched something like this a few weeks ago, graphical too! However, Timeline Search has issues.
A timeline search for Google shows nothing about the search company. It begins in the 1890s and ends in the 1930's with articles about Barney Google.
NOTE: We were working on this article during Danny Sullivan's podcast today so we missed it. Sorry D.S. However, we heard that Danny also mentioned this. :-) Btw, Google Timeline Search does better with Yahoo, Microsoft, and even Gigablast.
A timeline search for Baseball
Shows a couple of articles from the 1890's and then jumps to 1991. Where are articles from the other 100 years? Searching without the timeline turns up plenty. No football (American we guess) during the 1980's that would make a timeline? What happened to stories about the birth of Monday Night Football in 1970. No timeline mention. In fact, nothing for the 1970's.
Some material from The New York Times is part of the Google archive. However, it's not all there. We were unable to find any documentation. As we said, many libraries offer for free the full text and full image of the NYT back to 1851. It's also available (for a fee) from the NY Times web site. Google News Archive has zero articles about Monday Night Football from The Times 1/1/1969-12/31/1970 but the New York Times Historic Archive (again free from many libraries) lists eleven. Again, without a catalog of what is and is not available and the archived dates available for each publication, it makes for many issues, missed articles, and wasted time.
Finally, New York Times articles available via the Google News Archive do not contain graphics, charts, etc. In other words, they are TEXT ONLY. The complete NY Times Archive offers the full image of each page with this material included. Advertisements are even searchable and retrievable. VERY USEFUL. Other papers that some libraries offer for free in this format include the Washington Post and the Wall St. Journal. Check out this list for more. Don't forget the Canadian digitization work from Cold North Wind is doing work to digitize Canadian papers.
Yes, it has issues, but FindArticles, around for years, is still available and allows the user to narrow by source. Remember, some of FindArticles content is fee-based (via HighBeam) but they do have a lot of content for free. You can limit your search to free content.
NOTE: I did a few quick searches and found articles that Google is selling (in one form or another) available for free from FindArticles.com. Here are three examples.
+ Apple's 'Hypercard': promise equals hype. (Software Review)
Via Google, you need to subscribe to HighBeam ||| Via Find Articles (full text, free)
++ New Light on Weimar. (Weimar, Germany)
Via Google, you need to subscribe to HighBeam to access full text ||| Via Find Articles (full text, free)
+++ Petersen plots Ford's future - Donald Petersen, Ford Motor
fee-based via Google, HighBeam ||| Via Find Articles (full text, free)
Bottom Line: Libraries left out of the picture and end users being charged for what they could get for free with just a bit of knowledge. Plus, other places to look online for free.
Like all things Google, only time will tell. Has Google Base or Google Co-op become household words? What about Orkut? Even with Google's massive and I mean massive reach, MySpace is the dominant player. Let's check back in a few months. I'm sure publishers will make some $$$ from this, but the question is will it be enough, will they continue to care about libraries and the library marketplace? Of course, we will note again what Marissa Mayer told BusinessWeek:
Marissa Mayer, estimated that up to 60% to 80% of Google’s products may eventually crash and burn.
"We anticipate that we’re going to throw out a lot of products,” says Mayer. “But [people] will remember the ones that really matter and the ones that have a lot of user potential.”
Said another way, don't put all of your eggs in one basket.
POSTSCRIPT 1: To Librarians
If you're paying for these databases we must do a better job of making them noticeable. Don't take my word for it. Noted Forbes tech columnist Steve Manes loves what libraries offer database wise, but he notes it's often hard to find out about and simply discover.
POSTSCRIPT 2: Same Article For Sale, Many Databases
Take a look at this 2005 article from our friends at Computer in Libraries. Yes, it's the same article from many vendors. Let's review.
1) Buy it as part of the $19.95/month HighBeam subscription service.
2) Purchase from Alacra for $9.95 in a pay-per-view model
3) Purchase from Goliath for 9.95 in a pay-per-view model
4) We perhaps can get an idea of archive dates when we see that a 2001 article from CIL is available from only one source.
POSTSCRIPT 3: EBSCO and AccessMyLibrary Reviewed in TV Guide Style
If the user happens across an article via EBSCO or the AccessMyLibrary program and their library participates and they have a libary card they can access some material for free. The user will get to a page that directs them to find the article via a library. I searched for a few libraries and came up empty (EBSCO). Jeer.
Jeer, simply changing your Zip Code on Access at My Library takes a couple of clicks. When I entered my specific Zip Code I got a list of 86 libraries. Confusing. And for more confusion, my local public library (that subscribes to Gale products) was not listed. When I changed Zip Codes (not the easiest link to find, jeer) I entered a Zip for midtown Manhattan and the NYC Public was not listed. The same for a downtown Chicago Zip, no listing for the Chicago Public Library. This could be confusing for the end user. Btw, each AccessMyLibrary article is also available for a fee. CHEER: Browse titles, however no archive info data is provided. Jeer, when compared to what a user could access directly from a ThomsonGale database via a library, the title list is rather small. Btw, directory pages contain ads often to subscribe to magazines. Full text articles also contain ads.
Cheer and kudos to TG for placing a link to a library's set of TG licensed databases. We realize it's business, but offering a list of all databases at that specific library would be a plus for the entire industry. Also, we couldn't find any info on how often the database is updated (it seems frequent). However, we don't know how often those updates are sent to Google News Archive.
AccessMyLibrary is not new. It went live last June. You can also access some TG print publications. Finally, the AccessMyLibrary "Browse Pages" contain ads to subscribe to magazines.
A jeer. A results page containing AccessMyLibrary articles make NO mention of the possibility of getting the article for free. However, the price of the article ($6.95/US) if purchased from the AccessMyLibrary program is listed. Why no mention of the potential of free access?
Finally, confusion (jeer). Going directly to AccessMyLibrary we found an archived press release from Business Wire. The press release is titled, "Zacks Sell List Highlights: Andrx Corporation, H&R Block, Inc., Ceridian Corporation and H.J. Heinz Company" and was published on December 30, 2004. However, it seems that some AccessMyLibrary material CANNOT be found through Google archive search. You'll see that this press release is for sale for $9.95. It seems some Business Wire content from 2004 is available via AML while others are not. Our guess is that the full contents of AccessMyLibrary is NOT in the Google news archive database? More confusion for end users.
POSTSCRIPT 4: One news digitization company we like a lot is NewspaperArchive.com. Currently, they offer more than 45 million digitized pages. They do good work and some of their material is in Google News Archive. Two quick points.
1) Did you know that they give their ENTIRE database AWAY for FREE to K-12 schools and public libraries? See this post.
2) They also offer numerous freebies that contain full text and full image articles. Again, searchable and free. We've listed many of these specialty archives on ResourceShelf but to save time, see this list. Impressive.
Everyone Likes Free Stuff
Let's also note that some articles from Google News Archive via NewspaperArchive are available in both places. One free, one fee-based. Example, this search for Wilt Chamberlin.
+ Google (first article from Sheboygan Press)
+ College Basketball Archive from NewspaperArchive.com
Same one. Click. It's free in PDF.
*** Sidebar: Google Scholar and Libraries
We did a some random searches for libraries participating in the Google Scholar Library Links program. Yes, MANY are listed and just a click away. However, we were a bit surprised that several large schools and many well-known liberal arts universities and community colleges were part of the program. Interesting, since this program has received massive amounts of attention. You would have thought just about every library would be listed and active.
+ Baruch College, CCNY (Not Listed)
+ Butler University (Not Listed)
+ Beloit College (Not Listed)
+ Boston University (Not Listed)
+ Cambridge University (Not Listed)
+ Chaminade Univesity (Not Listed)
+ College of DuPage (Not Listed)
+ Emerson College (Not Listed)
+ Fullerton College (Not Listed)
+ Georgetown (Available)
+ Grambling (Not Listed)
+ Grinnell College (Not Listed)
+ Illinois Institute of Technology (Not Listed)
+ London School of Economics (Not Listed)
+ Loyola University, Chicago (Not Listed)
+ McGill University (Not Listed)
+ Macalester College (Available)
+ Maricopa County Community College (Not Listed
+ Miami of Ohio (Not Listed)
+ New School University (Not Listed)
+ Northern Virginia Community College (Not Listed)
+ Oakton Community College (Not Listed)
+ Oregon State (Not Listed)
+ Oxford University (Available)
+ Philadelphia University (Not Listed)
+ Princeton (Listed but not activated)
+ Rheinisch- Westfälische Technische Hochschule (Not Listed)
+ Salem State College (Not Listed)
+ San Jose State (Available)
+ Saddleback Community College (Not Listed)
+ U.S. Air Force Academy (Not Listed)
+ U.S. Naval Academy (Available, but not activated)
+ UCLA (Not Listed, Listed as University of California, Los Angeles, no cross reference)
+ University of Alaska (Not Listed)
+ University of Chicago (Not Listed)
+ University of Delaware (Not Listed)
+ University of Kentucky (Not Listed)
+ University of Texas, Austin (Listed)
+ University of Toronto (Available)
+ University of Washington (Not Listed)
+ University of Wyoming (Not Listed)
+ College of William and Mary (Listed, but not activated)
+ Yale University (Listed but not activated)
+ York University (Not Listed)