Receive the weekly sampler of posts and "Resource of the Week".
Subscribe »

Enter your
email address:

My Account »


Bookmark and Share

Testimonial?
If you find ResourceShelf useful, please supply a testimonial »








Home > ResourceBlog > Article

« All ResourceBlog Articles

 

Bookmark and Share   \"Feed\"

Friday, 21st May 2010

A Look Around the New Home of the Internet Archive & A Few Comments from Brewster

Rob Pegoraro from the Washington Post headed out to San Francisco and visited Brewster Kahle and his team at the Internet Archive (IA). It's a very interesting article and a excellent primer for anyone interested in one of our favorite tools that goes beyond The Wayback Machine. That said, Wayback is one of the top two or three essential tools for Internet researchers. As time moves on, it will become even more important.

In terms of physical location, the Internet Archive is now located in the former home of Christian Science church. They moved to this location from the Presidio of San Francisco last Fall. The article makes also points out that some Internet Archive scanning takes place in a former Christian Science reading room. But this is not the only place where books and other materials are scanned. If you take a look at various collections of books in the IA you can see collections digitized at other places. For example, here is the U. of Toronto page.

We were thrilled to see that Pegoraro mentions The Open Library project which is an Internet Archive "initiative." The just relaunched their database and we posted an extended item about what the enhanced database can users. It's one of the cooler searchable databases we've seen and also very cool is that The Open Library is doing work with LibraryThing and Goodreads which is great to see.

Here's one quote from the article. Pegoraro asks Kahle about data formats that would work well for long-term storage.

I [Pegoraro] wrapped up our interview by asking Kahle for his preferred file formats for long-term storage, since I get that kind of question fairly often from readers. He said the archive uses FLAC (Free Lossless Audio Compression) for music, had adopted H.264 for video storage after trying five other formats, used JPEG for photos and employed a related format, JPEG 2000, for text-heavy images. But he also said that for personal storage, PDF or nearly universally supported commercial formats -- even Microsoft Office -- would be fine, too.

With the realization that articles in newspapers can't go on forever (and have to pass by several editors) the only thing we would have loved to seen a mention of is anything about Archive-It.

If you're unaware of the service, here's a brief overview.

Archive-It is a fee-based service that many non-profits, schools (K-12 and higher ed), libraries, archives, and others use to archive their own websites or collections based on topics of interest to that organization.

As of today Archive-It has more than 1000 public collections that you can access and search. Plus, when you search an archived collection unlike The Wayback Machine you can use keywords. For example, this page lists all of public collections. Near the top you'll see that the complete ACLU web site is archived here.

Users of the Archive-It can archive an entire or just a set of pages. Here's a collection from Stanford's Humanities Lab. As you'll see (look for the URLs) they're archiving web sites or web pages that deal with video games with a focus on single player games. Btw, the National Institutes of Health and their Daily Web Snapshot of what appears to be top-level pages is an example of a daily crawl.

Overall, an excellent read that would be a great resource to share with others especially those of you who teach web search and discuss The Wayback Machine and the archiving of web content.

Source: The Washington Post

Access the Complete Article by Rob Pegoraro

Note from Gary: I was able to visit IA HQ the week they moved in to this new location. Mucho cool and I can only imagine the move-in is complete and more cool things are going on.


Category:

Views: 2953



blog comments powered by Disqus

« All ResourceBlog Articles

 

Read about the FreePint FamilyThe FreePint Family is a family of resources to help information workers be more effective, raise the value of information in their organisations and contribute to success.

'FreePint... provides most of my professional development because it won't come through work and [other resources] just don't cut it.'

Read about the FreePint Family »


Visit the FreePint ShopFreePint Shop: FreePint sells reports, resources and subscription products to support your information work and information-related decisions.

Latest: FreePint Volume: Critical Insight on Social Media 2012 (01 Feb 2012) | FUMSI Report: Folio on Conferences and Continuing Professional Development (26 Jan 2012) | FreePint Research Report: Information Governance Policies and Priorities (25 Jan 2012) | Docuticker Report: DocuTips on Health Literacy (19 Jan 2012) | VIP Magazine: 98 (18 Jan 2012)

Browse the FreePint Shop »


FUMSI ForumFUMSI Forum: Do you have a research question? Post it to the FUMSI Forum, where professionals share Q&A and useful tips on how to Find, Use, Manage and Share Information. It's free.

Latest FUMSI Forum postings: Most Shared Content on Finding Information (09 Feb 2012) | Times are changing - a FUMSI Editorial (09 Feb 2012) | [TIPPLE] eBook resources - Share (07 Feb 2012) | Most Shared Content on Sharing Information (01 Feb 2012) | Our own worst enemy? - a FUMSI Editorial (01 Feb 2012)

Visit the FUMSI Forum and post »


VIP LiveWireVIP LiveWire: Offers commentary on emerging news stories of interest to premium content users, vendors and industry insiders.

Latest VIP LiveWire postings: Social media and BRIC - new report (08 Feb 2012) | Reuters takes the social media pulse (08 Feb 2012) | How to deal with the tech-savvy customer? (08 Feb 2012) | More ways for employers to poke around (01 Feb 2012) | Trust your supplier? Check with the Armadillo (01 Feb 2012)

Visit the VIP LiveWire »






Subscribe

Subscribe to the ResourceShelf Newsletter and receive the weekly sampler of posts and Resource of the Week.

Find out more »

ResourceShelf sponsored by:

Article Categories

All Article Categories »

Archive

All Archives »