Receive the weekly sampler of posts and "Resource of the Week".
Subscribe »

Enter your
email address:

My Account »


Bookmark and Share

Testimonial?
If you find ResourceShelf useful, please supply a testimonial »








Home > ResourceBlog > Article

« All ResourceBlog Articles

 

Bookmark and Share   Feed

Friday, 21st May 2010

A Look Around the New Home of the Internet Archive & A Few Comments from Brewster

Rob Pegoraro from the Washington Post headed out to San Francisco and visited Brewster Kahle and his team at the Internet Archive (IA). It's a very interesting article and a excellent primer for anyone interested in one of our favorite tools that goes beyond The Wayback Machine. That said, Wayback is one of the top two or three essential tools for Internet researchers. As time moves on, it will become even more important.

In terms of physical location, the Internet Archive is now located in the former home of Christian Science church. They moved to this location from the Presidio of San Francisco last Fall. The article makes also points out that some Internet Archive scanning takes place in a former Christian Science reading room. But this is not the only place where books and other materials are scanned. If you take a look at various collections of books in the IA you can see collections digitized at other places. For example, here is the U. of Toronto page.

We were thrilled to see that Pegoraro mentions The Open Library project which is an Internet Archive "initiative." The just relaunched their database and we posted an extended item about what the enhanced database can users. It's one of the cooler searchable databases we've seen and also very cool is that The Open Library is doing work with LibraryThing and Goodreads which is great to see.

Here's one quote from the article. Pegoraro asks Kahle about data formats that would work well for long-term storage.

I [Pegoraro] wrapped up our interview by asking Kahle for his preferred file formats for long-term storage, since I get that kind of question fairly often from readers. He said the archive uses FLAC (Free Lossless Audio Compression) for music, had adopted H.264 for video storage after trying five other formats, used JPEG for photos and employed a related format, JPEG 2000, for text-heavy images. But he also said that for personal storage, PDF or nearly universally supported commercial formats -- even Microsoft Office -- would be fine, too.

With the realization that articles in newspapers can't go on forever (and have to pass by several editors) the only thing we would have loved to seen a mention of is anything about Archive-It.

If you're unaware of the service, here's a brief overview.

Archive-It is a fee-based service that many non-profits, schools (K-12 and higher ed), libraries, archives, and others use to archive their own websites or collections based on topics of interest to that organization.

As of today Archive-It has more than 1000 public collections that you can access and search. Plus, when you search an archived collection unlike The Wayback Machine you can use keywords. For example, this page lists all of public collections. Near the top you'll see that the complete ACLU web site is archived here.

Users of the Archive-It can archive an entire or just a set of pages. Here's a collection from Stanford's Humanities Lab. As you'll see (look for the URLs) they're archiving web sites or web pages that deal with video games with a focus on single player games. Btw, the National Institutes of Health and their Daily Web Snapshot of what appears to be top-level pages is an example of a daily crawl.

Overall, an excellent read that would be a great resource to share with others especially those of you who teach web search and discuss The Wayback Machine and the archiving of web content.

Source: The Washington Post

Access the Complete Article by Rob Pegoraro

Note from Gary: I was able to visit IA HQ the week they moved in to this new location. Mucho cool and I can only imagine the move-in is complete and more cool things are going on.


Category:

Views: 3263




blog comments powered by Disqus

« All ResourceBlog Articles

 

Read about the FreePint FamilyFreePint Family

A family of resources to help information workers be more effective, raise the value of information in their organisations and contribute to success. Read more »


FeedLatest Family Articles:


Click to view the article Quilting big data threads
Thursday, 24th May 2012

Recently I have found myself cooing over visualisation maps (and heat maps) of health and well being resources. The content rich data is overlayed with mapping technologies, and some interesting themes and patterns are emerging.


Click to view the article The fallacy of information overload
Wednesday, 23rd May 2012

A lot of the talk around social media in the last year has been around information overload. Social media has provided us with new and exciting ways to create content. But it has also meant learning new ways to manage and engage with social media tools. Are we teetering on the edge of an information overload precipice?


Click to view the article Information overload: fact, fantasy or filter failure?
Wednesday, 23rd May 2012

Information overload is a figment of your imagination. Or a failure of your filter. Or a symptom of your technological submissiveness. Depends on who you ask.


Click to view the article Newsdesk: tracking millions of pieces of information a day
Tuesday, 22nd May 2012

What if you had to sort through 3.5 million articles and social media posts a day and try to pull out the most relevant items for your organisation? What if you then had to cobble it all together into something readable for your top groups and executives in your organisation?


Click to view the article Alacra Compliance adds managerial oversight
Tuesday, 22nd May 2012

Alacra Compliance saves time by aggregating information from both free and fee-based sources and enabling users to conduct an accurate federated search across these sources (coined “simultaneous search” by Alacra).


All Family Articles »
Family Articles by Category »


Tell us what you're working on,
and we'll talk to you about how FreePint can help »


FreePint Family Testimonials

"Fabulous resource to learn of unique tools and insights. Very useful." Manager, Futures and Forecasting, Virginia, USA

More testimonials »






Subscribe

Subscribe to the ResourceShelf Newsletter and receive the weekly sampler of posts and Resource of the Week.

Find out more »

ResourceShelf sponsored by:

Article Categories

All Article Categories »

Archive

All Archives »