Friday, 29th September 2006
A Selected List of Web Page Preservation and Archiving Projects
Before we begin, please note that this is far from a comprehensive list. It's just a beginning. Many large web archiving projects (in many languages) are coming online all of the time. Plus, others already exist that we did mention in this first go around. In other words, more to come.
European Archive Foundation Launches Free Digital Library
The European Archive Foundation said Thursday it has launched its massive digital library of free music and film. The nonprofit organization collaborates with national libraries and other organizations to make non-copyrighted, or free-use material available to the public.
Direct to the European archive
At launch contents includes:
+ 22 British Government Public Information Films
+ Recordings (limited accessibility by region)
+ Web Pages and Sites
++ European Constitution Web Archive
++ UKGOV Weekly Web archive
Weekly collection of 11 UK government websites
Thanks to Peter Suber from Open Access News (essential reading) for the news tip.
Here's a List of Some Other Web Archiving Projects
Remember, more to come.
+ Don't Forget The Internet Archive is full of Music, Film, Text, and Numerous Special Collections along with the Essential Wayback Machine. Some of the special web collections include:
+ Hurricanes Katrina and Rita
+ Web Pioneers
+ Using Archive-It Technology from the Internet Archive, here are a few of the collections built so far using Archive-It. Learn about each of these archives and find links to many more on this page.
A collection of websites of anarchist organizations (groups, networks) around the world.
+ Canadian Labour Unions
+ Canadian Political Parties And Political Interest Groups
+ Canadian Political Interest Groups
+ Islamic Middle East
+ Latin American Government Documents Archive, LAGDA
The Latin American Government Documents Archive (LAGDA) seeks to preserve and facilitate access to a wide range of ministerial and presidential documents from 18 Latin American and Caribbean countries.
+ Archive Of Political Parties And Elections In Latin America
+ North Carolina State Government Web Site Archive
+ South Dakota, Legislative Research Council
+ Archive Of Venezuelan Political Discourse
+ Virginia State Government, Judicial Branch, Collection
+ Indiana University Web Sites
+ University Of Southern California Website Archive
University Of Toronto Web Archives
Learn about each of these archives and find links to many more on this page.
+ 2004 Presidential Term Web Harvest
Note: Keyword searchable using Nutch software.
The 2004 Presidential Term Web Harvest is a National Archives and Records Administration (NARA) project that produced a collection of federal web sites copied, or harvested, from the world wide web between 10/14/04 and 11/19/04. The Heritrix web harvester and a list of 982 active and unrestricted second level URLs were used to capture all linked federal sites down to the fourth level. Those initial 982 ".gov" and ".mil" URLs were provided by U.S. General Services Administration's (GSA) ".GOV" Internet Domain Registry and the Defense Information Systems Agency (DOD/DISA)...The harvest collection contains approximately 6.5 terabytes of information, roughly 75 million web pages and represents about 50,000 ".gov" and ".mil" unrestricted federal web sites active between 10/14/04 and 11/19/04.
+ MINERVA (Mapping INternet Electronic Resources for Virutal Archive (via LC)
Web Archives Available:
+ 107th Congress
December 12, 2002
+ Election 2002
Jul. 1, 2002 - Nov. 30, 2002
September 11, 2001
Sep. 11, 2001 - Dec. 1, 2001
Aug 1, 2000 - Jan 21, 2001
+ White House Web Site "Snap Shots" (via Clinton Library)
1994, 1999, 2000, Late 2000-2001 (Final Days), White House Virtual Library (1993-Mid Jan 2001)
+ Australia, Pandora Archive (via NLA and Partners)
Australia's Web Archive
PANDORA, Australia's Web Archive is a growing collection of copies of Australian online publications, established initially by the National Library of Australia in 1996, and now built in collaboration with nine other Australian libraries and other cultural collecting organisations.
+ United Kingdom Web Archiving Consortium
Despite our apparent dependence on this medium, very little attention has been paid to the long-term preservation of websites. Indeed, with the life of an average website estimated to be around 44 days (about the same lifespan as a housefly) there is a danger that invaluable scholarly, cultural, and scientific resources will be lost to future generations. To address this problem, a consortium of six leading UK institutions is working collaboratively on a project to develop a test-bed for selective archiving of UK websites. View the project timeline here.
+ Books: Digital History: A Guide to Gathering, Preserving, and Presenting the Past on the Web (via George Mason University, Center for History and New Media
A book that provides a plainspoken and thorough introduction to the web for historians who wish to produce online work, or to build upon and improve the projects they have already started.
See Also: David Mattison's British Columbia Digital Library