Receive the weekly sampler of posts and "Resource of the Week".
Subscribe »

Enter your
email address:

My Account »


Bookmark and Share

Testimonial?
If you find ResourceShelf useful, please supply a testimonial »








Home > ResourceBlog > Article

« All ResourceBlog Articles

 

Bookmark and Share   \"Feed\"

Wednesday, 28th April 2010

More Facts About the Library of Congress / Twitter Archive; Includes Text of LC/Twitter Agreement

UPDATE (8/30/2010): We've learned from the Library of Congress that they're still deciding the definition of what credentials, beyond a reader ID card, a person would need to access the archive. Also, as we said in earlier posts some curated portions of Twitter archive will likely be accessible on the Internet but it still needs to be decided if the Twitter Archive will be accessible on the web.

BOTTOM LINE: All of the decisions that need to be made are still up in the air.

A couple of weeks ago a massive amount of attention was given to the fact that a copy of the Twitter Archive (back to day one in 2006) along with rolling updates would be a gift to the Library of Congress from Twitter. In fact, Twitter approached LC about the project.

It was also one of the only times that we can think of in the Google "age" that another organization, in this case the Library of Congress, released a similar service on the same day and received more attention than Google.

Earlier that same day, Google announced that "Google Replay" was live on the web and while very small at launch, eventually the entire archive would be searchable back to day one in 2006).

What we thought was perhaps most interesting is that very few news organizations of any kind made it clear that the LC Twitter Archive would NOT BE ACCESSIBLE either online or at the Library of Congress itself by the general public to utilize while the Google version of the archive was already online and would be accessible to anyone with a web connection.

In a ResourceShelf post about both the LC Twitter and the Google Twitter Archives from April 19th, we used multiple sources including a conversation with an LC spokesperson to do our best to provide a complete and factual overview (at that time) of both services.

TODAY, in a second blog post on the Library of Congress blog, Matt Raymond provides more information in the first of a "starter list" of FAQs that also include the complete text of the LC/Twitter agreement. 1) As we said in our post on the 19th, LC is getting the archive from Twitter as a gift.

Here's the text of the agreement (PDF; 1 page) along with Twitter's current (8 pages; PDF) and their previous terms of service (4 pages; PDF) that were also included as addenda.

The FAQ continues with why it's important to archive and preserve Twitter and as info pros and many others know, LC collects a wide range of materials. Matt writes:

Individually tweets might seem insignificant, but viewed in the aggregate, they can be a resource for future generations to understand life in the 21st century.

That's true. However, in the original news release from LC, a link to a tweet by President Obama the morning after his victory speech in Chicago is provided. In our view, it's a single tweet that had significance as it was posted and continues to have significance today and will likely have it years from now. A single tweet or a small group of tweets that seem insignificant today might have great significance in the future be it 20, 50, or 100 years from now. Of course, the opposite may also be true.

Next, we read that deleted tweets, private account info, links to pictures and websites will not be archived. So, if someone or the masses tweets and then links to an important government report, info (a link) where to find it will not accessible from the archive. That's a new fact to us and seems a bit strange. Links are central to what Twitter can do. We also learn that LC does not plan to "collect" the linked sites.

Of course, the Internet Archive, Archive-It, and other projects including those from Harvard U. and the University of California are collecting and archiving sites.

Finally, on the six month window between a tweet is tweeted and the time it has the potential to reach the database. We don't know how often the Twitter archive will be updated (daily, monthly, quarterly)? We mentioned some of this in our April 19th report.

The FAQ concludes with some ideas about how LC wants to use this archive as a tool to learn more about digital preservation, as a case study for developing a processes for usage, developing tools for researcher access (read about the Stanford group in the April 19th post), "as well as from the Library’s ongoing experience with serving collections and protecting privacy and rights."

LC will NOT try to reproduce Twitter's functionality (we're guessing that has more to do with retrieval of material than the actual posting of tweets). Two examples of archives already online that might serve as examples of what LC could do with the Twitter material are the National Elections Web Archive and the Supreme Court Nominations Web Archive. They will make an announcement when they are ready for researcher use.

Perhaps the bottom line is that a lot of decisions at LC have to be made. We said that two weeks ago. Patience! It's also why they are calling this a starter list of FAQ's.

However, issues remain that we were surprised not to have seen mentioned in this first FAQ since they were mentioned in the press.

A) This page lists many web archives being developed or that have been completed by LC. This page lists many of the only completed archives.

For example, the last one on this page is to an archive of about 800 sites from and about the 2000 election.

The collection is browsable, searchable and was commissioned by LC from The Internet Archive. In other words, the Library of Congress did not do the actual crawling and archiving.

The 2000 Election Archive is accessible online and to the public.

We have been told that no part of the the Twitter Archive would be web accessible and available to the general public. It would only be available to qualified researchers (a definition that is TBD) at the Library of Congress. No access outside would be available outside of the LC buildings. In our view, this example confuses the situation.

C) To be clear, we weren't able to find a mention of who will and who will not have access to LC's Twitter Archive.

D) Will LC make any of this content accessible to the public perhaps through a display at one of the LC buildings in D.C.?

E) Who will be doing the actual archiving? We know that it's very likely LC will be handling the preservation work but will LC have the technology in place to capture each tweet? What we think is most likely to happen is Twitter will be sending files to LC (on a predetermined basis) via the Internet.

That's about it. If/when we think of more we'll add them.

We want to be clear. We are beyond thrilled that LC is part of the project. As they point out in the FAQ, if nothing else this will be a tremendous learning experience for future digital archiving project with the scope and notoriety that Twitter provides. Also, kudos to Twitter for thinking of LC and offering the archive as a gift.

Of course, one question only Microsoft and Twitter can answer, is Bing also going to offer a Twitter archive?

For more about the Twitter archive from Google (aka Google Replay), see the second part of this post. You can also begin and learn by doing. Start here and replace the search term, MoMA (Museum of Modern Art) with your terms(s). To focus your results simply move the timeline and arrows to a desired date and time (down to the minute). Here's an example: tweets with the company name Netflix in them that were tweeted on April 4th, from 11:45-11:53am.

Sources: LC Blog, ResourceShelf

Views: 1581



blog comments powered by Disqus

« All ResourceBlog Articles

 

Read about the FreePint FamilyThe FreePint Family is a family of resources to help information workers be more effective, raise the value of information in their organisations and contribute to success.

'FreePint... provides most of my professional development because it won't come through work and [other resources] just don't cut it.'

Read about the FreePint Family »


Visit the FreePint ShopFreePint Shop: FreePint sells reports, resources and subscription products to support your information work and information-related decisions.

Latest: FreePint Volume: Critical Insight on Social Media 2012 (01 Feb 2012) | FUMSI Report: Folio on Conferences and Continuing Professional Development (26 Jan 2012) | FreePint Research Report: Information Governance Policies and Priorities (25 Jan 2012) | Docuticker Report: DocuTips on Health Literacy (19 Jan 2012) | VIP Magazine: 98 (18 Jan 2012)

Browse the FreePint Shop »


FUMSI ForumFUMSI Forum: Do you have a research question? Post it to the FUMSI Forum, where professionals share Q&A and useful tips on how to Find, Use, Manage and Share Information. It's free.

Latest FUMSI Forum postings: Most Shared Content on Finding Information (09 Feb 2012) | Times are changing - a FUMSI Editorial (09 Feb 2012) | [TIPPLE] eBook resources - Share (07 Feb 2012) | Most Shared Content on Sharing Information (01 Feb 2012) | Our own worst enemy? - a FUMSI Editorial (01 Feb 2012)

Visit the FUMSI Forum and post »


VIP LiveWireVIP LiveWire: Offers commentary on emerging news stories of interest to premium content users, vendors and industry insiders.

Latest VIP LiveWire postings: Compliance - it's not just financial (10 Feb 2012) | Social media and BRIC - new report (08 Feb 2012) | Reuters takes the social media pulse (08 Feb 2012) | How to deal with the tech-savvy customer? (08 Feb 2012) | More ways for employers to poke around (01 Feb 2012)

Visit the VIP LiveWire »






Subscribe

Subscribe to the ResourceShelf Newsletter and receive the weekly sampler of posts and Resource of the Week.

Find out more »

ResourceShelf sponsored by:

Article Categories

All Article Categories »

Archive

All Archives »