More Facts About the Library of Congress / Twitter Archive; Includes Text of LC/Twitter Agreement
UPDATE (8/30/2010): We've learned from the Library of Congress that they're still deciding the definition of what credentials, beyond a reader ID card, a person would need to access the archive. Also, as we said in earlier posts some curated portions of Twitter archive will likely be accessible on the Internet but it still needs to be decided if the Twitter Archive will be accessible on the web.
BOTTOM LINE: All of the decisions that need to be made are still up in the air.
A couple of weeks ago a massive amount of attention was given to the fact that a copy of the Twitter Archive (back to day one in 2006) along with rolling updates would be a gift to the Library of Congress from Twitter. In fact, Twitter approached LC about the project.
It was also one of the only times that we can think of in the Google "age" that another organization, in this case the Library of Congress, released a similar service on the same day and received more attention than Google.
Earlier that same day, Google announced that "Google Replay" was live on the web and while very small at launch, eventually the entire archive would be searchable back to day one in 2006).
What we thought was perhaps most interesting is that very few news organizations of any kind made it clear that the LC Twitter Archive would NOT BE ACCESSIBLE either online or at the Library of Congress itself by the general public to utilize while the Google version of the archive was already online and would be accessible to anyone with a web connection.
The FAQ continues with why it's important to archive and preserve Twitter and as info pros and many others know, LC collects a wide range of materials. Matt writes:
Individually tweets might seem insignificant, but viewed in the aggregate, they can be a resource for future generations to understand life in the 21st century.
That's true. However, in the original news release from LC, a link to a tweet by President Obama the morning after his victory speech in Chicago is provided. In our view, it's a single tweet that had significance as it was posted and continues to have significance today and will likely have it years from now. A single tweet or a small group of tweets that seem insignificant today might have great significance in the future be it 20, 50, or 100 years from now. Of course, the opposite may also be true.
Next, we read that deleted tweets, private account info, links to pictures and websites will not be archived. So, if someone or the masses tweets and then links to an important government report, info (a link) where to find it will not accessible from the archive. That's a new fact to us and seems a bit strange. Links are central to what Twitter can do. We also learn that LC does not plan to "collect" the linked sites.
Of course, the Internet Archive, Archive-It, and other projects including those from Harvard U. and the University of California are collecting and archiving sites.
Finally, on the six month window between a tweet is tweeted and the time it has the potential to reach the database. We don't know how often the Twitter archive will be updated (daily, monthly, quarterly)? We mentioned some of this in our April 19th report.
The FAQ concludes with some ideas about how LC wants to use this archive as a tool to learn more about digital preservation, as a case study for developing a processes for usage, developing tools for researcher access (read about the Stanford group in the April 19th post), "as well as from the Library’s ongoing experience with serving collections and protecting privacy and rights."
LC will NOT try to reproduce Twitter's functionality (we're guessing that has more to do with retrieval of material than the actual posting of tweets). Two examples of archives already online that might serve as examples of what LC could do with the Twitter material are the National Elections Web Archive and the Supreme Court Nominations Web Archive. They will make an announcement when they are ready for researcher use.
Perhaps the bottom line is that a lot of decisions at LC have to be made. We said that two weeks ago. Patience! It's also why they are calling this a starter list of FAQ's.
However, issues remain that we were surprised not to have seen mentioned in this first FAQ since they were mentioned in the press.
The collection is browsable, searchable and was commissioned by LC from The Internet Archive. In other words, the Library of Congress did not do the actual crawling and archiving.
We have been told that no part of the the Twitter Archive would be web accessible and available to the general public. It would only be available to qualified researchers (a definition that is TBD) at the Library of Congress. No access outside would be available outside of the LC buildings. In our view, this example confuses the situation.
C) To be clear, we weren't able to find a mention of who will and who will not have access to LC's Twitter Archive.
D) Will LC make any of this content accessible to the public perhaps through a display at one of the LC buildings in D.C.?
E) Who will be doing the actual archiving? We know that it's very likely LC will be handling the preservation work but will LC have the technology in place to capture each tweet? What we think is most likely to happen is Twitter will be sending files to LC (on a predetermined basis) via the Internet.
That's about it. If/when we think of more we'll add them.
We want to be clear. We are beyond thrilled that LC is part of the project. As they point out in the FAQ, if nothing else this will be a tremendous learning experience for future digital archiving project with the scope and notoriety that Twitter provides. Also, kudos to Twitter for thinking of LC and offering the archive as a gift.
Of course, one question only Microsoft and Twitter can answer, is Bing also going to offer a Twitter archive?
The FreePint Family is a family of resources to help information workers be more effective, raise the value of information in their organisations and contribute to success.
'FreePint... provides most of my professional development because it won't come through work and [other resources] just don't cut it.'
FUMSI Forum: Do you have a research question? Post it to the FUMSI Forum, where professionals share Q&A and useful tips on how to Find, Use, Manage and Share Information. It's free.