Receive the weekly sampler of posts and "Resource of the Week".
Subscribe »

Enter your
email address:

My Account »


Bookmark and Share

Testimonial?
If you find ResourceShelf useful, please supply a testimonial »








Home > ResourceBlog > Article

« All ResourceBlog Articles

 

Bookmark and Share   Feed

Wednesday, 28th April 2010

More Facts About the Library of Congress / Twitter Archive; Includes Text of LC/Twitter Agreement

UPDATE (8/30/2010): We've learned from the Library of Congress that they're still deciding the definition of what credentials, beyond a reader ID card, a person would need to access the archive. Also, as we said in earlier posts some curated portions of Twitter archive will likely be accessible on the Internet but it still needs to be decided if the Twitter Archive will be accessible on the web.

BOTTOM LINE: All of the decisions that need to be made are still up in the air.

A couple of weeks ago a massive amount of attention was given to the fact that a copy of the Twitter Archive (back to day one in 2006) along with rolling updates would be a gift to the Library of Congress from Twitter. In fact, Twitter approached LC about the project.

It was also one of the only times that we can think of in the Google "age" that another organization, in this case the Library of Congress, released a similar service on the same day and received more attention than Google.

Earlier that same day, Google announced that "Google Replay" was live on the web and while very small at launch, eventually the entire archive would be searchable back to day one in 2006).

What we thought was perhaps most interesting is that very few news organizations of any kind made it clear that the LC Twitter Archive would NOT BE ACCESSIBLE either online or at the Library of Congress itself by the general public to utilize while the Google version of the archive was already online and would be accessible to anyone with a web connection.

In a ResourceShelf post about both the LC Twitter and the Google Twitter Archives from April 19th, we used multiple sources including a conversation with an LC spokesperson to do our best to provide a complete and factual overview (at that time) of both services.

TODAY, in a second blog post on the Library of Congress blog, Matt Raymond provides more information in the first of a "starter list" of FAQs that also include the complete text of the LC/Twitter agreement. 1) As we said in our post on the 19th, LC is getting the archive from Twitter as a gift.

Here's the text of the agreement (PDF; 1 page) along with Twitter's current (8 pages; PDF) and their previous terms of service (4 pages; PDF) that were also included as addenda.

The FAQ continues with why it's important to archive and preserve Twitter and as info pros and many others know, LC collects a wide range of materials. Matt writes:

Individually tweets might seem insignificant, but viewed in the aggregate, they can be a resource for future generations to understand life in the 21st century.

That's true. However, in the original news release from LC, a link to a tweet by President Obama the morning after his victory speech in Chicago is provided. In our view, it's a single tweet that had significance as it was posted and continues to have significance today and will likely have it years from now. A single tweet or a small group of tweets that seem insignificant today might have great significance in the future be it 20, 50, or 100 years from now. Of course, the opposite may also be true.

Next, we read that deleted tweets, private account info, links to pictures and websites will not be archived. So, if someone or the masses tweets and then links to an important government report, info (a link) where to find it will not accessible from the archive. That's a new fact to us and seems a bit strange. Links are central to what Twitter can do. We also learn that LC does not plan to "collect" the linked sites.

Of course, the Internet Archive, Archive-It, and other projects including those from Harvard U. and the University of California are collecting and archiving sites.

Finally, on the six month window between a tweet is tweeted and the time it has the potential to reach the database. We don't know how often the Twitter archive will be updated (daily, monthly, quarterly)? We mentioned some of this in our April 19th report.

The FAQ concludes with some ideas about how LC wants to use this archive as a tool to learn more about digital preservation, as a case study for developing a processes for usage, developing tools for researcher access (read about the Stanford group in the April 19th post), "as well as from the Library’s ongoing experience with serving collections and protecting privacy and rights."

LC will NOT try to reproduce Twitter's functionality (we're guessing that has more to do with retrieval of material than the actual posting of tweets). Two examples of archives already online that might serve as examples of what LC could do with the Twitter material are the National Elections Web Archive and the Supreme Court Nominations Web Archive. They will make an announcement when they are ready for researcher use.

Perhaps the bottom line is that a lot of decisions at LC have to be made. We said that two weeks ago. Patience! It's also why they are calling this a starter list of FAQ's.

However, issues remain that we were surprised not to have seen mentioned in this first FAQ since they were mentioned in the press.

A) This page lists many web archives being developed or that have been completed by LC. This page lists many of the only completed archives.

For example, the last one on this page is to an archive of about 800 sites from and about the 2000 election.

The collection is browsable, searchable and was commissioned by LC from The Internet Archive. In other words, the Library of Congress did not do the actual crawling and archiving.

The 2000 Election Archive is accessible online and to the public.

We have been told that no part of the the Twitter Archive would be web accessible and available to the general public. It would only be available to qualified researchers (a definition that is TBD) at the Library of Congress. No access outside would be available outside of the LC buildings. In our view, this example confuses the situation.

C) To be clear, we weren't able to find a mention of who will and who will not have access to LC's Twitter Archive.

D) Will LC make any of this content accessible to the public perhaps through a display at one of the LC buildings in D.C.?

E) Who will be doing the actual archiving? We know that it's very likely LC will be handling the preservation work but will LC have the technology in place to capture each tweet? What we think is most likely to happen is Twitter will be sending files to LC (on a predetermined basis) via the Internet.

That's about it. If/when we think of more we'll add them.

We want to be clear. We are beyond thrilled that LC is part of the project. As they point out in the FAQ, if nothing else this will be a tremendous learning experience for future digital archiving project with the scope and notoriety that Twitter provides. Also, kudos to Twitter for thinking of LC and offering the archive as a gift.

Of course, one question only Microsoft and Twitter can answer, is Bing also going to offer a Twitter archive?

For more about the Twitter archive from Google (aka Google Replay), see the second part of this post. You can also begin and learn by doing. Start here and replace the search term, MoMA (Museum of Modern Art) with your terms(s). To focus your results simply move the timeline and arrows to a desired date and time (down to the minute). Here's an example: tweets with the company name Netflix in them that were tweeted on April 4th, from 11:45-11:53am.

Sources: LC Blog, ResourceShelf

Views: 1875




blog comments powered by Disqus

« All ResourceBlog Articles

 

Read about the FreePint FamilyFreePint Family

A family of resources to help information workers be more effective, raise the value of information in their organisations and contribute to success. Read more »


FeedLatest Family Articles:


Click to view the article Quilting big data threads
Thursday, 24th May 2012

Recently I have found myself cooing over visualisation maps (and heat maps) of health and well being resources. The content rich data is overlayed with mapping technologies, and some interesting themes and patterns are emerging.


Click to view the article The fallacy of information overload
Wednesday, 23rd May 2012

A lot of the talk around social media in the last year has been around information overload. Social media has provided us with new and exciting ways to create content. But it has also meant learning new ways to manage and engage with social media tools. Are we teetering on the edge of an information overload precipice?


Click to view the article Information overload: fact, fantasy or filter failure?
Wednesday, 23rd May 2012

Information overload is a figment of your imagination. Or a failure of your filter. Or a symptom of your technological submissiveness. Depends on who you ask.


Click to view the article Newsdesk: tracking millions of pieces of information a day
Tuesday, 22nd May 2012

What if you had to sort through 3.5 million articles and social media posts a day and try to pull out the most relevant items for your organisation? What if you then had to cobble it all together into something readable for your top groups and executives in your organisation?


Click to view the article Alacra Compliance adds managerial oversight
Tuesday, 22nd May 2012

Alacra Compliance saves time by aggregating information from both free and fee-based sources and enabling users to conduct an accurate federated search across these sources (coined “simultaneous search” by Alacra).


All Family Articles »
Family Articles by Category »


Tell us what you're working on,
and we'll talk to you about how FreePint can help »


FreePint Family Testimonials

"Fabulous resource to learn of unique tools and insights. Very useful." Manager, Futures and Forecasting, Virginia, USA

More testimonials »






Subscribe

Subscribe to the ResourceShelf Newsletter and receive the weekly sampler of posts and Resource of the Week.

Find out more »

ResourceShelf sponsored by:

Article Categories

All Article Categories »

Archive

All Archives »