Receive the weekly sampler of posts and "Resource of the Week".
Subscribe »

Enter your
email address:

My Account »


Bookmark and Share

Testimonial?
If you find ResourceShelf useful, please supply a testimonial »








Home > ResourceBlog > Article

« All ResourceBlog Articles

 

Bookmark and Share   \"Feed\"

Saturday, 10th July 2010

HathiTrust Digital Library Adds New Content Accessible to All Users, Nearly 100,000 Volumes Digitized by Internet Archive

Access HathiTrust Digital Library Search and Collection Tools

Cool! Visualize the Collection by Call Number, Languages, and Date

The Hathi Trust Digital Library Update for June 2010 that was released today has info on a couple of new features. We've bolded the items that are available to all users.

Shibboleth was released for partner authentication in June.

Authenticated users can now download full-PDFs of all public domain volumes in HathiTrust, and access the Collection Builder feature through local sign-on. Shibboleth also lays the groundwork for future augmented services to partner institutions, potentially including the ability to make uses of digital volumes allowed by Section 108 of U.S. copyright law, and allow full access to in copyright volumes for users with print disabilities.

The release of Shibboleth was made in conjunction with improvements to PageTurner that enabled delivery of high-resolution PDF files with embedded OCR for entire volumes. While only individuals at member institutions have access to this service across the repository, all public domain volumes that were not digitized by Google are available for full-PDF download to members and non-members alike. Right now these include nearly 100,000 Internet Archive-digitized volumes that have been contributed by the University of California, and thousands of volumes digitized locally by the University of Michigan. The partners are poised to significantly increase the amount of non-Google-digitized content preserved in HathiTrust in the near future, making many more public domain volumes freely available for download and distribution.

While we're talking user tools, let's step away from the HTDL Update for a moment and check a recent post on Hathi's Large Scale Search Blog by Tom Burton-West. He writes:

When you do a search, you will see check boxes next to each search result. You can select items you want from the search results and create a personal collection. This should make it much easier to do repeated searches and explore a targeted subset of the HathiTrust volumes. If you are not logged in, the collection will be temporary. If you log in you can save the collection permanently. This enables users to do focused searching within a selected subset of search results.

If you're not from a partner institution, follow the links on this page to create a "friend" account from the University of Michigan. All you need is an email address and access to your email. It takes no more than 3-5 minutes.

That's it. Now, back to highlights from the newsletter.

SEASR

HathiTrust is in the process of investigating SEASR, the Software Environment for the Advancement of Scholarly Research, as a means to provide computational access to materials stored in the repository. Staff at the University of Michigan began installation of SEASR in the HathiTrust development environment in June, and expect to gain more knowledge about SEASR and what would be involved in applying it to HathiTrust over the next several weeks.

Next, Highlights from Hathi Working Groups

Discovery

As of the end of June, there are nearly 3.1 million HathiTrust records in WorldCat. Record loading is now continuing at a quicker pace, and is nearly complete.

OCLC is also making several alterations to the catalog’s functionality to fully meet HathiTrust’s requirements. This work is expected to extend into early August, after which time the interface will be reviewed for public beta release.

Collaborative Development Environment

University of Michigan staff continued the migration of HathiTrust applications into the new development environment in June, performing testing and configuration of the GlusterFS distributed file system that will be used as the storage back-end for the environment as well...When configuration is complete, the environment will support HathiTrust development efforts broadly across the partnership.

Quality, Ingest, and Error Rate

The quality working group is still working through a set of scenarios for gating volumes of poor quality from entering HathiTrust, and developing a justification and recommendation for the best approach to follow.

Development Updates come next.

Large-scale Search

The full text search index in Indiana was put into production by Michigan staff in early June, making the infrastructure for full text search fully redundant. Two new index build servers were also put into production in Michigan. All of the new systems have been functioning well, and the new build servers have substantially improved the performance of index building and maintenance...Michigan staff also developed a Lucene utility in June (Solr uses Lucene) to read an index and print out the total number of occurrences of a term.

Collection Builder

Integration of Collection Builder functionality with large-scale search is in the final stages of testing and will be deployed in July.

Storage Upgrade

Michigan staff have ordered and received additional storage for the Indiana and Michigan sites and will be putting it into service during July and August. The upgrade requires the installation of a new, larger storage network switch, so staff will be using the opportunity to introduce a new cabling layout for the entire system.

Outages

HathiTrust services were unavailable on Monday, June 7 from 7:10-10:00am and on Tuesday, June 8 from 5:00-5:30pm due to a connectivity problem with one of the web servers; and on Saturday, June 25 from 8:30-10:00am due to a database server disk space shortage.

Database Growth

Indiana University

236 Volumes Added in June
177,333 Total Volumes in Collection

Penn State University

328 Volumes Added in June
22,824 Total Volumes in Collection

University of California

616 Volumes Added in June
1,509,169 Total Volumes in Collection

University of Michigan

34,605 Volumes Added in June
4,056,835 Total Volumes in Collection

University of Minnesota

173 Volumes Added in June
73,856 Total Volumes in Collection

University of Wisconsin

10,073 Volumes Added in June
353,639 Total Volumes in Collection

Totals

46,031 Volumes Added in June
6,193,386 Total Volumes in Collection

Public Domain Volumes

~20% of Total
1,208,351 Total Volumes in Collection

Statistics

6,197,125 total volumes
3,627,903 book titles
146,794 serial titles
2,168,993,750 pages
230 terabytes
73 miles
5,035 tons
1,208,634 volumes (~20% of total) in the public domain

The HathiTrust Update is Also Available as a PDF.

Access HathiTrust Digital Library Search and Collection Tools


Category:

Views: 3317



blog comments powered by Disqus

« All ResourceBlog Articles

 

Read about the FreePint FamilyThe FreePint Family is a family of resources to help information workers be more effective, raise the value of information in their organisations and contribute to success.

'FreePint... provides most of my professional development because it won't come through work and [other resources] just don't cut it.'

Read about the FreePint Family »


Visit the FreePint ShopFreePint Shop: FreePint sells reports, resources and subscription products to support your information work and information-related decisions.

Latest: FreePint Volume: Critical Insight on Social Media 2012 (01 Feb 2012) | FUMSI Report: Folio on Conferences and Continuing Professional Development (26 Jan 2012) | FreePint Research Report: Information Governance Policies and Priorities (25 Jan 2012) | Docuticker Report: DocuTips on Health Literacy (19 Jan 2012) | VIP Magazine: 98 (18 Jan 2012)

Browse the FreePint Shop »


FUMSI ForumFUMSI Forum: Do you have a research question? Post it to the FUMSI Forum, where professionals share Q&A and useful tips on how to Find, Use, Manage and Share Information. It's free.

Latest FUMSI Forum postings: Most Shared Content on Finding Information (09 Feb 2012) | Times are changing - a FUMSI Editorial (09 Feb 2012) | [TIPPLE] eBook resources - Share (07 Feb 2012) | Most Shared Content on Sharing Information (01 Feb 2012) | Our own worst enemy? - a FUMSI Editorial (01 Feb 2012)

Visit the FUMSI Forum and post »


VIP LiveWireVIP LiveWire: Offers commentary on emerging news stories of interest to premium content users, vendors and industry insiders.

Latest VIP LiveWire postings: Compliance - it's not just financial (10 Feb 2012) | Social media and BRIC - new report (08 Feb 2012) | Reuters takes the social media pulse (08 Feb 2012) | How to deal with the tech-savvy customer? (08 Feb 2012) | More ways for employers to poke around (01 Feb 2012)

Visit the VIP LiveWire »






Subscribe

Subscribe to the ResourceShelf Newsletter and receive the weekly sampler of posts and Resource of the Week.

Find out more »

ResourceShelf sponsored by:

Article Categories

All Article Categories »

Archive

All Archives »