Receive the weekly sampler of posts and "Resource of the Week".
Subscribe »

Enter your
email address:

My Account »


Bookmark and Share

Testimonial?
If you find ResourceShelf useful, please supply a testimonial »








Home > ResourceBlog > Article

« All ResourceBlog Articles

 

Bookmark and Share   \"Feed\"

Tuesday, 23rd March 2010

Chemistry: Databases: Comparing Two Chemical Name Dictionaries & Manual vs. Auto Curation

From the Announcement:

Just like the rest of us, scientists today are swamped with information. As more chemical resources become freely available, text mining applications - previously focused on correctly identifying gene and protein names – are now shifting towards also correctly identifying chemical names. Now database experts have compared two chemical name dictionaries head to head, and report on the payoffs of manual versus automatic data curation in the open access publication, Journal of Cheminformatics.

Chemlist's creators wanted to investigate the effect extensive manual curation of a multi-source chemical dictionary would have on chemical term identification in text. Kristina Hettne and her team based in the Netherlands, together with US-based colleagues, compared Chemlist, a dictionary for identifying small molecules and drugs in text automatically generated from a number of publicly available databases, with a second dictionary extracted from the ChemSpider database which has been curated manually to establish valid chemical name to structure relationships. To compare automatic curation with manual curation, the authors used only the ChemSpider component containing manually curated names and synonyms in their research.

{Snip]

This means that although ChemSpider achieved the best precision, the Chemlist dictionary had a higher recall and the best F-score, a function of a test's accuracy incorporating both precision and recall. "Rule-based filtering and disambiguation is necessary to achieve high precision for both automatically generated and the manually curated dictionaries," Hettne concludes. Antony Williams, project lead for ChemSpider comments "Such validated name-structure dictionaries studied in this work provide a strong foundation for semantic markup technologies, interlinking and various online resources." Both ChemSpider and the chemical databases included in Chemlist continue to grow at high speed, and further investigation is needed to see how this growth affects the performance of the dictionaries.

Access Chemspider

Access Chemlist

Access the Complete Announcement

Source: BioMed Central / EurekAlert


Category:

Views: 2187



blog comments powered by Disqus

« All ResourceBlog Articles

 

Read about the FreePint FamilyThe FreePint Family is a family of resources to help information workers be more effective, raise the value of information in their organisations and contribute to success.

'FreePint... provides most of my professional development because it won't come through work and [other resources] just don't cut it.'

Read about the FreePint Family »


Visit the FreePint ShopFreePint Shop: FreePint sells reports, resources and subscription products to support your information work and information-related decisions.

Latest: FreePint Volume: Critical Insight on Social Media 2012 (01 Feb 2012) | FUMSI Report: Folio on Conferences and Continuing Professional Development (26 Jan 2012) | FreePint Research Report: Information Governance Policies and Priorities (25 Jan 2012) | Docuticker Report: DocuTips on Health Literacy (19 Jan 2012) | VIP Magazine: 98 (18 Jan 2012)

Browse the FreePint Shop »


FUMSI ForumFUMSI Forum: Do you have a research question? Post it to the FUMSI Forum, where professionals share Q&A and useful tips on how to Find, Use, Manage and Share Information. It's free.

Latest FUMSI Forum postings: [TIPPLE] eBook resources - Share (07 Feb 2012) | Most Shared Content on Sharing Information (01 Feb 2012) | Our own worst enemy? - a FUMSI Editorial (01 Feb 2012) | [TIPPLE] eBook resources - Manage (31 Jan 2012) | "Frictionless sharing" - exploring the c (31 Jan 2012)

Visit the FUMSI Forum and post »


VIP LiveWireVIP LiveWire: Offers commentary on emerging news stories of interest to premium content users, vendors and industry insiders.

Latest VIP LiveWire postings: Reuters takes the social media pulse (08 Feb 2012) | How to deal with the tech-savvy customer? (08 Feb 2012) | More ways for employers to poke around (01 Feb 2012) | Trust your supplier? Check with the Armadillo (01 Feb 2012) | Cloudy with a chance of... (01 Feb 2012)

Visit the VIP LiveWire »






Subscribe

Subscribe to the ResourceShelf Newsletter and receive the weekly sampler of posts and Resource of the Week.

Find out more »

ResourceShelf sponsored by:

Article Categories

All Article Categories »

Archive

All Archives »