The Web puzzle of online information resources often hinders end-users from effective and efficient access to these resources. Clustering resources into appropriate subject-based groupings may help alleviate these difficulties, but will it work with heterogeneous material? The University of Michigan and the University of California Irvine joined forces to test automatically enhancing metadata records using the Topic Modeling algorithm on the varied OAIster corpus.
Clustering can not only serve to help the searcher find what they are (or think they are) looking for but we think it's just as valuable to present ideas, concepts, names, trends, etc. that the searcher might not see by browsing one record at a time.
Notes 1: We have posted about OAIster many times on ResourceShelf. This database aggregates over 12 million records from more than 854 contributors. A very useful and important resource.
Notes 2: Of course no conversation about clustering can take place without mention Raul Valdes-Perez, Co-Founder and CEO of Vivisimo. Vivisimo powers Clusty and offers dynamic clustering for enterprise search. While white papers can only say so much, this paper from Vivisimo (PDF) about information overload and what they call "selective ignorance" is worthy of your attention. You can learn more by browsing the Vivisimo site, reading the company blog (with commentary from Valdes-Perez) and using some of their projects beyond Clusty.
Examples:
+ Clustermed.info
Clustering (and note the many ways to cluster) PubMed since this search is using controlled fields.
Notes 3: Since Ask.com provides "Zoom Related Info" that offers terms to narrow and/or expand a search (based on concepts) and in some cases, show related names (again, based on concepts), I'll keep my comments to this example and some of what has been written elsewhere. 1 ||| 2 ||| 3.
Btw, Zoom Related Search is also available with Ask Image search.
Here's an example using a search for Paul McCartney). Note the Zoom Related Results in the left pane.
The FreePint Family is a family of resources to help information workers be more effective, raise the value of information in their organisations and contribute to success.
'FreePint... provides most of my professional development because it won't come through work and [other resources] just don't cut it.'
FUMSI Forum: Do you have a research question? Post it to the FUMSI Forum, where professionals share Q&A and useful tips on how to Find, Use, Manage and Share Information. It's free.