Web Search
Source: Searchblog, A Couple of Comments on Search Variables
A post on J.B.'s site says "bravo" to this post on the Scobelizer blog asking for the big engines to give the public access to the "guts" of their search technologies so end users can better tune their results. Scoble also mentions that small engines (Gigablast, Feedster, Technorati and many others) will be the "real innovators." Tim Bray correctly reminds us that most users don't even take the time to use advanced syntax when developing a search strategy.
A couple of comments
+ I don't see that the "big guys" are going to be sharing the guts of their technology with us very soon. Heck, I can't even get Google to explain how their stemming technology works. Does Coca-Cola give out the exact formulas of how products are made? McDonalds still calls it "secret sauce." (-:
--
+ Many people believe that a single search tool can be all things to all people? I tend to disagree. Is there just a single reference book on your bookshelf? For the librarians out there, you know that many LexisNexis libraries and Dialog files exist.
--
+ Scoble mentions that innovations will come from smaller databases. I agree. I would also toss into the mix "focused" and targeted tools like SmealSearch and ResearchIndex. Just last week, NEC was awarded a patent for focused crawler technology.
--
+ Why shouldn't these -- along with thousands of other smaller, focused, and specialized databases -- be "federated" at search time, which would help create a "better" search tool? Instead of incorporating many search tools, each with a different interface, a common interface could be designed to run "on top" of selected resources. All databases would and should remain completely independent.
--
For example -- searching for news? Bring together results from Technorati, Feedster, Yahoo News, Rocket News, and perhaps a fee-based service like Factiva (if you have access to it or if the publisher offered a pay-per-view option). At the enterprise level, results from the web, fee-based databases, and local search tools can be combined. Dialog offers something similar to what I'm describing called OneSearch.
+ Technology (similar to what Dialog has offered for years) could, if needed, help the user select the most pertinent databases to incorporate into a search (either based on the specific query or a searcher could simply tell the tool to make the decisions).
--
+ Large databases like Google and Yahoo could also be exploited (if needed) by using them with advanced search strategies that have been "pre-built" for the user. For example, instead of a searcher having to go to the "advanced" page or know specific syntax to limit the query to government info, the federated tool would automatically append the query with the appropriate syntax, enabling a more precise search.
--
+ Personalization would also be part of the system. Results could be post-processed with dupes removed, results clustered, and then sorted and re-sorted according to the search need. Initial result sets could be based on a user's profile and/or past usage, but could be easily tweaked at the time of the search. As I mentioned yesterday, I think Yahoo is doing good things with their SmartSort technology. It's easy to understand and use. Other views of the information, including visualization of results, could also be made available if the user found it helpful.
--
+ A common interface (designed for the needs of a specific user group) will make it easier for people to take full advantage of these disparate tools.
--
+ Let's not forget that people often don't want links. They want an answer. So, in addition to simultaneously searching disparate sources, technology should also be able to summarize and, if possible (depending on the query), present a possible answer(s) to fact-based queries.
--
+ We're already starting to see this with FAST's new ESP technology, WebFountain, and other federated search technology* providers.
* Full disclosure: A company offering federated technology is a sponsor of ResourceShelf.
--
+ Yes, this sounds like meta searching and to a certain degree it is. However, what we've come to think of as meta searching is just the tip of the iceberg. With additional search technologies, access to a wide variety of databases (web, fee-based, local), and post-processing technology, we could create a robust resource that would offer full-power (for those who want it), but also be simple enough for the average user. Btw, NISO is also doing work in the federated search/meta search arena.
--
+ It's very easy to envision how information professionals could be valuable in the creation and maintenance of tools like these.
A family of resources to help information workers be more effective, raise the value of information in their organisations and contribute to success. Read more »
Recently I have found myself cooing over visualisation maps (and heat maps) of health and well being resources. The content rich data is overlayed with mapping technologies, and some interesting themes and patterns are emerging.
A lot of the talk around social media in the last year has been around information overload. Social media has provided us with new and exciting ways to create content. But it has also meant learning new ways to manage and engage with social media tools. Are we teetering on the edge of an information overload precipice?
Information overload is a figment of your imagination. Or a failure of your filter. Or a symptom of your technological submissiveness. Depends on who you ask.
What if you had to sort through 3.5 million articles and social media posts a day and try to pull out the most relevant items for your organisation? What if you then had to cobble it all together into something readable for your top groups and executives in your organisation?
Alacra Compliance saves time by aggregating information from both free and fee-based sources and enabling users to conduct an accurate federated search across these sources (coined “simultaneous search” by Alacra).