Barry Schwartz at Search Engine Land points out a new "exclusive" article by noted tech writer, Stephen Levy, published in Wired. It tells the story of Google's algorithm (what the company will make public, of course) "rules the web."
Here are a Few Sections of the Article (It's Fairly Long) We Found Most Interesting
This year, Google will introduce 550 or so improvements to its algorithm
Even the Bingers [those who work at Bing] confess that, when it comes to the simple task of taking a search term and returning relevant results, Google is still miles ahead. But they also think that if they can come up with a few areas where Bing excels, people will get used to tapping a different search engine for some kinds of queries.
The search engine currently uses more than 200 signals to help rank its results.
Note: Directly above this sentence in the actual article it mentions how Google exploited hyperlinks and sometimes you would come across a page where your search terms were not found. This could be the reason. This is still an issue. Just today we came across several pages that did not have our search terms.
The most recent major change, codenamed Caffeine, revamped the entire indexing system to make it even easier for engineers to add signals.
The Article Includes a Chart of "Some of the Most Significant Additions and Adaptations Since the Dawn of PageRank."
This initiative allows Google to update its index constantly, instead of in big batches.
5. June, 2005: "Personalized Results"
6. December, 2005: "Big Daddy"
Engine update allows for more-comprehensive Web crawling.
7. May, 2007: Universal Search
8. December 2009:
Displays results from Twitter and blogs as they are published.
NOTE: Want to learn more about Google and PageRank? The original research paper written by Larry Page and Sergey Brin, "The Anatomy of a Large-Scale Hypertext Search Engine (PDF) is still an engaging, interesting, and for the most part, not to technical of a read. Remember, the paper is more than 10 years old and things have changed.
Two things missing in our view from Levy's article are an example or two of things that did not work and why. Of course, Google's success has been nothing short of amazing and some might even say unique. But, that doesn't mean they haven't had a problem or two. It would be interesting to learn about them.
Also, it would have been worth a paragraph or two to discuss search algorithms that preceded PageRank. We're specifically thinking about the work of Jon Kleinberg and his hubs and authorities approach. Several articles in a section titled, "Web Analysis and Search: Hubs and Authorities" on this page are worth looking at. This paper from Scientific American (1999) is excellent and actually does some comparing of core concepts between Google and Clever. Kleinberg was a member of IBM's Clever team. The engine itself was never publicly released but their web site is still online. You have to wonder if the world would be a different place if Clever was released before Google (which is possible). Finally, one of the papers on Kleinberg's page, "Authoritative Sources in a Hyperlinked Environment," is cited by Page and Brin in their "Anatomy" paper.
A family of resources to help information workers be more effective, raise the value of information in their organisations and contribute to success. Read more »
Recently I have found myself cooing over visualisation maps (and heat maps) of health and well being resources. The content rich data is overlayed with mapping technologies, and some interesting themes and patterns are emerging.
A lot of the talk around social media in the last year has been around information overload. Social media has provided us with new and exciting ways to create content. But it has also meant learning new ways to manage and engage with social media tools. Are we teetering on the edge of an information overload precipice?
Information overload is a figment of your imagination. Or a failure of your filter. Or a symptom of your technological submissiveness. Depends on who you ask.
What if you had to sort through 3.5 million articles and social media posts a day and try to pull out the most relevant items for your organisation? What if you then had to cobble it all together into something readable for your top groups and executives in your organisation?
Alacra Compliance saves time by aggregating information from both free and fee-based sources and enabling users to conduct an accurate federated search across these sources (coined “simultaneous search” by Alacra).