Home > ResourceBlog > Article

« All ResourceBlog Articles

 

Bookmark and Share   Feed

Wednesday, 9th June 2010

Google Releases the Much Discussed "Caffeine" Index

You might remember that last August Google announced a new project, code-name Caffeine, that was basically to being build to and replace the entire infrastructure that Google uses to crawl, index, and rank pages. During the time Caffeine was being tested, especially in those first days after the announcement, some said that they noticed fast speeds in getting searches completed and results returned.

Tonight, Google has announced that the Caffeine technology for all Google searches is now live. A blog post from GOOG titled, "Our new search index: Caffeine" has details.

Facts (According to the Google Blog Post):

+ 50% Fresher Results Compared to the Old Indexing System (We Will try to Get a Precise Definition What this Means in Terms of Actual Time)

+ Largest Index Ever

+ Every Second Caffeine Processes Hundreds of Thousands of Pages in Parallel
If this were a pile of paper it would grow three miles taller every second.

+ Caffeine Takes Up Nearly 100 million Gigabytes of Storage in One Database

+ Information at a Rate of Hundreds of Thousands of Gigabytes Per Day

Our old index had several layers, some of which were refreshed at a faster rate than others; the main layer would update every couple of weeks. To refresh a layer of the old index, we would analyze the entire web, which meant there was a significant delay between when we found a page and made it available to you.

With Caffeine, we analyze the web in small portions and update our search index on a continuous basis, globally. As we find new pages, or new information on existing pages, we can add these straight to the index. That means you can find fresher information than ever before—no matter when or where it was published.s.

In terms of what this means to the searcher in terms of how to construct a search, nothing has changed. However, it pages are being refreshed more frequently it means the cache is also being updated more frequently. So, if you want a copy of a page the way it looked at Noon on Wednesday, it's probably a good idea to make a copy for yourself (have you tried Zotero?) Why? Because by 12:15 on Wednesday the content on the page might have changed and that means the cache has been updated. This new index could bring more attention to the importance of personal index management.

Do these faster times (I'm sure MANY will be testing to see how accurate Google's numbers are) mean anything to the typical Google searcher? Obviously, for the "power" searcher the potential for better results seems strong.

Remember, when all search engines placed on their homepage their total size? It meant little if not nothing and it's no longer being done. Will recrawl and refresh times be a new metric that search engines use to promote/market themselves to users.

See Also: Vanessa Fox at Search Engine Land Has a Great Post and Should Be Read by All Content Owners and Webmasters (via SEL)

Note: Vanessa makes an essential point. The Caffeine index has not changed Google's ranking algorithm. Two different things.

Here's one more point that is important to keep in mind. Thanks Vanessa.

Note that the introduction of Caffeine doesn’t necessarily mean that pages will be crawled on a faster schedule than before. It simply means that once those pages are crawled, they are made available to searchers much more quickly. (Remember, you can estimate how often your pages are crawled by taking a look at your server logs or checking the cache dates in Google.)

UPDATE: We posted this item at 10pm EDT. It was in the main Google database less than a minute after we posted it. Impressive!

Views: 1426

   




« All ResourceBlog Articles

 

FreePint

FreePint supports the value of information in the enterprise. Read more »


FeedLatest FreePint Articles:


  • Click to view the article Mobile Strategy in the Investments Business
    Tuesday, 21st May 2013

    Carlos Ramon of retirement investments provider VALIC tells FreePint how the company developed a mobile app to allow its customers to check their fund positions, get news updates and contact their financial adviser. He also provides valuable tips for other organisations considering their own mobile strategy, whether for smartphones or tablets.

  • Click to view the article Webinar Will Bring Big Data Down to Size
    Monday, 20th May 2013

    Find out more about FreePint's Webinar: Big Data in Action: Plain Language, Practical Guidance and don't forget to register for the free event. It's a great opportunity to find out more about big data from four technology and service companies in the sector: Attunity, Connotate, Linguamatics and Opera Solutions.

  • Click to view the article Question Time for Thomson Reuters and PLC's Top Team
    Monday, 20th May 2013

    Robin Neidorf meets up with key members of Thomson Reuters and PLC's senior management team to quiz them about the role of PLC in Thomson Reuters' plans for the US legal market - and to find out more about Thomson Reuters' approach to the legal market worldwide.

  • ... more ...

All Family Articles »
Family Articles by Category »


A FreePint Subscription delivers articles and reports that support your organisation's information practice, content and strategy.

Start the conversation about a subscription by
completing our online form: "How can FreePint help?"


FreePint Testimonials

"This report will be of great value to me as I meet with the managing partner in the near future to discuss the budget. It is one of the ..."

Read more testimonials and supply yours »






 

 
 
 

Register

Register to receive the free ResourceShelf Newsletter, featuring highlighted posts.

Find out more »

ResourceShelf sponsored by:

Article Categories

All Article Categories »

Archive

All Archives »