Receive the weekly sampler of posts and "Resource of the Week".
Subscribe »

Enter your
email address:

My Account »


Bookmark and Share

Testimonial?
If you find ResourceShelf useful, please supply a testimonial »








Home > ResourceBlog > Article

« All ResourceBlog Articles

 

Bookmark and Share   Feed

Friday, 22nd September 2006

Search Briefs: Where Have they Been Doing? Publishers aim for some control of search results

+ Publishers aim for some control of search results (via Reuters)

Global publishers, fearing that Web search engines such as Google are encroaching on their ability to generate revenue, plan to launch an automated system for granting permission on how to use their content.

This is far from a new issue. For publishers who don't want their full text content accessible (for free) via a large web engine after a certain period of time simply take the article(s) offline. CMS systems can handle this easily. Yes, it's hard to stop users who simply copy and reprint, but that's another issue. Also, caching/archiving can be stopped by using the no-archive option that all of the large crawlers and the Internet Archive offer.

"Since search engine operators rely on robotic 'spiders' to manage their automated processes, publishers' Web sites need to start speaking a language which the operators can teach their robots to understand," according to a document seen by Reuters that outlines the publishers' plans.

Seems like some publishers already understand how it all works.

For example, Washington Post material is not cached by the Internet Archive and it's also hard to find cached content in Google. Here's their robots.txt file and notice that the Internet Archive is not allowed to crawl the WPOST site. Of course, another option might be to make the full text free for a certain period (not have it cached) and then sell the content after that set date. Of course, good researchers know that lots of this content is available at no charge, 24x7x365, using remote access to a library.

See Also: Global Publishers Head Off Legal Clash With Search Engines
See Also: No Archive/No Cache Info from Ask, Gigablast, Google, Yahoo.

See Also: Although Topix.net (with one of the largest open web indicies available and a ResourceShelf favorite) now offers a wonderful one-year archive. However, Topix is constantly checking these urls and, if they are removed, they are taken out of the database OR a link to purchase the content is made available.

+ Test out the new result pages for Google (via ZDNet)
Garett explains how to take a peek at this text.

Views: 752




blog comments powered by Disqus

« All ResourceBlog Articles

 

Read about the FreePint FamilyFreePint Family

A family of resources to help information workers be more effective, raise the value of information in their organisations and contribute to success. Read more »


FeedLatest Family Articles:


Click to view the article Quilting big data threads
Thursday, 24th May 2012

Recently I have found myself cooing over visualisation maps (and heat maps) of health and well being resources. The content rich data is overlayed with mapping technologies, and some interesting themes and patterns are emerging.


Click to view the article The fallacy of information overload
Wednesday, 23rd May 2012

A lot of the talk around social media in the last year has been around information overload. Social media has provided us with new and exciting ways to create content. But it has also meant learning new ways to manage and engage with social media tools. Are we teetering on the edge of an information overload precipice?


Click to view the article Information overload: fact, fantasy or filter failure?
Wednesday, 23rd May 2012

Information overload is a figment of your imagination. Or a failure of your filter. Or a symptom of your technological submissiveness. Depends on who you ask.


Click to view the article Newsdesk: tracking millions of pieces of information a day
Tuesday, 22nd May 2012

What if you had to sort through 3.5 million articles and social media posts a day and try to pull out the most relevant items for your organisation? What if you then had to cobble it all together into something readable for your top groups and executives in your organisation?


Click to view the article Alacra Compliance adds managerial oversight
Tuesday, 22nd May 2012

Alacra Compliance saves time by aggregating information from both free and fee-based sources and enabling users to conduct an accurate federated search across these sources (coined “simultaneous search” by Alacra).


All Family Articles »
Family Articles by Category »


Tell us what you're working on,
and we'll talk to you about how FreePint can help »


FreePint Family Testimonials

"Fabulous resource to learn of unique tools and insights. Very useful." Manager, Futures and Forecasting, Virginia, USA

More testimonials »






Subscribe

Subscribe to the ResourceShelf Newsletter and receive the weekly sampler of posts and Resource of the Week.

Find out more »

ResourceShelf sponsored by:

Article Categories

All Article Categories »

Archive

All Archives »