Receive the weekly sampler of posts and "Resource of the Week".
Subscribe »

Enter your
email address:

My Account »


Bookmark and Share

Testimonial?
If you find ResourceShelf useful, please supply a testimonial »








Home > ResourceBlog > Article

« All ResourceBlog Articles

 

Bookmark and Share   Feed

Wednesday, 23rd July 2003

Challenging Assumptions Leads to Web Search Insights

Web Organization and Searching
Source: Research/Penn State
The Work of Lee Giles
Read about the important and interesting work of Lee Giles at Penn State University. From the article, Lee Giles is not much interested in surfing. Mining and extraction are terms more to his liking. Giles, the David Reese professor of information sciences and technology at Penn State, has devoted his career to finding better ways to get at information, to wring the most out of it, to marshal it efficiently. A few key passages follow. Make sure to read the entire article.
-
+ The Web exists as a distributed sort of information base, Giles says, with typical understatement. Un-regulated, decentralized, the work of tens of millions of disparate authors, and constantly growing at an ever-accelerating rate, the Web is no easy object to take the measure of. Yet characterizing the Web, understanding its parameters and its behavior, was the first thing Giles set about doing. Whats there, how it is connected, how it changes, who uses it, why they use it the more you know about these things, the more efficiently youre able to use it, he says.
-
+ In another study, published last year in the Proceedings of the National Academy of Sciences, he and his co-authors challenge the widely held notion that the competition for attention on the Web is purely winner-take-all, i.e., that new sites on the Web are more likely to attach themselves to sites that already have many links, insuring that a small number of established sites will always receive a disproportionate share of Web traffic. While this preferential behavior does accurately describe the Web as a whole, Giles and his co-authors write, it varies significantly by the type of site considered. Thus, while a new newspaper or entertainment site might find it difficult competing with similar sites that are already popular, university sites and the pages of individual scientists exhibit a more egalitarian link growth. The behavior is more complicated than had been thought, Giles says.
-
+ But automatic engines have their limitations, too. For one thing, most current crawlers are unable to recognize spam, which in this context means unreliable information. In the unregulated environment of the Web, Giles says, people claiming to be what theyre not is an ongoing problem.
-
+ A more praccompletelytion [to completley personalized search tool], at least in the short term, is what Giles calls the niche search engine, designed specifically to meet the needs of a group of people with similar interests: employees of a company, say, or members of a profession. By limiting its crawling to a specific subject area, the niche engine can burrow deeper, providing more consistently useful information. A prime example is CiteSeer [aka ResearchIndex], a tool that Giles and Steve Lawrence created for the field of computer and information science. .
Note: We completely agree with Dr. Giles. Those of you who read ResourceShelf on a regular basis know that we try hard to provide info about useful specialized and 'niche' search tools.
----
----
See Also: eBizSearch
Another niche search tool that Giles has developed. It focuses on materials about electronic business. eBizSearch was a Resource of the Week when it was officially launched in January, 2003.
--
See Also: Direct to Lee Giles Home Page
Plenty of interesting reading here.

Views: 249




blog comments powered by Disqus

« All ResourceBlog Articles

 

Read about the FreePint FamilyFreePint Family

A family of resources to help information workers be more effective, raise the value of information in their organisations and contribute to success. Read more »


FeedLatest Family Articles:


Click to view the article Quilting big data threads
Thursday, 24th May 2012

Recently I have found myself cooing over visualisation maps (and heat maps) of health and well being resources. The content rich data is overlayed with mapping technologies, and some interesting themes and patterns are emerging.


Click to view the article The fallacy of information overload
Wednesday, 23rd May 2012

A lot of the talk around social media in the last year has been around information overload. Social media has provided us with new and exciting ways to create content. But it has also meant learning new ways to manage and engage with social media tools. Are we teetering on the edge of an information overload precipice?


Click to view the article Information overload: fact, fantasy or filter failure?
Wednesday, 23rd May 2012

Information overload is a figment of your imagination. Or a failure of your filter. Or a symptom of your technological submissiveness. Depends on who you ask.


Click to view the article Newsdesk: tracking millions of pieces of information a day
Tuesday, 22nd May 2012

What if you had to sort through 3.5 million articles and social media posts a day and try to pull out the most relevant items for your organisation? What if you then had to cobble it all together into something readable for your top groups and executives in your organisation?


Click to view the article Alacra Compliance adds managerial oversight
Tuesday, 22nd May 2012

Alacra Compliance saves time by aggregating information from both free and fee-based sources and enabling users to conduct an accurate federated search across these sources (coined “simultaneous search” by Alacra).


All Family Articles »
Family Articles by Category »


Tell us what you're working on,
and we'll talk to you about how FreePint can help »


FreePint Family Testimonials

"The free content from FreePint helps me to stay informed of trends. My researchers call it an "uncanny ability to find these details". ..."

More testimonials »






Subscribe

Subscribe to the ResourceShelf Newsletter and receive the weekly sampler of posts and Resource of the Week.

Find out more »

ResourceShelf sponsored by:

Article Categories

All Article Categories »

Archive

All Archives »