How does the web search behavior of "rich'' and "poor'' people differ? Do men and women tend to click on different results for the same query? What are some queries almost exclusively issued by African Americans? These are some of the questions we address in this study. Our research combines three data sources: the query log of a major US-based web search engine, profile information provided by 28 million of its users (birth year, gender and zip code), and US-census information including detailed demographic information aggregated at the level of ZIP code. Through this combination we can annotate each query with, e.g., the average per-capita income in the ZIP code it originated from. Though conceptually simple, this combination immediately creates a powerful demographic profiling tool. The main contributions of this work are the following. First, we provide a demographic description of a large sample of search engine users in the US and show that it agrees well with the distribution of the US population. Second, we describe how different segments of the population differ in their search behavior, e.g. with respect to the diversity of formulated queries or with respect to the clicked URLs. Third, we explore applications of our methodology to improve web search and, in particular, to help issuing query reformulations. These results enable the creation of a powerful tool for improved user modeling in practice, with many applications including improving web search and advertising. For instance, advertisements for ``family vacations'' could be adapted to the (expected) income of the person issuing the query, or search suggestions shown to users could be adapted to items that are more interesting given their particular characteristics.
by: Mendoza, M.; Poblete, B.; Castillo, C.
In: Social Media Analytics, KDD '10 Workshops, ACM, Washington, USA (2010)
In this article we explore the behavior of Twitter users under an emergency situation. In particular, we analyze the activity related to the 2010 earthquake in Chile and characterize Twitter in the hours and days following this disaster. Furthermore, we perform a preliminary study of certain social phenomenons, such as the dissemination of false rumors and confirmed news. We analyze how this information propagated through the Twitter network, with the purpose of assessing the reliability of Twitter as an information source under extreme circumstances.
by Baeza-Yates, R.; Middleton, C.; Castillo, C.
in: Web Intelligence, IEEE, Milan, Italy (2009)
This article describes a geographical study on the usage of a search engine, focusing on the traffic details at the level of countries and continents. The main objective is to understand from a geographic point of view, how the needs of the users are satisfied, taking in account the geographic location of the host in which the search originates, and the geographic location of the host of the clicked URL. Our results confirm that the Web is a cultural mirror of society and shed light on the implicit social network behind search. These results are also useful as input for the design of distributed search engines.
A family of resources to help information workers be more effective, raise the value of information in their organisations and contribute to success. Read more »
Recently I have found myself cooing over visualisation maps (and heat maps) of health and well being resources. The content rich data is overlayed with mapping technologies, and some interesting themes and patterns are emerging.
A lot of the talk around social media in the last year has been around information overload. Social media has provided us with new and exciting ways to create content. But it has also meant learning new ways to manage and engage with social media tools. Are we teetering on the edge of an information overload precipice?
Information overload is a figment of your imagination. Or a failure of your filter. Or a symptom of your technological submissiveness. Depends on who you ask.
What if you had to sort through 3.5 million articles and social media posts a day and try to pull out the most relevant items for your organisation? What if you then had to cobble it all together into something readable for your top groups and executives in your organisation?
Alacra Compliance saves time by aggregating information from both free and fee-based sources and enabling users to conduct an accurate federated search across these sources (coined “simultaneous search” by Alacra).