Learn About and Try OpenCalais (a Free Service from Thomson Reuters)
OpenCalais (OC) is a free service that we first mentioned six months ago and have mentioned several times since. This post from June, 2009 mentions some of the organizations using the service from Thomson Reuters.
In a nutshell, OpenCalais uses semantic technology and natural language processing to analyze text and add metadata by drawing out entities from documents, blog posts, news stories, etc. In some cases, ths type of data can identify or help identify relationships between people, businesses, etc.
A visualization tool might make OpenCalais even more powerful. For example in might be interesting for visualization tools like Muckety or NNDB Mapper and to quickly "see" relationships that might go unnoticed without OpenCalais or other services.
Sure, it would be wonderful if all web content could be analyzed by a human and then have high quality metadata associated with it.
However, that's far from possible given the massive amount of content generated each minute of each day.
We entered the full text of last Saturday's Weekly Presidential Address and got back lots of stats and commentary.
+ Topic (Labor) Worth noting that we did not put the title of the address in the viewer box. The title is, "President Obama Says Recovery Act Creating Jobs and Strengthening Economy"
+ Social Tags Labor, Unemployment, Presidency of Barack Obama, etc.
+ Entities including: Cities (Arcadia, FL) is mentioned in speech Holiday (He ends by wishing everyone a Happy Halloween
+ Continent America (well we'll got it some slack, close but incorrect) +Industry Terms Clean energy, Less Energy (Good)
+ Province or State Florida, again accurate Finally, Events & Facts
+ Generic Relations (announce, Florida,United States, the largest set of) First we were puzzled. Then, by cursoring over the entry, we see that it's Florida having the largest set of clean energy projects.
Btw, if you cursor over any of the entities you'll find additional info.
For example, with Florida we find a relevance score and the lat/long for Arcardia, FL, the town mentioned in the address.
Although we did see it in our document, OC might also provide direct links to Wikipedia, CIA World Factbook, etc.
Overall, very good. But, it's just one example and one example search does not make a service.
One question that we would we would like to get an answer to is why ThomsonReuters is providing free access to OpenCalais? Does it plan to charge for additional services in the future?
UPDATE: Krista Thomas from OpenCalais sent along the following goals in a Twitter message.
1) Better software faster.
2) Connect all the worlds' business information.
For bloggers, OC offers a WordPress plug-in, a service for Drupal users and more. The WordPress tool analyzes blog postings, suggests, and even images from Flickr.
Other services have technology that draws out indexing terms, descriptors, etc. but OpenCalais appears to be much more sophisticated. Somewhat similar is Silobreaker news search Silobreaker's algorithm draws out entities from stories and then make them clickable or searchable. It also offers a couple of cool visualization tools.
Krista Thomas from OpenCalais recently gave a presentation to the San Diego Software Industry Council. Krista's slides are available online. The charts on pages 4 and 5 are difficult to read so we're trying to get copies to share..
At the present time
+ 18,000 Developers
+ 20+ Publishers
+ 50 Apps and Services Created
+ 4 million docs processed daily
Again, you can try OpenCalais yourself by typing or pasting text into the viewer box.
Finally, here's one more OC example using the content from this post.
Overall, it's easy to see how this service could be of value to both the individual blogger but even more so to publishing companies with a non-stop stream of of content.
A family of resources to help information workers be more effective, raise the value of information in their organisations and contribute to success. Read more »
Recently I have found myself cooing over visualisation maps (and heat maps) of health and well being resources. The content rich data is overlayed with mapping technologies, and some interesting themes and patterns are emerging.
A lot of the talk around social media in the last year has been around information overload. Social media has provided us with new and exciting ways to create content. But it has also meant learning new ways to manage and engage with social media tools. Are we teetering on the edge of an information overload precipice?
Information overload is a figment of your imagination. Or a failure of your filter. Or a symptom of your technological submissiveness. Depends on who you ask.
What if you had to sort through 3.5 million articles and social media posts a day and try to pull out the most relevant items for your organisation? What if you then had to cobble it all together into something readable for your top groups and executives in your organisation?
Alacra Compliance saves time by aggregating information from both free and fee-based sources and enabling users to conduct an accurate federated search across these sources (coined “simultaneous search” by Alacra).