Search Every Word Spoken in a Video: From YouTube to the PBS NewsHour
ResourceShelf friend, writer, and web guru Marshall Kirkpatrick who recently became a member of the Splashcast team informs all of us (with a great review) of a new service feature Podzinger that provides transcript searching (every spoken word) in some YouTube categories.
Kirkpatrick writes:
Searches can currently be performed in a limited number of categories (sports, anime, entertainment or all) but further categorization could be interesting as well...The results are different than what you get when you search YouTube itself, so combining the two search feeds is probably the best idea.
For those of you who are unfamiliar with Podzinger, they offer "transcript search" of podcasts. Podscope from TVEyes is another service that offers this type of service.
If you are new to video or audio transcript searching, it's a topic we've mentioned several times on ResourceShelf over the years. Here's a quick review and lots of links. Have fun!
1) TVEyes (Fee-based and free).
The fee-based service offers near real-time transcripts and then the option to go directly to the part of the video where your words were spoken. Search almost all of the major TV networks and many local stations in the U.S. Also, National Public Radio, Al Jazeera, and TV and radio from the U.K. Alert services too! In other words, they say your companies name on CNBC at 7:45am by 7:55 or so, you'll have an alert with a direct link to the transcript and video.
TVEyes also offers a FREE service (look for the search box on the home page) that provides transcript search of tv news video available on the open web from MSNBC, Reuters, and others. The free service does not offer transcripts but will allow you to go directly to where your keywords are spoken.
2) Similar fee-based services (all with free trials) from Critical Mention, ShadowTV, and Fednet.net (U.S. Congress) are all worth a look. As Marshall points out, Blinkx also offers some keyword transcript searching.
Virage from Autonomy has also been in this space for years.***
3) Perhaps the most interesting player in this space is Nexidia. This Georgia-based company approaches transcript searching from a different angle. Most others use either speech-to-text or closed-captioning to search on. Nexidia is different. They break the spoken word into phonetic sounds (phonemes, about 40 in the English language) which makes for strong accuracy (not perfect however), less computing power and less "training" for jargon and other words. Nexidia has a strong presence in the call center and government marketplaces. Another strength is that it will work with most languages. In the Fall of 2006, Nexidia launched a demo site using video from a TV station in Atlanta.
It would be interesting to see a comparison with Nexidia, Podzinger, TVEyes, and others working with YouTube and other user-created content.
C) American Field Guide
Keyword search (or browse), “the sights and sounds from a wide variety of environments throughout America. We’ve collected over 1400 video clips that enable you to experience America’s wilderness firsthand…”
Finally, not entirely keyword searchable (yet?) but still worthy of mention is the National Public Radio archive. Here you can search written show rundowns and then go directly to the segment. Just about every news and public affairs show back to 1996 is included. Look for the advanced search box in the right margin.
Look for a future post with a compilation of archived broadcasters from around the world.
Thanks again to Marshall Kirkpatrick for his post and motivation to compile the links in this post.
A family of resources to help information workers be more effective, raise the value of information in their organisations and contribute to success. Read more »
Recently I have found myself cooing over visualisation maps (and heat maps) of health and well being resources. The content rich data is overlayed with mapping technologies, and some interesting themes and patterns are emerging.
A lot of the talk around social media in the last year has been around information overload. Social media has provided us with new and exciting ways to create content. But it has also meant learning new ways to manage and engage with social media tools. Are we teetering on the edge of an information overload precipice?
Information overload is a figment of your imagination. Or a failure of your filter. Or a symptom of your technological submissiveness. Depends on who you ask.
What if you had to sort through 3.5 million articles and social media posts a day and try to pull out the most relevant items for your organisation? What if you then had to cobble it all together into something readable for your top groups and executives in your organisation?
Alacra Compliance saves time by aggregating information from both free and fee-based sources and enabling users to conduct an accurate federated search across these sources (coined “simultaneous search” by Alacra).