This post is part of our ReadWriteStart channel, which is a resource and guide for first-time entrepreneurs and startups. The channel is sponsored by Microsoft BizSpark. To sign up for BizSpark, click here.
A 2-year-old Silicon Valley startup is working on technology to solve the problems of machine vision. Their new product, launching today at DEMO, will lay the foundation for true video search, wherein software will learn, recognize, and understand patterns and take subsequent actions rather than relying on machine-readable tags and metadata.
In its current form, Vitamin D Video is targeted for video analytics and monitoring, but the underlying technology, called Hierarchical Temporal Memory, or HTM, is modeled on the human neocortex and is the stuff of AI-yet-to-come.
Vitamin D, founded by some of the key UX, engineering, and marketing folks behind the original Palm and Treo products, tells us that their video application can distinguish between people and other objects in live or archived video in ways that no other system can, not even the highest enterprise-grade analytics and monitoring technologies.
It condenses enormous amounts of video into distinct and valuable events very quickly. It also has zero configuration and a simple rules wizard that sets up notifications for specific events.
Ultimately, Vitamin D Video could eliminate the need for humans to stuff text-based data around their videos. "Vitamin D envisions a near future in which all the world's video can be searched, monitored, understood, and reacted to in more automated ways," company rep Allen Bush wrote to us in an email. "If computers could understand video content without having to be told by humans, many disruptive applications would surface."
This application will be available to any user with a simple webcam or IP camera and will be free throughout the beta testing period. Vitamin D sees the app as particularly useful for video surveillance and security as well as mobile capture and lookup, contextual advertising, entertainment, and general visual search. Interested parties can check out their demo.


The company is based in Menlo Park and has received small amounts of angel funding from investors that include HTC CEO Peter Chou.
In 2005, Jeff Hawkins, Palm Computing founder and brain researcher, founded a company called Numenta. Numenta's HTM technology is focused on a new generation of artificial intelligence, or more specifically, enabling computers to recognize, learn, and understand patterns in massive amounts of data. Their HTM platform, the foundation for Vitamin D's app, is also applicable to a broad class of other problems, from machine vision to fraud detection to semantic analysis of text. HTM is based on a theory of neocortex first described in Hawkins' book On Intelligence and was subsequently turned into a mathematical form by Numenta co-founder Dileep George.
Other HTM-based applications solve such problems as recognizing objects in images, recognizing behaviors in videos, identifying the gender of a speaker, predicting traffic patterns, doing optical character recognition on messy text, evaluating medical images, and predicting click through patterns on the web. Check out some sample applications and their demo packages, developed in conjunction with Vitamin D.
The demos show small video clips being processed in real time for machine recognition of humans in video, even in low light, in crowded environments, and in environments with multiple moving objects.

It is important to note that HTM systems are trained rather than programmed in the traditional sense. According to the Numenta website, "Sensory data is applied to the bottom of the hierarchy of an HTM system, and the HTM automatically discovers the underlying patterns in the sensory input. HTMs learn what objects or movements are in the world and how to recognize them, just as a child learns to identify new objects."
Vitamin D CEO Celeste Baranski said, "We've witnessed the incredible growth and importance of searching text-based content. But we're still in the dark ages when it comes to finding things in video... There is a clear need for a better way to detect, recognize, and filter objects. Vitamin D Video is the first of many smart applications that will unlock the value inside video across a broad range of consumer and business applications."
Microsoft BizSpark is a startup program that gives you three-year access to the latest
Microsoft development tools, as well as connecting you to a nationwide
network of investors and incubators. Click here to apply.
Comments
Subscribe to comments for this post OR Subscribe to comments for all ReadWriteStart posts
Amazing and cool to see machine learning systems on the edge of impacting our everyday lives. A wave of related announcements lately have got to have people that don't understand these trends scratching their heads and starting to wonder - http://bit.ly/vYfam. Exciting stuff.
Wow. On one hand, this is fascinating and intriguing. On the other, it is a bit spooky. Anyone else see the movie "Eagle Eye"? Seriously, it will be interesting to see what the next five to 20 years bring about in the world of computers and technology. Let's hope we use these "advancements" in a responsible manner.
I wonder if you tried the application yourself, or anything else built upon HTM?
Sentences like "Other HTM-based applications solve such problems as recognizing objects in images" should be backed up with concrete data.. There are established image databases to test such applications, and I haven't seen anything from Numenta claiming state-of-the-art performance on them.
I would expect RWW to investigate more thoroughly before publishing stuff with such scientific importance.
The demo does not show any novelty of ideas neither it suggest any implemented usecases. Assuming that it will "Ultimately, eliminate the need for humans to stuff text-based data around their videos" is an overstatement on their part. RWW need more qualitative postings.
There's no doubt that machine vision is leaving the factory and taking on an even larger role in our lives. I'd really like to see more debate about the implications of 'total surveillance' - please visit my blog http://machinevision4users.blogspot.com/ to learn more about where the technology is heading.