20 February 2011

IBM's Watson: Its all about the software

Along with the many other people across the nation, I watched episodes of Jeopardy earlier this week where IBM's newest super-computer played alongside former champions. It was amazing to see how well their newest super-computer play against the other opponents, able to interpret clues and formulate responses with amazing accuracy. It was astounding to see how it dominated the opponents in just the first 2 tournament games it has ever played, far-and-above more powerful than its chess-playing counterpart, Deep Blue. Deep Blue didn't make significant headlines until it finally beat a chess grand-master, and it took several attempts for that to happen. Watson, on the other hand, only tried once - and it swept the competition. But I think the praise is misguided, as the entire match seemed like a giant advertisement for IBM.
Granted, Watson is poised to make significant advancements in natural language processing (NLP), but I don't believe it to be revolutionary. It is a huge step in NLP, but it still lacks in accuracy, response formulation, and overall power. Moreover, IBM makes it appear proprietary. Watson was touted to be a system that could only survive on IBM's hardware, but in actuality, Watson is merely a glorified information query system that could quite potentially run on any off-the-shelf system. After all, Watson is just software. But, of course, this is not the IBM way. Just as OSX is not allowed to be legally run on any system other than Apple's hardware, IBM makes it extremely difficult to run their software on anything but their hardware. By continuing to press the idea that Watson's NLP systems run on IBM's POWER-7 hardware, it generates the misconception that IBM's hardware systems are superior when it is simply not true.
Granted, IBM's POWER line of server hardware is quite powerful, their short- and long-term costs are prohibitive for widespread adoption. However, if IBM really cares about the future of people, not their profits, the natural language processor that Watson employs should be made publicly available. With public accessibility, computer scientists the world over could improve the system, making it more powerful and mature enough for day-to-day consumer-level use. It could be augmented with the ability to use speech recognition, or even image processing to add oh-so-important contextual information. It could even be used to provide better language translation engines. Imagine being able to call people in other countries, who speak completely different languages, and speak to them in your native tongue, but the other party hearing it in their language and vice versa. This could be accomplished by using a calling center to route the call, similar to a collect call, but the routing center actively translates to your respective languages using speech recognition (with inflection recognition) and text-to-speech systems (like what was used by Watson when it responded to clues).
It irks me to no end to think that Watson, in its current form, probably will never see the light-of-day on more powerful, cheaper, easier-to-deploy computing clusters, such as the incredible server clusters at Google or Amazon, which process terabytes of information scattered throughout the Internet every day. It is likely that it will remain within the confines of IBM and their POWER platform, consistently falling short of its true potential because of its closed nature, and the fact that IBM will be the only ones to be able to re-purpose and engineer the Watson NLP system to function in more useful and creative ways. IBM would be the gatekeeper, picking and choosing what it would be used for, restricting its applications to only those who are willing to pay the seemingly exorbitant price tag that would be needed to make Watson work with in specialized environments.
I am sure that if Watson was able to be used on a variety of platforms, and companies could purchase it for integration within their own applications or computer scientists could expand the platform, it is almost certain that IBM would make more money than if they tied it to their proprietary platforms. But thanks to the greater interest in NLP sparked by this Jeopardy challenge, we may see groups of computer scientists formulate their own open-source NLP software that would prove to be a cheaper, more flexible and quite possibly a more powerful option, and IBM would be left behind, wondering how the world passed it by again - and unfortunately, the answer would  be glaring at them as it always does for corporations: you were being too greedy and people worked to find an alternative.

No comments: