Thursday, February 01, 2007

Microsoft Vista Voice Recognition Exploit

One of the great features of the new Microsoft Vista operating system is its new voice recognition and control software. It's supposed to be really good, as in good enough to replace Dragon Naturally Speaking or other similar products. The most unique factor is that it does not require "training"--meaning, the software doesn't have to be taught what your voice sounds like. This is great because, well, reading 20 pages of text to some software to teach it what you sound like is a real bore. Not to mention, most of these products only print the text you need to read on the screen, making it more difficult for those with visual impairments to train the software.

Unfortunately, not having to train the software has a downside: the computer will recognize any voice and respond to its commands. So, if you happen to go to a website that has a sound clip that issues voice commands to the computer, your computer will follow the commands.

A few things have to be in place for this to work. First, you have to have Vista's speech recognition software enabled and configured. Second, you have to have speakers and a microphone connected to your computer. This is a far-fetched scenario, but one that users should be aware of, especially people with visual impairments who may be more likely to switch to Vista because of this new speech recognition software.

Edit: In my pondering of solutions for this, I realize that this is a difficult problem.

My first instinct was to create a "voiceprint" and associate that with authorized users on the computer, then require that the voiceprint match the command giver's voice for any system commands (delete, file navigation, file transfer, etc.). But, that's easier said than done. First, creating a voiceprint is a difficult task since voices change. Then, the overhead of constantly comparing the speaker's voice to the voiceprint for commands is significant. Implementation would be time and resource intensive, and, once implemented, hardware minimum requirements would likely be higher.

The other option would be to develop a method to distinguish what sounds are being played through the computer's speakers. But again, it's easier said than done and would require a HUGE amount of work and code (not to mention, I don't think there's any current research into this sort of technology).

I don't really see Microsoft fixing this any time soon.

1 comment:

Rosemary said...

Thanks for sharing this information with the Wheelie Catholic blog. I don't need Voice Recognition yet, but I can already tell my typing requires more and more spell check and revisions. Eventually, I will need it, so I'll follow this story carefully.