View Full Version : Wave to CallWave, Another Speech to Text Voicemail System, Part 1
Brandon Miniman
09-17-2007, 08:37 PM
You must all think I'm crazy trying so many voicemail systems. First, it was SpinVox (http://pocketnow.com/index.php?a=portal_detail&t=news&id=4208), which did a great job at converting voicemail speech into text, and sending via email or SMS. The accuracy was very high, and I recommended it in the end. The only thing that was missing was contact association, so that when the transcription comes through, it says "Mom" instead of (625)5551212.
Then, I signed up for SimulScribe (http://pocketnow.com/index.php?a=portal_detail&t=news&id=4420) and downloaded the Windows Mobile application, called SimulSays (http://www.simulsays.com/) that is the first true Visual Voicemail client for Windows Mobile (Smartphone only). The problem with SimulScribe was that the accuracy wasn't as high as SpinVox, the message delivery was delayed by up to five minutes, and the Windows Mobile application wasn't yet ready for prime time.
And recently, I've stumbled on yet another voicemail service, called CallWave (http://www.callwave.com/company/). Before I write about my experience on the system, let me explain how CallWave is different.
Apparently, SimulScribe and SpinVox use humans to convert the speech to text (I thought it was done with software), so there is 1. always a delay in receiving the text/email message and 2. limited privacy since someone on the other end always has to listen to your voicemail. Also, since they use humans to do the work, the price is higher: for SimulScribe, the price per month is $10 for 40 messages, or $30 for unlimited; and SpinVox is a bit cheaper at about $10 for 20 messages and $20 for 50 messages.
The way that CallWave is different is that it uses software to transcribe messages. That means that it can be offered at little or no cost (it's still in beta, so pricing info is pending), message delivery is VERY fast, and no human has to listen to your voicemail.
But can software match the accuracy of a human? I'll have the answer in a few days!
motionmind
09-18-2007, 03:59 PM
Spinvox has been free in the US. I'm assuming it still is since I've never had to pay for it.
Brandon Miniman
09-18-2007, 06:11 PM
Spinvox has been free in the US. I'm assuming it still is since I've never had to pay for it.
True, it's still free - but they'll be starting to charge soon.
motionmind
09-19-2007, 01:32 AM
Hopefully AT&T will have some sort of visual voice for all their customers like the iPhone's, but I'll not be holding my breath any time soon.
Brandon Miniman
09-19-2007, 11:04 AM
Hopefully AT&T will have some sort of visual voice for all their customers like the iPhone's, but I'll not be holding my breath any time soon.
Yeah, it'd be tough to do a cross-platform visual voicemail system. I'm sure they're working on it, though, since they now know how to do it. It'd be a huge selling point of ATT could say "the only network with visual voicemail on all our devices."
Chuong Nguyen
09-19-2007, 11:27 AM
I've been a user of CallWave since before the speech to text feature. Didn't realize that other services was done using a human transcriber. That's good to know!
Brandon Miniman
09-19-2007, 11:32 AM
I've been a user of CallWave since before the speech to text feature. Didn't realize that other services was done using a human transcriber. That's good to know!
How is the accuracy of the transcriptions for you?
Chuong Nguyen
09-19-2007, 11:41 AM
Accuracy isn't bad at all. CallWave refers to the transcribed text as the "gist" of the message and that's what it is--the gist. Usually, when users speak loudly, clearly, and semi-slow, it is pretty good. When there is a lot of background noise and when users don't speak as clearly, the recognition is bad, but it is still decipherable as to what is intended.
The cool thing is even with bad accuracy, you still have the link to the actual voicemail in your email inbox so you can always click it and go directly to the message to find out what's being said.
sumdmgai
09-19-2007, 11:55 AM
<b>"Apparently, SimulScribe and <a target="_blank" href="http://www.spinvox.com">SpinVox</a> use humans to convert the speech to text " </b><p>Here is a quote from SpinVox's Daniel Doulton, Co-founder and VP Marketing, Strategy and Development: <p>"At the heart of VMCS (Voicemail Conversion System) is an automated process that combines state-of-the-art techniques – including artificial intelligence, voice recognition and natural linguistics - with the ability to learn from human beings, all whilst maintaining the highest standards of security and privacy.<p>Whatever language you speak, the majority of the conversion of your voice messages is carried out by powerful computers. This means we can convert your messages very quickly and accurately. VMCS is very clever but one of the features of the system, of which we are most proud, is that it is modest enough `to know what it doesn't know`. <p>That means that when it encounters words or phrases it hasn’t come across before, it contacts a <b><a target="_blank" href="http://www.spinvox.com">SpinVox</a></b> language expert for assistance. Once the word or phrase is understood, the language expert then immediately updates the VMCS, making it a truly live-learning system, which knows more and more, hour by hour, constantly improving the level of service we are able to offer.<p>VMCS expertise has been built up over the past four years - converting millions of messages from millions of different voices and accents in English, French, Spanish and German – to create a system so accurate and effective that once people <b>try <a target="_blank" href="http://www.spinvox.com">SpinVox</a></b>, 85% of them decide they can’t do without it."<p>I hope that clears up any misconceptions, I'm a happy SpinVoxer.<p>As to the question of price, read the fine print, Callwave is <b>"Free during Beta"...</b>
While it’s true that the SpinVox voice-to-text conversion service is incredibly accurate, it’s not accurate to say that `SpinVox uses humans to convert the speech to text`. This leads to a misunderstanding so I’d like to clarify. SpinVox has developed a sophisticated learning system called the Voice Message Conversion System to automatically carry out the majority of conversions. When the system encounters a word or phrase its does not know or understand, it is able to refer to a human for assistance. The human then trains the system so that word or phrase becomes known to the VMCS for future use. In that way, the VMCS is constantly evolving and learning, increasing in accuracy and speed with each conversion. SpinVox is integrating its services with the largest global carriers and has already signed deals in North America with the likes of Alltel and Cincinnati Bell in the U.S. and Rogers and Sasktel in Canada. By next year, hundreds of millions of subscribers worldwide will be able to receive their voicemails in text form as a service offered directly from their carrier and powered by the SpinVox VMCS. This would be impossible to achieve without machine-based conversion.
Brandon Miniman
09-22-2007, 07:32 PM
While it’s true that the SpinVox voice-to-text conversion service is incredibly accurate, it’s not accurate to say that `SpinVox uses humans to convert the speech to text`. This leads to a misunderstanding so I’d like to clarify. SpinVox has developed a sophisticated learning system called the Voice Message Conversion System to automatically carry out the majority of conversions. When the system encounters a word or phrase its does not know or understand, it is able to refer to a human for assistance. The human then trains the system so that word or phrase becomes known to the VMCS for future use. In that way, the VMCS is constantly evolving and learning, increasing in accuracy and speed with each conversion. SpinVox is integrating its services with the largest global carriers and has already signed deals in North America with the likes of Alltel and Cincinnati Bell in the U.S. and Rogers and Sasktel in Canada. By next year, hundreds of millions of subscribers worldwide will be able to receive their voicemails in text form as a service offered directly from their carrier and powered by the SpinVox VMCS. This would be impossible to achieve without machine-based conversion.
Very interesting! It seems that CallWave doesn't use such as system that flags a message or word that isn't understood, so that a human can interpret it.
Art Rosenberg
09-25-2007, 01:18 PM
I wrote about the move to "voice-to-text messaging" in my Unified-View column earlier this year and was shocked to get an email from a UK reader claiming that human transcribers were doing all the heavy lifting for SpinVox. Rather than publicize the allegation, I tracked it down through my industry contacts and eliminated it as a source of confusion to the marketplace.
I am sorry that you did not do the same. I have known the folks at CallWave for many years and they don't need this kind of misleading endorsement.
As to the reality of this kind of service, it is obviously a sign that voice messaging is moving towards the reality of separating caller needs from call recipient needs with the help of mature speech recognition technology. This will apply to other areas of interface technology, e.g., user interfaces for multimodal mobile devices, where self-service applications will accept speech input, but provide more efficient visual information output. (See Intervoice announcements and my interview on www.ucstrategies.com)
One reality about speech recognition is that it will never be 100% accurate due to limitations of the vocabulary data base, human accents, or ambient noise interfernce. For this reason, the recorded voice must still be retained for human reference, whether it is the recipient personally or a hosted, outsourced service.
Another aspect of voice messaging is that it is usually very ad lib and not well structured, so CallWave's "gist" approach is very efficient as a form of message notification. Storing the voice message in hosted storage, rather than a personal mailbox is also practical for both personal mobile devices as well as enterprise mail systems.
With unified communications and mobile "smart phones," we will see the rise of visual interfaces and text output, while VoIP telephony will become an option for less used voice conversations/conferencing between people in business communications. It's just a matter of time. Even there, federated presence management will reduce the guesswork for initiating a call, and the need for telephone answering voice messaging will be minimized accordingly.
Brandon Miniman
09-26-2007, 01:07 AM
I wrote about the move to "voice-to-text messaging" in my Unified-View column earlier this year and was shocked to get an email from a UK reader claiming that human transcribers were doing all the heavy lifting for SpinVox. Rather than publicize the allegation, I tracked it down through my industry contacts and eliminated it as a source of confusion to the marketplace.
I am sorry that you did not do the same. I have known the folks at CallWave for many years and they don't need this kind of misleading endorsement.
As to the reality of this kind of service, it is obviously a sign that voice messaging is moving towards the reality of separating caller needs from call recipient needs with the help of mature speech recognition technology. This will apply to other areas of interface technology, e.g., user interfaces for multimodal mobile devices, where self-service applications will accept speech input, but provide more efficient visual information output. (See Intervoice announcements and my interview on www.ucstrategies.com)
One reality about speech recognition is that it will never be 100% accurate due to limitations of the vocabulary data base, human accents, or ambient noise interfernce. For this reason, the recorded voice must still be retained for human reference, whether it is the recipient personally or a hosted, outsourced service.
Another aspect of voice messaging is that it is usually very ad lib and not well structured, so CallWave's "gist" approach is very efficient as a form of message notification. Storing the voice message in hosted storage, rather than a personal mailbox is also practical for both personal mobile devices as well as enterprise mail systems.
With unified communications and mobile "smart phones," we will see the rise of visual interfaces and text output, while VoIP telephony will become an option for less used voice conversations/conferencing between people in business communications. It's just a matter of time. Even there, federated presence management will reduce the guesswork for initiating a call, and the need for telephone answering voice messaging will be minimized accordingly.
What is the link to your article? I couldn't pull it up.
mgreene
09-28-2007, 11:22 AM
I've been using Callwave for a couple of weeks. I guess if I tell people to speak more slowly and clearly it might improve but to far the accuracy has been problematic. I also suspect some problems in how it is synching with my Verizon cell phone as I now have several messages listed as new voice mail - but no messages there. Anyone have this type of experience?
- Michael
playdeep
11-26-2007, 02:57 PM
Any idea when spinvox will start charging...........nevermind looks like they are now as i see on their site. in the US i mean.
vBulletin® v3.7.4, Copyright ©2000-2009, Jelsoft Enterprises Ltd.