|
Control your computer
using your voice to input text, browse the web, simulate keyboard and
mouse input, open and control programs and documents
Automatic speech recognition is the
process by which a computer maps an acoustic speech signal to text.
Automatic speech understanding is the process by which a computer maps
an acoustic speech signal to some form of abstract meaning of the
speech. Speech synthesis is the task of transforming written input to
spoken output. The input can either be provided in a graphemic/orthographic
or a phonemic script, depending on its source. As a consequence of its
reliance on phonology, linguistics, signal processing, statistics,
computer science, acoustics, connectionist networks, psychology and
other fields, there are many technologies involved in speech technology.
Speech recognition programs are either "speaker
dependent" or "speaker independent." The speaker-dependent programs
adjust to the way you speak, which requires a "training" process. The
speaker-independent programs attempt to recognize anyone's speech,
without benefit of training.
Speaker-dependent programs tend to be smaller,
faster, and more accurate than speaker-independent programs. However,
they also require more time to learn because you have to train the
program to recognize your voice patterns. Voice LookUp, reviewed below,
is a speaker-dependent program. Speaker-independent programs are easier
to learn and use, but they tend to be larger and require more power.
PDsay, reviewed below, is speaker independent.
Speech recognition programs also differ in purpose.
Both of the Pocket PC programs I review in this article recognize spoken
commands. Both are capable of looking up contacts by name and launching
programs. Additionally, one of them has text-to-speech capability it
will "speak out" the information you request. However, neither of these
programs will translate your speech into text. Current speech-to-text
programs are too large and require too much CPU power to be practical
for the Pocket PC. They are available for desktop PCs, but even then
they use speaker-dependent recognition to improve accuracy and increase
speed.
There are a few things about speech recognition on
the Pocket PC that you should know:
Speech recognition software requires a great deal
of space—Both of these programs occupy over 4 Mb of storage and may
require another 4 to 5 Mb of RAM to run.
The quality of the built-in microphones ranges
from acceptable to bad—The microphones included in Pocket PCs aren't
designed for high-fidelity recording. As a result, the programs have to
work hard to recognize your speech. Speak clearly and keep your lips
within an inch or so of the microphone.
Support for Pocket PC 2002 isn't fully ironed out—Because
Microsoft hasn't officially released the software development kit (SDK)
for Pocket PC 2002, there were some definite rough edges when I tested
this software on a Pocket PC 2002. Both vendors have committed to
addressing Pocket PC 2002 issues as soon as Microsoft releases the SDK.
Let's take a look at two speech recognition programs
for the Pocket PC.
Voice LookUp from HandHeld Speech
Voice LookUp, from HandHeld Speech (www.handheldspeech.com),
lets you look up contacts and switch applications using voice commands.
Its author freely admits that he doesn't invest heavily into developing
Voice LookUp's user interface. His efforts are aimed squarely at
improving the recognition engine, and this strategy seems to have paid
off. Voice LookUp is faster and more accurate than PDsay.
The installation file is a self-extracting archive
file. Click on the archive file to extract the necessary program files.
Then click on "Install.exe" to put the program on your Pocket PC. The
program itself is large, but the majority of it can be installed on a
storage card, making the storage requirements for the program a little
more palatable. After the installation is complete you must train the
program to recognize your voice patterns. This takes less than five
minutes and involves speaking a few sentences into your Pocket PC.
Can Skype replace my local phone company service?
The answer now that SkypeIn is here is YES Skype In gives you the
ability of having a real phone number so anyone can call you, actually
your computer from any regular landline telephone. I also have and use
my cell phone for long distance calls as it is free after 7pm, but I
have had a cell phone for many years and so the saving my cell phone may
provide does not impact the example below.
Here is a breakdown of my typical phone bill BEFORE Skype:
* Local Phone service through SBC $38.00 per month US
Long Distance phone service through various providers $26.00 per month
US
TOTALS Before Skype $64.00 per month US
Here is a breakdown of my typical phone bill AFTER Skype:
Local Phone service through Vonage and Skype In $16.00+$3.41 per month
US
Long Distance phone service through Vonage and SkypeOut $5.00 per month
US
TOTALS Before Skype and Vonage $24.41 per month US
What Skype can do
What is a local phone service anyway ? Well for starters it is a phone
number you give out to everyone you know. It allows you to make local
calls and long distance calls. It has a call waiting option , caller I D
and three way calling and voicemail option as well. Skype Out handles
your long distance phone service needs.
Skype with Skype In and Skype Out gives you everything I just mentioned
your local and long distance phone company provides you but FAR cheaper.
Skype also gives you computer to computer capability so many of your
friends and family may not need to use the telephone - they just call
you on the computer for free using Skype.
But I don't want to use my computer every time I make or receive a phone
call
With Skype In and Skype Out and using a USB to RJ11 Gateway device you
can easily get rid of your local telephone service to make and receive
phone calls anywhere in your home you do now.
What Skype can NOT do
The three things you cannot do with Skype is dial 411 or information or
use the 911 emergency service. You also can not send faxes via Skype or
Skype Out, it just doesn't work. This is why I have Vonage for 911
service If I need 411 service, I use my cell phone, Yellow Pages or just
surf the Internet Yellow Page sites if I need it, which is not often.
Conclusion - Join the Skype revolution
With the addition of Skype In and Skype Out, Skype can replace your
local and long distance phone service at a fraction of the cost of your
current local and long distance services. In my example it is 2 / 3 less
than I use to pay and It even provides you voice mail if you want to use
it or you can do what I do and set your Skype voice mail answering for
90 seconds or more so your home answering machine picks up instead of
your Skype Voice mail.
Send and Receive Fax on Skype using PLUS FAX
PLUS FAX lets you send and receive fax with your Skype account. You can
fax paper documents to any skype user using plus fax. The process of
faxing a paper document to skype users is similar to faxing paper
document to an e-mail address which i described in my previous post on
plus fax. The only difference is that you enter the skype id instead of
an e-mail address. Faxing digital document to any skype user involves
downloading a virtual printer from plus fax. Once this virtual printer
has been downloaded and installed you can then send fax directly from
any MS Windows Application with Printing capability. Visit plus fax web
site for more information and trial.
|
What's New
- text-to-speech software
- Natural-sounding voice output
- Natural Voices
- Speech Recognition
- Speech Synthesis
- Telephony Components
-
Voip
|
|
Speech recognition (also known as automatic speech recognition or
computer speech recognition) converts spoken words to text. The term
"voice recognition" is sometimes used to refer to speech recognition
where the recognition system is trained to a particular speaker - as is
the case for most desktop recognition software, hence there is an
element of speaker recognition, which attempts to identify the person
speaking, to better recognize what is being said. Speech recognition is
a broad term which means it can recognize almost anybody's speech - such
as a call-centre system designed to recognize many voices. Voice
recognition is a system trained to a particular user, where it
recognizes their speech based on their unique vocal sound.
Speech recognition applications include voice dialing (e.g., "Call
home"), call routing (e.g., "I would like to make a collect call"),
domotic appliance control and content-based spoken audio search (e.g.,
find a podcast where particular words were spoken), simple data entry
(e.g., entering a credit card number), preparation of structured
documents (e.g., a radiology report), speech-to-text processing (e.g.,
word processors or emails), and in aircraft cockpits (usually termed
Direct Voice Input).
The
first speech recognizer appeared in 1952 and consisted of a device for
the recognition of single spoken digits [1] Another early device was the
IBM Shoebox, exhibited at the 1964 New York World's Fair.
One of the most notable domains for the commercial application of speech
recognition in the United States has been health care and in particular
the work of the medical transcriptionist (MT)[citation needed].
According to industry experts, at its inception, speech recognition (SR)
was sold as a way to completely eliminate transcription rather than make
the transcription process more efficient, hence it was not accepted. It
was also the case that SR at that time was often technically deficient.
Additionally, to be used effectively, it required changes to the ways
physicians worked and documented clinical encounters, which many if not
all were reluctant to do. The biggest limitation to speech recognition
automating transcription, however, is seen as the software. The nature
of narrative dictation is highly interpretive and often requires
judgment that may be provided by a real human but not yet by an
automated system. Another limitation has been the extensive amount of
time required by the user and/or system provider to train the software.
A distinction in ASR is often made between "artificial syntax systems"
which are usually domain-specific and "natural language processing"
which is usually language-specific. Each of these types of application
presents its own particular goals and challenges.
In
the health care domain, even in the wake of improving speech recognition
technologies, medical transcriptionists (MTs) have not yet become
obsolete. Many experts in the field[who?] anticipate that with increased
use of speech recognition technology, the services provided may be
redistributed rather than replaced. Speech recognition is used to enable
deaf people to understand the spoken word via speech to text conversion,
which is very helpful.
Speech recognition can be implemented in front-end or back-end of the
medical documentation process.
Front-End SR is where the provider dictates into a speech-recognition
engine, the recognized words are displayed right after they are spoken,
and the dictator is responsible for editing and signing off on the
document. It never goes through an MT/editor.
Back-End SR or Deferred SR is where the provider dictates into a digital
dictation system, and the voice is routed through a speech-recognition
machine and the recognized draft document is routed along with the
original voice file to the MT/editor, who edits the draft and finalizes
the report. Deferred SR is being widely used in the industry currently.
Many Electronic Medical Records (EMR) applications can be more effective
and may be performed more easily when deployed in conjunction with a
speech-recognition engine. Searches, queries, and form filling may all
be faster to perform by voice than by using a keyboard.
Speech Technology: voice recording,digital voice recorders,audio
analisys,noise cancellation, noise reduction, DSP board, embedded
solutions
Speech recognition is a gift to human being by nature. We all speak, and
understand what other says. Even other animals do speak, in their own
ways. When we speak, our ear receives the sound waves and converts them
into signals that can be interpreted by the brain. You may wonder, why I
am speaking about Medical Sciences when I am speaking about Windows
Vista.
There is no natural intelligence for the computers. So, therefore a
voice database is created for each user and will be compared with the
incoming voice from the microphone and appropriate word will be produced
as a result.
The
problem of misinterpretation of text, commands through speech is solved
to a great extent in Windows Vista. Thus, one can enjoy the speech
recognition in a much better way in Windows Vista. Speech recognition
will help the user to get very large documents typed, with less use of
key board and mouse. Speech recognition has been improved in Vista, will
be improving in the years to come.
LIVE DEMO SITES FOR SPEECH RECOGNITION AND
SYNTHESIS EVALUATION
Online speech to text
www.wavetotext.com
Create TTS Voices
www.ttsbuilder.com
100% .Net soft phones
www.h323phone.com
Voice Verification Online
www.windentify.com
Q&A: The Future Of Speech
Find More Results for: "speech pc "
Editor's Note: As we reported earlier this week, IBM unveiled new
speech recognition technology on Tuesday that can comprehend the
nuances of spoken English, translate it on the fly, and even create
on-the-fly subtitles for foreign-language television programs.
After spending the day at IBM headquarters viewing demos of the
company's latest research projects, reporter Robyn Peterson caught
up with two of the research leaders in IBM's speech recognition
group, Dr. David Nahamoo, manager of Human Language Technologies,
and Dr. Roberto Sicconi, manager of Multimodal Conversational
Solutions. The following transcript has been edited for clarity.
PC Magazine: What's wrong with speech recognition software that's
available on the market today?
Dr. David Nahamoo (DN): The way to look at speech technology is
to look at the progression of human language.
Microsoft's .NET Speech group is focused on
supporting new speech technologies, including the new Speech Application
Language Tags (SALT) standard. They started the SALT forum (www.saltforum.org)
with five other companies in hopes of defining a standard that will help
speech-enable the entire Internet. In other words, their goal is to let
you talk to your computer to interact with the Internet. This is an
ambitious goal given the relatively low penetration of speech
recognition technology today. Until the user interface on our devices is
Web-centric, SALT probably won't help us much in our quest to access our
personal information with our voice, but it gives us something to look
forward to.
Speech recognition technology has improved to the
point that we now have reasonable (though less-than-perfect) command
recognition programs available for the Pocket PC. One of these programs
will even read information to us. Which one of the current offerings do
you pick? The decision is based on whether you favor the recognition
accuracy of Voice LookUp or PDsay's ability to read your information to
you.
We're not quite ready for speech-to-text yet—no
dictation on the Pocket PC. But that can't be too far in the future, and
when it comes, the utility of the Pocket PC and similar devices will
increase dramatically.
|