How to control text only aircrafts using voice
recognition in EuroScope
It is a quite long problem how to control text only aircrafts. There are aliases you can use for most cases, EuroScope also offers the automatic text message generators. But a text only aircraft still needs more time to communicate with. And probably the most difficult thing is that other planes are not aware that you are typing your text message and the radio channel is silent just because of that and not because you are just waiting them for a call.
The would-be solution is here for a longer period of time. It would solve our problem if we had a speech recognition engine. Then we could speak to the text only aircrafts and then send the recognized message as chat. Unfortunately as I found the current speech recognition technology is a little bit far to be reliable and precise for doing that. They are able to recognize many words, but there are too many mistakes, misinterpretations. As I tested I was never be able to reach more than 70-80 % of successfully recognized words. That is far less then suitable for our purpose. I also found that some words (e.g. zero) I was simply unable to teach to be recognized. Without that we can not control.
To achieve a better and usable level of recognition we have to reduce the possible words and sentences that the program is to recognized. If we do this then even if some words are missed or misrecognized the final result would be what we want to send to the pilot. For that I created a simply grammar (with regular expressions). You can define words, one-ofs, repeats, sequences and (probably the most usable) sounds like statements. From these blocks you have to create sentences. These sentences are the real aliases. EuroScope will try to understand only these sentences and nothing more. The output text message can be only one of the sentences you defined.
As nearly everything in the EuroScope environment the grammar files are also simply text files. You can edit them by a notepad. Every line is compiled on his own and contains a complete description.
The basic elements of the voice alias are the words. You have to define all words you would like to use in your aliases. It is also a help for the voice recognition engine that only these words are to be recognized. No word that is not defined here will be ever recognized.
There are two word definitions. The first is the simple:
WORD:approach
WORD:runway
WORD:squawk
WORD:land
WORD:takeoff
WORD:taxi
These
lines are simply word definitions.
The
second version is the word with replacement:
WORD:zero:0
WORD:one:1
WORD:two:2
WORD:alpha:A
WORD:bravo:B
WORD:charlie:C
WORD:thousand:000
WORD:hundred:00
In
the second case if the engine recognizes the word it will enter the replacement
string to the message. Here you also can play with the spaces. When you have
replacement string then no spaces are added around. On the other hand if no
replacement then there will be spaces around (multiple spaces are ignored
later).
In
this way “victor echo bravo oscar
sierra” will be converted to “VEBOS”. “squawk two six two two” is
converted to “squawk
WORD:direct:
proceed direct :
I
also found that I was unable to make the “proceed” word to be
recognized by the engine. So I added the “direct” word to be
changed to “proceed direct”. It is just a trick.
There
were words that were just unrecognizable by the engine. When I spoke them the
engine found another word from the list. To be able to cope with these regular
misrecognitions you can use the sounds like statements. Using that if the
misrecognized word matches the regular expression then the good one will be
used instead.
In
the statement the first word is the one you are saying and you want to be
recognized. While the second word is the one the speech recognition engine
understood. Only words defined before can be used in both places.
There
are common sounds like words:
SOUNDS:four:for
SOUNDS:two:to
SOUNDS:descend:descent
SOUNDS:descent:descend
These
can not be recognized without the rules at all. Then there are specific to the
speaker:
SOUNDS:via:zero
SOUNDS:victor:zero
SOUNDS:gate:eight
Just
try your words and if you receive another word regularly add a
sounds like statement.
From
now on I am talking about expressions not only words. An expression can be a
word or an element defined by one of the statements below. There are some
simply general rules:
It
is part of a regular expressions definition. Using this statement you can
define an element that can be exactly one of the listed expressions. The first
string is the name of the expression while the next ones are the options.
ONEOF:number:zero:one:two:three:four:five:six:seven:eight:nine
Here
I defined the “number” expression that can be any of the
“one”, “two” … “nine”.
ONEOF:direction:left:right
When
turning we can add the “direction” expression and then it can be
“left” or “right”
When
you need the same type of element zero, one or more times then create a
repeated expression. You can use it for optional elements with minimum 0
maximum 1 occurrence. The first string is the name of the expression followed
by the minimum and the maximum occurrence.
REPEAT:two_numbers:number:2:2
REPEAT:three_numbers:number:3:3
REPEAT:four_numbers:number:4:4
REPEAT:optional_direction:direction:0:1
In
these examples the “three_number” is good
for heading, the “four_numbers” are good
for squawk for example. The “optional_direction”
can be used as optional runway designator.
To
concatenate some expressions into one you can use the sequence statement. It is
as simply as it looks like. The first string is the name of the expression while
the rest is the expressions to be used.
SEQUENCE:qnh_data:by:QNH:four_numbers
SEQUENCE:badov:bravo:alpha:delta:oscar:victor
SEQUENCE:wind_normal:wind:three_numbers:at:two_numbers:knots:optional_gusting
The
sentences are the real voice aliases. From the technical part they are just
nameless sequences. The alias matcher code gets all the so far spoken words and
selects the sentence that matches the best.
First
of all you need to install the speech recognition engine. Download the Speech
SDK 5.1 from http://www.microsoft.com/downloads/details.aspx?FamilyID=5e86ec97-40a7-453f-b0ee-6583171b4530&displaylang=en
and install. You need the SpeechSDK51.exe (cca 70
MB).
In
EuroScope open the voice setup dialog. In the bottom you can find the Voice Alias grammar. Put your grammar
file name here and press the Enable voice
alias recognition check box. You may have the message dialog popping up
with all the problems in your grammar file. Fix and reload it again. You do not
have to leave EuroScope to change the grammar.
If
the grammar file is OK, you can press the Test
grammar button and start talking the words immediately. Say all the words,
test if the engine recognizes well. Where are the problematic words? Try adding
sounds like statements to make it more stable. And of course test all your
sentences again and again. Be patient. To build up a usable grammar file needs
some time to practice. You can test your grammar without speaking. Just edit
the content of the source edit box to see what sentence is recognized (see the
next chapter about it).
When
the voice alias recognition is enabled you can use it in the following way:
I
packed a VoiceAlias.txt file to the ZIP. It works for me well, but I guess you
need to adopt to your environment.
I hope it will make some more fun when controlling text
only planes.