How to control text only aircrafts using voice recognition in EuroScope

About voice aliases

It is a quite long problem how to control text only aircrafts. There are aliases you can use for most cases, EuroScope also offers the automatic text message generators. But a text only aircraft still needs more time to communicate with. And probably the most difficult thing is that other planes are not aware that you are typing your text message and the radio channel is silent just because of that and not because you are just waiting them for a call.

The would-be solution is here for a longer period of time. It would solve our problem if we had a speech recognition engine. Then we could speak to the text only aircrafts and then send the recognized message as chat. Unfortunately as I found the current speech recognition technology is a little bit far to be reliable and precise for doing that. They are able to recognize many words, but there are too many mistakes, misinterpretations. As I tested I was never be able to reach more than 70-80 % of successfully recognized words. That is far less then suitable for our purpose. I also found that some words (e.g. zero) I was simply unable to teach to be recognized. Without that we can not control.

To achieve a better and usable level of recognition we have to reduce the possible words and sentences that the program is to recognized. If we do this then even if some words are missed or misrecognized the final result would be what we want to send to the pilot. For that I created a simply grammar (with regular expressions). You can define words, one-ofs, repeats, sequences and (probably the most usable) sounds like statements. From these blocks you have to create sentences. These sentences are the real aliases. EuroScope will try to understand only these sentences and nothing more. The output text message can be only one of the sentences you defined.

The grammar file structure

As nearly everything in the EuroScope environment the grammar files are also simply text files. You can edit them by a notepad. Every line is compiled on his own and contains a complete description.

The words

The basic elements of the voice alias are the words. You have to define all words you would like to use in your aliases. It is also a help for the voice recognition engine that only these words are to be recognized. No word that is not defined here will be ever recognized.

There are two word definitions. The first is the simple:

WORD:approach
WORD:runway
WORD:squawk
WORD:land
WORD:takeoff
WORD:taxi

These lines are simply word definitions.

The second version is the word with replacement:

WORD:zero:0
WORD:one:1
WORD:two:2
WORD:alpha:A
WORD:bravo:B
WORD:charlie:C
WORD:thousand:000
WORD:hundred:00

In the second case if the engine recognizes the word it will enter the replacement string to the message. Here you also can play with the spaces. When you have replacement string then no spaces are added around. On the other hand if no replacement then there will be spaces around (multiple spaces are ignored later).

In this way “victor echo bravo oscar sierra” will be converted to “VEBOS”. “squawk two six two two” is converted to “squawk 2622”. Also ‘one thousand” is “1000”, “five hundred” is “500”. There is only one extra tool inside the compiler that “seven thousand five hundred” is converted to “7500” when the word “hundred” is recognized.

WORD:direct: proceed direct :

I also found that I was unable to make the “proceed” word to be recognized by the engine. So I added the “direct” word to be changed to “proceed direct”. It is just a trick.

The sounds like statements

There were words that were just unrecognizable by the engine. When I spoke them the engine found another word from the list. To be able to cope with these regular misrecognitions you can use the sounds like statements. Using that if the misrecognized word matches the regular expression then the good one will be used instead.

In the statement the first word is the one you are saying and you want to be recognized. While the second word is the one the speech recognition engine understood. Only words defined before can be used in both places.

There are common sounds like words:

SOUNDS:four:for
SOUNDS:two:to
SOUNDS:descend:descent
SOUNDS:descent:descend

These can not be recognized without the rules at all. Then there are specific to the speaker:

SOUNDS:via:zero
SOUNDS:victor:zero
SOUNDS:gate:eight

Just try your words and if you receive another word regularly add a sounds like statement.

Regular expressions

From now on I am talking about expressions not only words. An expression can be a word or an element defined by one of the statements below. There are some simply general rules:

In an expression only words and elements that are defined in advance can be used. So there is no way for recursion (that avoids some endless loop).
All names must be globally unique (you can not have same name expressions or words).

The one of statements

It is part of a regular expressions definition. Using this statement you can define an element that can be exactly one of the listed expressions. The first string is the name of the expression while the next ones are the options.

ONEOF:number:zero:one:two:three:four:five:six:seven:eight:nine

Here I defined the “number” expression that can be any of the “one”, “two” … “nine”.

ONEOF:direction:left:right

When turning we can add the “direction” expression and then it can be “left” or “right”

The repeat statements

When you need the same type of element zero, one or more times then create a repeated expression. You can use it for optional elements with minimum 0 maximum 1 occurrence. The first string is the name of the expression followed by the minimum and the maximum occurrence.

REPEAT:two_numbers:number:2:2
REPEAT:three_numbers:number:3:3
REPEAT:four_numbers:number:4:4
REPEAT:optional_direction:direction:0:1

In these examples the “three_number” is good for heading, the “four_numbers” are good for squawk for example. The “optional_direction” can be used as optional runway designator.

The sequence statements

To concatenate some expressions into one you can use the sequence statement. It is as simply as it looks like. The first string is the name of the expression while the rest is the expressions to be used.

SEQUENCE:qnh_data:by:QNH:four_numbers
SEQUENCE:badov:bravo:alpha:delta:oscar:victor
SEQUENCE:wind_normal:wind:three_numbers:at:two_numbers:knots:optional_gusting

The sentences

The sentences are the real voice aliases. From the technical part they are just nameless sequences. The alias matcher code gets all the so far spoken words and selects the sentence that matches the best.

How to use it

Microsoft Speech Recognition engine

First of all you need to install the speech recognition engine. Download the Speech SDK 5.1 from http://www.microsoft.com/downloads/details.aspx?FamilyID=5e86ec97-40a7-453f-b0ee-6583171b4530&displaylang=en and install. You need the SpeechSDK51.exe (cca 70 MB).

EuroScope setup

In EuroScope open the voice setup dialog. In the bottom you can find the Voice Alias grammar. Put your grammar file name here and press the Enable voice alias recognition check box. You may have the message dialog popping up with all the problems in your grammar file. Fix and reload it again. You do not have to leave EuroScope to change the grammar.

If the grammar file is OK, you can press the Test grammar button and start talking the words immediately. Say all the words, test if the engine recognizes well. Where are the problematic words? Try adding sounds like statements to make it more stable. And of course test all your sentences again and again. Be patient. To build up a usable grammar file needs some time to practice. You can test your grammar without speaking. Just edit the content of the source edit box to see what sentence is recognized (see the next chapter about it).

Use it in EuroScope

When the voice alias recognition is enabled you can use it in the following way:

Press the primary PTT button. I expect to say the callsign of the text aircraft (I do suggest trying to recognize it).
Then while the primary PTT is down, press the secondary PTT simultaneously. That will switch on the recognition engine (and will not connect the secondary devices to the mike). When enabled, the bottom line (prompt and message line) is hidden and two new read-write edit boxes are shown there. They are both empty.
When you are talking the recognized words are put to the first edit box each after the other with space separated. In the right place you can see the so far best matching alias. The matched words are shown without any sign, the sounds like matches are flagged by {} around, while the non matches are flagged by [].
When ready you can release the secondary PTT button. You can go and manually edit any of the edit boxes. If you change the input edit, then for every keystroke the content is analyzed for a better sentence match. Here you can use the replacement strings too (e.g. press just 2 instead of two, or A instead of alpha).
If the result is OK, the press ENTER to copy the result to the command line. Or if you press the secondary PTT again (when the result edit box is not empty) then the content is copied to the command editor and a new sentence is going to be recognized. This can be used when multiple sentences are to be sent to the plane.
Send the text message in the original normal way to the plane.

The VoiceAlias.txt

I packed a VoiceAlias.txt file to the ZIP. It works for me well, but I guess you need to adopt to your environment.

I hope it will make some more fun when controlling text only planes.