Android and Linux

Monday, October 25, 2010

Voice command/Tasker errata

More natural commands

Having played around with voice commands a little after my last post, I decided my speech wasn't natural enough. Wait, what? With voice commands and speech synthesis, isn't it usually the computer that doesn't sound natural? Well in this case, it certainly doesn't feel natural to look at my phone and say "Forecast. Tuesday."

Based on the commands and tasks in my last post, I would say "forecast tuesday" and the word "forecast" would trigger the forecast task and the second word would be used to prepare the forecast for the correct day.

I've gravitated toward this solution for filtering my voice commands, when they need filtering.
awk '{print $NF}' /sdcard/.voice > /sdcard/.voicetmp
then changing the task from this:

5- Perform Action WXDAY if %VOICE matches forecast**

to this:

5- Perform Action WXDAY if %VOICE matches **forecast**

No big difference, but where it did depend on the first and second word before, now the first word can be anywhere in the sentence and the second word only needs to be at the end. Now "What's the forecast for friday?" or "Boy, I sure do wish my phone could tell me what the forecast is supposed to be for the day of the 29th, which just happens to be a friday" will both work and sound a lot better than "Forecast. Friday."

I'm only using the weather task as an example since it's been posted here. This can be used with any command which needs to extract a trigger and a variable from your speech.

I haven't had the need to input multiple variables yet, but I did whip up a couple scripts to give that flexibility:
#! /system/bin/sh
if test -z "$1"
tr ' ' '\n' < /sdcard/.voice | tail -n1 > /sdcard/.voicetmp
tr ' ' '\n' < /sdcard/.voice | tail -n${1} > /sdcard/.voicetmp
You can execute this script followed by the number of items you want to extract, or followed by nothing for extracting one item. For example, if you want to extract three items and the script is named "vfilter," you'd put "vfilter 3" in the Locale Execute Plugin then set up your task accordingly.

I'm not sure what use this is, but I hate being constrained to only being able to use a single word so I hammered this out and put it on my sdcard in case I need it.

Another approach, if you need more flexibility, would be to say the number, and execute either of these commands:
#! /system/bin/sh
awk '{ for(i=$NF;(NF-i)<NF;i--) { printf "%s%s",$(NF-i),FS } printf RS }' /sdcard/.voice > /sdcard/.voicetmp
awk '{ for(i=$NF;(NF-i)<NF;i--) { printf "%s%s",$(NF-i),FS } printf RS }'  /sdcard/.voice | tr ' ' '\n' > /sdcard/.voicetmp
The only difference is that the first one will output everything on one line and the second will output each word on it's own line.

Using this, you can say "Please google more common hades three" and using google as a keyword to open a search URL, it will input the last three words "more common hades."

Of course, this is more easily accomplished by using "google" as a variable split point and using VAR2 as the search term, but, who knows, it might come in handy for a flexible task where you need to input a different number of items on the fly without changing a handful of actions in the task.

Temp files

You may have noticed I used .voice and .voicetmp as temp files. You can bypass temp files by putting the words directly into the system clipboard by editing the Python script to this:
import android
droid = android.Android()
Speech = droid.recognizeSpeech()[1]
I toyed around with this and decided it was simpler to use files. Other tasks may set or use the clipboard and there may be something important in the clipboard that I don't want to erase by an unrelated task, and it's easier to set up tasks without worrying about backing up the clipboard every time.

Now that the human side sounds better...

Oh yeah, I haven't really mentioned speech synthesis here. The only thing worth mentioning is that the best voices I've heard are made by Svox and are available in the Market. They have many voices and a free app that gives a sample of them all. I personally like the British female voice. When using Tasker, I set the tone to 6 and speed to 10 and o-la-la does she sound good.

Do I have any cool voice tasks?

Probably not. I actually have 15 tasks that are controlled by voice, but some are for controlling my home computer over ssh and are probably only interesting to me. Here are a few that may be useful. I won't post the whole task, just the idea. The rest should be easy to figure out.

Saying "pic" takes a photo.

Saying "text" runs the voice script twice more, once to get the SMS recipient, again to get the SMS body, then opens the SMS app and fills out the text.

Saying "map [address]" opens the map to the address I specify.

Saying "search [phrase]" opens google search of the phrase.

I'm trying to buy a house so saying "mls search [number]" opens the browser to[number] to look up homes by their MLS listing.

I have all those in a separate task which the voice task executes. After that, it executes another task which is just for loading apps. I have a dozen set up to open on command, like saying "terminal" opens the terminal, saying "mail" opens gmail, etc. I haven't really used them, but it's another example of using voice control. Other voice apps can open apps by name, but how can they possibly interpret names like "BTEP" or "QuickSSHD?" Using your own voice control, you can open them by nicknames, which is much more powerful.