I've had my rounds with voice input on smartphones. For an entire week this summer, I chose to take on a very tough challenge. I went an entire week without using a soft keyboard on my phones and only used voice input.
Not that I thought it would be fun by any means, but it was a week of pure torture. Getting people's names right was the worst, as was punctuation. I'm a stickler for proper grammar, even in text messages and on Twitter. (Exceptions can be made for character constraints, of course.)
However, I learned quite a bit over those very long seven days. I found some glaring limitations of Android voice input and learned that iOS dictation was significantly more well-rounded and practical to use long-term. The dictation feature in iOS reacts to virtually any punctuation command, whereas the Android voice input system would type out "colon" or "new line".
I also spent a lot of time using Siri – both during and after the challenge. At the time, I was using the HTC One X and only had access to Google Now via the inactive Galaxy Nexus on my desk. But through my testing, I found that I only used Siri and Google Now to search for things. I never used these services for one of their most outstanding features: voice actions.
Using Google Now or Siri, you can say things like "Open Contacts" to open the Contacts app. Or say someone's name to pull up their contact information. You can tell Siri or Google Now to create a reminder, or a self-note. You can compose emails using commands.
It's really simple and self-explanatory. You just have to mentally bridge the gap between phones of yesteryear and the newer ones, which are actually pretty smart.
In my week-long test, I don't recall ever opening an application using a voice action, outside of just testing the feature. I never long-pressed the home button in Siri or slid up from the Home button to access Google Now to launch an app. I have found that, in most cases, physically navigating to an application's icon is quicker and more efficient than speaking to your phone. And typing out a reminder or email is generally quicker, so long as I have a free hand (sometimes two, since I now use the Galaxy Note II).
Over the last few months, though, I have found myself consulting Google Now more and more. Mainly, I'm just asking questions, searching for things, fact-checking and such. But last week, Google added several new features to Google Now, such as a barcode scanner that allows users to image search, a la Google Goggles, and an song recognition engine.
Simply open Google Now and speak, "What is this song?" It takes just a few seconds to get a clip from the song, compare it against a database and return the name, album, artist and, of course, a Google Play link.
And image search works just as quickly. Say, "Scan this barcode." The typical Google Now voice input screen turns into a barcode scanner. Point the viewfinder towards a non-barcode and Google will image search for similar images or things related to an object in the image.
Since the update, I have switched to using Google Now to identify songs, though I'm not sure how much longer that will stick considering SoundHound ∞ links me directly to the song in Spotify. And the novelty factor of the quick image search has yet to wear off, but Google Goggles came in handy back in the day and having this implemented alongside the other vast features of Google Now might make it one of my frequent, go-to tools.
I can't ever see myself using voice actions to launch applications or make reminders until I can speak to my phone to wake it. At that point, voice actions will have crossed a very important gap between a cool novelty to show off to your friends and a truly useful tool.
Do you use voice actions, readers? Do you use Siri or Google Now to launch applications and perform mini tasks? Or, like me, do you prefer the manual approach? What will it take for you to use voice actions?