speech detection button

Culture Change and Speech Input

Do you use speech recognition software as an input on your desktop or mobile devices? The basic tradeoffs have been studied for ages. Speech allows the user to keep hands-free, particularly helpful for maintenance workers and health care providers who have their hands busy while they need to input notes. Also for physically challenged people who have limited motor control for typing.

On the other hand, there are costs as well. Most people can type faster than they speak, at least after a little practice. The accuracy of speech recognition software has more errors than typical typing input performance. Much better now than a few years ago, but still not quite even. Autocorrect errors can be particularly hard to find when proofreading.

With the prevalence of mobile devices and apps, speech recognition has really taken off. This more extensive use provides some visibility into much more fundamental changes. Clive Thompson has a piece in the August issue of Wired magazine that touches on these, some of which really amazed me. I wonder just how much this shift could change us.

In a sense, voice-writing requires people to change their cognitive style. It’s relatively free and easy, more like speech than writing. But because it’s hard to edit and tinker, dictating to a phone is most like working on an old manual typewriter, where you have to map out each sentence in your head before clacking away. “I think through more completely what I’m trying to say,” Erik Olsen, a video journalist at The New York Times and another dictation adherent, told me.

My Take

Let me explain.

There are several changes that he noticed in his own behavior. In some dimensions, speech input becomes more formal. He used fewer contractions for example. But in other ways, speech input becomes less formal. He used more first person pronouns instead of titles.

There are other direct contradictions as well. He tended to ramble more, making the content longer. But he unconsciously adapted to the needs of the computer, using shorter and simpler words that were less likely to be incorrectly received.

There is also a difference when in public. If you are typing, it doesn’t matter if you are at home, in the office, or sitting on the train. But with voice, each of these could lead to different constraints. You are not going to announce sensitive topics (good or bad) to your fellow commuters on the train. You are more likely to share good news verbally when at work and less likely to share bad news verbally.

Could this lead to an increase in team situation awareness? Perhaps we will become more aware of what our coworkers are doing when we can’t help but hear them dictating notes on the current project into their computers. Perhaps it will lead to more social situation awareness when we hear them dictating an email to their spouses. We get that already when eavesdropping on phone calls, but this could expand that significantly.

Your Turn

Of course, Clive’s article is anecdotal. Longitudinal field studies are needed to see if these behaviors generalize. So I am really interested to hear what you think about them.

  • Have you experienced any of these yourself?
  • Have any of them appeared in your own testing?

Image Credit: Dion Gillard

Leave a Reply

Your email address will not be published. Required fields are marked *