Voice user interfaces like Siri and Alexa have improved in some ways, but in many other ways are similar to IVR phone systems from the 1970s. This book goes through various things to look out for while designing them. For example:
- Be as brief as possible and use visual mode to display lots of information.
- How to provide feedback that to the user that he was understood.
- Ways of confirming an action explicitly or implicitly, and what are the tradeoffs.
- Handling different types of errors like no speech, or out-of-grammar response, ambiguity.
- Designing around ASR features like endpoint detection, barge-in, correcting for similar sounding words with N-best lists.
Most of the advice is somewhat obvious when pointed out, but are subtle and can easily be missed if you’re not experienced. For example, when asking to choose between two items, it’s bad to say “What would you like? We have A and B” because user might barge-in after the question; better to put the question at the end. The design mindset is quite different from engineering: most of the suggestions are quite trivial from a technical perspective, that engineers don’t even think about, but can make a big difference to customer experience.
I skimmed the book as it’s only slightly relevant to my work. Still, interesting to see how the product designer has to deeply understand at a high level the characteristics and limitations of the underlying speech technology, so that he can design around them (but he needn’t be concerned with the exact details of how ASR systems work).