In a demonstration of how algorithms are playing an increasing role in determining user experience, hundreds of millions of Google customers are about to see a change to their voice search and dictation experience. This results not from a fresh set of visual branding tweaks or a further development of its in-app ‘material design’ philosophy, but from an evolution of the neural network techniques employed to process voice input.
“Today, we’re happy to announce we built even better neural network acoustic models using Connectionist Temporal Classification (CTC) and sequence discriminative training techniques. These models are a special extension of recurrent neural networks (RNNs) that are more accurate, especially in noisy environments, and they are blazingly fast!” — Google Research Blog, 24th September 2015.
If you take a few moments to wade through dense language of the blog post, the potential for this change becomes clear: they way in which Google attempts to understand users’ voices is evolving significantly. A user who previously found Google’s voice recognition too slow or inaccurate may now find it has crossed that critical line where latency and success rates are just good enough to make voice input a habitual behaviour rather than an occasional novelty.
Users gravitate towards the input mechanism with the lowest overall latency. It is the reason most users’ primary digital touchpoint is now a mobile device, which can be pulled from the pocket and engaged in less than a second. The same effect will be seen in choosing between keyboard and voice: in most situations (excepting, for instance, where privacy concerns override), speed will win.
This improvement in voice recognition holds the potential for a ripple effect: the ability to dictate documents into a smartphone rather than typing them out on a touchscreen keyboard may mean more business users can leave their laptop behind; the handsfree interactions enabled by accurate voice input may make users more likely to engage with Google in environments like the car or kitchen; looking deeper, it may also mean the messages you exchange with friends and family are input less frequently with your fingertips and more often by speaking them aloud with your voice.
Google’s use of these algorithms extends across its services. The 2015 relaunch of Photos, for instance, relied heavily on advances in the neural network techniques used to identify and organise image content. The overall user experience of Google Photos simply wouldn’t be possible without the algorithm behind it, which is responsible for some of the most meaningful interactions it offers, including the ability to automatically search for colours and objects by simply typing (or speaking!) them into the search box.
This means experience designers must start to think about the previously fixed and predictable framework which underpins the digital environment in a much more plastic way. Just as the arrival of different screen sizes necessitated a responsive approach to visual design, the growing role of algorithms in delivering deeply personalised and contextually sensitive experiences will require a more flexible approach to experience design.
Google is not alone in adopting this approach. Every major name in digital is experimenting with neural networks to govern everything from the products they suggest on shopping sites to allocating resources like taxis and delivery trucks. There are also venture capital firms specifically targeting opportunities in this area, such as London-based Playfair Capital, which held an event (you can watch all the talks on Youtube) in June highlighting advances in machine learning. As these techniques become more widespread, especially in situations such as pre-emptively addressing customer service issues, there is an ever increasing chance of an algorithm becoming the single most important factor in determining customer experience at a given moment.