Differences of how a voice assistant is activated seems to influence the overall perception of user experience, control and acceptance of such an interactive system.

ABSTRACT 

While voice assistants are on the rise for a variety of applications, talking to the television still feels less natural to users than talking to their friends or neighbours. Subtle differences how and when a voice assistant is activated seem to influence the overall perception of users in terms of user experience, control and acceptance of such an interactive system.

To investigate the influence of using speech to search for content with an ambient voice assistant, compared to a more traditional solution with a microphone in a remote control, an experimental study was performed. Fourteen participants took part in a within-subject experiment comparing an ambient speech interaction with speech search using a remote control with a dedicated button to activate the microphone in terms of: privacy, usability and user experience.

Results indicate a slightly higher impression of control for the button-based speech search modality, as well as fewer privacy concerns by the users for the button-based speech search modality. In terms of user experience, the hands-free ambient speech search does not perform significantly better than the traditional buttonbased speech search approach. 

INTRODUCTION

The usage of speech interaction is on the rise: labelled by marketers as “voice assistants”’ a variety of products has appeared on the market. In this paper we use speech to refer to a user talking to a system, while voice is used to refer to a person’s unique voice that allows identification. Since the introduction of “Siri” on Apple’s iOS on mobile devices back in 2011, voice assistants have gained popularity and get smarter every day. All major brands presented voice assistants: Cortana (Microsoft), Google Assistant (Google), Alexa (Amazon).

With Alexa, Amazon started a new trend: now the voice assistant is ambient, it is not only embedded in the user’s smartphone or laptop, but has its own dedicated device, which is standing in the home, always listening and at the user’s service. While Alexa was the only device designed to be a standalone vocal assistant back in 2014, Google released Google Home and “Spot”, the connected ambient microphone by nvidia which will allow the user to control her home and her nvidia Shield TV in a hands-free manner, thanks to the Google Assistant embedded in the device. Microsoft’s Cortana will also be embedded in a speaker device, designed by Harman Kardon.

All these ambient vocal assistants are changing the speech-search landscape. When it comes to TV, until now the user had to use the built-in microphone on the TV or set-topbox remote control to perform her voice search.

Now, with the ambient microphone, users will be free to perform their speech search without having to talk to their remote control, and will have a complete hands-free experience. This leads us to these questions: are users ready to open their homes to these technologies?

What are the differences between this “ambient” modality and a voice-controlled remote control solution, in terms of usability, user experience, feeling of control, acceptance, and privacy concerns for the TV use case? Is there still room in the living-room for the remote control?

The goal of this study was to understand if the ambient speech search is enhancing the user experience of watching TV, compared to a traditional voice search using the remote control’s built-in microphone. We also wanted to measure how determining factors, such as acceptance of such a technology and user’s privacy concerns, are influencing the overall experience of the user.

Download the full technical paper below