Improving the UX of sending voice messages: a concept for a new interaction model

Chat applications are among the most used apps on a daily basis, more than 65,000,000,000 messages are sent every day on WhatsApp alone.

Often used for fast and rapid interactions, messaging applications have seen the use of voice messages grow over time, mainly among younger users and Chinese.

This concept aims to propose a new model of interaction, complementary to those already existing, that improves the user experience in the particular use case of “sending a single voice message” making it faster and easier.

How does it work?

The proposed interaction is similar to walkie-talkies. An interaction model that does not require the user to look at and interact with the screen of the device.

Recording a voice message

Simultaneously clicking on the 2 buttons of the volume control* the app is launched and it starts recording the voice message (if the mobile phone is locked it could be unlocked by facial recognition).


Selecting the recipient

When the buttons are released the recording is stopped and the user is able to choose the recipient of the message by voice input or from the interface

Feedback after selection by voice


If the recipient has been selected via the interface a short feedback is displayed, the message sent and the app closes itself right after. Otherwise if selected by voice input, the recipient name is displayed giving to the user a short time to eventually cancel the action before the message is automatically sent and the app closed.

This interaction mode allows users to send single voice messages very quickly, reducing the number of steps required and allowing an interaction that does not force users to look at the screen or interact with it, in a faster and easier way than Siri or Google Assistant, IMHO.

Current interaction model, steps:

  1. Find the app
  2. Launch the app
  3. Select the recipient
  4. Press the Register button, speak and release the button
  5. Close/Switch the App

Proposed interaction model, steps:

  1. Press the volume buttons, speak and release the buttons
  2. Pronounce/Select the recipient

In the use case where a user is already interacting with his phone, this interaction model could be very handy cause he doesn’t have to switch app, he just has to click on the volume controls to record and send the message. The app closes automatically after sending the message, without leaving the user the task of returning to the previous app.

I don’t like voice messages so much but almost the same concept could be used to send text messages adding a speech to text feature to convert the voice message before sending it.

Leave a comment if you have any feedback I would love to hear it!

*At the moment the Android OS allows apps to customize/remap the behavior of the physical keys (via accessibility service), functionality denied to iOs devices for as much as I know.

Write a Comment