Digital Ventriloquism: Giving Voice to Everyday Objects

Smart speakers with voice agents have seen rapid adoption in recent years, with 41% of U.S. consumers owning one by the end of 2018. These devices use traditional speaker coils, which means the agent’s voice always emanates from the device itself, even when that information might be more contextually and spatially relevant elsewhere. One option would be to instrument everything or to have multiple speakers in each room, but this comes with installation, maintenance, and aesthetic downsides.

In this research, we describe our work on Digital Ventriloquism, which allows a single smart speaker to render sounds onto passive objects in the environment. Not only can these items speak, but also make other sounds, such as notification chimes. Importantly, objects need not be modified in any way: the only requirement is line of sight to our speaker. As smart speaker microphones are omnidirectional, it is possible to have interactive conversations with totally passive objects, such as doors and plants.

To achieve this effect, we use a dense, 2D array of ultrasonic transducers. This produces a highly directional emission due to the Huygens-Fresnel principle, which is critical for rendering sounds onto specific objects in the environment. We amplitude modulate a 40 kHz ultrasonic signal, which is inaudible “in flight” prior to collision with an object’s surface. Upon collision, it demodulates to audible frequencies through parametric interaction. Thus, the object becomes the origin of the audible sound. Humans can then localize this digital ventriloquism as they would with any other sound (i.e., though binaural localization and their head-related transfer function).

Download

Paper PDF

Reference

Iravantchi, Y., Goel, M. and Harrison, C. 2020. Digital Ventriloquism: Giving Voice to Everyday Objects. In Proceedings of the 38th Annual SIGCHI Conference on Human Factors in Computing Systems. CHI '20. ACM, New York, NY.

Chris Harrison	Research Curriculum Vitae Fun Projects Travel
Digital Ventriloquism: Giving Voice to Everyday Objects Smart speakers with voice agents have seen rapid adoption in recent years, with 41% of U.S. consumers owning one by the end of 2018. These devices use traditional speaker coils, which means the agent’s voice always emanates from the device itself, even when that information might be more contextually and spatially relevant elsewhere. One option would be to instrument everything or to have multiple speakers in each room, but this comes with installation, maintenance, and aesthetic downsides. In this research, we describe our work on Digital Ventriloquism, which allows a single smart speaker to render sounds onto passive objects in the environment. Not only can these items speak, but also make other sounds, such as notification chimes. Importantly, objects need not be modified in any way: the only requirement is line of sight to our speaker. As smart speaker microphones are omnidirectional, it is possible to have interactive conversations with totally passive objects, such as doors and plants. To achieve this effect, we use a dense, 2D array of ultrasonic transducers. This produces a highly directional emission due to the Huygens-Fresnel principle, which is critical for rendering sounds onto specific objects in the environment. We amplitude modulate a 40 kHz ultrasonic signal, which is inaudible “in flight” prior to collision with an object’s surface. Upon collision, it demodulates to audible frequencies through parametric interaction. Thus, the object becomes the origin of the audible sound. Humans can then localize this digital ventriloquism as they would with any other sound (i.e., though binaural localization and their head-related transfer function). Download Paper PDF Reference Iravantchi, Y., Goel, M. and Harrison, C. 2020. Digital Ventriloquism: Giving Voice to Everyday Objects. In Proceedings of the 38th Annual SIGCHI Conference on Human Factors in Computing Systems. CHI '20. ACM, New York, NY.
© Chris Harrison