Binaural Telephony

Spatial Audio-Visual Conferencing using Binaural HD Voice

Vision: Participate in meetings from far away places, being connected via a novel binaural mobile communication device.

Benefit: High intelligibility and thrilling communication experience due to the acoustic preservation of the speakers' positions in the room as well as natural rendering of background sounds.

Example use case

Participants of a meeting are located at different places. Due to the binaural telephony the remote participant has the same listening impression as if he would be located at a position in the meeting room together with the other participants. Due to the binaural acoustic signals reaching the remote participant, he has the impression that sounds reach his ears from different virtual directions. It is like being there even though the remote participant may be located at a far away place.

Demonstration

Telephony today is of low quality and monaural. In the future, people will use high definition binaural telephony. Get an impression on how communication will be like in the future by watching the demonstration video below. In this video, the impact of binaural telephonie will be demonstrated by switching between todays' telephony and future communication (binaural HD telephony) during playback.

Note: In order to experience the full benefit of the binaural audio signals, be sure to use a stereo headset when watching the video! The sound track starts after the first 30 seconds of the video.

Technical Background

The vision of binaural telephony can be achieved by using novel binaural audio recording and rendering devices in combination with a novel binaural communication system. Binaural recordings are not just stereo: Due to the placement of the acoustic sensors in close proximity to the ears, besides the two channel audio signal, also the typical components involved in human audio perception can be captured and transmitted to the communication partner such as acoustic reflections caused by the head or acoustic shadowing caused by the human body.

References

[ruengeler2013c]
Matthias Rüngeler, Hauke Krüger, Gottfried Behler, and Peter Vary
HD-Voice-3D: Herausforderungen und Lösungen bei der Audiosignalverarbeitung
Workshop Audiosignal- und Sprachverarbeitung (WASP), September 2013

[schaefer12]
Magnus Schäfer, Mohammad Bahram, and Peter Vary
Improved Binaural Model for Localization of Multiple Sources
ITG-Fachtagung Sprachkommunikation, September 2012

[loellmann12b]
Heinrich W. Löllmann and Peter Vary
Efficient Speech Dereverberation for Binaural Hearing Aids
ITG-Fachtagung Sprachkommunikation, September 2012

[loellmann12c]
Heinrich W. Löllmann and Peter Vary
Beamformer for Driving Binaural Speech Enhancement
Proceedings of International Workshop on Acoustic Signal Enhancement (IWAENC), September 2012

[ruengeler12]
Matthias Rüngeler, Hauke Krüger, Thomas Schlien, and Peter Vary
Spatial Audio Conferencing using Binaural HD Voice
International Workshop on Acoustic Signal Enhancement (IWAENC), September 2012

[geiser11]
Bernd Geiser, Magnus Schäfer, and Peter Vary
Binaural Wideband Telephony Using Steganography
Konferenz Elektronische Sprachsignalverarbeitung (ESSV), September 2011

[jeub11a]
Marco Jeub, Hauke Krüger, Heinrich W. Löllmann, Robert Bücs, Christopher Bulla, Thomas Schlien, and Peter Vary
Real-Time Dereverberation for Hearing Aids with Binaural Link
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), May 2011