3D Audio in FascinatE
Creating a format agnostic interactive broadcast experience poses some interesting challenges to the partners from Technicolor and The University of Salford who are responsible for the audio aspects of FascinatE. Of chief importance is the need to record the given audio scene in such a way that the content can be rendered on any reproduction system at the user end and can update depending on the dynamic viewing point. This demands a paradigm shift from how audio has been traditionally recorded for broadcast. Instead of broadcasting to match a specific hardware set up such as stereo, 5.1, 7.1 etc we adopt an object orientated approach which can be reproduced on any system. The audio scene is considered to be made up of a set of audio objects (point sources with a specific location) and an ambient sound field contribution. The challenge at the recording side therefore is to record the sound field as well as the content and location of the audio objects at the scene. This often involves completely different recording techniques to what is considered standard practice in the broadcast industry. Ideally each sound source would be individually close miked and tracked in space, however in many cases (such as the first FascinatE test shoot at a football match) this is not possible and the content and position of the audio objects needs to be derived by processing the signals from the available microphones near to the sources.
The ambient sound field can also be recorded in such a way that it can be updated to match a given viewing position for example, using sound field microphones such as the Eigenmike® (Figure 1) or the SoundField® microphone which record the 3 dimensional sound field at a given point. With audio objects and sound field accurately recorded it is possible to encode these sources in various sound field representations such as ambisonics B-format or wave field synthesis (WFS) which can in turn be decoded into any output format. As the user pans around the visual scene it is possible to both rotate and translate this sound field to match the new viewing position based on camera pan and zoom. On the rendering side, it is important that the audio updates accurately with the updating view and that it matches the user preferences. FascinatE bridges the gap between a passive viewer and active participant scenarios. Current television broadcasts could be considered as passive viewing, where the audio remains stationary regardless of the camera position; conversely active participant viewing is more akin to a video game scenario where the audio updates completely with the viewing position. Of interest for FascinatE is which of these viewing paradigms the user subscribes to when navigating round the scene. Future work will therefore be centred on not only recording the audio scene such that the content is format agnostic but also on determining how best to render the audio to match user preferences.