Audiovisual integration of impaired streams

To understand the combined human experience of audio and video, this research focuses on audiovisual perception and the factors that contribute to weaken the integration between the two modalities. With the popularity of online services that provide entertainment on the go, we consider how potentially related impairments will influence the perceptual integration. These impairments include latency, loss, active adaptation, asynchrony and displacement, but we also consider factors such as reverberation, speech intelligibility, and content.

An-intro-to-perceptionWe pursue new knowledge about humans’ perception of audiovisual quality in the context of video streaming and immersive audiovisual systems. Despite the long tradition of objective assessment of auditory, as well as visual quality, in telecommunication systems, this body of research has covered only a limited set of situations and conditions. Moreover, objective metrics are currently unable to represent the full range of perceptual mechanisms that are at work to process audiovisual content.

In recent years, we have defined common vocabularies of quality attributes and we have worked to understand their relations. We have also looked into each attribute’s importance for scalable video, audiovisual content, and live mixed-reality performances based on multi-modal content. Furthermore, as a step towards more consistent subjective quality evaluation, we have developed robust and cost-efficient assessment methods for audiovisual quality perception.

An-intro-to-perception-img2

With the increased relevance of 3D technology in both entertainment and communication platforms, we have moved into a new perceptual reality where several new challenges confront the human senses. Due to the limitations dictated by the available network and hardware resources of a multimedia system, the quality of the delivered content is likewise limited. Thus, with the exception of large-scale productions (like Hollywood movies), an increasing number of immersive applications depend on imperfect devices for capturing and rendering. Moreover, they often rely on immediate communication of content between end-users. This is the context for the work that we have set out to perform.

So far, our studies on human audiovisual perception have yielded important insights into perceptual diversity and how human attributes can influence audiovisual processing and temporal sensitivity. In 3D-Sense, we aim to extend on this work and explore how audiovisual sensory information is combined by perceptual system when both audio and video contribute with spatial cues in three dimensions. 3D-Sense unites two research areas, cognitive psychology, with focus on human cognitive and perceptual processes, and multimedia systems, which develops algorithms and protocols to provide the best user experience.