Category Archives: News

RATS at the Festival of Lights 2019

RATS is short for Real-time Adaptive Three-sixty Streaming, a software that was mainly developed by Trevor Ballardt while he stayed at TU Darmstadt and worked for the MAKI SFB. RATS was used in a test case of the H2020 project 5Genesis, which is preparing to showcase 5G during the Festival of Lights 2020 in Berlin. The test was conducted at Humboldt University during the Festival of Lights 2019.

Fraunhofer Fokus, one of the local partners in Berlin, wrote a piece to summarize the test from the 5Genesis perspective. We contributed video streaming and talk about it in this article.

FoL 2019 at the Humboldt University (© Magnus Klausen 2019)

The original idea of RATS was to use NVenc to convert an input stream from a 360 camera into a set of tiles in real time, which could be encoded at several qualities on the server before stitching them into a set of tiled H.265 videos. These H.265 videos would form a succession of qualities suitable for the orientation of the 360 camera. This idea was published in a demo paper at ACM MMSys 2019, and the intended application in 5Genesis as a short paper at ACM Mobicom’s S3. The code for RATS can be found on Github.

However, 5Genesis is also about dense user populations that access a live video feed, and such density can only be achieved if users can stream to their mobile phones without installing any additional software. The RATS idea would work perfectly for this if mobile phones’ browsers supported H.265. Unfortunately, Android phones do not.

So, instead of tiling in the sensible manner, we modified a clone of ffmpeg (a clone with minor modifications that is required for RATS). ffmpeg can be configured to use NVenc for encoding video streams in H.264, and it can also generate fragmented MPEG4 with suitable HLS and DASH manifest files. In case of DASH, MPD (manifest) files can the form of templates, which removes the needs for clients to download updates even in case of live streams, while HLS clients require updates. Instead of merging tiles after compressing them separately, we used Gaussian filtering on tile-shaped regions of the video to reduce the coding complexity. An arbitrary number of these version can be generated in parallel, using our new ffmpeg CUDA module for partial blurring.

The camera that we installed at the FoL 2019 was a surveillance camera with a fisheye lens (actually a panomorph lens, but close enough to fisheye to make our life easy), while we settled onto VideoJS for panorama display in our cross-platform web pages which should show the FoL videos on arbitrary browsers. It was a bit irritating that the current version of VideoJS has lost Fisheye projection support while it has gained both DASH and HLS support.

Consequently, we had project our panorama from fisheye to equirectangular projection. We followed two approaches. In one, we added a reprojection module into ffmpeg that uses CUDA to make the conversion before streaming, followed by a configuration of VideoJS that allowed to project only one half of a sphere, since a fisheye camera records only a single hemisphere. In the other, we extended VideoJS to support fisheye lenses directly. While the first piece of code may be more generally useful, we found that the single conversion of the second approach (which will be published in a master thesis next year) provides better visual quality.


About 2.8 millions of new luminal GI cancers (esophagus, stomach, colorectal) are detected yearly in the world, and the mortality is about 65%. In addition to these cancers, numerous other chronic diseases affect the human GI tract. The most common ones include gastroesophageal reflux disease, peptic ulcer disease, inflammatory bowel disease, celiac disease and chronic infections. All have a significant impact on the patients’ health-related quality of life. Consequently, gastroenterology is one of the most significant medical branches.

If we for example look at colorectal cancer (CRC) with one of the highest incidences and mortality of the diseases in the GI tract, early detection is essential for prognosis. Mini-invasive endoscopic and surgical treatment is most often curative in early stages (I- II) with a 5-year survival probability of about 90%, but in advanced stages (III-IV), radiation and/or chemotherapy is often required, and it has a 5-year survival of only 10-30%. In this case, colonoscopy is considered to be the gold standard for the examination of the colon for early detection of cancer and precancerous pathology. However, it is not the ideal screening test. On average 20% of polyps are missed or incompletely removed (i.e., the risk of getting CRC largely depend on the endoscopists ability to detect polyps). It is also a demanding procedure requiring a significant time investment from the medical professional, and the procedure is unpleasant and can cause great discomfort for the patient. This may lead to reduced participation rates and less efficient screening. As a result, ongoing colonoscopy screening programs have a low attendance rate.

Our overall idea is to develop a system for automatic analysis of (video) data from the entire GI tract. As a first approach and to show how complex the target is we developed a multimedia system that supports doctors in disease detection in the GI tract. The main requirements of such a system are (i) easy to use, (ii) easy to extend to different diseases, (iii) real time handling of multimedia content, (iv) being able to be used as a live system and (v) high classification performance with minimal false negative classification results. Therefore, the system consists of three main parts: The annotation sub-system, the detection and automatic analysis sub-system, and the visualization and computer aided diagnosis sub-system.

Using the ASU Mayo polyp dataset, our prototype EIR, based on content-based visual information retrieval, achieved a detection accuracy of above 90% at a speed of about 300 frames per second, i.e., real-time feedback is enabled.


Selected publications:

“Multimedia and Medicine: Teammates for Better Disease Detection and Survival”, Michael Riegler, Mathias Lux, Carsten Griwodz, Concetto Spampinato, Thomas de Lange, Sigrun L. Eskeland, Konstantin Pogorelov, Wallapak Tavanapong, Peter T. Schmidt, Cathal Gurrin, Dag Johansen, Håvard Johansen, Pål Halvorsen, Proceedings of ACM Multimedia (ACM MM), Amsterdam, The Netherlands, October 2016, pp. 968-977 [pdf] [DOI: 10.1145/2964284.2976760] [slides]

“GPU-accelerated Real-time Gastrointestinal Diseases Detection”, Konstantin Pogorelov, Michael Riegler, Pål Halvorsen, Peter Thelin Schmidt, Carsten Griwodz, Dag Johansen, Sigrun Losada Eskeland, Thomas de Lange, Proceedings of the International Symposium on Computer-Based Medical Systems (CBMS), Dublin, Ireland/Belfast, Northern Ireland, June 2016 [pdf]

“EIR – Efficient Computer Aided Diagnosis Framework for Gastrointestinal Endoscopies”, Michael Riegler, Konstantin Pogorelov, Pål Halvorsen, Thomas de Lange, Carsten Griwodz, Peter Thelin Schmidt, Sigrun Losada Eskeland, Dag Johansen, Proceedings of the International Workshop on Content-based Multimedia Indexing (CBMI), Bucharest, Romania, June 2016 [pdf]

“Explorative Hyperbolic-Tree-Based Clustering Tool for Unsupervised Knowledge Discovery”, Michael Riegler, Konstantin Pogorelov, Mathias Lux, Pål Halvorsen, Carsten Griwodz, Sigrun Losada Eskeland, Thomas de Lange, Proceedings of the International Workshop on Content-based Multimedia Indexing (CBMI), Bucharest, Romania, June 2016 [pdf]

“Computer Aided Disease Detection System for Gastrointestinal Examinations”, Michael Riegler, Konstantin Pogorelov, Jonas Markussen, Mathias Lux, Håkon Kvale Stensland, Thomas de Lange, Carsten Griwodz, Pål Halvorsen, Dag Johansen, Peter Thelin Schmidt, Sigrun L. Eskeland, Proceedings of the ACM Multimedia Systems Conference (MMSys), Klagenfurt am Wörthersee, Austria, May 2016 [DOI:10.1145/2910017.2910629]

“Efficient Processing of Videos in a Multi Auditory Environment Using Device Lending of GPUs”, Konstantin Pogorelov, Michael Riegler, Jonas Markussen, Hålkon Kvale Stensland, Pål Halvorsen, Carsten Griwodz, Sigrun Losada Eskeland, Thomas de Lange, Proceedings of the ACM Multimedia Systems Conference (MMSys),
Klagenfurt am Wörthersee, Austria, May 2016 [DOI: 10.1145/2910017.2910636]

“Expert Driven Semi-Supervised Elucidation Tool for Medical Endoscopic Videos” (demo), Zeno Albisser, Michael Riegler, Pål Halvorsen, Jiang Zhou, Carsten Griwodz, Ilangko Balasingham, Cathal Gurrin, Proceedings of the ACM Multimedia Systems Conference (MMSys), Portland, OR, USA, March 2015, pp. 73-76, [pdf] [DOI: 10.1145/2713168.2713184]

“Event Understanding in Endoscopic Surgery Videos”, Mario Guggenberger, Michael Riegler, Mathias Lux, Pål Halvorsen, Proceedings of the ACM International Workshop on Human Centered Event Understanding from Multimedia (HuEvent), Orlando, FL, USA, November 2014 pp. 17-22 [DOI: 10.1145/2660505.2660509]

LADIO project will follow up on POPART

The Horizon 2020 project proposal LADIO: Live Action Data Input and Output has been accepted.

Like POPART before, LADIO is an innovation action of 18 months. This time, we focus on maximizing the collection of metadata on film sets in order to simplify the collaboration of post-production facilities.
The project will have a strong vision aspect, especially since the Technical University of Prague is joining the POPART team, but Media will concentrate on aspects of storage and transmission. Stay tuned for more news form LADIO.

ACM Multimedia papers accepted

We can report another two accepted papers in the two most competitive tracks for the ACM Multimedia Conference 2016.

Multimedia and Medicine: Teammates for Better Disease Detection and Survival
Michael Riegler, Mathias Lux, Carsten Griwodz, Concetto Spampinato, Thomas de Lange, Sigrun L. Eskeland, Konstantin Pogorelov, Wallapak Tavanapong, Peter T. Schmidt, Cathal Gurrin, Dag Johansen, Håvard Johansen, Pål Halvorsen

OpenVQ – A Video Quality Assessment Toolkit
Kristian Skarseth, Henrik Bjørlo, Pål Halvorsen, Michael Riegler, Carsten Griwodz

Special session accepted for MMM conference 2017

The MPG group and some of our collaborators proposed a special session for the Multimedia Modeling Conference 2017 in Iceland. We can announce now, that our proposal got accepted. The special session will be an evolved version of CrowdMM — Crowdsourcing for Multimedia.


The session will focus on advancing the state of the art of best practices for the use of crowdsourcing in multimedia research. A wealth of topics will be addressed within the field of multimedia, cross-cutting all of the main conference areas. Contributions dealing with e.g. crowdsourcing-based identification and evaluation of multimedia QoE (Area: Multimedia HCI and QoE), visual and audio indexing via human computation or hybrid techniques (Area:  Multimedia Search and Recommendation) or also the use of crowdsourcing for analysing affect portrayed in or elicited by multimedia content. Specific the emerging areas Emotional and Social Signals in Multimedia such as User Intent and Affection will be welcomed, as long as they present a strong methodological focus on crowdsourcing. We will put emphasis on tackling the methodological challenges listed in the “topics” section below. Crowdsourcing cannot be considered as a mature technology for multimedia research if the results it produces are not repeatable. To take crowdsourcing to the next level, it is necessary to determine best practices for test and incentive schemes design, as well as robust data analysis and quality control techniques. On the longer term (beyond 2017), our goal is to generate guidelines and recommendations for the use of crowdsourcing in multimedia, possibly also involving standardization bodies. To do so, it is necessary to focus on crowdsourcing not only as a means for multimedia research, but also as an end.

The main proposers are:

Guillaume Gravier, IRISA, France,

Guillaume Gravier is a senior research scientist at Centre National pour la Recherche Scientifique (CNRS). Since 2002, he has been working at the IRISA lab, where he currently leads the multimedia group. With a background on statistical speech modeling, his research activities focus on multimedia analytics: multimodal content modeling, multimedia pattern mining, natural language processing and video hyperlinking, etc. Guillaume Gravier is president of the French-speaking Speech Communication Association and co-founded the ISCA SIG on Speech and Language in Multimedia (SLIM), which he has been chairing since 2013. He is a member of the board of the national ICT cluster Images et Réseaux and the technical representative of Inria in the PPP BDVA. Guillaume Gravier has also been involved in the organization of major conferences and of national and international evaluation benchmarks.

Mathias Lux, Klagenfurt University, Austria,

Mathias Lux is Associate Professor at the Institute for Information Technology (ITEC) at Klagenfurt University. He is working on user intentions in multimedia retrieval and production and emergent semantics in social multimedia computing. In his scientific career he has (co-) authored more than 80 scientific publications, has served in multiple program committees and as reviewer of international conferences, journals and magazines, and has organized multiple scientific events. Mathias Lux is also well known for the development of the award winning and popular open source tools Caliph & Emir and LIRE for multimedia information retrieval.

Michael Riegler, Simula, Norway,

Michael Riegler is a PhD student at Simula Research Laboratory. He received his master degree from the Klagenfurt University with distinction. His master thesis was about large scale content based image retrieval. He wrote it at the Technical University of Delft under the supervision of Martha Larson. He is a part of the EONS project at the Media Performance Group. His research interests are endoscopic video analysis and understanding, image processing, image retrieval, parallel processing, gamification and serious games, crowdsourcing, social computing and user intentions. Furthermore, he is involved in several initiatives like the MediaEval Benchmarking initiative for Multimedia Evaluation and he has (co-) authored more than 30 scientific publications.

Steering Committee:

Martha Larson, Delft University of Technology, Netherlands, and Radboud University Nijmegen, Netherlands

Judith Redi, Delft University of Technology, Netherlands

Papers accepted at MMSys and associated workshops

With the final notification deadlines past, we can report that MPG is going to return to MMSys with a nice number of papers, datasets and demos.


  • A High-Precision, Hybrid GPU, CPU and RAM Power Model for Generic Multimedia Workloads
    Kristoffer Robin Stokke, Håkon Kvale Stensland, Carsten Griwodz, Pål Halvorsen


  • Device Lending in PCI Express Networks
    Lars Bjørlykke Kristiansen, Jonas Markussen, Håkon Kvale Stensland, Michael Riegler, Hugo Kohmann, Friedrich Seifert, Roy Nordstrøm, Carsten Griwodz, Pål Halvorsen

MMSys Special Session on AR

  • Robustness of 3D Point Positions to Camera Baselines in Markerless AR Systems
    Deepak Dwarakanath, Carsten Griwodz, Pål Halvorsen

MMSys Dataset papers

  • Right inflight? A dataset for exploring the automatic prediction of movies suitable for a watching situation
    Michael Riegler, Martha Larson, Concetto Spampinato, Pål Halvorsen, Mathias Lux, Jonas Markussen, Konstantin Pogorelov, Carsten Griwodz, Håkon Kvale Stensland
  • Heimdallr: A dataset for sport analysis
    Michael Riegler, Duc-Tien Dang-Nuyen, Bård Winther, Carsten Griwodz, Konstantin Pogorelov, Pål Halvorsen

MMSys Demo papers

  • Computer Aided Disease Detection System for Gastrointestinal Examinations
    Michael Riegler, Konstantin Pogorelov, Jonas Markussen, Mathias Lux, Håkon Kvale Stensland, Thomas de Lange, Carsten Griwodz, Pål Halvorsen, Dag Johansen, Peter Thelin Schmidt, Sigrun L. Eskeland
  • Immersed gaming in Minecraft
    Milan Loviska, Otto Krause, Herman A. Engelbrecht, Jason B. Nel, Gregor Schiele, Alwyn Burger, Stephan Schmeißer, Christopher Cichiwskyj, Lilian Calvet, Carsten Griwodz, Pål Halvorsen
  • Ultra-Low Delay for All: Live Experience, Live Analysis
    Olga Bondarenko, Koen De Schepper, Ing-Jyh Tsang, Bob Briscoe, Andreas Petlund, Carsten Griwodz
  • Efficient Processing of Videos in a Multi Auditory Environment Using Device Lending of GPUs
    Konstantin Pogorelov, Michael Riegler, Jonas Markussen, Håkon Kvale Stensland, Pål Halvorsen, Carsten Griwodz, Sigrun Losada Eskeland, Thomas de Lange

C2Tag, a robust and accurate fiducial marker system for image-based localization from challenging images

Our papers on C2Tags has been accepted for publication in CVPR 2016.

C2Tags are a new approach to visual marker tracking that has been designed for localization in challenging environments, such as the film sets in our H2020 project POPART.

Like a few earlier papers, C2Tags consist of concentric rings, whose position in space has been reconstructed before tracking occurs. The new contribution of C2Tags lies in the detection algorithms, which enable it to tolerate considerable partial occlusion and intense motion blur, and still locate and subsequently identify the marker.

Examples of C2Tag resilience

Examples of C2Tag resilience

For POPART, where we use the C2Tags to track film camera movement on film sets where natural feature detectors fail or faster processing is required, this ability to handle fast motion and occlusion is a major game changer.

Paper Abstract

Fiducials offer a reliable detection and identification of images of known

planar figures in a view. They are used in a wide range of applications, especially when a reliable reference is needed to, e.g., estimate the camera movement in cluttered or textureless environments.

A fiducial designed for such applications must be robust to partial occlusions, varying

distances and angles of view, and fast camera movements.

In this paper, we present a new fiducial system, whose markers consist of concentric

circles: relying on the their geometric properties, the proposed system

allows to accurately detect the position of the image of the circles’ common center.

Moreover, the different thickness of its rings can be used to encode the information

associated to the marker, thus allowing the univocal identification of the marker.

We demonstrate that the proposed fiducial system can be detected in very

challenging conditions and the experimental results show that it outperforms other recent fiducial systems.

Trailer of Third Life premiere available

Artists Otto Krause and Milan Loviška took to the stage in three public performances of the Third Life Project, while their research collaborators from Stellenbosch University in South Africa, University of Duisburg-Essen in Germany and Simula Research Lab and Norway followed closely from the around the stage, to step in and assist on either the virtual or real-world side of their performance.

WUK: Third Life (8.10. - 10.10.2015, Generalprobe) | Foto:

WUK: Third Life

WUK: Third Life (8.10. - 10.10.2015, Generalprobe) | Foto:

WUK: Third Life

WUK: Third Life (8.10. - 10.10.2015, Generalprobe) | Foto:

WUK: Third Life

WUK: Third Life (8.10. - 10.10.2015, Generalprobe) | Foto:

WUK: Third Life

WUK: Third Life (8.10. - 10.10.2015, Generalprobe) | Foto:

WUK: Third Life

WUK: Third Life (8.10. - 10.10.2015, Generalprobe) | Foto:

WUK: Third Life

WUK: Third Life (8.10. - 10.10.2015, Generalprobe) | Foto:

WUK: Third Life

WUK: Third Life (8.10. - 10.10.2015, Generalprobe) | Foto:

WUK: Third Life

WUK: Third Life (8.10. – 10.10.2015, Generalprobe) | Fotos:

High Performance Computing meets Performance Art

In their performative lecture the artists together with an international team of experts exploit technology and employ artistic vision to blur the lines between human beings and machines and between reality and imagination. They explore up-to-date possibilities of development of an avatar performance for a real life audience, which operates within mixed realities (real and “second life”) and coessentially aspires to open the door to the “third life”, where virtuality can transgress directly into reality.

With the use of a “smart stage” they address the new performative possibilities of virtual environments that aren’t limited or constrained by the local space that the physical bodies inhabit. This unique interface of a simulated virtual world, Internet of Things and novel tracking technologies allow virtual characters to perform activities in the real world, whereas activities of performers in the real world enable changes in the virtual world. The notion of third life is manifested here not only in the synchronous interconnection of the virtual and the real but also in their divergence alike, and brings up for question and re-examination what a body is, how a body operates and whether that body is alive or dead, real or virtual.

Third Life Project, initiated in early 2014, implements artistic and scientific research and is devised in the ongoing, networked collaboration across national boundaries.

See more photos and a video summary from the 3 performances at

Premiere @ WUK Vienna, 08 October 2015.

Concept/Dramaturgy/Scenography/Performance: Otto Krause & Milan Loviška
Virtual environments of Minecraft: Otto Krause alias Aproktas
Minecraft expertise and gesture control: Herman Engelbrecht, Jason Bradley Nel (Stellenbosch University/MIH Medialab, South Africa)
Tracking: Carsten Griwodz, Lilian Calvet (Simula Research Lab & LABO Mixed Realities, Norway)
Cyberphysical devices and Non-Player-Characters: Gregor Schiele, Alwyn Burger, Stephan Schmeißer, Christopher Cichiwskyj (University of Duisburg-Essen, Germany)
Server: René Griessl (Bielefeld University, Germany)

A co-production of Territorium – Kunstverein and WUK Performing Arts in Vienna.

With the kind support of the City of Vienna’s Department of Cultural Affairs, the Arts Division; and the Arts and Culture Division of the Federal Chancellery of Austria. With the contribution from the FiPS project funded from the EU’s 7th Framework Programme for research, technological development and demonstration under grant agreement no 609757. Thanks to LABO Mixed Realities in Norway, the EU project POPART (Previz for On-set Production – Adaptive Realtime Tracking) funded under grant agreement no 644874, the Bielefeld University in Germany and the Stellenbosch University in South Africa.WUK:



« Older Entries