Author Archives: Michael Riegler

ACM Multimedia papers accepted

We can report another two accepted papers in the two most competitive tracks for the ACM Multimedia Conference 2016.

Multimedia and Medicine: Teammates for Better Disease Detection and Survival
Michael Riegler, Mathias Lux, Carsten Griwodz, Concetto Spampinato, Thomas de Lange, Sigrun L. Eskeland, Konstantin Pogorelov, Wallapak Tavanapong, Peter T. Schmidt, Cathal Gurrin, Dag Johansen, Håvard Johansen, Pål Halvorsen

OpenVQ – A Video Quality Assessment Toolkit
Kristian Skarseth, Henrik Bjørlo, Pål Halvorsen, Michael Riegler, Carsten Griwodz

Special session accepted for MMM conference 2017

The MPG group and some of our collaborators proposed a special session for the Multimedia Modeling Conference 2017 in Iceland. We can announce now, that our proposal got accepted. The special session will be an evolved version of CrowdMM — Crowdsourcing for Multimedia.


The session will focus on advancing the state of the art of best practices for the use of crowdsourcing in multimedia research. A wealth of topics will be addressed within the field of multimedia, cross-cutting all of the main conference areas. Contributions dealing with e.g. crowdsourcing-based identification and evaluation of multimedia QoE (Area: Multimedia HCI and QoE), visual and audio indexing via human computation or hybrid techniques (Area:  Multimedia Search and Recommendation) or also the use of crowdsourcing for analysing affect portrayed in or elicited by multimedia content. Specific the emerging areas Emotional and Social Signals in Multimedia such as User Intent and Affection will be welcomed, as long as they present a strong methodological focus on crowdsourcing. We will put emphasis on tackling the methodological challenges listed in the “topics” section below. Crowdsourcing cannot be considered as a mature technology for multimedia research if the results it produces are not repeatable. To take crowdsourcing to the next level, it is necessary to determine best practices for test and incentive schemes design, as well as robust data analysis and quality control techniques. On the longer term (beyond 2017), our goal is to generate guidelines and recommendations for the use of crowdsourcing in multimedia, possibly also involving standardization bodies. To do so, it is necessary to focus on crowdsourcing not only as a means for multimedia research, but also as an end.

The main proposers are:

Guillaume Gravier, IRISA, France,

Guillaume Gravier is a senior research scientist at Centre National pour la Recherche Scientifique (CNRS). Since 2002, he has been working at the IRISA lab, where he currently leads the multimedia group. With a background on statistical speech modeling, his research activities focus on multimedia analytics: multimodal content modeling, multimedia pattern mining, natural language processing and video hyperlinking, etc. Guillaume Gravier is president of the French-speaking Speech Communication Association and co-founded the ISCA SIG on Speech and Language in Multimedia (SLIM), which he has been chairing since 2013. He is a member of the board of the national ICT cluster Images et Réseaux and the technical representative of Inria in the PPP BDVA. Guillaume Gravier has also been involved in the organization of major conferences and of national and international evaluation benchmarks.

Mathias Lux, Klagenfurt University, Austria,

Mathias Lux is Associate Professor at the Institute for Information Technology (ITEC) at Klagenfurt University. He is working on user intentions in multimedia retrieval and production and emergent semantics in social multimedia computing. In his scientific career he has (co-) authored more than 80 scientific publications, has served in multiple program committees and as reviewer of international conferences, journals and magazines, and has organized multiple scientific events. Mathias Lux is also well known for the development of the award winning and popular open source tools Caliph & Emir and LIRE for multimedia information retrieval.

Michael Riegler, Simula, Norway,

Michael Riegler is a PhD student at Simula Research Laboratory. He received his master degree from the Klagenfurt University with distinction. His master thesis was about large scale content based image retrieval. He wrote it at the Technical University of Delft under the supervision of Martha Larson. He is a part of the EONS project at the Media Performance Group. His research interests are endoscopic video analysis and understanding, image processing, image retrieval, parallel processing, gamification and serious games, crowdsourcing, social computing and user intentions. Furthermore, he is involved in several initiatives like the MediaEval Benchmarking initiative for Multimedia Evaluation and he has (co-) authored more than 30 scientific publications.

Steering Committee:

Martha Larson, Delft University of Technology, Netherlands, and Radboud University Nijmegen, Netherlands

Judith Redi, Delft University of Technology, Netherlands

Papers accepted at CBMI 2016

We can report another three accepted papers for CBMI 2016.

Crowdsourcing as Self Fulfilling Prophecy: Influence of Discarding Workers in Subjective Assessment Tasks. Michael Riegler, Vamsidhar Reddy Gaddam, Martha Larson, Ragnhild Eg, Pål Halvorsen and Carsten Griwodz

Explorative Hyperbolic-Tree-Based Clustering Tool for Unsupervised Knowledge Discovery. Michael Riegler, Konstantin Pogorelov, Mathias Lux, Pål Halvorsen, Carsten Griwodz, Thomas de Lange and Sigrun Losada Eskeland

EIR – Efficient Computer Aided Diagnosis Framework for Gastrointestinal Endoscopies. Michael Riegler, Konstantin Pogorelov, Pål Halvorsen, Thomas de Lange, Carsten Griwodz, Peter Thelin Schmidt, Sigrun Losada Eskeland and Dag Johansen

Visitor from Japan

Last weekend we had a visitor from the Gifu University, Dr.  Satoshi Tamura.

Satoshi has been working the following areas: speech signal processing, computer vision and image processing, music information processing, and natural language (text) processing. He also has been investigating multimodal information processing such as Audio-visual speech recognition, voice activity detection, speech conversion and model adaptation using speech signals as well as lip images. Cross-modal researches such as the application of speech technologies to the other areas. Through these activities, he would like to improve the performance of each pattern recognition task e.g. speech recognition, and to explore the universal recognition algorithm that is commonly applied to many pattern recognition areas. In addition, he has been collaborating with doctors and researchers in the school of medicine.

Satoshi will work with us on our medical projects such as the live colonoscopy video analysis and he will also support us for the collaboration and communication with the Hiroshima University Hospital.






The problem of overbuffering in today’s Internet (termed as bufferbloat) has  recently drawn a great amount of attention from the research community. This  has led to the development of various active queue management (AQM) schemes.  The last years have seen a lot of effort to show the15210144673_b37c806986_z benefits of AQMs over simple  tail-drop queuing and to encourage deployment. Yet it is still unknown to what  extent AQMs are deployed in the Internet. We have developed an active measurement tool, called TADA (Tool for Automatic Detection of AQMs), that can detect if the bottleneck router on a particular communication path uses AQM. Our detection technique is based on analyzing the patterns of queue delays and packet losses. The tool is composed of a Sender process running at the sender machine and a Receiver process running at the receiver machine. The tool uses UDP for sending constant bit rate (probing) streams and TCP for a control channel between the endpoints.

The sourcecode can be found at:

DigSys Pillcam Medical Workshop

The media performance group hosted a workshop for the DigSys Pillcam pre project. In this workshop we tried to bring the medical and the computer science world together. After two days of fruitful discussion and hard work we achieved this goal and we even managed to define a catalogue of abnormalities in the digestive system that are required by medical doctors.





OpenSea – A search based classification tool

OpenSea contains software for experimenting with image recognition and classification based on global image features. It is able to classify single images but it also can extract frames from videos and classify them in real time. The software is easy to handle and can for example be used as a simple but strong baseline for evaluating classifiers.

This video is a demonstration of how the software works and how it can be used in a medical scenario.

OpenSea can be accessed via

New Collaboration with AHO

Lately we started to collaborate with Kjetil Nordby from the Ocean Industries Concept Lab of the Architecture and Design Hoyskolen (AHO).

One of their projects is the Ulstein Brige Vision. The visualisation show
how multimodal interaction may be applied on future offshore service vessels.
The concepts are developed as part of the Ulstein Bridge Concept research project.
Videos about their work:

The MPG Demos @ ACM MMSys 2015

This year our group was able to get several demos accepted at the ACM MMSys Conference 2015 in Portland. The demos are:

Scaling Virtual Camera Services to a Large Number of Users. In this demo we show how a PTZ camera system can be used without consuming much bandwidth. For reducing the bandwidth usage, we reduce the quality adaptively in the regions where the data is not required to be present. Video:

Energy Efficient Video Encoding Using the Tegra K1 Mobile Processor. This demonstration shows how hardware and software configuration impacts the running power usage of a live video encoder. The encoder, Codec 63, is architecturally similar to H.264 and Google’s VP8 and runs on a Tegra K1 mobile processor. Participants can offload video processing to a GPU, change CPU and GPU operating frequency, migrate between CPU clusters and turn off CPU cores. The effects of these settings in terms of achieved frame-rate, power usage and energy per encoded frame is displayed live. How energy-efficiently can you encode yourself? Video:

How much delay is there really in current games?. This demonstration uses a typical gaming setup wired to an oscilloscope to show how long the total, local delay is. Participants can also bring their own computers and games so that they can measure delays in the games or other software.

Expert Driven Semi-Supervised Elucidation Tool for Medical Endoscopic Videos. In this demo we present a semi-supervised annotation tool for medical experts. The tool should help to collect medical data for machine learning and computer vision approaches. Therefore we combine lightweight and time efficient manual annotations with object tracking algorithms. Video:


« Older Entries