Category Archives: Accepted paper

RATS at the Festival of Lights 2019

RATS is short for Real-time Adaptive Three-sixty Streaming, a software that was mainly developed by Trevor Ballardt while he stayed at TU Darmstadt and worked for the MAKI SFB. RATS was used in a test case of the H2020 project 5Genesis, which is preparing to showcase 5G during the Festival of Lights 2020 in Berlin. The test was conducted at Humboldt University during the Festival of Lights 2019.

Fraunhofer Fokus, one of the local partners in Berlin, wrote a piece to summarize the test from the 5Genesis perspective. We contributed video streaming and talk about it in this article.

FoL 2019 at the Humboldt University (© Magnus Klausen 2019)

The original idea of RATS was to use NVenc to convert an input stream from a 360 camera into a set of tiles in real time, which could be encoded at several qualities on the server before stitching them into a set of tiled H.265 videos. These H.265 videos would form a succession of qualities suitable for the orientation of the 360 camera. This idea was published in a demo paper at ACM MMSys 2019, and the intended application in 5Genesis as a short paper at ACM Mobicom’s S3. The code for RATS can be found on Github.

However, 5Genesis is also about dense user populations that access a live video feed, and such density can only be achieved if users can stream to their mobile phones without installing any additional software. The RATS idea would work perfectly for this if mobile phones’ browsers supported H.265. Unfortunately, Android phones do not.

So, instead of tiling in the sensible manner, we modified a clone of ffmpeg (a clone with minor modifications that is required for RATS). ffmpeg can be configured to use NVenc for encoding video streams in H.264, and it can also generate fragmented MPEG4 with suitable HLS and DASH manifest files. In case of DASH, MPD (manifest) files can the form of templates, which removes the needs for clients to download updates even in case of live streams, while HLS clients require updates. Instead of merging tiles after compressing them separately, we used Gaussian filtering on tile-shaped regions of the video to reduce the coding complexity. An arbitrary number of these version can be generated in parallel, using our new ffmpeg CUDA module for partial blurring.

The camera that we installed at the FoL 2019 was a surveillance camera with a fisheye lens (actually a panomorph lens, but close enough to fisheye to make our life easy), while we settled onto VideoJS for panorama display in our cross-platform web pages which should show the FoL videos on arbitrary browsers. It was a bit irritating that the current version of VideoJS has lost Fisheye projection support while it has gained both DASH and HLS support.

Consequently, we had project our panorama from fisheye to equirectangular projection. We followed two approaches. In one, we added a reprojection module into ffmpeg that uses CUDA to make the conversion before streaming, followed by a configuration of VideoJS that allowed to project only one half of a sphere, since a fisheye camera records only a single hemisphere. In the other, we extended VideoJS to support fisheye lenses directly. While the first piece of code may be more generally useful, we found that the single conversion of the second approach (which will be published in a master thesis next year) provides better visual quality.

Papers accepted at CBMI 2016

We can report another three accepted papers for CBMI 2016.

Crowdsourcing as Self Fulfilling Prophecy: Influence of Discarding Workers in Subjective Assessment Tasks. Michael Riegler, Vamsidhar Reddy Gaddam, Martha Larson, Ragnhild Eg, Pål Halvorsen and Carsten Griwodz

Explorative Hyperbolic-Tree-Based Clustering Tool for Unsupervised Knowledge Discovery. Michael Riegler, Konstantin Pogorelov, Mathias Lux, Pål Halvorsen, Carsten Griwodz, Thomas de Lange and Sigrun Losada Eskeland

EIR – Efficient Computer Aided Diagnosis Framework for Gastrointestinal Endoscopies. Michael Riegler, Konstantin Pogorelov, Pål Halvorsen, Thomas de Lange, Carsten Griwodz, Peter Thelin Schmidt, Sigrun Losada Eskeland and Dag Johansen

C2Tag, a robust and accurate fiducial marker system for image-based localization from challenging images

Our papers on C2Tags has been accepted for publication in CVPR 2016.

C2Tags are a new approach to visual marker tracking that has been designed for localization in challenging environments, such as the film sets in our H2020 project POPART.

Like a few earlier papers, C2Tags consist of concentric rings, whose position in space has been reconstructed before tracking occurs. The new contribution of C2Tags lies in the detection algorithms, which enable it to tolerate considerable partial occlusion and intense motion blur, and still locate and subsequently identify the marker.

Examples of C2Tag resilience

Examples of C2Tag resilience

For POPART, where we use the C2Tags to track film camera movement on film sets where natural feature detectors fail or faster processing is required, this ability to handle fast motion and occlusion is a major game changer.

Paper Abstract

Fiducials offer a reliable detection and identification of images of known

planar figures in a view. They are used in a wide range of applications, especially when a reliable reference is needed to, e.g., estimate the camera movement in cluttered or textureless environments.

A fiducial designed for such applications must be robust to partial occlusions, varying

distances and angles of view, and fast camera movements.

In this paper, we present a new fiducial system, whose markers consist of concentric

circles: relying on the their geometric properties, the proposed system

allows to accurately detect the position of the image of the circles’ common center.

Moreover, the different thickness of its rings can be used to encode the information

associated to the marker, thus allowing the univocal identification of the marker.

We demonstrate that the proposed fiducial system can be detected in very

challenging conditions and the experimental results show that it outperforms other recent fiducial systems.

Redundant Data Bundling in TCP

Redundant Data Bundling (RDB) is a mechanism for TCP that aims to reduce the per-packet latency for traffic produced by interactive applications. The master’s thesis Taming Redundant Data Bundling: Balancing fairness and latency for redundant bundling in TCP[1] presents a new implementation in the Linux kernel, along with a detailed set of experiments that show the benefits of RDB. The thesis builds on the original work on RDB[2], and addresses some unresolved issues making the new implementation a candidate for widespread deployment. The paper Latency and Fairness Trade-Off for Thin Streams using Redundant Data Bundling in TCP[3] presented at LCN2015 describes the RDB mechanism in detail.

Paper Abstract

Time-dependent applications using TCP often send thin-stream traffic, characterised by small packets and high intertransmission-times. Retransmissions after packet loss can result in very high delays for such flows as they often cannot trigger fast retransmit. Redundant Data Bundling is a mechanism that preempts the experience of loss for a flow by piggybacking unacknowledged segments with new data as long as the total packet size is lower than the flow maximum segment size. Although successful at reducing retransmission latency, this mechanism had design issues leaving it open for abuse, effectively making it unsuitable for general Internet deployment. In this paper, we have redesigned the RDB mechanism to make it safe for deployment. We improve the trigger for when to apply it and evaluate its fairness towards competing traffic. Extensive experimental results confirm that our proposed modifications allows for inter-flow fairness while maintaining the significant latency reductions from the original RDB mechanism.

Thin streams and latency reducing mechanisms

Applications like online games, remote control systems, high-frequency trading and sensor networks provide a hugely increased utility when their per-packet latencies are at their lowest. They have in common that they produce mostly traffic with thin-stream characteristics, consisting of small packet payloads and high inter-transmission times (ITTs). Table 1 shows some examples of traffic properties for applications producing thin streams.

Application Payload size (B) Packet inter-arrival time (ms) Avg. BW req
avg min max avg med min max 1% 99% pps bps
VNC (from client) 8 1 106 34 8 0 5451 0 517 29.412 17K
Skype (2 users) 236 14 1267 34 40 0 1671 4 80 29.412 69K
SSH text session 48 16 752 323 159 0 76610 32 3616 3.096 2825
Anarchy Online 98 8 1333 632 449 7 17032 83 4195 1.582 2168
World of Warcraft 26 6 1228 314 133 0 14855 0 3785 3.185 2046
Age of Conan 80 5 1460 86 57 0 1375 24 386 11.628 12K


Some existing mechanisms in the Linux kernel aimed at reducing latencies for TCP are

  • Thin Linear Timeouts (LT)
  • Modified fast retransmit (mFR)
  • Early Retransmit (ER)
  • Tail Loss Probe (TLP)

These mechanisms are reactive, i.e. they modify how the TCP sender reacts to loss or possible loss by triggering retransmits faster. The problem is that in a best case scenario, they still require at least an RTT of extra delay for the loss signal to reach the sender host before the data is retransmitted. The exception to this is in cases where the TLP packet retransmits the lost packet.

Read more about time-dependent networking (TDN) and earlier work on thin streams and interactive applications.

Redundant Data Bundling

In contrast to the earlier mentioned reactive mechanisms, RDB is a proactive mechanism which tries to prevent unnecessary delays when packet loss occurs. By bundling already sent data with packets containing new data, RDB can be considered a retransmission mechanism that retransmits segments even before any loss signals are present.

An important property of RDB is that it does not produce extra packets on the network, but instead utilizes the “free” space in small TCP packets produced by interactive applications.


Figure 1: Example of an Ethernet frame for a TCP packet with 100 bytes payload.

TCP requires that the payload in each packet is sequential, however, a receiver is required to accept any new data in a packet even if some of the data has already been received. RDB takes advantage of this by bundling already sent (but un-ACKed) data segments in packets with new data.

Figure 2 shows an example with four separate data segments (S1-S4) illustrating how RDB organizes the data segments in each packet.



Figure 2: Example showing how RDB bundles the data of previously sent packets in packets with new data.


Note about “Free” or unused space in Ethernet frames
For small TCP packets with less than 1 * MSS worth of payload, the Ethernet frame will not fill the maximum segment size. “Free” is this context refers to additional space that may be used without increasing the same number of Ethernet frames through the network.

RDB in action

Figure 3 shows a scenario where an application performs write calls to the TCP socket with 365 bytes per call. When two consecutive packets are lost, the lost segments are “recovered” by the third packet which bundles the (new) segments from the previous two packets.



Figure 3: Packet timeline of a TCP thin stream using RDB.


Limiting RDB to thin streams

To ensure only thin streams can use RDB, a mechanism called Dynamic Packets in Flight Limit (DPILF) is used. By specifying a minimum allowed ITT, a maximum packets in flight limit is calculated dynamically based on the current RTT measurements. Figure 4 shows how the DPIFL is calculated for the minimum ITT limits 5, 10, 20 and 30 ms, with different RTTs.


Figure 4: Example of DPIFL for different minimum ITT values

Figure 4: Example of DPIFL for different minimum ITT values

For comparison, a Static Packets in Flight Limit (SPILF) of 3 is included in figure 4 to show how the thin stream test currently used in the Linux kernel does not adjust the limit based on RTT measurements.

Experiments and results

A wide range of tests have been performed to evaluate the effects of RDB in regards to reduced latencies, as well as fairness towards competing network traffic. The experiments have been performed in a testbed with Linux (Debian) hosts as depicted in figure 5.


Testbed setup

Figure 5: Testbed setup

Tests with uniform loss rate

A set of tests were performed where netem on BRIDGE2 was configured with a uniform loss rate. These tests illustrate the difference in latency between regular TCP and RDB with different bundling rates. In figure 6 we see the result from one of the test sets with a uniform loss rate of 10%. The plot shows the ACK latency for three separate tests, where each test started 20 different TCP streams. One test where only regular TCP was used, and the two other tests with 10 regular TCP streams, and 10 RDB streams, where a limit was imposed on when any redundant data could be bundled. For RDB SPIFL 3, each TCP stream was allowed to bundle only when there were three or fewer packets in flight (PIFs). With an ITT of 30 ms, and a network delay configured with 75 ms in each direction (RTT of 150 ms), each stream would have 5 PIFs as long as it is not congestion limited.

Figure 6: Experiment with uniform loss

Figure 6: Experiment with uniform loss


Comparing the latency of the different stream types, we see a significant difference when RDB is enabled. The RDB streams that were allowed to bundle while PIFs <= 7 have a significantly better ACK latency, where 90% of the TCP segments have no extra delay. For the regular TCP streams we see that almost 60% of the TCP segments have a higher ACK latency than the ideal 150 ms, even when only 10% of the packets are lost. This is due to head-of-line blocking causing delays not only for the lost segment, but for every segment transmitted after the lost segment, until a retransmission arrives at the receiver side.

Latency tests with cross traffic over a shared bottleneck

In these experiments a set of RDB-enabled thin streams and regular TCP streams are tested while competing against greedy TCP flows over a shard bottleneck. Each of the following plots show the results from two test runs, one where 20 TCP streams (TCP Reference) compete against 5 greedy TCP flows, and another where 10 RDB-enabled thin streams and 10 TCP thin streams compete against 5 greedy TCP flows. The shared bottleneck is configured with a rate limit of 5Mbit and a pfifo queue of one bandwidth-delay product (63 packets).

Figure 7: Experiment with no bundling limit

Figure 7: Experiment with no bundling limit



The RDB streams plotted in figure 7 are allowed to bundle any available previous segments as long as the payload does not exceed one MSS. Looking at the latencies we see a significant difference between the RDB streams and the TCP streams, however, comparing the latencies to the TCP reference values we see that the increased bandwidth due to the redundant data causes extra queuing delays which may not be desirable.

Figure 8: Experiment with bundling limit

Figure 8: Experiment with bundling limit




Figure 8 shows the results from a test with the same setup as in figure 7, except for a limitation on the redundancy level for the RDB streams where they are allowed to bundle only one previous segment with each packet. The results show that with the reduced redundancy level, the RDB streams still have a much lower latency that the competing TCP streams, while avoiding the extra queuing delay found in the previous tests.



The RDB mechanism uses a proactive approach to reduce latencies for time-dependent traffic over TCP. By continually bundling (retransmitting) data instead of waiting for any loss signals before triggering a retransmission, RDB is able to avoid the extra delay of at least 1 RTT needed for a loss signal to reach the sender host. This results in significantly reduced latencies for thin stream traffic that experience sporadic packet loss.

Main benefits of RDB:

  • RDB is backwards compatible TCP mechanism that requires sender side modifications only. This means that e.g. a Linux server can enable RDB on connections to unmodified TCP receivers. In this case, only the data sent from the server to the client will benefit from RDB, and the data in the reverse path will use regular TCP.
  • RDB does not produce extra packets on the network, and achieves the reduced latencies by using the packets that are already scheduled for transmission.
  • Reduces latencies for RDB enabled streams without unreasonable negative effects on competing traffic .


[1] B. R. Opstad, “Taming redundant data bundling” Master’s thesis, University of Oslo, May 2015. (Download PDF)
[2] K. Evensen, A. Petlund, C. Griwodz, and P. Halvorsen, “Redundant bundling in TCP to reduce perceived latency for time-dependent thin streams” Comm. Letters, IEEE, vol. 12, no. 4, pp. 324–326, April 2008.
[3] B. R. Opstad, J. Markussen, I. Ahmed, A. Petlund, C. Griwodz, and P. Halvorsen, “Latency and Fairness Trade-Off for Thin Streams using Redundant Data Bundling in TCP”, IEEE Conference on Local Computer Networks (LCN), Clearwater Beach, Florida, USA, Oct. 2015. (Download PDF)

The MPG Demos @ ACM MMSys 2015

This year our group was able to get several demos accepted at the ACM MMSys Conference 2015 in Portland. The demos are:

Scaling Virtual Camera Services to a Large Number of Users. In this demo we show how a PTZ camera system can be used without consuming much bandwidth. For reducing the bandwidth usage, we reduce the quality adaptively in the regions where the data is not required to be present. Video:

Energy Efficient Video Encoding Using the Tegra K1 Mobile Processor. This demonstration shows how hardware and software configuration impacts the running power usage of a live video encoder. The encoder, Codec 63, is architecturally similar to H.264 and Google’s VP8 and runs on a Tegra K1 mobile processor. Participants can offload video processing to a GPU, change CPU and GPU operating frequency, migrate between CPU clusters and turn off CPU cores. The effects of these settings in terms of achieved frame-rate, power usage and energy per encoded frame is displayed live. How energy-efficiently can you encode yourself? Video:

How much delay is there really in current games?. This demonstration uses a typical gaming setup wired to an oscilloscope to show how long the total, local delay is. Participants can also bring their own computers and games so that they can measure delays in the games or other software.

Expert Driven Semi-Supervised Elucidation Tool for Medical Endoscopic Videos. In this demo we present a semi-supervised annotation tool for medical experts. The tool should help to collect medical data for machine learning and computer vision approaches. Therefore we combine lightweight and time efficient manual annotations with object tracking algorithms. Video:


Multimedia and Healthcare

In the last month our group could published several papers about research that can help to improve the way how doctors use multimedia technology in their work. The last accepted publication is about an elucidation tool for endoscopic videos which will be presented at the ACM MMSys Conference 2015 in Portland. An overview about our work in this area can be found at:




Paper for SoHuman 2014 accepted

Our paper “Mobile Picture Guess: A Crowdsourced Serious
Game for Simulating Human Perception” by Michael Riegler (Simula Research Laboratory), Mathias Lux (University of Klagenfurt), Ragnhild Eg (Simula Research Laboratory) and Markus Schicho (Econob) has been accepted for the 3rd International Workshop on Social Media for Crowdsourcing and Human Computation at the 6th International Conference on Social Informatics.

In this paper we present an novel idea of combining a mobile game with a Crowdsourcing campaign in order to collect information on saliency of image segments. Goal of the game is to make the players guess what is depicted in an image while it is still uncovered. The game mechanics allow us to collect information about which image segment is necessary in order for the user to guess correctly the image content.

Several contributions accepted for ACM MM 2014

The various papers were written with colleagues from other institutes in Europe and elsewhere. They represent the University of Klagenfurt (Alpen-Adria Universität – AAU), TU Graz (TUG), University of Tromsø (UiT), TU Delft (TUD), University of Trento (UT), University of Southampton (SOT), Research Center L3S (L3S), University of Avignon (UA), University of Toulouse (IRT), University of Singapore (NUS)

  1. “How ‘How’ Reflects What’s What: Content-based Exploitation of How Users Frame Social Images”:
    Authored by Michael Riegler, Martha Larson (TUD), Mathias Lux (AAU), Christoph Kofler (AAU)
  2. “Gone: An Interactive Experience for Two People”:
    Authored by Michael Riegler, Mathias Lux (AAU), Christian Zellot (AAU), Lukas Knoch (AAU), Horst Schnattler (?), Sabrina Natpetschnig (AAU)
  3. “Getting by with a Little Help from the Crowd: Optimal Human Computation Approaches to Social Image Labeling”:
    Authored by Babak Loni (TUD), Jonathon Hare (SOT), Mihai Georgescu (L3S), Michael Riegler, Xiaofei Zhu (L3S), Mohamed Morchid (UA), Richard Dufour (UA), Martha Larson (TUD)
  4. “Crowd to the Rescue for Hard-to-Find Data: Collecting Contexts in which Edited Images are used Online”:
    Authored by Valentina Conotter (UT), Duc-Tien Dang-Nguyen (UT), Michael Riegler, Giulia Boato (UT), Martha Larson (TUD)
  5. “Event Understanding in Endoscopic Surgery Videos”:
    Authored by Mario Guggenberger (AAU), Michael Riegler, Mathias Lux (AAU), Pål Halvorsen
  6. “Real-Time HDR Panorama Video”:
    This is a poster paper authored by L. Kellerer (TU Graz), V. R. Gaddam, R. Langseth, H. K. Stensland, C. Griwodz, D. Johansen (UiT), and P. Halvorsen. The paper is concerned with the addition of HDR to the Bagadus rendering, which is an important component in the harsh light conditions of Alfheim stadium in Tromsø.
  7. “Automatic real-time zooming and panning on salient objects from a panoramic video”:
    This is a demo paper authored by V. R. Gaddam, R. Langseth, H. K. Stensland, C. Griwodz, and P. Halvorsen. The paper demonstrates automatic panning and zooming in the panorama video that is generated as part of the Bagadus pipeline.
  8. “3D Interest Maps From Simultaneous Video Recordings”:
    Authored by A. Carlier (IRT), L. Calvet, D. T. D. Nguyen (NUS), W. T. Ooi (NUS), P. Gurdjos (IRT), V. Charvillat (IRT)

Best Presentation Award @ NOSSDAV 2014

OLYMPUS DIGITAL CAMERAIn an earlier blogpost, we informed that the paper “Interactive Zoom and Panning from Live Panoramic Video” by Vamsidhar Reddy Gaddam, Ragnar Langseth, Sigurd Ljødal, Pierre Gurdjos, Vincent Charvillat, Carsten Griwodz and Pål Halvorsen was accepted at NOSSDAV 2013.

Even though his PC broke an hour before the session and the powerpoint slides had to be re-made, Vamsi gave a perfect presentation, and at the conference banquet, he received the BEST PRESENTATION AWARD.

An extended version of the Bagadus ISM paper accepted for IJSC

An extended version of the ISM paper about the real-time panorama recording pipeline in Bagadus has been accepted to be published in International Journal of Semantic Computing (IJSC). The paper has the title “Efficient Implementation and Real-time Processing of Panorama Video” and is authored by Håkon Kvale Stensland, Vamshidar Reddy Gaddam, Marius Tennøe, Espen Helgedagsrud, Mikkel Næss, Henrik Kjus Alstad, Carsten Griwodz, Pål Halvorsen and Dag Johansen.



There are many scenarios where high resolution, wide field of view video is useful. Such panorama video may be generated using camera arrays where the feeds from multiple cameras pointing at different parts of the captured area are stitched together. However, processing the different steps of a panorama video pipeline in real-time is challenging due to the high data rates and the stringent timeliness requirements. In our research, we use panorama video in a sport analysis system called Bagadus. This system is deployed at Alfheim stadium in Tromsø, and due to live usage, the video events must be generated in real-time. In this paper, we describe our real-time panorama system built using a low-cost CCD HD video camera array. We describe how we have implemented different components and evaluated alternatives. The performance results from experiments ran on commodity hardware with and without co-processors like graphics processing units (GPUs) show that the entire pipeline is able to run in real-time.

« Older Entries