RATS at the Festival of Lights 2019

RATS is short for Real-time Adaptive Three-sixty Streaming, a software that was mainly developed by Trevor Ballardt while he stayed at TU Darmstadt and worked for the MAKI SFB. RATS was used in a test case of the H2020 project 5Genesis, which is preparing to showcase 5G during the Festival of Lights 2020 in Berlin. The test was conducted at Humboldt University during the Festival of Lights 2019.

Fraunhofer Fokus, one of the local partners in Berlin, wrote a piece to summarize the test from the 5Genesis perspective. We contributed video streaming and talk about it in this article.

FoL 2019 at the Humboldt University (© Magnus Klausen 2019)

The original idea of RATS was to use NVenc to convert an input stream from a 360 camera into a set of tiles in real time, which could be encoded at several qualities on the server before stitching them into a set of tiled H.265 videos. These H.265 videos would form a succession of qualities suitable for the orientation of the 360 camera. This idea was published in a demo paper at ACM MMSys 2019, and the intended application in 5Genesis as a short paper at ACM Mobicom’s S3. The code for RATS can be found on Github.

However, 5Genesis is also about dense user populations that access a live video feed, and such density can only be achieved if users can stream to their mobile phones without installing any additional software. The RATS idea would work perfectly for this if mobile phones’ browsers supported H.265. Unfortunately, Android phones do not.

So, instead of tiling in the sensible manner, we modified a clone of ffmpeg (a clone with minor modifications that is required for RATS). ffmpeg can be configured to use NVenc for encoding video streams in H.264, and it can also generate fragmented MPEG4 with suitable HLS and DASH manifest files. In case of DASH, MPD (manifest) files can the form of templates, which removes the needs for clients to download updates even in case of live streams, while HLS clients require updates. Instead of merging tiles after compressing them separately, we used Gaussian filtering on tile-shaped regions of the video to reduce the coding complexity. An arbitrary number of these version can be generated in parallel, using our new ffmpeg CUDA module for partial blurring.

The camera that we installed at the FoL 2019 was a surveillance camera with a fisheye lens (actually a panomorph lens, but close enough to fisheye to make our life easy), while we settled onto VideoJS for panorama display in our cross-platform web pages which should show the FoL videos on arbitrary browsers. It was a bit irritating that the current version of VideoJS has lost Fisheye projection support while it has gained both DASH and HLS support.

Consequently, we had project our panorama from fisheye to equirectangular projection. We followed two approaches. In one, we added a reprojection module into ffmpeg that uses CUDA to make the conversion before streaming, followed by a configuration of VideoJS that allowed to project only one half of a sphere, since a fisheye camera records only a single hemisphere. In the other, we extended VideoJS to support fisheye lenses directly. While the first piece of code may be more generally useful, we found that the single conversion of the second approach (which will be published in a master thesis next year) provides better visual quality.