Our software collection: mint.tools


Having available hardware (sensors) to do recording is only one part. (A rather small one, in comparison, it turns out.) To be able to make any sense of the resulting sensor data, the individual streams coming from the sensors must be integrated, at the very least temporally: the recordings need to be in sync. For video and audio, this is often achieved by recording video and audio on the same media. But already it poses problems if there are several concurrent video recordings (i.e., several cameras running at the same time). In the olden days, synchronisation was achieved by creating an event that was recorded on all media---this was the job of the Clapperboard.

The problem of synchronising the video recordings is solved in hardware for us---the cameras we're using offer a master/slave mode, where there is one master camera that controls the internal timecode of the other cameras, ensuring that all cameras record at exactly the same moment (recall that cameras record only 25 pictures per second; this makes sure that these times are synchronised between all cameras) as well as that frames (pictures) corresponding to the same time are linked together, and when starting to play on one camera, playback also starts on the other.

But as we are interested in multimodal data, this is only part of the solution. Synchronizing the other data streams (eye tracking data and motion capture, in our case) poses two further problems, namely how to synchronise these data streams with each other and with the cameras. We solve these problems separately and in different ways. To synchronise the data streams, we record from a central server to which these streams are sent; all events then are logged with the timestamp from the server machine. This incurs some network lag, but other than that, it is ensured that the timestamps are coming from synchronised clocks (as in fact, they are coming from only one clock). Our solution does more, however. The logging is done through a virtual reality environment, instant reality (built by Fraunhofer IGD), which also allows us to transparently transform the 3D data sent by the sensors into a single coordinate system, which we can then instantly visualise. This gives us during recording already a 3D view of the scene, which allows for live monitoring of the recording process.

The following diagramm gives an overview of our tool collection. We have written specific servers that use the APIs of the sensors (kinect and faceLab) and send data in the format expected by the instant reality / instantIO framework. Within that, we can visualise the incoming data streams in a 3D scene, and also log all data. Playing back logs then uses the same scene for visualisation.


The following image shows such a virtual representation of a recording scene. labsceneclassic

We have also patched the popular tool for annotation of multimodal data, ELAN so that it can control of replayer for 3D data, in effect making the 3D data another track that can be used of annotation. This combination is shown in the following image.


To support computational work on the data, we have developed a python package that reads in the logged data and makes it useable for a modern, interactive style of data interaction using tools such as IPython and powerful plot-libraries.

We describe these components separately below.


This website reports on some results of the "Multimodal Interaction Lab", which is located at Bielefeld University and is led by David Schlangen. The information contained in this website is for general information purposes only; we make no representations or warranties of any kind, express or implied, about the completeness, accuracy, reliability, suitability or availability with respect to the website or the information, products, services, or related graphics contained on the website for any purpose. Any reliance you

... Continued

place on such information is therefore strictly at your own risk.

Through this website you are able to link to other websites which are not under the control of us or Bielefeld University. We have no control over the nature, content and availability of those sites. The inclusion of any links does not necessarily imply a recommendation or endorse the views expressed within them.