Variable Frame Rates in Video

by S3D Centre

Research Centre - S3D

We are currently doing experiments on the use of Variable Frame Rates (VFRs) consisting of both High Frames Rates (HFRs) and standard frame rates (as 24fps). As part of our evaluation of HFR and VFR content, it is necessary to know how we can direct the viewers gaze using intra-scene VFR (and if this is possible), through tracking where the viewers gaze pattern flows on a screen with both standard frame rate, HFR, and intra-scene VFR content in order to compare differences.

Intra-scene describes the use of multiple techniques within the same scene (and possibly also the frame) . In this case, the use of frame rates.

Why might this be relevant?

With “Soul Mates 3D”, we experimented with changing the frame rate from scene to scene. With intra-scene VFR, we propose changing the frame rate within the very frame itself; for example, a foreground of a horse running past a camera may be filmed/animated at 120 fps, and the background consisting of a stadium crowd may be filmed/animated at 24fps. The two scenes are composited together within the same timebase (let’s use 60fps as a choice), and we may appreciate the lack of motion artifacts within the foreground (horse) while enjoying the familiar 24fps aesthetic of the background visuals. The two elements within the scene will provide completely different motion information, and we as viewers would have a choice of where to look within the frame.

In cinema, television and gaming, viewers primarily focus their attention on an area of a moving image that has the most visual information. For example:

  • depth of field/focus (by means of eliminating information in the area that is not to be the focus through blur, we then focus our attention on the area of the frame that is sharp)
  • convergence (in stereoscopic 3D this used to target the viewer’s attention onto a chosen area of the frame by changing the 3D plane of the moving image)
  • composition and camera movement (using image obstruction and reveal, called occlusion)
  • And moving subjects or objects.

The talent performs on camera. Two cameras record at two separate very high frame rates (60fps and 120fps)

We selected a face as the image we would experiment with, because from birth faces are important in our social interaction and how we communicate effectively throughout our lives with one another. We look at faces daily, and we draw a vast amount of information from looking at faces.

Tools and Techniques

It is challenging to find both cameras and software that can do frame rates as high as we need in order to obtain data. The key is to use as high as possible a frame rate, in this case, 120fps. We simultaneously record a 24fps version of the performance. We considered the differences in Ultra HD (4K) resolution vs HD resolution in terms of seeing minute detail in the image. Will resolution make a difference in the ability to view motion detail in HFRs?

To measure the fixation patterns (a proxy for attention, in this case) we use eye tracking hardware and software. We visualize the data into patterns that are easy to interpret.

Stereoscopic 3D (S3D)