We recently posted comparison videos highlighting the differences between composite video cables and our component video cables. We chose YouTube as the video host due to popularity and accessibility. However, YouTube does not support frame rates greater than 30fps (frames per second). This is a hurdle we had to overcome when preparing our videos for upload.
In this blog post, we discuss a simple method for downsampling a video from 60fps to 30fps that retains the flicker and jitter present in the original 60fps video. Televisions in the United States and Japan display video at a 60fps rate, which is the frame rate provided to them by consoles like the Super Nintendo Entertainment System and Sega Genesis. YouTube, on the other hand, displays video at a maximum rate of 30fps. Therefore, when video recorded at 60fps is uploaded for sharing, it is downsampled (and possibly pre-filtered) by a factor of two. When this happens, any flicker that is produced by pixels turning on and off from frame to frame is lost.
A typical example of this might be when a character is blinking during temporary invincibility after taking damage. (See here for a real life example.) In such an example, the resulting downsampled 30fps video would either show a solid character or no character at all. In addition, during our work we discovered a similar issue when trying to display certain types of jitter in the original video that was a result of using the composite video output used by the SNES. Because of the downsampling to 30fps required by YouTube, this jitter was no longer present in the 30fps video and we had no way of providing a representative comparison to the HD Retrovision component cables which alleviate this problem.
To solve this issue, a simple model of jitter was considered. In the following jitter model, we imagine a single row of pixels consisting of only 0's (black) and 1's (white) shifting between frames at 60fps.
Frame 0: 1 | 0 | 1 | 0 | 1
Frame 1: 0 | 1 | 0 | 1 | 0
Frame 2: 1 | 0 | 1 | 0 | 1
Frame 3: 0 | 1 | 0 | 1 | 0
Frame 4: 1 | 0 | 1 | 0 | 1
Assuming a simple scheme of dropping frames (although this will work similarly with an averaging pre-filter), it is easy to see why the flicker disappears in the 30fps video:
Frame 0: 1 | 0 | 1 | 0 | 1
Frame 2: 1 | 0 | 1 | 0 | 1
Frame 4: 1 | 0 | 1 | 0 | 1
If we imagine that frames are paired up like so [0 1], [2 3], [4 5], [6 7], ... , then the standard downsampling method is simply choosing the first frame in each pair. As above, we'd get [0], [2], [4], [6] ... for our frames. Instead, what we'd like to accomplish is to alternate which frame we choose to drop from each pair of frames. To do this, we can simply flip frames in every other pair before running the standard downsampling method. We rearrange frames as [0 1], [3 2], [4 5], [7 6], ..., so that the resulting output frames in the 30fps video are [0], [3], [4], [7], ... and so on. The result retains the jitter in our simple model:
Frame 0: 1 | 0 | 1 | 0 | 1
Frame 3: 0 | 1 | 0 | 1 | 0
Frame 4: 1 | 0 | 1 | 0 | 1
So how does this hold up in practice? Is our incredibly simplistic model at all representative of reality, or is the output an unwatchable mess of video that looks nothing like the original 60fps video? It turns out that despite the simplicity of this model, this downsampling schema works quite well. In the example video below, you can see both the flickering and jitter inherent in the higher frame rate video even though it is being displayed at a downsampled rate of 30fps.
Note: All 3 segments below are from video captured using the standard composite input cables and do not represent the improvements gained by using HD Retrovision cables. The right pane is the HD Retrovision downsampling schema on composite video.