Sunday, February 9, 2014

Distortion methods in the Rift and their performance

There are three main approaches to producing the required Rift distortion that I'm aware of and have implemented.

  • Shader based direct calculation
  • Shader based lookup 
  • Mesh based distortion
Here's an overview of the three, along with some information about the different performance characteristics.

Terminology

There are two different coordinates we're interested in at any given point (regardless of reference frame).  The coordinates of the pixel to be rendered, and the coordinates of the scene texture containing that should be rendered there.  In order to make what follows more clear we should establish names for these two coordinates.  

Distorting the image moves the position of a given pixel in the scene texture to a given pixel on the screen.  I will therefore refer to the coordinates of the pixel being rendered as the distorted coordinates, and the scene texture coordinates as the undistorted coordinates.  

Shader based direct calculation distortion


Shader based distortion using direct calculation of the undistorted texture coordinates from the distorted ones is what I've covered in most of my posts on the topic.  It's also what's presented in the Oculus SDK examples in the 0.2.x and earlier versions of the SDK.  

The nature of vertex and fragment shaders means that you're going to be starting from the distorted coordinates.  From there you need to calculate the undistorted coordinates.  I've covered this at length in this post so I won't delve too deeply into it here.  The executive summary is that in the fragment shader you:
  1. Convert from texture coordinates to screen coordinates to Rift coordinates
  2. Find the scaling factor to convert from the distorted Rift coordinate to the undistorted one
  3. Multiply your distorted coordinate by the scale to get the undistorted Rift coordinate
  4. Convert back into screen coordinates, and from there back into texture coordinates
  5. Fetch the color from the texture using your newly found undistorted texture coordinate
Some of that can be optimized out, but that's basically it. All the conversion and calculation is done in the shader.  

Shader based lookup distortion

The calculations you perform in the direct calculation method are the same every frame, so it's subject to pre-calculation.  The obvious way to do this is to find the undistorted coordinate for every distorted coordinate pixel and store them in a texture.  Of course, it turns out you don't really need to do it for every coordinate.  You can create a texture that is substantially smaller resolution than your rendering target (say.. 128x128) and rely on the linear interpolation between precisely calculated undistorted coordinates to be good enough.  The distortion function isn't linear, but the difference between the precise calculation and the linear interpreted value over 5-10 pixels isn't likely to be noticeable.  

This method functions exactly the same way as the direct calculation method.  The difference is that you're doing the calculation in your application, rather than on the video card, and you only do it once (or twice, once for each eye).  

Mesh based distortion

The shader based method rely on manipulating the texture coordinates when you render a quad containing your scene texture.  Mesh based distortion is different.  Rather than render a single quad and using a mechanism to alter the texture coordinates of what's painted at a given pixel, you render the scene as a mesh of many triangles.  The texture coordinates are left alone, and instead the vertex positions of the triangles are manipulated.  

A 128x128 mesh of trinagles with distorted coordinates

Instead of altering the texture coordinate we alter the geometry.  However, there's a roadbump to this method.  Instead of starting with the distorted coordinate, you're starting with the undistorted one.  Getting the distorted coordinate from the undistorted one isn't immediately obvious.  At first glace it would seem like you'd have to invert the polynomial used in doing the reverse calculation.  I have no idea how to do that and no real inclination to learn.  

But I do know how to apply a binary search find find the input values to a given function that produce the desired output.  Our desired output is the undistorted coordinates, so the binary search can be used to find the distorted ones.  

  double getDistortionScaleForRadius(double rTarget) {
    double max = rTarget * 2;
    double min = 0;
    double distortionScale;
    while (true) {
      double rSource = ((max - min) / 2.0) + min;
      distortionScale = getUndistortionScaleForRadiusSquared(
          rSource * rSource);
      double rResult = distortionScale * rSource;
      if (closeEnough(rResult, rTarget)) {
        break;
      }
      if (rResult < rTarget) {
        min = rSource;
      } else {
        max = rSource;
      }
    }
    return 1.0 / distortionScale;
  }

Just as with the texture lookup method, the mesh approach isn't per-pixel accurate.  You still get linear interpolation between vertices, but again, at 128x128 for the Rift DK1 screen resolution, the difference is probably imperceptible.  

Which method should you use?

It depends on your target environment.  Despite the fact that it's doing calculation for every single pixel for every single frame, on my hardware direct calculation is the fastest approach.  But direct calculation has drawbacks.  It results in a much more complicated shader implementation that mesh or lookup based methods, and so is harder to debug.  It also is strongly tied to particular kind of distortion, i.e. radially symmetrical lenses described by a polynomial.  If you have asymmetrical lenses or you don't know the polynomial coefficients, you can't implement direct calculation distortion.  For instance, the recently released SteamVR API treats the distortion function as a black box.  You can call a method and provide distorted texture coordinates and get back undistorted ones, but that's the extent of it. 

I suspect that more VR API's based around the Rift style design of distorting the image in software to correct for the lens distortion will start to follow this approach.  It leaves the dirty work of determining the best method of calculating distortion and implementing that calculation in the hands of people who's jobs revolve around that knowledge, rather than pushing it out among game developers everywhere, and it gives HMD designers more latitude in their designs, since they don't have to worry about imposing the burden of some hideous mathematical computation on their customers (the game and application developers kind, not the general market for HMDs).  

So that leaves the mesh approach and the texture lookup approach.  For a given resolution of mesh / texture, the mesh based approach should be more accurate.  This is because for a texture, there will be pixels on the texture that store undistorted coordinates that lie outside the actual texture.  So for a 128x128 texture you've got quite a few of the texture pixels that don't contribute to the accurate rendering of the image.  This means you effectively have more interpolated pixels per correctly calculated pixel.  By comparison, every vertex in the mesh is used for rendering the output.  On the other hand, the mesh is almost certainly likely to take up more space in memory.  Further, it's not as fast as the texture lookup, and doesn't scale to higher resolutions as well.  What the mesh approach does do well is function in environments where shader support might be poor or non-existent.  

So the answer is 'probably lookup texture based distortion' unless the target environment prohibits that somehow, but of course the answer really is 'benchmark it and figure it out for yourself, preferably at runtime'.  

I did a performance comparison of the three methods, as well as running an version of the application that did no distortion at all as a baseline.  

Running on an i7-4770k processor, with a GeForce GTX 650 Ti, the results were as follows:


Distortion TypeNative 64x64128x128512x512
None59n/an/an/a
Meshn/a791371040
Shader - Lookup Texturen/a818290
Shader - Direct Calculation69n/an/an/a

All times are in microseconds.  As you can see, the mesh approach scales very poorly as your increase the number of vertices.  The texture lookup approach on the other hand isn't strongly impacted by the resolution of the texture.  After all, there is likely to be some amount of interpolation on every single pixel, so changing the number of pixels being interpolated between has very little impact.  I suspect the performance difference largely results in increased cache misses due to the larger size of the texture, but that's pure speculation.

The code

The code for these numbers as well as implementations of all these methods of performing distortion is available in the example code repository for the Oculus Rift in Action book, specifically the examples for chapter 5, section 2.

The book

Oculus Rift in Action is available through the Manning Early Access Program, and if you buy it you can see the chapters as they're produced and provide feedback on what could be improved or what you'd like to see.

Buy my book.

Now.

Go on, do it.

I'll wait.

4 comments:

  1. Do you think a geometry shader or tessellation shader implementation of the mesh approach would yield better scaling with mesh size?

    ReplyDelete
  2. Did you actually test your undistortedScale function? I tried to implement it for my purposes but I always ran into the problem of an infinite loop. I implemented the closeEnough function as (abs(rResult - rTarget) < epsilon). Where I tried epsilon values between 0.1 and 0.0001 but the values from rResult or rTarget get too small and for some reason the break is never reached. I also tried to scale rTarget Value by some big number and divide the distortionScale by the same value but the same occurs after a while with this approach. I found nothing of interest in your code examples and therefore ask if I do something fundamentally wrong? Maybe you could push me in the right direction. Thank you.

    ReplyDelete
  3. It's possible the code is not functional or out of date, but it did work for me at the time with the DK1. You might take a look at similar code inside the Oculus SDK and see if it's easier to follow: https://github.com/jherico/OculusSDK/blob/11403fcdf4e64eb46b0b9fd0bd0ff6d8529d4dbc/LibOVR/Src/OVR_Stereo.cpp#L245

    ReplyDelete