Saturday, March 8, 2014

Field of view in the Oculus VR

User Pogo on the Oculus VR forums asked the other day about why some applications didn't cover more of the available screen real estate on the Rift, and consequently occupied a smaller field of view than was possible.

Every pixel on the Rift display has a fixed relationship with the lenses (discounting the slight differences between the three sets of lenses that come with the Rift).  Each pixel has a specific θ (pronounced theta) which represents the angle between the lens axis and the line from the pixel through your eye.  Each pixel also has a specific 'perceived θ', which is the perceived angle between the lens axis and the where it appears to be because of the lens distortion.  

What this means is that if you're drawing on a smaller area of the Rift screen, you're essentially providing a smaller field of view.  

As an example of the issue Pogo posted the following images.  The first is from the Oculus configuration utility used to test the IPD and height settings a user has entered.

The second is from the lauded RedFrame environment demo.

And the last is from Half Life 2.  

All three applications are using native Rift support, but are rendering different fields of view.  But why?  Well, there are a number of reasons that this might happen.

How it happens

If you're working with the Oculus SDK and basing your code on the examples you may have seen the OVR::Util::Render::StereoConfig. This class takes information about the headset in the form of an OVR::HMDInfo structure and spits back out a number of important variables used to do the distortion.  It will return to you a OVR::Util::Render::StereoEyeParams structure for each eye containing some of the information, such as, critically, the projection matrix, which determines the field of view, and consequently the area of the screen covered.  OVR::Util::Render::StereoConfig also has a method GetDistortionScale() which tells you how much to scale up the image after the distortion coefficients have been applied.  Seeing this in action is non-trivial because all the actual computations occur in GL or DirectX shaders, but I assure you it is the case.  

The FOV, the projection matrix, the area of the screen covered, and the distortion scale returned by the above function all have to act in lockstep.  If you do everything else right, but decide to compute the projection matrix on your own with a FOV of your own choosing you run the risk of making your users ill, unless you're calculating it exactly the same was the Oculus SDK would in StereoConfig, because the perceived FOV won't match the physical FOV, and when a user turns their head, they'll see a turn rate either slower or faster than their inner ear expects.  

However, the means of influencing these values isn't immediately obvious.  There's no mechanism in the SDK to specify a FOV or a desired projection matrix.  There is a mechanism to specify how much of the screen you want covered, however, but it's somewhat obscure by my reckoning.  StereoConfig has a method called SetDistortionFitPointVP(float x, float y), which takes in a two dimensional coordinate, that is specified in normalized screen coordinates (where the X and Y size of the screen are both treated as length 2 and the origin (0,0) is at the middle of the screen).  The distortion fit point is interpreted by the SDK as a request that the edge of the distorted image for the left eye touch that point.  It defaults to (-1, 0) which means the very left edge of the screen horizontally and centered vertically.  All of the Oculus examples use the default value, hence the results we see in the first image from above:

Even though the SDK provides no way to specify the FOV directly, it's pretty trivial to do a binary search by setting the distortion fit point to a value, asking StereoConfig for the computed FOV, and then changing your distortion fit point based on whether the returned FOV was higher or lower than your target, and iterating until you got to the requested FOV.

So given this information, we have to ask why you'd see screenshots with less of the screen covered than you see in the Oculus examples.  Well, there are a few obvious reasons....

Performance constraints

It's important to remember that in order for the end image to look good, you have to render at a significantly higher resolution than the actual display panel resolution.  The distortion shader in the Oculus SDK examples performs what is called a barrel distortion on the scene.  This means that straight lines that were in the original image, becomes bowed inwards toward the center, like the sides of a old-timey barrel.

Perhaps a modern reader might more easily think of it as a 'keg' distortion
Because of the specific values chosen for the distortion coefficients, which has the overall effect of shrinking the image.  Before it's displayed on the screen, the image has to be sized back up uniformly, creating a magnification effect.  You could choose different coefficients that didn't shrink the image overall, but that would essentially just bake the magnification step in, resulting in the same overall effect.

This magnification effect can in turn cause a blurriness at the center of the screen, if you don't have enough pixels in the source image to map to the pixels on the physical display panel.  For instance, here is an animated gif showing the difference between rendering distortion using a source image of 960x1200 per eye, versus using the native per-eye resolution of 640x800.

Of course that's scaled down to fit in the blog, so you'll probably need to open it in a new window and view it at its original resolution in order to actually see anything.

Here's a zoom of a small spot on the wall near the center of view, zoomed 50%

The difference may be subtle but it can have a big impact on the feel of your game.  As you move away from the center of the image, the blurriness diminishes.  Because of the warping, the further you are from the center of the per-eye images, the fewer source image pixels are required to render the destination pixel.  Near the center of the image the required source texel density is high but it drops continuously as you move outward, and eventually drops to a point where a source image that has the same resolution as the panel provides enough source pixels for every output distorted pixel.

So we've established that to render a non-blurry image for the Rift that takes up the full FOV, you have to render at a higher resolution than the actual panel.  The ratio between the display panel resolution and the required resolution actually turns out to be exactly 1/GetDistortionScale(). so the offscreen framebuffer size for a given field of view turns out to be the physical resolution of the headset times  GetDistortionScale().

For the default fit point, the scaling factor is about 1.7.  So even though the display panel per-eye resolution is only 640x800, in order to get no blurring in the center of the image due to magnification, you'd have to render to a framebuffer resolution of about 1100x1400. And you'd have to do it twice per frame in order to render to the Rift.  This is about 1.5 times the number of pixels you have to render if you're targeting a normal display running at 1920x1080.  What's more, it's likely to be more expensive than because you're running the entire scene through the rendering pipeline twice.

If your game is designed to run on current hardware at a given frame rate, asking it to render two frames and 1.5 times as many pixels in the same amount of time might not be feasible.  To solve this problem you can either start reducing the visual quality of the game, or you can try to render fewer pixels.  The way to render fewer pixels is to adjust the FOV, or more accurately, change the fit point.

Middleware constraints

If you look at the implementation of the Valve Steam VR API here you'll see they are setting the distortion fit point to a value of 0.6 (horizontally) and 0.0 (vertically). This changes the distortion scale to a more manageable 1.05, and thus the required resolution of the framebuffer is about 680x850.  I'm working from the assumption that they've done this for performance purposes, since it appears to have been intentionally.  However, I've also opened a bug because as far as I can tell they offer no mechanism by which to change the value either through requesting a specific FOV or otherwise, and that seems like making too many assumptions about the performance constraints of the application.  For that matter, I'm honestly surprised that they wouldn't opt to let a game like Half Life 2, now 10 years old, run with as high a FOV as is physically possible on the Rift.

Regardless of the reasoning, as it currently stands I see no way of utilizing the full field of view of the Rift while working with the Steam VR API, because it's in charge of all the communication with the Oculus SDK, and doesn't expose the items that you'd need to change in order to correct it.

Similarly, other integration tools or platforms that manage the Rift interaction for you, such as Unity, UDK, or plugins for things like Torque3D or OpenFrameworks may not provide access to the values you need to be able to change.

Engine constraints

Some gaming engines simply may not be designed to have a customizable FOV, or may have limits on how far it can be extended before you start to get undesirable artifacts. Many games render their HUD (Heads Up Display) through their 3D engine, even when the HUD is just 2D text, and that means that field of view can impact where HUD elements appear on the screen; maybe it's not worth the engineering time to fix the UI layout.  Perhaps for optimization reasons a game assumes that outside a given maximum view frustum objects don't need to be rendered and drops them early in the rendering pipeline; the amount of work to fix that might be high enough that it's not worth doing. In these case the only solution to avoid making these artifacts visible would be to reduce the FOV.

Adapter constraints

There are a set of products out there, not games themselves, but tools designed to let you run existing games that don't have native Rift support on the Rift.

These are similar to and in some cases directly related to previous tools meant to do a similar job providing depth functionality on 3D monitors even when an application hadn't been written with stereoscopy in mind.

I don't have a detailed understanding of how tools like these work, but I can hazard a guess that it involves intercepting calls from the application to rendering APIs like DirectX and modifying the parameters to the calls.  While the fine details of rendering individual objects in a scene are almost infinitely varied, the basic flow of rendering the scene as a whole is fairly predictable, so if you know what you're looking for it's probably not to hard to find the calls that involve setting the projection matrix or the eye position for a given frame, and injecting alterations.  This is supposition, but if I were to attempt to implement something like these, this is likely where I would start.

However, there will always be limitations to how much these applications can do.  Similar to the engine constraints mentioned above, if a rendering application thinks it's working with a given field of view and you change that on the fly between the application and the rendering API, then you're likely to start seeing problems, since the application may simply fail to render (or render properly) items that would now be in view but that the application thinks aren't in view.


Some people may just have not really understood the distortion shader and may have left out certain components of it, or not properly integrated the code into their own systems.   And if you're working with something like Unity, rather than directly with the SDK, you have to dig into the integration code or documentation to figure out how to set the required values or the results can get pretty wonky.

To some extent I'd put the Steam VR issue in this category, since setting the fit point to a non-default value and then not exposing it at all in the API seems like an oversight.  Of course, I could be wrong.

No comments:

Post a Comment