Tuesday, June 11, 2013

HDR Lighting And Tone Mapping Introduction

This post is an introduction to HDR Lighting and the Tone Mapping Series. I will be covering some of the different techniques used for tonemapping and publish the results in the next few posts.

So, what is HDR Lighting and Tone Mapping?

Here are some lines from the DirectX9 HDR Lighting Tutorial :

"Light in the real world has a very high ratio between highest and lowest luminances; this ratio is called dynamic range and is approximately 1012:1 for viewable light in everyday life, from sun to starlight. Computer displays and print media have a dynamic range around 100:1, so compromises will have to be made when trying to display a scene with a higher dynamic range; this sample borrows from a technique originally developed for use in photography called tone mapping, which describes the process of mapping an HDR image into a low dynamic range space. Tone mapping effectively produces the same effect as the automatic exposure control which happens in the human visual system."

HDR Buffers (or floating point buffers) are used to capture the High range of luminances in the scene and Tone mapping makes a picture preserve all the details even with High and low luminances in the a frame when displayed on a compute Screen. I'm not going to go ahead and explain more about this topic, but you can read it more on the following links:

http://mynameismjp.wordpress.com/2010/04/30/a-closer-look-at-tone-mapping/
http://en.wikipedia.org/wiki/Tone_mapping

Following is the process following for HDR Lighting / Tone Mapping.

1. Create a floating point rendertarget. It could be R32G32B32A32 or R16G16B16A16. I would personally for for the 16 bits per channel as it will reduce the bandwidth when writing the pixels into the render target
2. Render your scene with lighting (could be forward rendered, deferred, tiled etc)  into this buffer. Now your scene can contain lights which has intensity values greater than 1.0f
3. Calculate the luminance of the whole scene
4. Apply a tonemap operator using the luminance to this floating point rendertarget and render the tonemapped scene into an 8 bits per channel buffer. It could be your back buffer
5. Display the back buffer

This is the basic process for HDR Lighting and Tonemapping. Now there are a number of implementations for the luminance/Tonemapping part. I shall be covering a few of them in this series of posts.

If you want a head start in implementing this feature, I would suggest taking a look at the HDRToneMappingCS11 sample that is available in the DirectX .



The luminance calculations is basically as follows. When we sample the "color" from a pixel, we apply the following equation

float luminance = color.r * 0.299f + color.g * 0.587f + color.b * 0.114f

or in an optimized manner

float luminance = dot(color.rgb, float3(.299, .587, .114));

In order to calculate the luminance for the whole scene, there are 2 methods that are explored. 

One is using the Pixel shader version and the other is the Compute Shader version. In the Pixel shader version, we use the downsampling technique to figure out the luminance. This means run a set of passes, and on each pass we sample the high resolution target a few times, average out the samples and store it in the lower resolution target, and repeat the process by using the lower res target as the input, till we render into a 1*1 target. In the above sample, they take 3*3 samples around the pixel and average it out and store it in the lower res target. In my case, I use 2*2 samples. This results in many more render targets, but for the shader I use a single Bilinear tap to get the average. If you want to get fancy, you could try using 2 rendertargets, and use the viewport to control the area for rendering (using the ping pong method).

In the Compute shader version, we perform 2 parallel reduction passes. One pass is to reduce from a 2D color target to a single dimensional UAV. And the second pass is to reduce it to a single float UAV buffer. Fewer passes and the use of Group Shared Memory are some of the advantages to using the compute shader. So the overall CPU Time is less compared to the Pixel shader version (as we have to downsample that many times till a single pixel). The Group Shared Memory is blazingly fast and almost like using a register. So whenever that is used in a compute shader, there will hardly be any fetch stalls when reading from the group shared memory. I think it is around 32k memory and it is on chip. The sample has some more documentation on what happens in the compute shader.


This sample uses a simple average luminance and a simple tone mapping operator and the results look pretty extreme, but it basically shows how to use the PS or the CS for this purpose. I will be posting up more tone mapping techniques implemented in my engine and you should be able to see some video/screenshots of this in action




Sunday, June 2, 2013

Taking Performance measures? ALWAYS DO IT RELEASE MODE

So, I started using the font engine for displaying the cost of the entire frame. I added timbars to the engine, to find out how much time a certain section of the code takes on the CPU side. This is really useful for measuring CPU Side cost for any implementation.

With this little gadget, I decided to implement some tonemapping algorithms, and see how much it cost. I was shocked to see that the whole frame costed around 40ms, when I just rendered one model + tonemapping. I spent a lot of time trying to figure out if it was a bad configuration, whether the model was really dense, or was there some issue with my engine?

And then I realized that I was running it Debug Mode. This means de-optimized Built C++ Code, de-optimized shaders, and bloated debug DirectX dlls. I set up the release version correctly, and voila, the time for the total frame came down to 7ms. Which was much close to what I was expecting.

So, just in case you're trying to time something you implemented, ALWAYS DO IT IN RELEASE MODE :).