Ray-tracing vs Rasterization
Rasterization: The term refers to the popular rendering algorithm for displaying three-dimensional shapes on a computer. Rasterization is currently the most popular technique for producing real-time 3D computer graphics. Real-time applications need to respond immediately to user input, and generally need to produce frame rates of at least 24 frames per second to achieve smooth animation. This is used in Games, CAD, medical imaging, etc. Any application that needs interactive rendering of complex 3D scenes.Ray-tracing: Ray-tracing is a technique for generating an image by tracing the path of light through pixels in an image plane and simulating the effects of its encounters with virtual objects. It is meant to produce much better quality images than rasterization but it’s also much slower because it’s much more CPU consuming.
Appetite for high quality visualization
Why do we need real-time and interactive
visualization?
I don’t know if you know about that story
but there was an experiment done by, I think, the American intelligence. They
were asked to write an application that could quickly identify buildings on
satellite pictures. So they started working on complex algorithms involving
machine learning, neural networks, image processing and anything that can help
identifying the shape of a building on a 2D image. But then they realized that,
with brain analysis and electrodes, using humans was a much simpler and cheaper
way to achieve what they wanted. So they hired people for a few dollars an
hours, and as they were showing them the photos, some parts of their brain were
activated when they could identify a building. And this was done in a matter of
milliseconds. The human brain is just good at that and has a very optimized
recognition algorithm. All this to say that our unconscious is very good at
picking things up and can understand more when facing realistic images.
Also, our brain understands dynamic scenes
better than static ones, and because the quality of what we see directly
influences our comprehension of complex events, it has to be as high as
possible.
Where is the challenge?
With Rasterization, every object is
independent from each other. It’s very suitable for embarrassingly parallel
algorithms and has a linear and predictable processing time, depending on the number of objects to
process.
With Ray-tracing, every object potentially
interacts with all of the others. So what you get is an exponential processing
time, as the number of objects increases.
Solving this problem requires a specific management
of the objects to be rendered, and this is what I am going to show you now.
Six degrees of separation
One day, a friend told me about the theory
of “Six degrees of separation”. Do you
know about this theory?
Six degrees of separation is the theory
that everyone and everything is six or fewer steps away, by way of
introduction, from any other person in the world, so that a chain of "a
friend of a friend" statements can be made to connect any two people in a
maximum of six steps
I did not take this sentence very seriously
at the time but for some reason, it got stuck in my head. And a few years
later, as I was encountering performance problems on one of my project, that
theory came back to me as an evidence.
If we are 7 billion people on this planet,
and if we never are more than 6 steps away from each other, why would it be
different in the memory of my computer? If I have 7 billion objects to process
in my rendering algorithm, it should not take more than 6 cycles to access the
object that I need to work on…
Ok, that’s the theory. How do we do that in
practice? And this is of course depending on the parameters that you can pass
to this clever equation, as shown on the slide.
Bounding boxes
In order to apply this algorithm, the first
thing to do is to group your objects, so that the group can be considered
rather than the objects that it contains. In this example, we talk about
“Bounding boxes”.
A bounding box is basically a 3D space that
contains a number of objects.
Ray-tracing is all about intersections. In the previous slides, I told you that when you launch a ray, you need to find the intersection with the closest object. If you have 10’000 objects, then you need to compute the intersection between the ray and the object itself, for all of the objects, and then keep the closest intersection. By grouping objects in boxes, you can start with computing intersection between the ray and the boxes, and if there is an intersection with a box, only then you will look into that box and compute the intersections with the objects it contains.
Fine tuning
A box intersection takes 17 cycles, using
the robust and efficient algorithm
provided by the University of Utah ( http://people.csail.mit.edu/amy/papers/box-jgt.pdf
)
A object intersection is a lot more
expensive and can cost several hundred cycles. So then it’s a trade-off to
identify between the number of boxes, and the number of objects.
OO vs Algorithm Specific Structure
Object Oriented Design is good, because it
uses concepts that we, humans, are comfortable with. A car object is a set of wheels, tyres,
seats, and it as a color, etc. You can define inheritance that prevents from
replicating the code.
With OO, you can write readable and
maintainable code, which is essential. But OO is not optimal for performance
because it fragments memory and forces one to consider an object systematically
as a whole.
If you take the example of a person with three attributes: a first name, a second name, and an age. Now imagine that you have a collection of
persons and you want to compute the average age of that population. With OO,
you will not only take the age attribute down to the processor but you will
also take the first and the second name down to the CPU, even if you do not need them in your
algorithm. Only because OO forces you do so, by design.
On the other hand, if you have an algorithm
specific data structure, you will separate you “person” object in two parts.
One for the name and family name, the
other one for the age. And only the data structure containing the age will be
taken down to the processor. This will dramatically increase the performance.
In the case of the ray-tracer, we do
exactly the same. A primitive such as a
sphere for example, is composed of a number of attributes such as the center,
the radius, the material properties, texture coordinates, etc. But when
processing intersection on the first pass of the algorithm, only the center and
the radius are needed, and this can be represented in the form of a float4.
For the boxes, only the two corners of the
box are needed, meaning two float3.
Order in RAM
Now that we have defined our data
structures, this needs to be optimized them for computers.
In our case, we have boxes and we need to
compute the ray-box intersection for each of them. So let’s make sure that
boxes are aligned in memory.
As we go through our boxes, if an intersection is found, we need to get deeper into the box to make sure that there is an intersection with a primitive. Meaning that ordering primitives also is essential. This is what is being done at a second level. And as explained in the previous slide, only me minimal information is taken down to the processor.