V-Ray RT and GPU rendering
Supported hardware and drivers
OpenCL run-time compilation
Choosing which devices to use for rendering
Balancing the GPU load
Supported features on the GPU
Common OpenCL errors
GPU rendering allows V-Ray RT to perform the raytracing calculations on the GPUs installed in the system, rather than the CPU. Since GPUs are specifically designed for massively parallel calculations, they can speed up the rendering process by an order of magnitude.
To enable GPU rendering, select the OpenCL (single kernel) or CUDA (single kernel) value for the Engine type parameter in the V-Ray RT settings.
V-Ray RT for GPU has two back-ends (or engines). One is based on OpenCL (see the references section below for more info on OpenCL) and the other one - on the nVidia CUDA platform.
The OpenCL engine should be able to run on any OpenCL-compatible hardware. However, as of the time of this writing (April 28th, 2012), only the nVidia implementation of OpenCL is sufficiently advanced to run it properly. For best results, a Fermi- or Kepler-based card with at least 2 GB of video RAM is recommended. Older cards will work, but performance will be significantly worse. Due to the large amount of RAM needed to compile the OpenCL code, currently it only works in 64-bit builds of V-Ray RT. It may be possible to run the OpenCL engine on software CPU implementations of OpenCL from AMD and Intel, however this has not been thoroughly tested.
The CUDA engine is supported only in 64-bit builds of V-Ray RT for Fermi- and Kepler-based nVidia cards. It is recommended to use the CUDA engine on nVidia GPUs.
Rendering on multiple GPUs is supported and by default V-Ray RT for GPU will use all available OpenCL/CUDA devices. See the sections below how to choose devices to run V-Ray RT GPU on.
V-Ray RT for GPU has been tested on a number of graphics cards including:
nVidia GeForce 680 GTX;
nVidia GeForce 580 GTX;
nVidia GeForce 590 GTX;
nVidia GeForce 570;
nVidia GeForce 480 GTX;
nVidia Tesla C2050;
nVidia Quadro 2000M;
If V-Ray RT for GPU cannot find a supported OpenCL/CUDA device on the system, it will silently fall back to CPU code. To see if the V-Ray render server is really rendering on the GPU, check out its console output.
In general, for portability reasons, OpenCL code is compiled at run-time when a program runs (much like OpenGL shaders written in GLSL and DirectX shaders written in HLSL), as opposed to CUDA code, which is precompiled in advance and stored in a binary format inside the program executable. This allows the OpenCL code to be portable and best optimized for the particular hardware on which it runs. The downside is that the compilation may take a while, depending on the OpenCL code complexity, number of OpenCL devices in the system, and the OpenCL compiler and driver versions. Luckily, the binary version of the compiled OpenCL code can be cached and the re-loaded much faster later on.
The first time you install V-Ray RT GPU and perform a GPU rendering, V-Ray will compile the OpenCL code for your hardware. This may take anywhere from 30 seconds to several minutes, depending on the number of graphics cards and driver version. In the V-Ray RT render server console window you will see something like this:
[2010/Sep/6|21:05:44] Running RTEngine
[2010/Sep/6|21:05:44] Initializing OpenCL renderer (single kernel version)...
[2010/Sep/6|21:05:44] Number of OpenCL devices found: 1
[2010/Sep/6|21:05:44] OpenCL device list:
[2010/Sep/6|21:05:44] Device 0: GeForce GTX 480
[2010/Sep/6|21:05:44] VRAY_OPENCL_DEVICES environment variable not specified; using all available devices
[2010/Sep/6|21:05:44] cl_nv_compiler_options supported!
[2010/Sep/6|21:05:44] Building OpenCL trace program...
[2010/Sep/6|21:06:34] OpenCL program built in 49.156 s
The resulting compiled binary code is cached to disk in the temporary folder for the current user. On subsequent runs, the compilation phase is skipped and the code is loaded directly from the disk:
[2010/Sep/6|21:46:54] Building OpenCL trace program...
[2010/Sep/6|21:46:54] OpenCL program built in 0.016 s
Such run-time compilation is not required by the CUDA engine and it will start rendering right away.
You may not want to use all available OpenCL/CUDA devices for rendering, especially if you have multiple GPUs and you want to leave one of them free for working on the user interface. To do this, you can use the supplied GUI tool, which you can find in Start Menu > Programs > Chaos Group > V-Ray RT Adv for 3ds Max > Select OpenCL devices for V-Ray RT:
After changing tis option, you need to restart the V-Ray RT render server (if it is running) for the changes to take effect. If the V-Ray RT render server is running as a Windows service, you may need to stop it from the Services applet in the Control Panel.
Note that the tool determines the devices to use for both CUDA and OpenCL rendering.
If you have only one GPU on your system, you may find that the user interface becomes sluggish and unresponsive while V-Ray RT is rendering on the GPU. To alleviate this problem, reduce the Rays per pixel and/or the Ray bundle size parameters in the Performance section of the V-Ray RT renderer settings in the 3ds Max Render Setup dialog. For example, you can try values like 128/8 or 128/4. This will break up the data passed to the GPU into smaller chunks, so that the user interface requests can be processed faster. Note however, that this will reduce the rendering speed. Turn on the statistics display to check the difference in render speed and to find the optimal settings for your system.
On the GPU, V-Ray uses a simplified version of the V-Ray renderer, which supports only a sub-set of all features of the CPU code. The features listed below are supported; anything else will likely not work.
Triangle meshes and VRayProxy objects are supported. Instancing is also supported - see the Instancing and Forest Pro support page.
Note that even for supported lights, only a sub-set of the light parameters are implemented.
Bitmap textures, the Falloff map, the VRayHDRI the VRaySky maps are supported. Other procedural textures (Checker, Noise etc.) are supported by baking them, provided that they have Explicit UVW mapping type. If the Resize textures for GPU option is turned on, then all textures uploaded to the GPU are resampled to a resolution specified by the GPU texture size parameter in the V-Ray RT settings in the Render Setup dialog. The Mix and ColorCorrection textures are also supported.
Only the background texture from the Environment dialog is supported and used for background, GI, reflections and refractions. Only spherical, mirror ball and angular environment mapping types are supported.
Motion blur is supported, provided that the Motion blur option in V-Ray RT is enabled and motion blur is enabled in the production renderer or through a physical camera.
Below is a list of some common OpenCL errors that you may get in the V-Ray RT render server console:
Error -4 at line XXX, in file ./src/ocl_tracedevice.cpp !!!
This errors means that there is not enough VRAM on the GPU to complete the rendering. You can try one or more of the following to fix the error: