## LuxCore: new BIASPATHOCL with OpenCL micro-kernels

Discussion related to the implementation of new features & algorithms to the Core Engine.

Moderators: Dade, jromang, tomb, zcott, coordinators

### LuxCore: new BIASPATHOCL with OpenCL micro-kernels

Introduction

I have merged the biaspathocl_microkernels branch with the main and the new BIASPATHOCL with OpenCL micro-kernels is available in the main version. It introduced 2(+1) major benefits:

1) BIASPATHOCL OpenCL kernels can now be compiled in shorted time (or just compiled as AMD/NVIDIA compilers were often unable to compile BIASPATHOCL kernels at all). In general, now compiling BIASPATHOCL kernels takes exactly the same time than compiling PATHOCL kernels.

2) The new BIASPATHOCL outperforms, in term of samples/sec, by a large margin the old BIASPATHOCL thanks to micro-kernels. BIASPATHOCL is about as fast as PATHOCL in term of samples/sec however thanks to the trade-off between between quality and speed (i.e. the "BIAS" part) it can easily outperform PATHOCL. For instance, this is LuxMark hotel scene with PATHOCL:

and BIASPATHOCL:

2.86M samples/sec Vs 3.65M sampes/sec. Note the difference in light quality due to the introduced bias. Without the cuts, BIASPATHOCL produces the same image quality and performance of PATHOCL.

There is an additional result (i.e. the +1):

3) PATHOCL and BIASPATHOCL now shares 99% of the OpenCL code (while the CPU code is still totally different). The result is that PATHOCL has gained some of the BIASPATHOCL (i.e. the optional bias) and BIASPATHOCL has gained some of the PATHOCL features (i.e. micro-kernels, Russian Roulette, etc.).

NOTE: BIASPATHOCL requires a special sampler (like RTPATHOCL)

Code: Select all
sampler.type = BIASPATHSAMPLER

New PATHOCL features

As I said, PATHOCL has gained some of the BIASPATHOCL:

Code: Select all
path.pathdepth.totalpath.pathdepth.diffusepath.pathdepth.glossypath.pathdepth.specular

The new PATHOCL properties gives the same control of BIASPATHOCL over the maximum depth of light bounces. path.pathdepth.diffuse offers a basic control of the "amount" of global illumination you want in the rendering and heavily influence the rendering performances (i.e. you can increase a lot the performance by reducing the max. depth).

New BIASPATHOCL features

As I said, BIASPATHOCL has gained some of the PATHOCL. The most obvious is micro-kernels with all their advantages (for some more information about micro-kernels go to viewtopic.php?f=8&t=11346). It has also gained Russian Roulette support:

Code: Select all
biaspath.russianroulette.depthbiaspath.russianroulette.cap

Lost BIASPATHOCL features

BIASPATHOCL has also lost one major feature in the process: path splitting (i.e. the capability to trace multiple shadow rays, glossy rays and diffuse rays. It always traces only one ray). Path splitting are not a viable option with micro-kernels.

Code: Select all
biaspath.sampling.diffuse.sizebiaspath.sampling.glossy.sizebiaspath.sampling.specular.sizebiaspath.sampling.directlight.sizebiaspath.lights.firstvertexsamples

biaspath.sampling.aa.size is still supported as it doesn't require path splitting (i.e. they are just different paths). I have also removed the support for:

Code: Select all
biaspath.lights.lowthresholdbiaspath.lights.nearstart

More because I had the feeling they were not used (or outdated by variance clamping) than because it is not possible to implement them. They can be re-added if required.

BIASPATHCPU

BIASPATHCPU still has the support for path splitting (i.e. biaspath.sampling.diffuse.size, biaspath.sampling.glossy.size, biaspath.sampling.specular.size, biaspath.sampling.directlight.size, biaspath.lights.firstvertexsamples). I would like to remove the features for symmetry with BIASPATHOCL but CPU have no problems with path splitting and doesn't look like a good idea to remove a major feature already available.

The GPU load of BIASPATHOCL is heavily related to the tile size and the AA samples. The number of samples to render sent to the GPU equal to <tile width> x <tile height> x <AA samples> and if this value is too small, the GPU will stand there without enough work to do. So it is imported to pick large tile sizes and/or AA samples if you have a fast GPU.

PATHOCL or BIASPATHOCL ?

The true question is "No-Tile or Tile rendering ?" because now PATHOCL and BIASPATHOCL have about of the same set of features. The major differences are:

1) BIASPATHOCL can render images of any size as it stores only one (or few) tiles in GPU ram. PATHOCL image resolution is limited by having to store all the image in GPU ram.

2) PATHOCL can use better samplers (i.e. Sobol and Metropolis). BIASPATHOCL specific sample is now basically a Random sample (but it may gain something similar to Sobol in the future).

3) PATHOCL maximizes the GPU load out of the box while BIASPATHOCL may require some tuning of the tile size and AA samples. PATHOCL is more "fire and forget" in general.

4) PATHOCL has "unbiased" default values for all parameters while BIASPATHOCL has "biased" default. BIASPATHOCL may deliver better performances at the cost of bias with default values. However, you can now bias PATHOCL as much as BIASPATHOCL and unbias BIASPATHOCL as much as PATHOCL.

5) BIASPATHOCL has adaptive rendering and convergence test.

As general rules, you are likely to work better with PATHOCL than BIASPATHOCL. However, if you are rendering high resolution images, BIASPATHOCL may be simply the only available option.

Posts: 8404
Joined: Sat Apr 19, 2008 6:04 pm
Location: Italy

### Re: LuxCore: new BIASPATHOCL with OpenCL micro-kernels

I have finished to write the above post, I'm also going to write a new thread with a proposal of "Path engine unification for exporters": showing a single PATH rendering engine with just several flags: CPU/OpenCL, Tile/No-tile, RT/No-RT, etc.

Posts: 8404
Joined: Sat Apr 19, 2008 6:04 pm
Location: Italy

### Re: LuxCore: new BIASPATHOCL with OpenCL micro-kernels

Atm. always getting:
Code: Select all
[LuxCore][0.833] [PathOCLRenderThread::0] Rendering thread ERROR: clWaitForEvents(-9999)terminate called after throwing an instance of 'cl::Error'  what():  clEnqueueReadBufferAbgebrochen

with pathocl used from blender or former exported .cfg. From repo luxball-hdr for example seems to work, investigating stll ...

EDIT: looks for now as massive exporter compat issue, still on it ....

EDIT": one observation: whenever i export pathdepth 8 , filesaver exporter makes it 9 !

EDI3: aaaah, think ic now .... uhmm, okay the pathdepth is odd, must ask Simon why he did that.

EDIT4: the other cl error seems to be bound to using METROPOLIS only, SOBOL works

Jens

jensverwiebe

Posts: 3429
Joined: Wed Apr 02, 2008 4:34 pm

### Re: LuxCore: new BIASPATHOCL with OpenCL micro-kernels

Jepp, METROPOLIS is broken atm., looking into this right now ...
Testing for now with SOBOL or RANDOM and this new stuff is really hot

Jens
Last edited by jensverwiebe on Mon Oct 10, 2016 8:30 am, edited 1 time in total.

jensverwiebe

Posts: 3429
Joined: Wed Apr 02, 2008 4:34 pm

### Re: LuxCore: new BIASPATHOCL with OpenCL micro-kernels

Piita

Posts: 611
Joined: Sat Aug 06, 2011 2:09 pm
Location: Finland

### Re: LuxCore: new BIASPATHOCL with OpenCL micro-kernels

super great news good feeling for luxrender futur ! thanks you so much !
i7 6700k + 32 Gb DDR4 + 2X R9 390 sapphir nitro.

sharlybg

Posts: 730
Joined: Tue Nov 02, 2010 10:22 am
Location: Ivory coast

### Re: LuxCore: new BIASPATHOCL with OpenCL micro-kernels

jensverwiebe wrote:EDIT": one observation: whenever i export pathdepth 8 , filesaver exporter makes it 9 !

EDI3: aaaah, think ic now .... uhmm, okay the pathdepth is odd, must ask Simon why he did that.

It is intended, it was introduced because LuxCore way to count path vertices sounds odd to the end users (it accounts for the problem of (not) doing MIS on the last vertex). So Simon allows the user to use a more intuitive values and than add 1 to account for LuxCore way. LuxCore way of counting:

#1 only direct light (without MIS);

#2 only direct light (with MIS);

#3 direct light (with MIS) + global illumination (1 bounce without MIS);

#4 direct light (with MIS) + global illumination (1 bounce with MIS);

#5 etc.

#2 is shown as #1 to the end user in LuxBlend, #3 as #2, etc.

P.S. going to check Metropolis.

Posts: 8404
Joined: Sat Apr 19, 2008 6:04 pm
Location: Italy

### Re: LuxCore: new BIASPATHOCL with OpenCL micro-kernels

jensverwiebe wrote:EDIT4: the other cl error seems to be bound to using METROPOLIS only, SOBOL works

It seems to happen only with NVIDIA, AMD and CPU devices are working fine here.

Posts: 8404
Joined: Sat Apr 19, 2008 6:04 pm
Location: Italy

### Re: LuxCore: new BIASPATHOCL with OpenCL micro-kernels

jensverwiebe wrote:EDIT4: the other cl error seems to be bound to using METROPOLIS only, SOBOL works

It seems to happen only with NVIDIA, AMD and CPU devices are working fine here.

Jepp, just back on testing did same: checked intel ocl and thats fine here too. Grrrrr ...

Edit, found something new in message when going with luxcoreui:
Code: Select all
[LuxRays][6.730] [Device GeForce GTX 980 Intersect] RADIANCE_PER_PIXEL_NORMALIZEDs[0] buffer size: 4800Kbytes[LuxCore][6.730] [PathOCLRenderThread::0] Rendering thread ERROR: clEnqueueNDRangeKernel(CL_OUT_OF_RESOURCES)

Thats the culprit ?

Jens

jensverwiebe

Posts: 3429
Joined: Wed Apr 02, 2008 4:34 pm

### Re: LuxCore: new BIASPATHOCL with OpenCL micro-kernels

jensverwiebe wrote:Edit, found something new in message when going with luxcoreui:
Code: Select all
[LuxRays][6.730] [Device GeForce GTX 980 Intersect] RADIANCE_PER_PIXEL_NORMALIZEDs[0] buffer size: 4800Kbytes[LuxCore][6.730] [PathOCLRenderThread::0] Rendering thread ERROR: clEnqueueNDRangeKernel(CL_OUT_OF_RESOURCES)

Thats the culprit ?

CL_OUT_OF_RESOURCES is an error NVIDIA driver usually throws when ... it doesn't know what is going wrong on the GPU. It is a bit like a memory fault exception with the CPU.

Anyway I should have fixed the problem with my last commit and now Metropolis should works fine again.