SmallLuxGPU 1.3beta1

Discussions related to GPU Acceleration in LuxRender

Moderators: jromang, tomb, Dade, coordinators

SmallLuxGPU 1.3beta1

Postby Dade » Sun Feb 07, 2010 3:59 pm

First all, I discovered a bug in "high bandwidth" mode: the first 32 passes where done with a capped path depth like in "low latency" mode (the bug was in not disabling this trick to increase responsiveness). This explain why we were seeing high samples/sec values at the start and than the value were going down and down (it was averaged with the initial boost).

Second I increased the default value of max. path depth to 3, from 2 (... I'm trying to slow down this little monster :D ).

So be careful if you compare values from v1.2 and v1.3beta1. Don't mix apples and oranges. It is probably better to use "low latency" mode to do comparison between v1.2 and v1.3beta1 (and don't forget to lower the max. depth too).

Anyway, let's come to the interesting things. I ported the Luxrender QBVH code both to the CPU and GPU. This code is highly optimized for vector types and it shows its muscles on the CPU thanks to the SSE. This v1.2 with BVH (only native threads):

cpu-1.2.jpg


And this is v1.3 with QBVH (only native threads):

cpu-1.3beta1.jpg


As you can see, it is something like 5 time faster ... woot :!:

The results on GPU are less eye popping but you should still experience a speedup of a 20-40%. The total speed up depends very much from your hardware and balance between the CPU and the GPU, anyway it runs more than 2 times faster than previous version of my hardware:

cpu+gpu-1.3beta1.jpg


The sources and binaries are available here: http://davibu.interfree.it/opencl/small ... 3beta1.tgz
User avatar
Dade
Developer
 
Posts: 4800
Joined: Sat Apr 19, 2008 6:04 pm
Location: Italy

Re: SmallLuxGPU 1.3beta1

Postby SATtva » Sun Feb 07, 2010 4:06 pm

I'd say just one word: outstanding!
Linux builds packager
聞くのは一時の恥、聞かぬのは一生の恥
User avatar
SATtva
Developer
 
Posts: 5501
Joined: Tue Apr 07, 2009 12:19 pm
Location: from Siberia with love

Re: SmallLuxGPU 1.3beta1

Postby Chiaroscuro » Sun Feb 07, 2010 5:21 pm

:mrgreen: Nice.

Hmmm, don't know what yet, but something is different... ;) :D
15 seconds (batch mode) on Sponza scene, using same config file for both:
Attachments
image.jpg
15 seconds Sponza - v1.3
image.jpg
15 seconds Sponza - v1.2
Chiaroscuro
Developer
 
Posts: 856
Joined: Thu Jan 21, 2010 10:46 am

Re: SmallLuxGPU 1.3beta1

Postby vildanovak » Sun Feb 07, 2010 5:39 pm

3.6 megasamples on luxball????? wow.
vildanovak
 
Posts: 53
Joined: Sat Dec 19, 2009 6:45 am

Re: SmallLuxGPU 1.3beta1

Postby Chiaroscuro » Sun Feb 07, 2010 7:04 pm

vildanovak wrote:3.6 megasamples on luxball????? wow.
Dade, I haven't been paying much attention to the stats (for SLG) on my PC until now, but I wonder if something is not right with them? If I disable one of my GPUs using the config file, I get similar stats, if not better? (attached is with both GPUs going, wide so you can see stats). i7 920 + 4890 x 2 , Win7 64-bit all standard clocks. (I just compared with v1.2 and it's similar for 1 vs 2 GPUs; but with v1.2 I get higher samps/sec. and lower rays/sec vs v1.3 where I get lower samps/sec. and higher rays/sec., like they're inverse proportional?). Otherwise your PC is completely destroying mine, or you weren't kidding about the Windows performance hit. :P (or could it just be Window's timer?)

EDIT: I've been doing more tests, and it doesn't appears to be your stats, it appears to be a catch 22 situation with trying to utilize all of the processors in my PC. You can see a little bit of the pattern in the attachment, one single GPU performs well... bring in two and there's some loss; then start bringing in CPU threads, one at a time, and for every gain in performance from a native thread I get some peformance loss from the dual GPUs. It's like they trade off each other instead of adding up together. And the more native threads, the worse the impact on the dual GPUs, to the point that it performs nearly as well with a single GPU with the max native threads. So I reach a practical plateau long before reaching the theoretical one. Sucks. :(
Attachments
slgstats2.jpg
Gradual loss in performance
slgstats.jpg
Initial stats
Last edited by Chiaroscuro on Mon Feb 08, 2010 1:16 am, edited 1 time in total.
Chiaroscuro
Developer
 
Posts: 856
Joined: Thu Jan 21, 2010 10:46 am

Re: SmallLuxGPU 1.3beta1

Postby Eros » Sun Feb 07, 2010 8:04 pm

I have a test scene with about 433ktris and i get a jump from 720k samples/s to 900k samples/s quite nice!

As noted in the LuxrenderGPU thread, updating my drivers in linux fixed a few problems. One problem effecting 1.3beta1 is that i would get workload problems regardless of the value i set. Version 195.30b of the drivers appear to actually work :D
User avatar
Eros
 
Posts: 415
Joined: Wed Jul 22, 2009 8:37 am

Re: SmallLuxGPU 1.3beta1

Postby jensverwiebe » Mon Feb 08, 2010 5:06 am

Hi all

smallluxGPU-v1.3beta1_OSX _x86_64 : http://www.jensverwiebe.de/LuxRender/Lu ... x86_64.zip

Fastest result is now with 4 native + gpu:


fastest_confi.png






Jens
User avatar
jensverwiebe
Developer
 
Posts: 2128
Joined: Wed Apr 02, 2008 4:34 pm

Re: SmallLuxGPU 1.3beta1

Postby Dade » Mon Feb 08, 2010 7:25 am

Chiaroscuro wrote:you weren't kidding about the Windows performance hit. :P (or could it just be Window's timer?)


Linux 64bit is about 2 time faster than Windows 7 64bit on my hardware. I'm not sure why, I think it is a combination of GCC optimizer being a lot better than VisualC++ optimizer and the Windows scheduler being pure crap. Check my last screenshot, my 5870 is about as fast your 2x4890 and our CPUs are equivalent in speed too, I can just throw 8 threads without loosing too much GPU workload while you can just barely use 2 (!).

Did I say the Window scheduler is crap ?

You could do a test with a linux live cd to check how much you could gain with Linux on your hardware :idea:

Chiaroscuro wrote:EDIT: I've been doing more tests, and it doesn't appears to be your stats, it appears to be a catch 22 situation with trying to utilize all of the processors in my PC. You can see a little bit of the pattern in the attachment, one single GPU performs well... bring in two and there's some loss; then start bringing in CPU threads, one at a time, and for every gain in performance from a native thread I get some peformance loss from the dual GPUs. It's like they trade off each other instead of adding up together. And the more native threads, the worse the impact on the dual GPUs, to the point that it performs nearly as well with a single GPU with the max native threads. So I reach a practical plateau long before reaching the theoretical one. Sucks. :(


Up to now the contribution of CPU was quite negligible in term of samples/sec so it wasn't really a problem to dedicate 1 core to each GPU. However now that native threads are so fast it is becoming a suboptimal solution. So I'm exploring various solutions to keep GPU busy with less cost for the CPU:

1) generate more rays for each step: for instance trace more shadow rays instead of only one, this would highly increase the GPU load (4 triangle lightsources in luxball scene = 3 more rays trace per step = 5 rays traced instead of 2 = 2.5 more load on the GPU). It would produce less samples/sec but with a lot less noise;

2) move the image pipeline to the GPU (in order to have less load on the CPU, add true image filtering, etc.)

3) move ray setup, and result collection on the GPU too;

I'm doing #1, #2 is very interesting and looking into it soon, #3 could be ridiculous fast but is not applicable to LuxrenderGPU so I'm not planning to move in that direction.

P.S. the difference you are seeing in Sponza scene is exactly the fix of path depth I was talking about. v1.2 was just doing the first 32 passes with a cut path depth resulting in a direct lighting-only rendering (this is useful in "low latency" mode to increase the responsiveness but totally useless in "high bandwidth" mode and it was artificially boosting the samples/sec).
User avatar
Dade
Developer
 
Posts: 4800
Joined: Sat Apr 19, 2008 6:04 pm
Location: Italy

Re: SmallLuxGPU 1.3beta1

Postby Chiaroscuro » Mon Feb 08, 2010 10:37 am

Dade wrote:[...]You could do a test with a linux live cd to check how much you could gain with Linux on your hardware :idea: [...]
Will try that. Actually, I have to re-format soon as I'm still using Win 7 RC which expires in March, so I'll put aside a partition for linux at the same time.

Thank you for the details.
Chiaroscuro
Developer
 
Posts: 856
Joined: Thu Jan 21, 2010 10:46 am

Re: SmallLuxGPU 1.3beta1

Postby Dade » Mon Feb 08, 2010 12:24 pm

Chiaroscuro wrote:
Dade wrote:[...]You could do a test with a linux live cd to check how much you could gain with Linux on your hardware :idea: [...]
Will try that. Actually, I have to re-format soon as I'm still using Win 7 RC which expires in March, so I'll put aside a partition for linux at the same time.


Another report about differences between linux and windows on the same hardware: viewtopic.php?f=34&t=3447&p=31730#p31716
User avatar
Dade
Developer
 
Posts: 4800
Joined: Sat Apr 19, 2008 6:04 pm
Location: Italy

Next

Return to GPU Acceleration

Who is online

Users browsing this forum: Makers_F and 3 guests