KyungSoo wrote:Thank you for your information, new nVidia driver makes it works.
And there is large speed up, too, from 2,500K to 3,400K.
But, the load of GPUs are not improved, yet. They are still around 40%s.
Quite impressive to see 8 GPUs at work. I have a new SmallLuxGPU version where you can assign even multiple threads to keep a GPU busy: all GPUs (i.e. intersection devices) are grouped together under a virtual device seen by the system like a single GPU (the workload is dynamically assigned to the less busy real GPU). Than I can define any number of threads to produce work for this single virtual device (for instance you can have 4 threads producing work for 2 GPUs).
However it would not solve the problem in your case because I assume you have more GPUs than CPU cores on your test system. I'm afraid the only option in a system like your is to make CPU code run faster and/or translate more work from the CPU to the GPUs.
There is also the option to increase the complexity of the scene, for instance you could try the scene with 2.7M triangles (but probably you need a more complex scene):
http://davibu.interfree.it/opencl/small ... scenes.tgz