Page 4 of 49

Re: ATI released OpenCL SDK with hardware support

PostPosted: Sun Dec 06, 2009 10:28 am
by Abel
Great! :)

I'm very curious to see how good this will look on a more complex interior scene - if it looks good and fast, I'll start dreaming of having this in Blender's viewport someday.

Re: ATI released OpenCL SDK with hardware support

PostPosted: Sun Dec 06, 2009 11:07 am
by Lord Crc
Very nice Dade. Unfortunately I haven't had much time to play with OpenCL yet... just wish these exams could finish soon! :P

Will indeed be interesting to see how it behaves as one makes it more complex.

Re: ATI released OpenCL SDK with hardware support

PostPosted: Sun Dec 06, 2009 2:52 pm
by jensverwiebe
hmm, in spite it compiles after changing OSX includes to -framework OpenCl and such,
i always fail witH:
OpenCL Device 0: Type = TYPE_GPU
OpenCL Device 0: Name = GeForce 8800 GT
OpenCL Device 0: Compute units = 14
OpenCL Device 0: Max. work group size = 512
Failed to open file 'rendering_kernel.cl'


Any idea ?

Jens

Re: ATI released OpenCL SDK with hardware support

PostPosted: Sun Dec 06, 2009 3:12 pm
by Lord Crc
Yeah Dade forgot (or didn't want?) to include the kernel file :)

Re: ATI released OpenCL SDK with hardware support

PostPosted: Sun Dec 06, 2009 4:18 pm
by psor
Great stuff Dade, thanks for sharing! :shock: 8-) :mrgreen:



take care
psor

Re: ATI released OpenCL SDK with hardware support

PostPosted: Sun Dec 06, 2009 5:18 pm
by Dade
Lord Crc wrote:Yeah Dade forgot (or didn't want?) to include the kernel file :)


Ahah, the kernelz is mine !!11onone ... sorry, I included *.c, *.h, *.txt ... and no *.cl :roll:

Jens, I updated the. tgz, now it includes the kernel too.

EDIT: Jens, your 14 compute units and 512 max. work group size look sexy, it should run even faster than on my hardware.

Re: ATI released OpenCL SDK with hardware support

PostPosted: Sun Dec 06, 2009 7:41 pm
by dougal2
Both your packaged binary and my own re-built binary product the following error in GPU mode:

Code: Select all
doug@l64dev:~/dev/SmallptGPU-v1.0$ ./smallptGPU 0 1 1024 768 scenes/simple.scn
Usage: ./smallptGPU
Usage: ./smallptGPU <use CPU device (0 or 1)> <use GPU device (0 or 1)> <window width> <window height> <scene file>
Reading scene: scenes/simple.scn
Scene size: 5
OpenCL Device 0: Type = TYPE_GPU
OpenCL Device 0: Name = GeForce 8400 GS
OpenCL Device 0: Compute units = 1
OpenCL Device 0: Max. work group size = 512
Reading file 'rendering_kernel.cl' (size 2634 bytes)
Failed to build OpenCL kernel: -42
OpenCL Programm Build Log: ptxas ptx input, line 2088; fatal   : Parsing error near '.': syntax error
: Retrieving binary for 'anonymous_jit_identity', for gpu='sm_11', usage mode='


CPU version seems to run at about 1,570,000 S/s (Q8300 2.5GHz)

Re: ATI released OpenCL SDK with hardware support

PostPosted: Sun Dec 06, 2009 11:34 pm
by Lord Crc
Thanks Dade!

For the cornell scene, the CPU gets about the same as dougal2, 1.5mill samples/sec on my Q6600. After some slight tweaking I got the GPU version on my 4790 to stabilize at 5.9mill S/s. Both numbers are using OpenCL (haven't compiled the single thread version yet).

The small tweak I did was to move the seeds into local array, and use that during the tracing, then move it back out. So

Code: Select all
__kernel void RadianceGPU(
    __global Vec *colors, __global unsigned int *seedsInput,
   __global Sphere *Spheres, __global Camera *camera,
   const unsigned int SphereCount, const int width, const int height,
   const int currentSample,
   __global int *pixels) {
   const int gid = get_global_id(0);
   const int lid = get_local_id(0);
   const int gid2 = 2 * gid;
   const int x = gid % width;
   const int y = gid / width;

   // move seed to local store
   unsigned int seeds[2];
   mem_fence(CLK_GLOBAL_MEM_FENCE);
   seeds[0] = seedsInput[gid2+0];
   seeds[1] = seedsInput[gid2+1];

...
   Radiance(Spheres, SphereCount, &ray,
         &seeds[0], &r);

...

   mem_fence(CLK_GLOBAL_MEM_FENCE);
   seedsInput[gid2+0] = seeds[0];
   seedsInput[gid2+1] = seeds[1];
}


Not sure if the fences are necessary but didn't affect performance, so I'd say better safe than sorry ;) Of course you'll need to comment out the __global before the seed param declaration in the other files as well.

Cheers!

Re: ATI released OpenCL SDK with hardware support

PostPosted: Mon Dec 07, 2009 6:38 am
by Dade
Lord Crc wrote:Not sure if the fences are necessary but didn't affect performance, so I'd say better safe than sorry ;)


Thanks Lord, I applied your patch and I gained about 3-400,000 sample/sec. You can safely remove the barriers, there one seed per kernel so they can work all in parallel.

I get only 400,000 samples out of OpenCL CPU device so I guess it is just the Linux version to sucks (we have exactly the same CPU). I get about 1million less sample/sec than you on GPU but your board is newer than mine (even if mine is a 48x instead of a 47xx). It could be again the Linux drivers to be slower too (nothing new :?).

xDougal2: your error is really strange, it looks like in internal error of the compiler (bug ?). -42 stands for CL_INVALID_BINARY. It looks like the compiler is generating something wrong.

Re: ATI released OpenCL SDK with hardware support

PostPosted: Mon Dec 07, 2009 8:21 am
by dougal2
Dade wrote:Dougal2: your error is really strange, it looks like in internal error of the compiler (bug ?). -42 stands for CL_INVALID_BINARY. It looks like the compiler is generating something wrong.


:(

I'll see if there's any SDK/driver updates.