Summary0001224: GlossyReflection kernel can compute a NaN Vector
DescriptionIn the interesting case, the two inputs to pow() are base = 0x3f7ff9df = slightly less than 1.0f and exponent = 0x39d0a3c4 = a pretty small positive value.

- Windows calculator computes this as a value close to and slightly less than 1.0f, or 0x3F7FFFFF.
- A conformant version of pow() can compute this as a value slightly above 1.0f, or 0x3f800001

const float cosTheta = pow(1.f - u1, exponent); // cosTheta = slightly >1.0f
const float sinTheta = sqrt(1.f - cosTheta * cosTheta); // sinTheta = NaN!
const float x = cos(phi) * sinTheta; // x = NaN!
const float y = sin(phi) * sinTheta; // y = NaN!
const float z = cosTheta;

Eventually, our entire ray direction is nothing but NaNs. This causes the entire QBVH to get walked as part of Intersect(), which means >200K iterations vs. 0000045:0000100 if the ray direction is non-NaN. That obviously impacts performance.

If we clamp the results of pow() to [0.0f, 1.0f]:

const float cosTheta = min( 1.f, pow(1.f - u1, exponent) );

everything is fine.

 Notes SATtva (developer) 2012-04-23 20:39 Can you recheck with 1.0RC1 or the current trunk? raun (reporter) 2012-04-25 09:42 Appears patched in this changeset: author Asbjørn Heid Wed Apr 25 09:05:49 2012 +0200 (9 hours ago) changeset 1048 6033098d1c43 parent 1046 6f1d6b9c48bd child 1049 271407fbc7b9    471 void GlossyReflection(const Vector *wo, Vector *wi, const float exponent,    472 const Vector *shadeN, const float u0, const float u1) {    473 const float phi = 2.f * M_PI * u0;    474 const float cosTheta = pow(1.f - u1, exponent);    475 const float sinTheta = sqrt(max(0.f, 1.f - cosTheta * cosTheta));    476 const float x = cos(phi) * sinTheta;    477 const float y = sin(phi) * sinTheta;    478 const float z = cosTheta; This fixes the NaN result from sqrt, but its still possible that costTheta is > 1.0f. I'm not sure if this is a problem since the vector is probably normalized eventually. Thanks for the quick response!

