Steve Wart | 17 Nov 21:08

Re: TBoundSphere performance

Thanks David,

I am also thinking of pre-calculating the spheres to cache them with my meshes.

The hit-testing is fast, it's the initial creation that's expensive.

Moving this complex code into C is a bit daunting, although I found a few free(ish) libraries that can take an arbitrary vector of 3d points and return the center and radius of a minimal (or close to it) sphere containing the points. Since we are doing most everything else in Smalltalk that might also provide a nice win. We are creating a large number of objects in this method, but as you say, it is only invoked at certain times.

I am activating about 200 frames when I enter a new space and the bounds init is taking on the order of 500ms to 3000ms for each one.

Steve

On Mon, Nov 17, 2008 at 11:53 AM, David P. Reed <dpreed <at> reed.com> wrote:
It's not surprising that so much time is spent on this.   It takes the mesh and creates a tight sphere-covering to accelerate ray intersection testing.

It's possible to trade space for time by storing the resulting sphere-tree with the mesh source, rather than recalculating it at load time.   But the storage space explosion factor is enormous, and it's not clear whether reading in the sphere-tree wouldn't be slow due to file reading (it certainly is slow to send across a slowish network).

You could get a constant factor by just making a primitive out of a few of the inner loops, or even much better improvement by using CUDA organized parallelized code (nice research project).

Alternatively, and probably much better, change the ray-mesh intersection code (the only client for the sphere-tree) by subclassing the operation by known techniques optimiized for your mesh's properties.  OOP is designed for this!




Steve Wart wrote:
Hi,

I have some large meshes that are taking a long time to activate. It looks like a lot of the time is spent in a method called initBounds, which in turn is spending a lot of time in TBoundSphere>>calcTree:faces:depth:

Have there been any efforts in optimizing this method (e.g. making it a primitive)?

Any suggestions on how to approach this? Slang? FFI? Something more clever?

Steve


Gmane