8 Sep 2003 08:37
Re: a query on floating-point operations
At 11:56 am +0200 03/9/6, james anderson wrote: >is this a consequence of the cache architecture / internal processor >parallelism which makes the array references as effective as >referencing constants from the code? is there some effect i'm >overlooking? I am more lost that you James. I wondered if perhaps the compiler could not use the declaration (type (simple-array double-float (4)) p result) when within the compile-transform macro. So I tried using a macro to wrapped local declarations for each access, i.e. in lieu of (* ... I used (*d ... where *d is (defmacro *d (a b) `(* (the double-float ,a) (the double-float ,b))) This had a deleterious effect of speed. At a tangent I tried svref in lieu of aref in the symbol-macrolet on the simple-arrays p and result, which produced massive consing. My hunch is that rather than array access being as fast as raw float access, there is some other inefficiency in the compile-transform version which slows it. The next test is to manually substitute the example float values in the compile-transform function, so allowing us to remove the runtime compiler phase (hence see if that is the source of the inefficiency). My trial and error optimization technique needs wholesale recalibration in the light of compiler and other internal changes in MCL 5 and MacOS X. James, you don't say what environment you are using (MCL, MacOS, hardware). I look forward to hearing from those who know far more of compiler internals and appropriate MCL optimization techniques. Parting shot: I presume that we cannot use this type of constant folding example for delivery applications as it evokes the compiler during runtime (that is without buying expensive compiler delivery licence).
RSS Feed