25 Oct 09:13
Re: [patch 2/2] backport of sba sg list management to ccio-dma
Joel Soete <soete.joel <at> scarlet.be>
2007-10-25 07:13:18 GMT
2007-10-25 07:13:18 GMT
> On Tue, Oct 23, 2007 at 06:15:34PM +0200, Joel Soete wrote:
> ...
> > This seems to be better fls then ffs like:
> > int get_iovp_order_faster(unsigned long size)
> > {
> > return fls((size - 1) >> (IOVP_SHIFT));
> > }
>
> Yes, I didn't really think about fls() vs ffs().
> But you figured it out. :)
>
> I just wanted the asm() version that's 10x faster than looping through
> and testing each bit.
>
I attached 2 test files (GetOrder-o.c using the original kernel get_iovp_order
routine and GetOrder-f.c using supposed faster release) and here are some
results:
# gcc -O2 -o GetOrder-o GetOrder-o.c
# gcc -O2 -o GetOrder-f GetOrder-f.c
# time ./GetOrder-o
real 0m40.576s
user 0m39.990s
sys 0m0.004s
# time ./GetOrder-f
real 0m1.401s
user 0m1.348s
sys 0m0.008s
# gcc -o GetOrder-o GetOrder-o.c
# gcc -o GetOrder-f GetOrder-f.c
# time ./GetOrder-o
real 4m54.438s
user 4m48.486s
sys 0m0.044s
# time ./GetOrder-f
real 1m22.989s
user 1m21.317s
sys 0m0.016s
I didn't check in more details the difference of results between not-optimized
and optimized (-O2) compile but it's clear it's faster
(why not push this in upstream? [I mean ./include/asm-generic/page.h])
>
> > Even thought following test seems to be ok:
> > unsigned int a;
> > unsigned long ul;
> >
> > /* for proof of concept
> > */
> > for (ul=0; ul<536870914; ul++) {
> > if ( get_iovp_order(ul) != get_iovp_order_faster(ul) ) {
> > printf("get_iovp_order(%ld) = %ld (0x%x)\n\n", ul, a =
get_iovp_order(ul), a);
> > printf("get_iovp_order_faster(%ld) = %ld (0x%x)\n\n", ul, a =
> > get_iovp_order_faster(ul), a);
> > exit (1);
> > }
> > }
> >
> > But what would give the kernel generic (include/asm-generic/page.h: this is
> > what get_iovp_order() is if IOVP_SHIFT==PAGE_SHIFT) get_order(0): 0?
> > the answer is 20 for paric and ia32?
>
> I don't know offhand.
> I'm not sure we have to handle that case...or if we do, make sure it's
handled correctly in get_iovp_order().
>
I just hope so that linux check to never call get_order(0)
> > > ...
> > > +#ifdef CCIO_MAP_STATS
> > > + ioc->usg_pages += sg_dma_len(sglist) >> IOVP_SHIFT;
> > > + ioc->usingle_calls--; /* kluge since call is unmap_sg() */
> > > +#endif
> > >
> > > I wouldn't add MAP_STATS here.
> > > They aren't enabled in SBA becuase they impact performance too much.
> > > I expect that will always be true and would seriously consider removing
> > > MAP_STATS code from SBA as well. Timing the bitmap search is
> > > probably the only critical bit of info that really matters.
> > > And that's for developement/testing only.
> > >
> > mmm, I would so let stay here but with just additional comment to prevent to
> > activate it outside this development/testing context?
>
> That's fine too.
In fact it would be useless as the begining of the src this #define was
already commented this way
> Just making a suggestion here since stats are alot less
> useful when stats interfere with the actual performance measurements.
> Reducing the stats only measuring the search time would make more sense
> to me.
>
understand. (I will remove this too so, even thought tbh I didn't yet test
this feature)
> cheers,
> grant
>
>
Many tx,
J.
PS: I play a bit with this stuff and surprisingly with latest git grab of
2.6.23, the narrow s-e disk (on LASI ncr53c710 hba) seems to be usable: I
reach to update an debian unstable installation not updated since a year
(about 430 upgrade) with just few and harmless "Bus Reset ..." and "failing
command because of reset, ..."
---
Pack Scarlet One, ADSL 6 Mbps + Telephonie, a partir de EUR 29,95...
http://www.scarlet.be/
_______________________________________________ parisc-linux mailing list parisc-linux <at> lists.parisc-linux.org http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
RSS Feed