1
0
Fork 0
forked from len0rd/rockbox
Commit graph

9 commits

Author SHA1 Message Date
Andrew Mahone
5313bf52b5 Invert divisor earlier in udiv32_arm, allowing the div0 test to be done before entering the 32-bit divide portion of the code, and making the handling of div0 simpler.
git-svn-id: svn://svn.rockbox.org/rockbox/trunk@24166 a1c6a512-1295-4272-9138-f99709370657
2010-01-03 15:57:03 +00:00
Andrew Mahone
686c4e53ce Use long jump to reach __div0 from udiv32_arm if building with IRAM and without EABI.
git-svn-id: svn://svn.rockbox.org/rockbox/trunk@24152 a1c6a512-1295-4272-9138-f99709370657
2010-01-03 04:48:19 +00:00
Andrew Mahone
c1f4d4037a More comments for udiv32_armv4.S, reduce zero divisor test to one cycle for the skipped branch by setting flags when inverting divisor, 32-bit numerators are handled by calling the 31-bit divider and fixing the results.
git-svn-id: svn://svn.rockbox.org/rockbox/trunk@24151 a1c6a512-1295-4272-9138-f99709370657
2010-01-03 04:30:13 +00:00
Andrew Mahone
d03768bc14 Add missing EOF newline.
git-svn-id: svn://svn.rockbox.org/rockbox/trunk@24143 a1c6a512-1295-4272-9138-f99709370657
2010-01-02 15:25:34 +00:00
Andrew Mahone
934514558b Remove special cases from udiv32_armv4.S, except for zero divisor and large numerator. Improvement of 1.23MHz on e200 with ape normal.
git-svn-id: svn://svn.rockbox.org/rockbox/trunk@24142 a1c6a512-1295-4272-9138-f99709370657
2010-01-02 15:15:21 +00:00
Andrew Mahone
822abc1236 Add 31/31-bit unsigned division in apps/codecs/lib/udiv_arm.S, with 2 cycles / iteration, falling back to previous 32-bit, 3 cycle / iteration code when needed (well under 1% of divisions in sample file). APE normal sample is now 96.90% realtime, approx 1.3% improved vs svn. TODO: unify divisor normalization for both trial subtraction routines, possibly use divisor bits to select 31- vs 32-bit division.
git-svn-id: svn://svn.rockbox.org/rockbox/trunk@24130 a1c6a512-1295-4272-9138-f99709370657
2009-12-31 08:32:15 +00:00
Jens Arnold
545b51e2e4 ARMv4 unsigned integer division: Using an overflow-safe comparison method in the main calculation allows to put back the 1.5 cyle (average) optimisation. Shaved off another instruction, as we don't need the remainder. * Use the very efficient ffs algorithm from ffs-arm.S for dividing by a power of 2.
git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19032 a1c6a512-1295-4272-9138-f99709370657
2008-11-06 21:21:33 +00:00
Jens Arnold
0eb6ae938e This optimisation breaks for very large divisors (MSB set), so remove it.
git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19012 a1c6a512-1295-4272-9138-f99709370657
2008-11-05 07:36:39 +00:00
Jens Arnold
fe04e40be7 Further optimised (vs. libgcc) unsigned 32 bit division for ARMv4 (based on the ARMv5(+) version from libgcc), in IRAM on PP for better performance on PP5002, and put into the codeclib for possible reuse. APE -c1000 is now usable on both PP502x and PP5002 (~138% realtime, they're on par now). Gigabeat F/X should also see an APE speedup.
git-svn-id: svn://svn.rockbox.org/rockbox/trunk@19009 a1c6a512-1295-4272-9138-f99709370657
2008-11-05 00:10:05 +00:00