Implements three algorithms: two almost-naive for small sizes (less than a few hundred), and Peter Luschny's Swing-Divide&Conquer[1] for larger operands. On my laptop the new code is ~20% faster than current GMP-5.0[2] implementation for all sizes, see the graphs below.
Behaviour can be deeply different on different architectures. If you try the code and obtain much different results, please let me know: <>.
The code contains an implementation of the Eratosthenes' sieve, the (odd) swing factorial, and some other service functions (e.g. a function to compute the product of an unordered list of numbers).
To test the code you can directly replace the
file mpz/fac_ui.c with the new one, and recompile the GMP
library. Otherwise you can use it stand-alone, compiling it with
the -DMAIN option from within the GMP source directory
adding all required options (after make&&(cd tune;make
speed)). I use:
gcc -O2 -DMAIN -I. -Itune fac_ui.c .libs/libgmp.a tune/.libs/libspeed.a -lrt -o fac-main .
The syntax is: fac-main [limit] [option
[step]] , where
-f factorial
test, -F factorial timings, -b binomial
test, -B binomial(n,n/3) timings;The code contains two thresholds: FAC_DSC_THRESHOLD
and FAC_ODD_THRESHOLD, the first is determined by the
size where bc_oddfac_1 start being slower
than dsc_oddfac_1; the second one by the
comparing bc_fac with any of the two oddfac
implementations.
TODO: tune both thresholds with tune/speed.
Licence: GNU GPLv3+.
Download: gmp-5.0.1/mpz/fac_ui.c.
References:
[1]Divide, Swing, and Conquer the Factorial and the lcm{1, 2, …, N} by Peter Luschny, preprint (April 2008)
[2]GMP - The GNU Multi Precision library
Implements four algorithms: a small update to current GMP-5.0[2] implementation and another naïve algorithm, two Divide&Conquer experiments (to be generalised) for medium-sizes, and the asymptotically fast algorithm by P. Goetgheluck[1] using factorisation.
This code heavily depend on the code above... Precisely,
it is the same file. It can be tested by
replacing mpz/fac_ui.c with the new file,
replace mpz/bin_uiui.c with the two lines below, and
recompile the GMP.
#define OPERATION_bin_uiui #include "fac_ui.c"
WARNING! This code is not ready for inclusion in
the library yet! There are many couples (n,k) where it
is slower than current implementation, because NO
threshold mechanism implement yet. Only the algoritm
marked new is used, but this is good, as you can see from
the pictures below, only for big values of k.
The graphs below show the regions (depending on values (n,k)) where each algorithm is the fastest.
A very short description of the algorithms referred to by the graphs:
Licence: GNU GPLv3+.
Download: gmp-5.0.1/mpz/fac_ui.c.
References:
[1]Computing Binomial Coefficients by P. Goetgheluck, The American Mathematical Monthly, Vol. 94, No. 4 (April 1987), pp. 360-365.
[2]GMP - The GNU Multi Precision library
The new code for was inserted again in the same
file. It can be tested by compiling it, I use the following
line
gcc -O2 -DMAIN -Impz -Itune fac_ui.c .libs/libgmp.a tune/.libs/libspeed.a -lrt -o fac-main .
When invoked by fac-main -d, the program will test the correctness of the code for the double factorial.
When invoked by fac-main -p, the program will test the correctness of the code for the primorial.
TODO: the code is in an early developement state, nevertheless it should be asymptotically fast.
Licence: GNU AGPLv3+.
Download: gmp-5.0.2/mpz/fac_ui.c.
References:
[*]GMP - The GNU Multi Precision library