To appear (accepted): Paolo D'Alberto, Marco Bodrato, and Alexandru Nicolau Exploiting Parallelism in Matrix-Computation Kernels for Symmetric Multiprocessor Systems: Matrix-Multiplication and Matrix-Addition Algorithm Optimizations by Software Pipeline and Threads Allocation in TOMS, ACM press.
Download: DVI-paper, PDF-paper, PDF-slides, BibTex-entry, software.
Abstract: Although Strassen method is not the
asymptotically faster matrix multiplication known, it is the most
widely used for large matrices on finite fields. After his first
paper, some variant have been proposed, with different additive
complexity, here we describe a new one.
The new variant is as
good as those already known for a simple matrix multiplication, but
can save operations either when more than two matrices are to be
multiplied or for squaring. Moreover it can be proved optimal for this
tasks.
The biggest gain is shown for nth-power computation,
in this scenario the additive complexity can be halved, with respect
to original Strassen's.
Published: Preprint N.622 Centro Interdipartimentale "Vito Volterra" - Università di Roma "Tor Vergata" (2008)
Published (full revised version): A Strassen-like matrix multiplication suited for squaring and higher power computation in Proceedings of the ISSAC 2010 conference, München, Germany, 25-28 July, 2010, ACM press.
Download: DVI-paper (29 kB), PDF-paper (265 kB), DjVu-paper (139 kB), PDF-slides (224 kB), TeX-slides (6 kB), DVI-preprint (25 kB), PDF-preprint (247 kB), DjVu-preprint (71 kB), BibTex-entry, software.
Abstract: Some hints and tricks to automatically obtain high degree Toom-Cook implementations, i.e. functions for integer or polynomial multiplication with a reduced complexity. The described method generates quite an efficient sequence of operations and the memory otprint is kept low by using a new strategy: mixing evaluation, interpolation and recomposition phases. It is possible to automatise the whole procedure obtaining a general Toom-n function, and to extend the method to polynomials in any characteristic except two.
Published: in E. Antelo, D. Hough and P. Ienne, editors, Proceedings of the 20th IEEE Symposium on Computer Arithmetic, pages 15-22, IEEE, Tübingen, Germany, July 25-27, 2011.
Download: PDF-paper (363 kB), PDF-slides (467 K), TeX-slides (- K), software.
Abstract:
Toom-Cook strategy is a well-known method for building algorithms to
efficiently multiply dense univariate polynomials. Efficiency of the
algorithm depends on the choice of interpolation points and on the
exact sequence of operations for evaluation and interpolation. If
carefully tuned, it gives the fastest algorithm for a wide range of
inputs. This work smoothly extends the Toom strategy to polynomial
rings, with a focus on
GF(2)[x]. Moreover a method is
proposed to find the faster Toom multiplication algorithm for any
given splitting order. New results found with it, for polynomials in
characteristic 2, are presented.
A new extension for
multivariate polynomials is also introduced; through a new definition
of density leading Toom strategy to be efficient.
Published: In C.Carlet and B.Sunar, editors, WAIFI'07 proceedings, volume 4547 of LNCS, pages 116-133. Springer, Madrid, España, June 21-22, 2007.
Also known as: Searching Optimal Toom-Cook Algorithms for Polynomials in Characteristic 2 and 0 (working title).
Download: DVI-paper (38 kB), PDF-paper (320 kB), DjVu-paper (132 kB), PDF-slides (380K), TeX-slides (7K), BibTex-entry, software.
Abstract: The use of Toom-Cook sub-quadratic
polynomial multiplication was recently shown to be possible also when
the coefficient field does not have elements enough, particularly for
𝔽2[x]. This paper focus on how Toom's
strategies can be adapted to polynomials on non-binary small
fields. In particular we describe the Toom-3 algorithm for
𝔽p[x] with p∈{3, 5, 7}, and
some other unbalanced algorithms.
Algorithms are described with full details, not only the asymptotic
complexity is given, but the exact sequence of operations for possibly
optimal implementations.
Algorithms given here for
𝔽p[x] perfectly
works, as well, for
𝔽pn[x],
and may be useful for computations inside
𝔽pn for large enough
n.
Moreover some connections with FFT-based multiplication are found,
slightly improving the worst cases of FFT.
Published: Preprint Centro Interdipartimentale "Vito Volterra" - Università di Roma "Tor Vergata" (2007)
Download: DVI-paper (32 kB), PDF-paper (307 kB), DjVu-paper (104 kB), BibTex-entry, software.
Abstract:
In this paper the problem of univariate polynomial evaluation
is considered. When both polynomial coefficients and
the evaluation point
are integers, unbalanced multiplications (one
factor having many more digits than the other one) in classical
Ruffini-Horner rule do not let computations completely benefit of
subquadratic methods, like Karatsuba, Toom-Cook and
Schönhage-Strassen's.
We face this problem by applying an approach originally proposed by
Estrin to augment parallelism exploitation in computation. We show
that it is also effective in the sequential case, whenever data dimensions
grow, e.g. in the long integer case.
We add some adjustments to Estrin's proposal obtaining a smoother
behaviour around corner cases, and to avoid performance degradation when
most of the coefficients are zero.
This way, a new general algorithm is obtained, improving both
theoretical complexity and actual performance. The algorithm itself is
very simple, and its use can be usefully extended to evaluation of
polynomials on rationals or on polynomials (polynomial composition).
Some tests, results and comparisons obtained with PARI/GP are also presented,
for both dense and ``sparse'' polynomials.
Published: Long Integers and Polynomial Evaluation with Estrin's Scheme in Proceedings of the 13th SYNASC Symposium, Timişoara, Romania, September 26-29, 2011
Download: PDF-paper (309 kB), DjVu-paper (- kB), PDF-slides (610kB), BibTex-entry, software.
Abstract: Karatsuba and Toom-Cook are well-known methods used to multiply efficiently two long integers. There have been different proposal about the interpolating values used to determine the matrix to be inverted and the sequence of operations to invert it. A definitive word about which is the optimal matrix (values) and the (number of) basic operations to invert it seems still not to have been said. In this paper we present some particular examples of useful matrices and a method to generate automatically, by means of optimised exhaustive searches on a graph, the best sequence of basic operations to invert them.
Published: Preprint N.605 Centro Interdipartimentale "Vito Volterra" - Università di Roma "Tor Vergata" (2006)
Published (full revised version): Integer and Polynomial Multiplication: Towards Optimal Toom-Cook Matrices in Proceedings of the ISSAC 2007 conference, Ontario, Canada, July 29-August 1, 2007, ACM press.
Download: PDF-preprint (263 kB), DjVu-preprint (104 kB), PDF-paper (287 kB), PDF-slides (475kB), BibTex-entry, software.
Abstract: In Gröbner bases computation, as in other algorithms in commutative algebra, a general open question is how to guide the calculations coping with numerical coefficients and/or not exact input data. It often happens that, due to error accumulation and/or insufficient working precision, the obtained result is not one expects from a theoretical derivation. The resulting basis may have more or less polynomials, a different number of solution, roots with different multiplicity, another Hilbert function, and so on. Augmenting precision we may overcome algorithmic errors, but one does not know in advance how much this precision should be, and a trial-and-error approach is often the only way to follow. Coping with initial errors is an even more difficult task. In this experimental work we propose the combined use of syzygies and interval arithmetic to decide what to do at each critical point of the algorithm.
Published: Proceedings workshop CASC 2006. V.G. Ganzha, E.W. Mayr, E.V. Vorozhtsov ed. Springer-Verlag LNCS 4194. Chişinau, Moldova 2006, pp. 64-76, ISBN: 3-540-45182-X
Download: PDF-paper (250 kB), DjVu-paper (87 kB), PDF-slides (538K), BibTex-entry,
software.
Abstract: In Gröbner bases computation a general open question is how to guide calculations coping with numerical coefficients and/or not exact input data. It may happen that, due to error accumulation or insufficient working precision, the result is not one theoretically expects. The basis may have more or less polynomials, a different number of solutions, a zero set with wrong multiplicity, and so on. Augmenting precision we may overcome algorithmic errors, but we don't know in advance how much it should be, and a trial-and-error approach is often the only way. Coping with initial errors is an even more difficult task. In this work the combined use of syzygies and interval arithmetic is proposed as a technique to decide at each critical point of the algorithm what to do.
Published: (reduced version) Proceedings of the 6th SYNASC Symposium. D. Petcu, D. Zaharie, V. Negru, T. Jebelean ed. MIRTON, Timişoara, Romania 2004, pp. 77 - 89, ISBN 973-661-441-7
Published: (full version) Analele Universitatii din Timisoara, Vol. XLII, Fasc.special 2, T. Jebelean, V. Negru, A. Popovici ed., Timişoara, Romania 2004, pp. 13 - 30, ISSN 1224-970X
Download: PDF-paper, BibTex-entry, software.
Abstract: We improve our proposal of a new variant of the McEliece cryptosystem based on QC-LDPC codes. The original McEliece cryptosystem, based on Goppa codes, is still unbroken up to now, but has two major drawbacks: long key and low transmission rate. Our variant is based on QC-LDPC codes and is able to overcome such drawbacks, while avoiding the known attacks. Recently, however, a new attack has been discovered that can recover the private key with limited complexity. We show that such attack can be avoided by changing the form of some constituent matrices, without altering the remaining system parameters. We also propose another variant that exhibits an overall increased security level. We analyse the complexity of the encryption and decryption stages by adopting efficient algorithms for processing large circulant matrices. The Toom-Cook algorithm and the short Winograd convolution are considered, that give a significant speed-up in the cryptosystem operations.
Published: in Proceedings of the Sixth Conference SCN ,volume 5229 of LNCS, pages 246-262. Springer, Amalfi, Italy, September 10-12, 2008.
Download: DVI-paper (34 kB), PDF-paper (251 kB), DjVu-paper (111 kB), PDF-slides (523K), BibTex-entry, software.
Ask for a copy: waiting for expiration of publisher restrictions, ask for a copy of the paper to one of the authors: Marco Bodrato <>.