You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
139 lines
6.9 KiB
HTML
139 lines
6.9 KiB
HTML
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
|
|
<html>
|
|
<!-- This manual describes how to install and use the GNU multiple precision
|
|
arithmetic library, version 6.1.0.
|
|
|
|
Copyright 1991, 1993-2015 Free Software Foundation, Inc.
|
|
|
|
Permission is granted to copy, distribute and/or modify this document under
|
|
the terms of the GNU Free Documentation License, Version 1.3 or any later
|
|
version published by the Free Software Foundation; with no Invariant Sections,
|
|
with the Front-Cover Texts being "A GNU Manual", and with the Back-Cover
|
|
Texts being "You have freedom to copy and modify this GNU Manual, like GNU
|
|
software". A copy of the license is included in
|
|
GNU Free Documentation License. -->
|
|
<!-- Created by GNU Texinfo 6.4, http://www.gnu.org/software/texinfo/ -->
|
|
<head>
|
|
<title>Binary GCD (GNU MP 6.1.0)</title>
|
|
|
|
<meta name="description" content="How to install and use the GNU multiple precision arithmetic library, version 6.1.0.">
|
|
<meta name="keywords" content="Binary GCD (GNU MP 6.1.0)">
|
|
<meta name="resource-type" content="document">
|
|
<meta name="distribution" content="global">
|
|
<meta name="Generator" content="makeinfo">
|
|
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
|
|
<link href="index.html#Top" rel="start" title="Top">
|
|
<link href="Concept-Index.html#Concept-Index" rel="index" title="Concept Index">
|
|
<link href="Greatest-Common-Divisor-Algorithms.html#Greatest-Common-Divisor-Algorithms" rel="up" title="Greatest Common Divisor Algorithms">
|
|
<link href="Lehmer_0027s-Algorithm.html#Lehmer_0027s-Algorithm" rel="next" title="Lehmer's Algorithm">
|
|
<link href="Greatest-Common-Divisor-Algorithms.html#Greatest-Common-Divisor-Algorithms" rel="prev" title="Greatest Common Divisor Algorithms">
|
|
<style type="text/css">
|
|
<!--
|
|
a.summary-letter {text-decoration: none}
|
|
blockquote.indentedblock {margin-right: 0em}
|
|
blockquote.smallindentedblock {margin-right: 0em; font-size: smaller}
|
|
blockquote.smallquotation {font-size: smaller}
|
|
div.display {margin-left: 3.2em}
|
|
div.example {margin-left: 3.2em}
|
|
div.lisp {margin-left: 3.2em}
|
|
div.smalldisplay {margin-left: 3.2em}
|
|
div.smallexample {margin-left: 3.2em}
|
|
div.smalllisp {margin-left: 3.2em}
|
|
kbd {font-style: oblique}
|
|
pre.display {font-family: inherit}
|
|
pre.format {font-family: inherit}
|
|
pre.menu-comment {font-family: serif}
|
|
pre.menu-preformatted {font-family: serif}
|
|
pre.smalldisplay {font-family: inherit; font-size: smaller}
|
|
pre.smallexample {font-size: smaller}
|
|
pre.smallformat {font-family: inherit; font-size: smaller}
|
|
pre.smalllisp {font-size: smaller}
|
|
span.nolinebreak {white-space: nowrap}
|
|
span.roman {font-family: initial; font-weight: normal}
|
|
span.sansserif {font-family: sans-serif; font-weight: normal}
|
|
ul.no-bullet {list-style: none}
|
|
-->
|
|
</style>
|
|
|
|
|
|
</head>
|
|
|
|
<body lang="en">
|
|
<a name="Binary-GCD"></a>
|
|
<div class="header">
|
|
<p>
|
|
Next: <a href="Lehmer_0027s-Algorithm.html#Lehmer_0027s-Algorithm" accesskey="n" rel="next">Lehmer's Algorithm</a>, Previous: <a href="Greatest-Common-Divisor-Algorithms.html#Greatest-Common-Divisor-Algorithms" accesskey="p" rel="prev">Greatest Common Divisor Algorithms</a>, Up: <a href="Greatest-Common-Divisor-Algorithms.html#Greatest-Common-Divisor-Algorithms" accesskey="u" rel="up">Greatest Common Divisor Algorithms</a> [<a href="Concept-Index.html#Concept-Index" title="Index" rel="index">Index</a>]</p>
|
|
</div>
|
|
<hr>
|
|
<a name="Binary-GCD-1"></a>
|
|
<h4 class="subsection">15.3.1 Binary GCD</h4>
|
|
|
|
<p>At small sizes GMP uses an <em>O(N^2)</em> binary style GCD. This is described
|
|
in many textbooks, for example Knuth section 4.5.2 algorithm B. It simply
|
|
consists of successively reducing odd operands <em>a</em> and <em>b</em> using
|
|
</p>
|
|
<blockquote>
|
|
<p><em>a,b = abs(a-b),min(a,b)</em> <br>
|
|
strip factors of 2 from <em>a</em>
|
|
</p></blockquote>
|
|
|
|
<p>The Euclidean GCD algorithm, as per Knuth algorithms E and A, repeatedly
|
|
computes the quotient <em>q = floor(a/b)</em> and replaces
|
|
<em>a,b</em> by <em>v, u - q v</em>. The binary algorithm has so far been found to
|
|
be faster than the Euclidean algorithm everywhere. One reason the binary
|
|
method does well is that the implied quotient at each step is usually small,
|
|
so often only one or two subtractions are needed to get the same effect as a
|
|
division. Quotients 1, 2 and 3 for example occur 67.7% of the time, see Knuth
|
|
section 4.5.3 Theorem E.
|
|
</p>
|
|
<p>When the implied quotient is large, meaning <em>b</em> is much smaller than
|
|
<em>a</em>, then a division is worthwhile. This is the basis for the initial
|
|
<em>a mod b</em> reductions in <code>mpn_gcd</code> and <code>mpn_gcd_1</code> (the latter
|
|
for both Nx1 and 1x1 cases). But after that initial reduction,
|
|
big quotients occur too rarely to make it worth checking for them.
|
|
</p>
|
|
<br>
|
|
<p>The final <em>1x1</em> GCD in <code>mpn_gcd_1</code> is done in the generic C
|
|
code as described above. For two N-bit operands, the algorithm takes about
|
|
0.68 iterations per bit. For optimum performance some attention needs to be
|
|
paid to the way the factors of 2 are stripped from <em>a</em>.
|
|
</p>
|
|
<p>Firstly it may be noted that in twos complement the number of low zero bits on
|
|
<em>a-b</em> is the same as <em>b-a</em>, so counting or testing can begin on
|
|
<em>a-b</em> without waiting for <em>abs(a-b)</em> to be determined.
|
|
</p>
|
|
<p>A loop stripping low zero bits tends not to branch predict well, since the
|
|
condition is data dependent. But on average there’s only a few low zeros, so
|
|
an option is to strip one or two bits arithmetically then loop for more (as
|
|
done for AMD K6). Or use a lookup table to get a count for several bits then
|
|
loop for more (as done for AMD K7). An alternative approach is to keep just
|
|
one of <em>a</em> or <em>b</em> odd and iterate
|
|
</p>
|
|
<blockquote>
|
|
<p><em>a,b = abs(a-b), min(a,b)</em> <br>
|
|
<em>a = a/2</em> if even <br>
|
|
<em>b = b/2</em> if even
|
|
</p></blockquote>
|
|
|
|
<p>This requires about 1.25 iterations per bit, but stripping of a single bit at
|
|
each step avoids any branching. Repeating the bit strip reduces to about 0.9
|
|
iterations per bit, which may be a worthwhile tradeoff.
|
|
</p>
|
|
<p>Generally with the above approaches a speed of perhaps 6 cycles per bit can be
|
|
achieved, which is still not terribly fast with for instance a 64-bit GCD
|
|
taking nearly 400 cycles. It’s this sort of time which means it’s not usually
|
|
advantageous to combine a set of divisibility tests into a GCD.
|
|
</p>
|
|
<p>Currently, the binary algorithm is used for GCD only when <em>N < 3</em>.
|
|
</p>
|
|
<hr>
|
|
<div class="header">
|
|
<p>
|
|
Next: <a href="Lehmer_0027s-Algorithm.html#Lehmer_0027s-Algorithm" accesskey="n" rel="next">Lehmer's Algorithm</a>, Previous: <a href="Greatest-Common-Divisor-Algorithms.html#Greatest-Common-Divisor-Algorithms" accesskey="p" rel="prev">Greatest Common Divisor Algorithms</a>, Up: <a href="Greatest-Common-Divisor-Algorithms.html#Greatest-Common-Divisor-Algorithms" accesskey="u" rel="up">Greatest Common Divisor Algorithms</a> [<a href="Concept-Index.html#Concept-Index" title="Index" rel="index">Index</a>]</p>
|
|
</div>
|
|
|
|
|
|
|
|
</body>
|
|
</html>
|