You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

95 lines
4.4 KiB
HTML

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<!-- This manual describes how to install and use the GNU multiple precision
arithmetic library, version 6.1.0.
Copyright 1991, 1993-2015 Free Software Foundation, Inc.
Permission is granted to copy, distribute and/or modify this document under
the terms of the GNU Free Documentation License, Version 1.3 or any later
version published by the Free Software Foundation; with no Invariant Sections,
with the Front-Cover Texts being "A GNU Manual", and with the Back-Cover
Texts being "You have freedom to copy and modify this GNU Manual, like GNU
software". A copy of the license is included in
GNU Free Documentation License. -->
<!-- Created by GNU Texinfo 6.4, http://www.gnu.org/software/texinfo/ -->
<head>
<title>Assembly Basics (GNU MP 6.1.0)</title>
<meta name="description" content="How to install and use the GNU multiple precision arithmetic library, version 6.1.0.">
<meta name="keywords" content="Assembly Basics (GNU MP 6.1.0)">
<meta name="resource-type" content="document">
<meta name="distribution" content="global">
<meta name="Generator" content="makeinfo">
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<link href="index.html#Top" rel="start" title="Top">
<link href="Concept-Index.html#Concept-Index" rel="index" title="Concept Index">
<link href="Assembly-Coding.html#Assembly-Coding" rel="up" title="Assembly Coding">
<link href="Assembly-Carry-Propagation.html#Assembly-Carry-Propagation" rel="next" title="Assembly Carry Propagation">
<link href="Assembly-Code-Organisation.html#Assembly-Code-Organisation" rel="prev" title="Assembly Code Organisation">
<style type="text/css">
<!--
a.summary-letter {text-decoration: none}
blockquote.indentedblock {margin-right: 0em}
blockquote.smallindentedblock {margin-right: 0em; font-size: smaller}
blockquote.smallquotation {font-size: smaller}
div.display {margin-left: 3.2em}
div.example {margin-left: 3.2em}
div.lisp {margin-left: 3.2em}
div.smalldisplay {margin-left: 3.2em}
div.smallexample {margin-left: 3.2em}
div.smalllisp {margin-left: 3.2em}
kbd {font-style: oblique}
pre.display {font-family: inherit}
pre.format {font-family: inherit}
pre.menu-comment {font-family: serif}
pre.menu-preformatted {font-family: serif}
pre.smalldisplay {font-family: inherit; font-size: smaller}
pre.smallexample {font-size: smaller}
pre.smallformat {font-family: inherit; font-size: smaller}
pre.smalllisp {font-size: smaller}
span.nolinebreak {white-space: nowrap}
span.roman {font-family: initial; font-weight: normal}
span.sansserif {font-family: sans-serif; font-weight: normal}
ul.no-bullet {list-style: none}
-->
</style>
</head>
<body lang="en">
<a name="Assembly-Basics"></a>
<div class="header">
<p>
Next: <a href="Assembly-Carry-Propagation.html#Assembly-Carry-Propagation" accesskey="n" rel="next">Assembly Carry Propagation</a>, Previous: <a href="Assembly-Code-Organisation.html#Assembly-Code-Organisation" accesskey="p" rel="prev">Assembly Code Organisation</a>, Up: <a href="Assembly-Coding.html#Assembly-Coding" accesskey="u" rel="up">Assembly Coding</a> &nbsp; [<a href="Concept-Index.html#Concept-Index" title="Index" rel="index">Index</a>]</p>
</div>
<hr>
<a name="Assembly-Basics-1"></a>
<h4 class="subsection">15.8.2 Assembly Basics</h4>
<p><code>mpn_addmul_1</code> and <code>mpn_submul_1</code> are the most important routines
for overall GMP performance. All multiplications and divisions come down to
repeated calls to these. <code>mpn_add_n</code>, <code>mpn_sub_n</code>,
<code>mpn_lshift</code> and <code>mpn_rshift</code> are next most important.
</p>
<p>On some CPUs assembly versions of the internal functions
<code>mpn_mul_basecase</code> and <code>mpn_sqr_basecase</code> give significant speedups,
mainly through avoiding function call overheads. They can also potentially
make better use of a wide superscalar processor, as can bigger primitives like
<code>mpn_addmul_2</code> or <code>mpn_addmul_4</code>.
</p>
<p>The restrictions on overlaps between sources and destinations
(see <a href="Low_002dlevel-Functions.html#Low_002dlevel-Functions">Low-level Functions</a>) are designed to facilitate a variety of
implementations. For example, knowing <code>mpn_add_n</code> won&rsquo;t have partly
overlapping sources and destination means reading can be done far ahead of
writing on superscalar processors, and loops can be vectorized on a vector
processor, depending on the carry handling.
</p>
</body>
</html>