You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

121 lines
5.9 KiB
HTML

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<!-- This manual describes how to install and use the GNU multiple precision
arithmetic library, version 6.1.0.
Copyright 1991, 1993-2015 Free Software Foundation, Inc.
Permission is granted to copy, distribute and/or modify this document under
the terms of the GNU Free Documentation License, Version 1.3 or any later
version published by the Free Software Foundation; with no Invariant Sections,
with the Front-Cover Texts being "A GNU Manual", and with the Back-Cover
Texts being "You have freedom to copy and modify this GNU Manual, like GNU
software". A copy of the license is included in
GNU Free Documentation License. -->
<!-- Created by GNU Texinfo 6.4, http://www.gnu.org/software/texinfo/ -->
<head>
<title>Assembly Cache Handling (GNU MP 6.1.0)</title>
<meta name="description" content="How to install and use the GNU multiple precision arithmetic library, version 6.1.0.">
<meta name="keywords" content="Assembly Cache Handling (GNU MP 6.1.0)">
<meta name="resource-type" content="document">
<meta name="distribution" content="global">
<meta name="Generator" content="makeinfo">
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<link href="index.html#Top" rel="start" title="Top">
<link href="Concept-Index.html#Concept-Index" rel="index" title="Concept Index">
<link href="Assembly-Coding.html#Assembly-Coding" rel="up" title="Assembly Coding">
<link href="Assembly-Functional-Units.html#Assembly-Functional-Units" rel="next" title="Assembly Functional Units">
<link href="Assembly-Carry-Propagation.html#Assembly-Carry-Propagation" rel="prev" title="Assembly Carry Propagation">
<style type="text/css">
<!--
a.summary-letter {text-decoration: none}
blockquote.indentedblock {margin-right: 0em}
blockquote.smallindentedblock {margin-right: 0em; font-size: smaller}
blockquote.smallquotation {font-size: smaller}
div.display {margin-left: 3.2em}
div.example {margin-left: 3.2em}
div.lisp {margin-left: 3.2em}
div.smalldisplay {margin-left: 3.2em}
div.smallexample {margin-left: 3.2em}
div.smalllisp {margin-left: 3.2em}
kbd {font-style: oblique}
pre.display {font-family: inherit}
pre.format {font-family: inherit}
pre.menu-comment {font-family: serif}
pre.menu-preformatted {font-family: serif}
pre.smalldisplay {font-family: inherit; font-size: smaller}
pre.smallexample {font-size: smaller}
pre.smallformat {font-family: inherit; font-size: smaller}
pre.smalllisp {font-size: smaller}
span.nolinebreak {white-space: nowrap}
span.roman {font-family: initial; font-weight: normal}
span.sansserif {font-family: sans-serif; font-weight: normal}
ul.no-bullet {list-style: none}
-->
</style>
</head>
<body lang="en">
<a name="Assembly-Cache-Handling"></a>
<div class="header">
<p>
Next: <a href="Assembly-Functional-Units.html#Assembly-Functional-Units" accesskey="n" rel="next">Assembly Functional Units</a>, Previous: <a href="Assembly-Carry-Propagation.html#Assembly-Carry-Propagation" accesskey="p" rel="prev">Assembly Carry Propagation</a>, Up: <a href="Assembly-Coding.html#Assembly-Coding" accesskey="u" rel="up">Assembly Coding</a> &nbsp; [<a href="Concept-Index.html#Concept-Index" title="Index" rel="index">Index</a>]</p>
</div>
<hr>
<a name="Cache-Handling"></a>
<h4 class="subsection">15.8.4 Cache Handling</h4>
<a name="index-Assembly-cache-handling"></a>
<p>GMP aims to perform well both on operands that fit entirely in L1 cache and
those which don&rsquo;t.
</p>
<p>Basic routines like <code>mpn_add_n</code> or <code>mpn_lshift</code> are often used on
large operands, so L2 and main memory performance is important for them.
<code>mpn_mul_1</code> and <code>mpn_addmul_1</code> are mostly used for multiply and
square basecases, so L1 performance matters most for them, unless assembly
versions of <code>mpn_mul_basecase</code> and <code>mpn_sqr_basecase</code> exist, in
which case the remaining uses are mostly for larger operands.
</p>
<p>For L2 or main memory operands, memory access times will almost certainly be
more than the calculation time. The aim therefore is to maximize memory
throughput, by starting a load of the next cache line while processing the
contents of the previous one. Clearly this is only possible if the chip has a
lock-up free cache or some sort of prefetch instruction. Most current chips
have both these features.
</p>
<p>Prefetching sources combines well with loop unrolling, since a prefetch can be
initiated once per unrolled loop (or more than once if the loop covers more
than one cache line).
</p>
<p>On CPUs without write-allocate caches, prefetching destinations will ensure
individual stores don&rsquo;t go further down the cache hierarchy, limiting
bandwidth. Of course for calculations which are slow anyway, like
<code>mpn_divrem_1</code>, write-throughs might be fine.
</p>
<p>The distance ahead to prefetch will be determined by memory latency versus
throughput. The aim of course is to have data arriving continuously, at peak
throughput. Some CPUs have limits on the number of fetches or prefetches in
progress.
</p>
<p>If a special prefetch instruction doesn&rsquo;t exist then a plain load can be used,
but in that case care must be taken not to attempt to read past the end of an
operand, since that might produce a segmentation violation.
</p>
<p>Some CPUs or systems have hardware that detects sequential memory accesses and
initiates suitable cache movements automatically, making life easy.
</p>
<hr>
<div class="header">
<p>
Next: <a href="Assembly-Functional-Units.html#Assembly-Functional-Units" accesskey="n" rel="next">Assembly Functional Units</a>, Previous: <a href="Assembly-Carry-Propagation.html#Assembly-Carry-Propagation" accesskey="p" rel="prev">Assembly Carry Propagation</a>, Up: <a href="Assembly-Coding.html#Assembly-Coding" accesskey="u" rel="up">Assembly Coding</a> &nbsp; [<a href="Concept-Index.html#Concept-Index" title="Index" rel="index">Index</a>]</p>
</div>
</body>
</html>