You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
151 lines
7.7 KiB
HTML
151 lines
7.7 KiB
HTML
4 years ago
|
<html lang="en">
|
||
|
<head>
|
||
|
<title>Looping Patterns - GNU Compiler Collection (GCC) Internals</title>
|
||
|
<meta http-equiv="Content-Type" content="text/html">
|
||
|
<meta name="description" content="GNU Compiler Collection (GCC) Internals">
|
||
|
<meta name="generator" content="makeinfo 4.13">
|
||
|
<link title="Top" rel="start" href="index.html#Top">
|
||
|
<link rel="up" href="Machine-Desc.html#Machine-Desc" title="Machine Desc">
|
||
|
<link rel="prev" href="Jump-Patterns.html#Jump-Patterns" title="Jump Patterns">
|
||
|
<link rel="next" href="Insn-Canonicalizations.html#Insn-Canonicalizations" title="Insn Canonicalizations">
|
||
|
<link href="http://www.gnu.org/software/texinfo/" rel="generator-home" title="Texinfo Homepage">
|
||
|
<!--
|
||
|
Copyright (C) 1988-2015 Free Software Foundation, Inc.
|
||
|
|
||
|
Permission is granted to copy, distribute and/or modify this document
|
||
|
under the terms of the GNU Free Documentation License, Version 1.3 or
|
||
|
any later version published by the Free Software Foundation; with the
|
||
|
Invariant Sections being ``Funding Free Software'', the Front-Cover
|
||
|
Texts being (a) (see below), and with the Back-Cover Texts being (b)
|
||
|
(see below). A copy of the license is included in the section entitled
|
||
|
``GNU Free Documentation License''.
|
||
|
|
||
|
(a) The FSF's Front-Cover Text is:
|
||
|
|
||
|
A GNU Manual
|
||
|
|
||
|
(b) The FSF's Back-Cover Text is:
|
||
|
|
||
|
You have freedom to copy and modify this GNU Manual, like GNU
|
||
|
software. Copies published by the Free Software Foundation raise
|
||
|
funds for GNU development.-->
|
||
|
<meta http-equiv="Content-Style-Type" content="text/css">
|
||
|
<style type="text/css"><!--
|
||
|
pre.display { font-family:inherit }
|
||
|
pre.format { font-family:inherit }
|
||
|
pre.smalldisplay { font-family:inherit; font-size:smaller }
|
||
|
pre.smallformat { font-family:inherit; font-size:smaller }
|
||
|
pre.smallexample { font-size:smaller }
|
||
|
pre.smalllisp { font-size:smaller }
|
||
|
span.sc { font-variant:small-caps }
|
||
|
span.roman { font-family:serif; font-weight:normal; }
|
||
|
span.sansserif { font-family:sans-serif; font-weight:normal; }
|
||
|
--></style>
|
||
|
</head>
|
||
|
<body>
|
||
|
<div class="node">
|
||
|
<a name="Looping-Patterns"></a>
|
||
|
<p>
|
||
|
Next: <a rel="next" accesskey="n" href="Insn-Canonicalizations.html#Insn-Canonicalizations">Insn Canonicalizations</a>,
|
||
|
Previous: <a rel="previous" accesskey="p" href="Jump-Patterns.html#Jump-Patterns">Jump Patterns</a>,
|
||
|
Up: <a rel="up" accesskey="u" href="Machine-Desc.html#Machine-Desc">Machine Desc</a>
|
||
|
<hr>
|
||
|
</div>
|
||
|
|
||
|
<h3 class="section">16.13 Defining Looping Instruction Patterns</h3>
|
||
|
|
||
|
<p><a name="index-looping-instruction-patterns-3677"></a><a name="index-defining-looping-instruction-patterns-3678"></a>
|
||
|
Some machines have special jump instructions that can be utilized to
|
||
|
make loops more efficient. A common example is the 68000 ‘<samp><span class="samp">dbra</span></samp>’
|
||
|
instruction which performs a decrement of a register and a branch if the
|
||
|
result was greater than zero. Other machines, in particular digital
|
||
|
signal processors (DSPs), have special block repeat instructions to
|
||
|
provide low-overhead loop support. For example, the TI TMS320C3x/C4x
|
||
|
DSPs have a block repeat instruction that loads special registers to
|
||
|
mark the top and end of a loop and to count the number of loop
|
||
|
iterations. This avoids the need for fetching and executing a
|
||
|
‘<samp><span class="samp">dbra</span></samp>’-like instruction and avoids pipeline stalls associated with
|
||
|
the jump.
|
||
|
|
||
|
<p>GCC has three special named patterns to support low overhead looping.
|
||
|
They are ‘<samp><span class="samp">decrement_and_branch_until_zero</span></samp>’, ‘<samp><span class="samp">doloop_begin</span></samp>’,
|
||
|
and ‘<samp><span class="samp">doloop_end</span></samp>’. The first pattern,
|
||
|
‘<samp><span class="samp">decrement_and_branch_until_zero</span></samp>’, is not emitted during RTL
|
||
|
generation but may be emitted during the instruction combination phase.
|
||
|
This requires the assistance of the loop optimizer, using information
|
||
|
collected during strength reduction, to reverse a loop to count down to
|
||
|
zero. Some targets also require the loop optimizer to add a
|
||
|
<code>REG_NONNEG</code> note to indicate that the iteration count is always
|
||
|
positive. This is needed if the target performs a signed loop
|
||
|
termination test. For example, the 68000 uses a pattern similar to the
|
||
|
following for its <code>dbra</code> instruction:
|
||
|
|
||
|
<pre class="smallexample"> (define_insn "decrement_and_branch_until_zero"
|
||
|
[(set (pc)
|
||
|
(if_then_else
|
||
|
(ge (plus:SI (match_operand:SI 0 "general_operand" "+d*am")
|
||
|
(const_int -1))
|
||
|
(const_int 0))
|
||
|
(label_ref (match_operand 1 "" ""))
|
||
|
(pc)))
|
||
|
(set (match_dup 0)
|
||
|
(plus:SI (match_dup 0)
|
||
|
(const_int -1)))]
|
||
|
"find_reg_note (insn, REG_NONNEG, 0)"
|
||
|
"...")
|
||
|
</pre>
|
||
|
<p>Note that since the insn is both a jump insn and has an output, it must
|
||
|
deal with its own reloads, hence the `m' constraints. Also note that
|
||
|
since this insn is generated by the instruction combination phase
|
||
|
combining two sequential insns together into an implicit parallel insn,
|
||
|
the iteration counter needs to be biased by the same amount as the
|
||
|
decrement operation, in this case −1. Note that the following similar
|
||
|
pattern will not be matched by the combiner.
|
||
|
|
||
|
<pre class="smallexample"> (define_insn "decrement_and_branch_until_zero"
|
||
|
[(set (pc)
|
||
|
(if_then_else
|
||
|
(ge (match_operand:SI 0 "general_operand" "+d*am")
|
||
|
(const_int 1))
|
||
|
(label_ref (match_operand 1 "" ""))
|
||
|
(pc)))
|
||
|
(set (match_dup 0)
|
||
|
(plus:SI (match_dup 0)
|
||
|
(const_int -1)))]
|
||
|
"find_reg_note (insn, REG_NONNEG, 0)"
|
||
|
"...")
|
||
|
</pre>
|
||
|
<p>The other two special looping patterns, ‘<samp><span class="samp">doloop_begin</span></samp>’ and
|
||
|
‘<samp><span class="samp">doloop_end</span></samp>’, are emitted by the loop optimizer for certain
|
||
|
well-behaved loops with a finite number of loop iterations using
|
||
|
information collected during strength reduction.
|
||
|
|
||
|
<p>The ‘<samp><span class="samp">doloop_end</span></samp>’ pattern describes the actual looping instruction
|
||
|
(or the implicit looping operation) and the ‘<samp><span class="samp">doloop_begin</span></samp>’ pattern
|
||
|
is an optional companion pattern that can be used for initialization
|
||
|
needed for some low-overhead looping instructions.
|
||
|
|
||
|
<p>Note that some machines require the actual looping instruction to be
|
||
|
emitted at the top of the loop (e.g., the TMS320C3x/C4x DSPs). Emitting
|
||
|
the true RTL for a looping instruction at the top of the loop can cause
|
||
|
problems with flow analysis. So instead, a dummy <code>doloop</code> insn is
|
||
|
emitted at the end of the loop. The machine dependent reorg pass checks
|
||
|
for the presence of this <code>doloop</code> insn and then searches back to
|
||
|
the top of the loop, where it inserts the true looping insn (provided
|
||
|
there are no instructions in the loop which would cause problems). Any
|
||
|
additional labels can be emitted at this point. In addition, if the
|
||
|
desired special iteration counter register was not allocated, this
|
||
|
machine dependent reorg pass could emit a traditional compare and jump
|
||
|
instruction pair.
|
||
|
|
||
|
<p>The essential difference between the
|
||
|
‘<samp><span class="samp">decrement_and_branch_until_zero</span></samp>’ and the ‘<samp><span class="samp">doloop_end</span></samp>’
|
||
|
patterns is that the loop optimizer allocates an additional pseudo
|
||
|
register for the latter as an iteration counter. This pseudo register
|
||
|
cannot be used within the loop (i.e., general induction variables cannot
|
||
|
be derived from it), however, in many cases the loop induction variable
|
||
|
may become redundant and removed by the flow pass.
|
||
|
|
||
|
</body></html>
|
||
|
|