You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
259 lines
12 KiB
HTML
259 lines
12 KiB
HTML
4 years ago
|
<html lang="en">
|
||
|
<head>
|
||
|
<title>IPA - GNU Compiler Collection (GCC) Internals</title>
|
||
|
<meta http-equiv="Content-Type" content="text/html">
|
||
|
<meta name="description" content="GNU Compiler Collection (GCC) Internals">
|
||
|
<meta name="generator" content="makeinfo 4.13">
|
||
|
<link title="Top" rel="start" href="index.html#Top">
|
||
|
<link rel="up" href="LTO.html#LTO" title="LTO">
|
||
|
<link rel="prev" href="LTO-object-file-layout.html#LTO-object-file-layout" title="LTO object file layout">
|
||
|
<link rel="next" href="WHOPR.html#WHOPR" title="WHOPR">
|
||
|
<link href="http://www.gnu.org/software/texinfo/" rel="generator-home" title="Texinfo Homepage">
|
||
|
<!--
|
||
|
Copyright (C) 1988-2015 Free Software Foundation, Inc.
|
||
|
|
||
|
Permission is granted to copy, distribute and/or modify this document
|
||
|
under the terms of the GNU Free Documentation License, Version 1.3 or
|
||
|
any later version published by the Free Software Foundation; with the
|
||
|
Invariant Sections being ``Funding Free Software'', the Front-Cover
|
||
|
Texts being (a) (see below), and with the Back-Cover Texts being (b)
|
||
|
(see below). A copy of the license is included in the section entitled
|
||
|
``GNU Free Documentation License''.
|
||
|
|
||
|
(a) The FSF's Front-Cover Text is:
|
||
|
|
||
|
A GNU Manual
|
||
|
|
||
|
(b) The FSF's Back-Cover Text is:
|
||
|
|
||
|
You have freedom to copy and modify this GNU Manual, like GNU
|
||
|
software. Copies published by the Free Software Foundation raise
|
||
|
funds for GNU development.-->
|
||
|
<meta http-equiv="Content-Style-Type" content="text/css">
|
||
|
<style type="text/css"><!--
|
||
|
pre.display { font-family:inherit }
|
||
|
pre.format { font-family:inherit }
|
||
|
pre.smalldisplay { font-family:inherit; font-size:smaller }
|
||
|
pre.smallformat { font-family:inherit; font-size:smaller }
|
||
|
pre.smallexample { font-size:smaller }
|
||
|
pre.smalllisp { font-size:smaller }
|
||
|
span.sc { font-variant:small-caps }
|
||
|
span.roman { font-family:serif; font-weight:normal; }
|
||
|
span.sansserif { font-family:sans-serif; font-weight:normal; }
|
||
|
--></style>
|
||
|
</head>
|
||
|
<body>
|
||
|
<div class="node">
|
||
|
<a name="IPA"></a>
|
||
|
<p>
|
||
|
Next: <a rel="next" accesskey="n" href="WHOPR.html#WHOPR">WHOPR</a>,
|
||
|
Previous: <a rel="previous" accesskey="p" href="LTO-object-file-layout.html#LTO-object-file-layout">LTO object file layout</a>,
|
||
|
Up: <a rel="up" accesskey="u" href="LTO.html#LTO">LTO</a>
|
||
|
<hr>
|
||
|
</div>
|
||
|
|
||
|
<h3 class="section">24.3 Using summary information in IPA passes</h3>
|
||
|
|
||
|
<p>Programs are represented internally as a <em>callgraph</em> (a
|
||
|
multi-graph where nodes are functions and edges are call sites)
|
||
|
and a <em>varpool</em> (a list of static and external variables in
|
||
|
the program).
|
||
|
|
||
|
<p>The inter-procedural optimization is organized as a sequence of
|
||
|
individual passes, which operate on the callgraph and the
|
||
|
varpool. To make the implementation of WHOPR possible, every
|
||
|
inter-procedural optimization pass is split into several stages
|
||
|
that are executed at different times during WHOPR compilation:
|
||
|
|
||
|
<ul>
|
||
|
<li>LGEN time
|
||
|
<ol type=1 start=1>
|
||
|
<li><em>Generate summary</em> (<code>generate_summary</code> in
|
||
|
<code>struct ipa_opt_pass_d</code>). This stage analyzes every function
|
||
|
body and variable initializer is examined and stores relevant
|
||
|
information into a pass-specific data structure.
|
||
|
|
||
|
<li><em>Write summary</em> (<code>write_summary</code> in
|
||
|
<code>struct ipa_opt_pass_d</code>). This stage writes all the
|
||
|
pass-specific information generated by <code>generate_summary</code>.
|
||
|
Summaries go into their own <code>LTO_section_*</code> sections that
|
||
|
have to be declared in <samp><span class="file">lto-streamer.h</span></samp>:<code>enum
|
||
|
lto_section_type</code>. A new section is created by calling
|
||
|
<code>create_output_block</code> and data can be written using the
|
||
|
<code>lto_output_*</code> routines.
|
||
|
</ol>
|
||
|
|
||
|
<li>WPA time
|
||
|
<ol type=1 start=1>
|
||
|
<li><em>Read summary</em> (<code>read_summary</code> in
|
||
|
<code>struct ipa_opt_pass_d</code>). This stage reads all the
|
||
|
pass-specific information in exactly the same order that it was
|
||
|
written by <code>write_summary</code>.
|
||
|
|
||
|
<li><em>Execute</em> (<code>execute</code> in <code>struct
|
||
|
opt_pass</code>). This performs inter-procedural propagation. This
|
||
|
must be done without actual access to the individual function
|
||
|
bodies or variable initializers. Typically, this results in a
|
||
|
transitive closure operation over the summary information of all
|
||
|
the nodes in the callgraph.
|
||
|
|
||
|
<li><em>Write optimization summary</em>
|
||
|
(<code>write_optimization_summary</code> in <code>struct
|
||
|
ipa_opt_pass_d</code>). This writes the result of the inter-procedural
|
||
|
propagation into the object file. This can use the same data
|
||
|
structures and helper routines used in <code>write_summary</code>.
|
||
|
</ol>
|
||
|
|
||
|
<li>LTRANS time
|
||
|
<ol type=1 start=1>
|
||
|
<li><em>Read optimization summary</em>
|
||
|
(<code>read_optimization_summary</code> in <code>struct
|
||
|
ipa_opt_pass_d</code>). The counterpart to
|
||
|
<code>write_optimization_summary</code>. This reads the interprocedural
|
||
|
optimization decisions in exactly the same format emitted by
|
||
|
<code>write_optimization_summary</code>.
|
||
|
|
||
|
<li><em>Transform</em> (<code>function_transform</code> and
|
||
|
<code>variable_transform</code> in <code>struct ipa_opt_pass_d</code>).
|
||
|
The actual function bodies and variable initializers are updated
|
||
|
based on the information passed down from the <em>Execute</em> stage.
|
||
|
</ol>
|
||
|
</ul>
|
||
|
|
||
|
<p>The implementation of the inter-procedural passes are shared
|
||
|
between LTO, WHOPR and classic non-LTO compilation.
|
||
|
|
||
|
<ul>
|
||
|
<li>During the traditional file-by-file mode every pass executes its
|
||
|
own <em>Generate summary</em>, <em>Execute</em>, and <em>Transform</em>
|
||
|
stages within the single execution context of the compiler.
|
||
|
|
||
|
<li>In LTO compilation mode, every pass uses <em>Generate
|
||
|
summary</em> and <em>Write summary</em> stages at compilation time,
|
||
|
while the <em>Read summary</em>, <em>Execute</em>, and
|
||
|
<em>Transform</em> stages are executed at link time.
|
||
|
|
||
|
<li>In WHOPR mode all stages are used.
|
||
|
</ul>
|
||
|
|
||
|
<p>To simplify development, the GCC pass manager differentiates
|
||
|
between normal inter-procedural passes and small inter-procedural
|
||
|
passes. A <em>small inter-procedural pass</em>
|
||
|
(<code>SIMPLE_IPA_PASS</code>) is a pass that does
|
||
|
everything at once and thus it can not be executed during WPA in
|
||
|
WHOPR mode. It defines only the <em>Execute</em> stage and during
|
||
|
this stage it accesses and modifies the function bodies. Such
|
||
|
passes are useful for optimization at LGEN or LTRANS time and are
|
||
|
used, for example, to implement early optimization before writing
|
||
|
object files. The simple inter-procedural passes can also be used
|
||
|
for easier prototyping and development of a new inter-procedural
|
||
|
pass.
|
||
|
|
||
|
<h4 class="subsection">24.3.1 Virtual clones</h4>
|
||
|
|
||
|
<p>One of the main challenges of introducing the WHOPR compilation
|
||
|
mode was addressing the interactions between optimization passes.
|
||
|
In LTO compilation mode, the passes are executed in a sequence,
|
||
|
each of which consists of analysis (or <em>Generate summary</em>),
|
||
|
propagation (or <em>Execute</em>) and <em>Transform</em> stages.
|
||
|
Once the work of one pass is finished, the next pass sees the
|
||
|
updated program representation and can execute. This makes the
|
||
|
individual passes dependent on each other.
|
||
|
|
||
|
<p>In WHOPR mode all passes first execute their <em>Generate
|
||
|
summary</em> stage. Then summary writing marks the end of the LGEN
|
||
|
stage. At WPA time,
|
||
|
the summaries are read back into memory and all passes run the
|
||
|
<em>Execute</em> stage. Optimization summaries are streamed and
|
||
|
sent to LTRANS, where all the passes execute the <em>Transform</em>
|
||
|
stage.
|
||
|
|
||
|
<p>Most optimization passes split naturally into analysis,
|
||
|
propagation and transformation stages. But some do not. The
|
||
|
main problem arises when one pass performs changes and the
|
||
|
following pass gets confused by seeing different callgraphs
|
||
|
between the <em>Transform</em> stage and the <em>Generate summary</em>
|
||
|
or <em>Execute</em> stage. This means that the passes are required
|
||
|
to communicate their decisions with each other.
|
||
|
|
||
|
<p>To facilitate this communication, the GCC callgraph
|
||
|
infrastructure implements <em>virtual clones</em>, a method of
|
||
|
representing the changes performed by the optimization passes in
|
||
|
the callgraph without needing to update function bodies.
|
||
|
|
||
|
<p>A <em>virtual clone</em> in the callgraph is a function that has no
|
||
|
associated body, just a description of how to create its body based
|
||
|
on a different function (which itself may be a virtual clone).
|
||
|
|
||
|
<p>The description of function modifications includes adjustments to
|
||
|
the function's signature (which allows, for example, removing or
|
||
|
adding function arguments), substitutions to perform on the
|
||
|
function body, and, for inlined functions, a pointer to the
|
||
|
function that it will be inlined into.
|
||
|
|
||
|
<p>It is also possible to redirect any edge of the callgraph from a
|
||
|
function to its virtual clone. This implies updating of the call
|
||
|
site to adjust for the new function signature.
|
||
|
|
||
|
<p>Most of the transformations performed by inter-procedural
|
||
|
optimizations can be represented via virtual clones. For
|
||
|
instance, a constant propagation pass can produce a virtual clone
|
||
|
of the function which replaces one of its arguments by a
|
||
|
constant. The inliner can represent its decisions by producing a
|
||
|
clone of a function whose body will be later integrated into
|
||
|
a given function.
|
||
|
|
||
|
<p>Using <em>virtual clones</em>, the program can be easily updated
|
||
|
during the <em>Execute</em> stage, solving most of pass interactions
|
||
|
problems that would otherwise occur during <em>Transform</em>.
|
||
|
|
||
|
<p>Virtual clones are later materialized in the LTRANS stage and
|
||
|
turned into real functions. Passes executed after the virtual
|
||
|
clone were introduced also perform their <em>Transform</em> stage
|
||
|
on new functions, so for a pass there is no significant
|
||
|
difference between operating on a real function or a virtual
|
||
|
clone introduced before its <em>Execute</em> stage.
|
||
|
|
||
|
<p>Optimization passes then work on virtual clones introduced before
|
||
|
their <em>Execute</em> stage as if they were real functions. The
|
||
|
only difference is that clones are not visible during the
|
||
|
<em>Generate Summary</em> stage.
|
||
|
|
||
|
<p>To keep function summaries updated, the callgraph interface
|
||
|
allows an optimizer to register a callback that is called every
|
||
|
time a new clone is introduced as well as when the actual
|
||
|
function or variable is generated or when a function or variable
|
||
|
is removed. These hooks are registered in the <em>Generate
|
||
|
summary</em> stage and allow the pass to keep its information intact
|
||
|
until the <em>Execute</em> stage. The same hooks can also be
|
||
|
registered during the <em>Execute</em> stage to keep the
|
||
|
optimization summaries updated for the <em>Transform</em> stage.
|
||
|
|
||
|
<h4 class="subsection">24.3.2 IPA references</h4>
|
||
|
|
||
|
<p>GCC represents IPA references in the callgraph. For a function
|
||
|
or variable <code>A</code>, the <em>IPA reference</em> is a list of all
|
||
|
locations where the address of <code>A</code> is taken and, when
|
||
|
<code>A</code> is a variable, a list of all direct stores and reads
|
||
|
to/from <code>A</code>. References represent an oriented multi-graph on
|
||
|
the union of nodes of the callgraph and the varpool. See
|
||
|
<samp><span class="file">ipa-reference.c</span></samp>:<code>ipa_reference_write_optimization_summary</code>
|
||
|
and
|
||
|
<samp><span class="file">ipa-reference.c</span></samp>:<code>ipa_reference_read_optimization_summary</code>
|
||
|
for details.
|
||
|
|
||
|
<h4 class="subsection">24.3.3 Jump functions</h4>
|
||
|
|
||
|
<p>Suppose that an optimization pass sees a function <code>A</code> and it
|
||
|
knows the values of (some of) its arguments. The <em>jump
|
||
|
function</em> describes the value of a parameter of a given function
|
||
|
call in function <code>A</code> based on this knowledge.
|
||
|
|
||
|
<p>Jump functions are used by several optimizations, such as the
|
||
|
inter-procedural constant propagation pass and the
|
||
|
devirtualization pass. The inliner also uses jump functions to
|
||
|
perform inlining of callbacks.
|
||
|
|
||
|
</body></html>
|
||
|
|