You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
243 lines
11 KiB
HTML
243 lines
11 KiB
HTML
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
|
|
<html>
|
|
<!-- Copyright (C) 1987-2018 Free Software Foundation, Inc.
|
|
|
|
Permission is granted to copy, distribute and/or modify this document
|
|
under the terms of the GNU Free Documentation License, Version 1.3 or
|
|
any later version published by the Free Software Foundation. A copy of
|
|
the license is included in the
|
|
section entitled "GNU Free Documentation License".
|
|
|
|
This manual contains no Invariant Sections. The Front-Cover Texts are
|
|
(a) (see below), and the Back-Cover Texts are (b) (see below).
|
|
|
|
(a) The FSF's Front-Cover Text is:
|
|
|
|
A GNU Manual
|
|
|
|
(b) The FSF's Back-Cover Text is:
|
|
|
|
You have freedom to copy and modify this GNU Manual, like GNU
|
|
software. Copies published by the Free Software Foundation raise
|
|
funds for GNU development. -->
|
|
<!-- Created by GNU Texinfo 6.4, http://www.gnu.org/software/texinfo/ -->
|
|
<head>
|
|
<title>Initial processing (The C Preprocessor)</title>
|
|
|
|
<meta name="description" content="Initial processing (The C Preprocessor)">
|
|
<meta name="keywords" content="Initial processing (The C Preprocessor)">
|
|
<meta name="resource-type" content="document">
|
|
<meta name="distribution" content="global">
|
|
<meta name="Generator" content="makeinfo">
|
|
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
|
|
<link href="index.html#Top" rel="start" title="Top">
|
|
<link href="Index-of-Directives.html#Index-of-Directives" rel="index" title="Index of Directives">
|
|
<link href="index.html#SEC_Contents" rel="contents" title="Table of Contents">
|
|
<link href="Overview.html#Overview" rel="up" title="Overview">
|
|
<link href="Tokenization.html#Tokenization" rel="next" title="Tokenization">
|
|
<link href="Character-sets.html#Character-sets" rel="prev" title="Character sets">
|
|
<style type="text/css">
|
|
<!--
|
|
a.summary-letter {text-decoration: none}
|
|
blockquote.indentedblock {margin-right: 0em}
|
|
blockquote.smallindentedblock {margin-right: 0em; font-size: smaller}
|
|
blockquote.smallquotation {font-size: smaller}
|
|
div.display {margin-left: 3.2em}
|
|
div.example {margin-left: 3.2em}
|
|
div.lisp {margin-left: 3.2em}
|
|
div.smalldisplay {margin-left: 3.2em}
|
|
div.smallexample {margin-left: 3.2em}
|
|
div.smalllisp {margin-left: 3.2em}
|
|
kbd {font-style: oblique}
|
|
pre.display {font-family: inherit}
|
|
pre.format {font-family: inherit}
|
|
pre.menu-comment {font-family: serif}
|
|
pre.menu-preformatted {font-family: serif}
|
|
pre.smalldisplay {font-family: inherit; font-size: smaller}
|
|
pre.smallexample {font-size: smaller}
|
|
pre.smallformat {font-family: inherit; font-size: smaller}
|
|
pre.smalllisp {font-size: smaller}
|
|
span.nolinebreak {white-space: nowrap}
|
|
span.roman {font-family: initial; font-weight: normal}
|
|
span.sansserif {font-family: sans-serif; font-weight: normal}
|
|
ul.no-bullet {list-style: none}
|
|
-->
|
|
</style>
|
|
|
|
|
|
</head>
|
|
|
|
<body lang="en">
|
|
<a name="Initial-processing"></a>
|
|
<div class="header">
|
|
<p>
|
|
Next: <a href="Tokenization.html#Tokenization" accesskey="n" rel="next">Tokenization</a>, Previous: <a href="Character-sets.html#Character-sets" accesskey="p" rel="prev">Character sets</a>, Up: <a href="Overview.html#Overview" accesskey="u" rel="up">Overview</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Index-of-Directives.html#Index-of-Directives" title="Index" rel="index">Index</a>]</p>
|
|
</div>
|
|
<hr>
|
|
<a name="Initial-processing-1"></a>
|
|
<h3 class="section">1.2 Initial processing</h3>
|
|
|
|
<p>The preprocessor performs a series of textual transformations on its
|
|
input. These happen before all other processing. Conceptually, they
|
|
happen in a rigid order, and the entire file is run through each
|
|
transformation before the next one begins. CPP actually does them
|
|
all at once, for performance reasons. These transformations correspond
|
|
roughly to the first three “phases of translation” described in the C
|
|
standard.
|
|
</p>
|
|
<ol>
|
|
<li> <a name="index-line-endings"></a>
|
|
The input file is read into memory and broken into lines.
|
|
|
|
<p>Different systems use different conventions to indicate the end of a
|
|
line. GCC accepts the ASCII control sequences <kbd>LF</kbd>, <kbd>CR LF<!-- /@w --></kbd> and <kbd>CR</kbd> as end-of-line markers. These are the canonical
|
|
sequences used by Unix, DOS and VMS, and the classic Mac OS (before
|
|
OSX) respectively. You may therefore safely copy source code written
|
|
on any of those systems to a different one and use it without
|
|
conversion. (GCC may lose track of the current line number if a file
|
|
doesn’t consistently use one convention, as sometimes happens when it
|
|
is edited on computers with different conventions that share a network
|
|
file system.)
|
|
</p>
|
|
<p>If the last line of any input file lacks an end-of-line marker, the end
|
|
of the file is considered to implicitly supply one. The C standard says
|
|
that this condition provokes undefined behavior, so GCC will emit a
|
|
warning message.
|
|
</p>
|
|
</li><li> <a name="index-trigraphs"></a>
|
|
<a name="trigraphs"></a>If trigraphs are enabled, they are replaced by their
|
|
corresponding single characters. By default GCC ignores trigraphs,
|
|
but if you request a strictly conforming mode with the <samp>-std</samp>
|
|
option, or you specify the <samp>-trigraphs</samp> option, then it
|
|
converts them.
|
|
|
|
<p>These are nine three-character sequences, all starting with ‘<samp>??</samp>’,
|
|
that are defined by ISO C to stand for single characters. They permit
|
|
obsolete systems that lack some of C’s punctuation to use C. For
|
|
example, ‘<samp>??/</samp>’ stands for ‘<samp>\</samp>’, so <tt>'??/n'</tt> is a character
|
|
constant for a newline.
|
|
</p>
|
|
<p>Trigraphs are not popular and many compilers implement them
|
|
incorrectly. Portable code should not rely on trigraphs being either
|
|
converted or ignored. With <samp>-Wtrigraphs</samp> GCC will warn you
|
|
when a trigraph may change the meaning of your program if it were
|
|
converted. See <a href="Invocation.html#Wtrigraphs">Wtrigraphs</a>.
|
|
</p>
|
|
<p>In a string constant, you can prevent a sequence of question marks
|
|
from being confused with a trigraph by inserting a backslash between
|
|
the question marks, or by separating the string literal at the
|
|
trigraph and making use of string literal concatenation. <tt>"(??\?)"</tt>
|
|
is the string ‘<samp>(???)</samp>’, not ‘<samp>(?]</samp>’. Traditional C compilers
|
|
do not recognize these idioms.
|
|
</p>
|
|
<p>The nine trigraphs and their replacements are
|
|
</p>
|
|
<div class="smallexample">
|
|
<pre class="smallexample">Trigraph: ??( ??) ??< ??> ??= ??/ ??' ??! ??-
|
|
Replacement: [ ] { } # \ ^ | ~
|
|
</pre></div>
|
|
|
|
</li><li> <a name="index-continued-lines"></a>
|
|
<a name="index-backslash_002dnewline"></a>
|
|
Continued lines are merged into one long line.
|
|
|
|
<p>A continued line is a line which ends with a backslash, ‘<samp>\</samp>’. The
|
|
backslash is removed and the following line is joined with the current
|
|
one. No space is inserted, so you may split a line anywhere, even in
|
|
the middle of a word. (It is generally more readable to split lines
|
|
only at white space.)
|
|
</p>
|
|
<p>The trailing backslash on a continued line is commonly referred to as a
|
|
<em>backslash-newline</em>.
|
|
</p>
|
|
<p>If there is white space between a backslash and the end of a line, that
|
|
is still a continued line. However, as this is usually the result of an
|
|
editing mistake, and many compilers will not accept it as a continued
|
|
line, GCC will warn you about it.
|
|
</p>
|
|
</li><li> <a name="index-comments"></a>
|
|
<a name="index-line-comments"></a>
|
|
<a name="index-block-comments"></a>
|
|
All comments are replaced with single spaces.
|
|
|
|
<p>There are two kinds of comments. <em>Block comments</em> begin with
|
|
‘<samp>/*</samp>’ and continue until the next ‘<samp>*/</samp>’. Block comments do not
|
|
nest:
|
|
</p>
|
|
<div class="smallexample">
|
|
<pre class="smallexample">/* <span class="roman">this is</span> /* <span class="roman">one comment</span> */ <span class="roman">text outside comment</span>
|
|
</pre></div>
|
|
|
|
<p><em>Line comments</em> begin with ‘<samp>//</samp>’ and continue to the end of the
|
|
current line. Line comments do not nest either, but it does not matter,
|
|
because they would end in the same place anyway.
|
|
</p>
|
|
<div class="smallexample">
|
|
<pre class="smallexample">// <span class="roman">this is</span> // <span class="roman">one comment</span>
|
|
<span class="roman">text outside comment</span>
|
|
</pre></div>
|
|
</li></ol>
|
|
|
|
<p>It is safe to put line comments inside block comments, or vice versa.
|
|
</p>
|
|
<div class="smallexample">
|
|
<pre class="smallexample">/* <span class="roman">block comment</span>
|
|
// <span class="roman">contains line comment</span>
|
|
<span class="roman">yet more comment</span>
|
|
*/ <span class="roman">outside comment</span>
|
|
|
|
// <span class="roman">line comment</span> /* <span class="roman">contains block comment</span> */
|
|
</pre></div>
|
|
|
|
<p>But beware of commenting out one end of a block comment with a line
|
|
comment.
|
|
</p>
|
|
<div class="smallexample">
|
|
<pre class="smallexample"> // <span class="roman">l.c.</span> /* <span class="roman">block comment begins</span>
|
|
<span class="roman">oops! this isn’t a comment anymore</span> */
|
|
</pre></div>
|
|
|
|
<p>Comments are not recognized within string literals.
|
|
<tt>"/* blah */"<!-- /@w --></tt> is the string constant ‘<samp>/* blah */<!-- /@w --></samp>’, not
|
|
an empty string.
|
|
</p>
|
|
<p>Line comments are not in the 1989 edition of the C standard, but they
|
|
are recognized by GCC as an extension. In C++ and in the 1999 edition
|
|
of the C standard, they are an official part of the language.
|
|
</p>
|
|
<p>Since these transformations happen before all other processing, you can
|
|
split a line mechanically with backslash-newline anywhere. You can
|
|
comment out the end of a line. You can continue a line comment onto the
|
|
next line with backslash-newline. You can even split ‘<samp>/*</samp>’,
|
|
‘<samp>*/</samp>’, and ‘<samp>//</samp>’ onto multiple lines with backslash-newline.
|
|
For example:
|
|
</p>
|
|
<div class="smallexample">
|
|
<pre class="smallexample">/\
|
|
*
|
|
*/ # /*
|
|
*/ defi\
|
|
ne FO\
|
|
O 10\
|
|
20
|
|
</pre></div>
|
|
|
|
<p>is equivalent to <code>#define FOO 1020<!-- /@w --></code>. All these tricks are
|
|
extremely confusing and should not be used in code intended to be
|
|
readable.
|
|
</p>
|
|
<p>There is no way to prevent a backslash at the end of a line from being
|
|
interpreted as a backslash-newline. This cannot affect any correct
|
|
program, however.
|
|
</p>
|
|
<hr>
|
|
<div class="header">
|
|
<p>
|
|
Next: <a href="Tokenization.html#Tokenization" accesskey="n" rel="next">Tokenization</a>, Previous: <a href="Character-sets.html#Character-sets" accesskey="p" rel="prev">Character sets</a>, Up: <a href="Overview.html#Overview" accesskey="u" rel="up">Overview</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Index-of-Directives.html#Index-of-Directives" title="Index" rel="index">Index</a>]</p>
|
|
</div>
|
|
|
|
|
|
|
|
</body>
|
|
</html>
|