Initial-processing.html 10 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231
  1. <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
  2. <html>
  3. <!-- Copyright (C) 1987-2023 Free Software Foundation, Inc.
  4. Permission is granted to copy, distribute and/or modify this document
  5. under the terms of the GNU Free Documentation License, Version 1.3 or
  6. any later version published by the Free Software Foundation. A copy of
  7. the license is included in the
  8. section entitled "GNU Free Documentation License".
  9. This manual contains no Invariant Sections. The Front-Cover Texts are
  10. (a) (see below), and the Back-Cover Texts are (b) (see below).
  11. (a) The FSF's Front-Cover Text is:
  12. A GNU Manual
  13. (b) The FSF's Back-Cover Text is:
  14. You have freedom to copy and modify this GNU Manual, like GNU
  15. software. Copies published by the Free Software Foundation raise
  16. funds for GNU development. -->
  17. <!-- Created by GNU Texinfo 6.7, http://www.gnu.org/software/texinfo/ -->
  18. <head>
  19. <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  20. <title>Initial processing (The C Preprocessor)</title>
  21. <meta name="description" content="Initial processing (The C Preprocessor)">
  22. <meta name="keywords" content="Initial processing (The C Preprocessor)">
  23. <meta name="resource-type" content="document">
  24. <meta name="distribution" content="global">
  25. <meta name="Generator" content="makeinfo">
  26. <link href="index.html" rel="start" title="Top">
  27. <link href="Index-of-Directives.html" rel="index" title="Index of Directives">
  28. <link href="index.html#SEC_Contents" rel="contents" title="Table of Contents">
  29. <link href="Overview.html" rel="up" title="Overview">
  30. <link href="Tokenization.html" rel="next" title="Tokenization">
  31. <link href="Character-sets.html" rel="prev" title="Character sets">
  32. <style type="text/css">
  33. <!--
  34. a.summary-letter {text-decoration: none}
  35. blockquote.indentedblock {margin-right: 0em}
  36. div.display {margin-left: 3.2em}
  37. div.example {margin-left: 3.2em}
  38. div.lisp {margin-left: 3.2em}
  39. kbd {font-style: oblique}
  40. pre.display {font-family: inherit}
  41. pre.format {font-family: inherit}
  42. pre.menu-comment {font-family: serif}
  43. pre.menu-preformatted {font-family: serif}
  44. span.nolinebreak {white-space: nowrap}
  45. span.roman {font-family: initial; font-weight: normal}
  46. span.sansserif {font-family: sans-serif; font-weight: normal}
  47. ul.no-bullet {list-style: none}
  48. -->
  49. </style>
  50. </head>
  51. <body lang="en">
  52. <span id="Initial-processing"></span><div class="header">
  53. <p>
  54. Next: <a href="Tokenization.html" accesskey="n" rel="next">Tokenization</a>, Previous: <a href="Character-sets.html" accesskey="p" rel="prev">Character sets</a>, Up: <a href="Overview.html" accesskey="u" rel="up">Overview</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Index-of-Directives.html" title="Index" rel="index">Index</a>]</p>
  55. </div>
  56. <hr>
  57. <span id="Initial-processing-1"></span><h3 class="section">1.2 Initial processing</h3>
  58. <p>The preprocessor performs a series of textual transformations on its
  59. input. These happen before all other processing. Conceptually, they
  60. happen in a rigid order, and the entire file is run through each
  61. transformation before the next one begins. CPP actually does them
  62. all at once, for performance reasons. These transformations correspond
  63. roughly to the first three &ldquo;phases of translation&rdquo; described in the C
  64. standard.
  65. </p>
  66. <ol>
  67. <li> <span id="index-line-endings"></span>
  68. The input file is read into memory and broken into lines.
  69. <p>Different systems use different conventions to indicate the end of a
  70. line. GCC accepts the ASCII control sequences <kbd>LF</kbd>, <kbd>CR&nbsp;LF<!-- /@w --></kbd> and <kbd>CR</kbd> as end-of-line markers. These are the canonical
  71. sequences used by Unix, DOS and VMS, and the classic Mac OS (before
  72. OSX) respectively. You may therefore safely copy source code written
  73. on any of those systems to a different one and use it without
  74. conversion. (GCC may lose track of the current line number if a file
  75. doesn&rsquo;t consistently use one convention, as sometimes happens when it
  76. is edited on computers with different conventions that share a network
  77. file system.)
  78. </p>
  79. <p>If the last line of any input file lacks an end-of-line marker, the end
  80. of the file is considered to implicitly supply one. The C standard says
  81. that this condition provokes undefined behavior, so GCC will emit a
  82. warning message.
  83. </p>
  84. </li><li> <span id="index-trigraphs"></span>
  85. <span id="trigraphs"></span>If trigraphs are enabled, they are replaced by their
  86. corresponding single characters. By default GCC ignores trigraphs,
  87. but if you request a strictly conforming mode with the <samp>-std</samp>
  88. option, or you specify the <samp>-trigraphs</samp> option, then it
  89. converts them.
  90. <p>These are nine three-character sequences, all starting with &lsquo;<samp>??</samp>&rsquo;,
  91. that are defined by ISO C to stand for single characters. They permit
  92. obsolete systems that lack some of C&rsquo;s punctuation to use C. For
  93. example, &lsquo;<samp>??/</samp>&rsquo; stands for &lsquo;<samp>\</samp>&rsquo;, so <tt>'??/n'</tt> is a character
  94. constant for a newline.
  95. </p>
  96. <p>Trigraphs are not popular and many compilers implement them
  97. incorrectly. Portable code should not rely on trigraphs being either
  98. converted or ignored. With <samp>-Wtrigraphs</samp> GCC will warn you
  99. when a trigraph may change the meaning of your program if it were
  100. converted. See <a href="Invocation.html#Wtrigraphs">Wtrigraphs</a>.
  101. </p>
  102. <p>In a string constant, you can prevent a sequence of question marks
  103. from being confused with a trigraph by inserting a backslash between
  104. the question marks, or by separating the string literal at the
  105. trigraph and making use of string literal concatenation. <tt>&quot;(??\?)&quot;</tt>
  106. is the string &lsquo;<samp>(???)</samp>&rsquo;, not &lsquo;<samp>(?]</samp>&rsquo;. Traditional C compilers
  107. do not recognize these idioms.
  108. </p>
  109. <p>The nine trigraphs and their replacements are
  110. </p>
  111. <div class="example">
  112. <pre class="example">Trigraph: ??( ??) ??&lt; ??&gt; ??= ??/ ??' ??! ??-
  113. Replacement: [ ] { } # \ ^ | ~
  114. </pre></div>
  115. </li><li> <span id="index-continued-lines"></span>
  116. <span id="index-backslash_002dnewline"></span>
  117. Continued lines are merged into one long line.
  118. <p>A continued line is a line which ends with a backslash, &lsquo;<samp>\</samp>&rsquo;. The
  119. backslash is removed and the following line is joined with the current
  120. one. No space is inserted, so you may split a line anywhere, even in
  121. the middle of a word. (It is generally more readable to split lines
  122. only at white space.)
  123. </p>
  124. <p>The trailing backslash on a continued line is commonly referred to as a
  125. <em>backslash-newline</em>.
  126. </p>
  127. <p>If there is white space between a backslash and the end of a line, that
  128. is still a continued line. However, as this is usually the result of an
  129. editing mistake, and many compilers will not accept it as a continued
  130. line, GCC will warn you about it.
  131. </p>
  132. </li><li> <span id="index-comments"></span>
  133. <span id="index-line-comments"></span>
  134. <span id="index-block-comments"></span>
  135. All comments are replaced with single spaces.
  136. <p>There are two kinds of comments. <em>Block comments</em> begin with
  137. &lsquo;<samp>/*</samp>&rsquo; and continue until the next &lsquo;<samp>*/</samp>&rsquo;. Block comments do not
  138. nest:
  139. </p>
  140. <div class="example">
  141. <pre class="example">/* <span class="roman">this is</span> /* <span class="roman">one comment</span> */ <span class="roman">text outside comment</span>
  142. </pre></div>
  143. <p><em>Line comments</em> begin with &lsquo;<samp>//</samp>&rsquo; and continue to the end of the
  144. current line. Line comments do not nest either, but it does not matter,
  145. because they would end in the same place anyway.
  146. </p>
  147. <div class="example">
  148. <pre class="example">// <span class="roman">this is</span> // <span class="roman">one comment</span>
  149. <span class="roman">text outside comment</span>
  150. </pre></div>
  151. </li></ol>
  152. <p>It is safe to put line comments inside block comments, or vice versa.
  153. </p>
  154. <div class="example">
  155. <pre class="example">/* <span class="roman">block comment</span>
  156. // <span class="roman">contains line comment</span>
  157. <span class="roman">yet more comment</span>
  158. */ <span class="roman">outside comment</span>
  159. // <span class="roman">line comment</span> /* <span class="roman">contains block comment</span> */
  160. </pre></div>
  161. <p>But beware of commenting out one end of a block comment with a line
  162. comment.
  163. </p>
  164. <div class="example">
  165. <pre class="example"> // <span class="roman">l.c.</span> /* <span class="roman">block comment begins</span>
  166. <span class="roman">oops! this isn&rsquo;t a comment anymore</span> */
  167. </pre></div>
  168. <p>Comments are not recognized within string literals.
  169. <tt>&quot;/*&nbsp;blah&nbsp;*/&quot;<!-- /@w --></tt> is the string constant &lsquo;<samp>/*&nbsp;blah&nbsp;*/<!-- /@w --></samp>&rsquo;, not
  170. an empty string.
  171. </p>
  172. <p>Line comments are not in the 1989 edition of the C standard, but they
  173. are recognized by GCC as an extension. In C++ and in the 1999 edition
  174. of the C standard, they are an official part of the language.
  175. </p>
  176. <p>Since these transformations happen before all other processing, you can
  177. split a line mechanically with backslash-newline anywhere. You can
  178. comment out the end of a line. You can continue a line comment onto the
  179. next line with backslash-newline. You can even split &lsquo;<samp>/*</samp>&rsquo;,
  180. &lsquo;<samp>*/</samp>&rsquo;, and &lsquo;<samp>//</samp>&rsquo; onto multiple lines with backslash-newline.
  181. For example:
  182. </p>
  183. <div class="example">
  184. <pre class="example">/\
  185. *
  186. */ # /*
  187. */ defi\
  188. ne FO\
  189. O 10\
  190. 20
  191. </pre></div>
  192. <p>is equivalent to <code>#define&nbsp;FOO&nbsp;1020<!-- /@w --></code>. All these tricks are
  193. extremely confusing and should not be used in code intended to be
  194. readable.
  195. </p>
  196. <p>There is no way to prevent a backslash at the end of a line from being
  197. interpreted as a backslash-newline. This cannot affect any correct
  198. program, however.
  199. </p>
  200. <hr>
  201. <div class="header">
  202. <p>
  203. Next: <a href="Tokenization.html" accesskey="n" rel="next">Tokenization</a>, Previous: <a href="Character-sets.html" accesskey="p" rel="prev">Character sets</a>, Up: <a href="Overview.html" accesskey="u" rel="up">Overview</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Index-of-Directives.html" title="Index" rel="index">Index</a>]</p>
  204. </div>
  205. </body>
  206. </html>