| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337 | <?xml version="1.0" encoding="ISO-8859-1"?><!DOCTYPE sect1 PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"  "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [  <!ENTITY % general-entities SYSTEM "../general.ent">  %general-entities;]><sect1 id="ch-tools-toolchaintechnotes">  <?dbhtml filename="toolchaintechnotes.html"?>  <title>Toolchain Technical Notes</title>  <para>This section explains some of the rationale and technical details  behind the overall build method. It is not essential to immediately  understand everything in this section. Most of this information will be  clearer after performing an actual build. This section can be referred  to at any time during the process.</para>  <para>The overall goal of this chapter and <xref  linkend="chapter-temporary-tools"/> is to produce a temporary area that  contains a known-good set of tools that can be isolated from the host system.  By using <command>chroot</command>, the commands in the remaining chapters  will be contained within that environment, ensuring a clean, trouble-free  build of the target LFS system. The build process has been designed to  minimize the risks for new readers and to provide the most educational value  at the same time.</para>  <para>The build process is based on the process of  <emphasis>cross-compilation</emphasis>. Cross-compilation is normally used  for building a compiler and its toolchain for a machine different from  the one that is used for the build. This is not strictly needed for LFS,  since the machine where the new system will run is the same as the one  used for the build. But cross-compilation has the great advantage that  anything that is cross-compiled cannot depend on the host environment.</para>  <sect2 id="cross-compile" xreflabel="About Cross-Compilation">    <title>About Cross-Compilation</title>    <para>Cross-compilation involves some concepts that deserve a section on    their own. Although this section may be omitted in a first reading, it    is strongly suggested to come back to it later in order to get a full    grasp of the build process.</para>    <para>Let us first define some terms used in this context:</para>    <variablelist>      <varlistentry><term>build</term><listitem>        <para>is the machine where we build programs. Note that this machine        is referred to as the <quote>host</quote> in other        sections.</para></listitem>      </varlistentry>      <varlistentry><term>host</term><listitem>        <para>is the machine/system where the built programs will run. Note        that this use of <quote>host</quote> is not the same as in other        sections.</para></listitem>      </varlistentry>      <varlistentry><term>target</term><listitem>        <para>is only used for compilers. It is the machine the compiler        produces code for. It may be different from both build and        host.</para></listitem>      </varlistentry>    </variablelist>    <para>As an example, let us imagine the following scenario: we may have a    compiler on a slow machine only, let's call the machine A, and the compiler    ccA. We may have also a fast machine (B), but with no compiler, and we may    want to produce code for a another slow machine (C). Then, to build a    compiler for machine C, we would have three stages:</para>    <informaltable align="center">      <tgroup cols="5">        <colspec colnum="1" align="center"/>        <colspec colnum="2" align="center"/>        <colspec colnum="3" align="center"/>        <colspec colnum="4" align="center"/>        <colspec colnum="5" align="left"/>        <thead>          <row><entry>Stage</entry><entry>Build</entry><entry>Host</entry>               <entry>Target</entry><entry>Action</entry></row>        </thead>        <tbody>          <row>            <entry>1</entry><entry>A</entry><entry>A</entry><entry>B</entry>            <entry>build cross-compiler cc1 using ccA on machine A</entry>          </row>          <row>            <entry>2</entry><entry>A</entry><entry>B</entry><entry>B</entry>            <entry>build cross-compiler cc2 using cc1 on machine A</entry>          </row>          <row>            <entry>3</entry><entry>B</entry><entry>C</entry><entry>C</entry>            <entry>build compiler ccC using cc2 on machine B</entry>          </row>        </tbody>      </tgroup>    </informaltable>    <para>Then, all the other programs needed by machine C can be compiled    using cc2 on the fast machine B. Note that unless B can run programs    produced for C, there is no way to test the built programs until machine    C itself is running. For example, for testing ccC, we may want to add a    fourth stage:</para>    <informaltable align="center">      <tgroup cols="5">        <colspec colnum="1" align="center"/>        <colspec colnum="2" align="center"/>        <colspec colnum="3" align="center"/>        <colspec colnum="4" align="center"/>        <colspec colnum="5" align="left"/>        <thead>          <row><entry>Stage</entry><entry>Build</entry><entry>Host</entry>               <entry>Target</entry><entry>Action</entry></row>        </thead>        <tbody>          <row>            <entry>4</entry><entry>C</entry><entry>C</entry><entry>C</entry>            <entry>rebuild  and test ccC using itself on machine C</entry>          </row>        </tbody>      </tgroup>    </informaltable>    <para>In the example above, only cc1 and cc2 are cross-compilers, that is,    they produce code for a machine different from the one they are run on.    The other compilers ccA and ccC produce code for the machine they are run    on. Such compilers are called <emphasis>native</emphasis> compilers.</para>  </sect2>  <sect2 id="lfs-cross">    <title>Implementation of Cross-Compilation for LFS</title>    <note>      <para>Almost all the build systems use names of the form      cpu-vendor-kernel-os referred to as the machine triplet. An astute      reader may wonder why a <quote>triplet</quote> refers to a four component      name. The reason is history: initially, three component names were enough      to designate unambiguously a machine, but with new machines and systems      appearing, that proved insufficient. The word <quote>triplet</quote>      remained. A simple way to determine your machine triplet is to run      the <command>config.guess</command>      script that comes with the source for many packages. Unpack the binutils      sources and run the script: <userinput>./config.guess</userinput> and note      the output. For example, for a 32-bit Intel processor the      output will be <emphasis>i686-pc-linux-gnu</emphasis>. On a 64-bit      system it will be <emphasis>x86_64-pc-linux-gnu</emphasis>.</para>      <para>Also be aware of the name of the platform's dynamic linker, often      referred to as the dynamic loader (not to be confused with the standard      linker <command>ld</command> that is part of binutils). The dynamic linker      provided by Glibc finds and loads the shared libraries needed by a      program, prepares the program to run, and then runs it. The name of the      dynamic linker for a 32-bit Intel machine will be <filename      class="libraryfile">ld-linux.so.2</filename> (<filename      class="libraryfile">ld-linux-x86-64.so.2</filename> for 64-bit systems). A      sure-fire way to determine the name of the dynamic linker is to inspect a      random binary from the host system by running: <userinput>readelf -l      <name of binary> | grep interpreter</userinput> and noting the      output. The authoritative reference covering all platforms is in the      <filename>shlib-versions</filename> file in the root of the Glibc source      tree.</para>    </note>    <para>In order to fake a cross compilation, the name of the host triplet    is slightly adjusted by changing the "vendor" field in the    <envar>LFS_TGT</envar> variable. We also use the    <parameter>--with-sysroot</parameter> option when building the cross linker and    cross compiler to tell them where to find the needed host files. This    ensures that none of the other programs built in <xref    linkend="chapter-temporary-tools"/> can link to libraries on the build    machine. Only two stages are mandatory, and one more for tests:</para>    <informaltable align="center">      <tgroup cols="5">        <colspec colnum="1" align="center"/>        <colspec colnum="2" align="center"/>        <colspec colnum="3" align="center"/>        <colspec colnum="4" align="center"/>        <colspec colnum="5" align="left"/>        <thead>          <row><entry>Stage</entry><entry>Build</entry><entry>Host</entry>               <entry>Target</entry><entry>Action</entry></row>        </thead>        <tbody>          <row>            <entry>1</entry><entry>pc</entry><entry>pc</entry><entry>lfs</entry>            <entry>build cross-compiler cc1 using cc-pc on pc</entry>          </row>          <row>            <entry>2</entry><entry>pc</entry><entry>lfs</entry><entry>lfs</entry>            <entry>build compiler cc-lfs using cc1 on pc</entry>          </row>          <row>            <entry>3</entry><entry>lfs</entry><entry>lfs</entry><entry>lfs</entry>            <entry>rebuild and test cc-lfs using itself on lfs</entry>          </row>        </tbody>      </tgroup>    </informaltable>    <para>In the above table, <quote>on pc</quote> means the commands are run    on a machine using the already installed distribution. <quote>On    lfs</quote> means the commands are run in a chrooted environment.</para>    <para>Now, there is more about cross-compiling: the C language is not    just a compiler, but also defines a standard library. In this book, the    GNU C library, named glibc, is used. This library must    be compiled for the lfs machine, that is, using the cross compiler cc1.     But the compiler itself uses an internal library implementing complex    instructions not available in the assembler instruction set. This    internal library is named libgcc, and must be linked to the glibc    library to be fully functional! Furthermore, the standard library for    C++ (libstdc++) also needs being linked to glibc. The solution    to this chicken and egg problem is to first build a degraded cc1 based libgcc,    lacking some fuctionalities such as threads and exception handling, then    build glibc using this degraded compiler (glibc itself is not    degraded), then build libstdc++. But this last library will lack the    same functionalities as libgcc.</para>    <para>This is not the end of the story: the conclusion of the preceding    paragraph is that cc1 is unable to build a fully functional libstdc++, but    this is the only compiler available for building the C/C++ libraries    during stage 2! Of course, the compiler built during stage 2, cc-lfs,    would be able to build those libraries, but (1) the build system of    gcc does not know that it is usable on pc, and (2) using it on pc    would be at risk of linking to the pc libraries, since cc-lfs is a native    compiler. So we have to build libstdc++ later, in chroot.</para>  </sect2>  <sect2 id="other-details">    <title>Other procedural details</title>    <para>The cross-compiler will be installed in a separate <filename    class="directory">$LFS/tools</filename> directory, since it will not    be part of the final system.</para>    <para>Binutils is installed first because the <command>configure</command>    runs of both GCC and Glibc perform various feature tests on the assembler    and linker to determine which software features to enable or disable. This    is more important than one might first realize. An incorrectly configured    GCC or Glibc can result in a subtly broken toolchain, where the impact of    such breakage might not show up until near the end of the build of an    entire distribution. A test suite failure will usually highlight this error    before too much additional work is performed.</para>    <para>Binutils installs its assembler and linker in two locations,    <filename class="directory">$LFS/tools/bin</filename> and <filename    class="directory">$LFS/tools/$LFS_TGT/bin</filename>. The tools in one    location are hard linked to the other. An important facet of the linker is    its library search order. Detailed information can be obtained from    <command>ld</command> by passing it the <parameter>--verbose</parameter>    flag. For example, <command>$LFS_TGT-ld --verbose | grep SEARCH</command>    will illustrate the current search paths and their order. It shows which    files are linked by <command>ld</command> by compiling a dummy program and    passing the <parameter>--verbose</parameter> switch to the linker. For    example,    <command>$LFS_TGT-gcc dummy.c -Wl,--verbose 2>&1 | grep succeeded</command>    will show all the files successfully opened during the linking.</para>    <para>The next package installed is GCC. An example of what can be    seen during its run of <command>configure</command> is:</para><screen><computeroutput>checking what assembler to use... /mnt/lfs/tools/i686-lfs-linux-gnu/bin/aschecking what linker to use... /mnt/lfs/tools/i686-lfs-linux-gnu/bin/ld</computeroutput></screen>    <para>This is important for the reasons mentioned above. It also    demonstrates that GCC's configure script does not search the PATH    directories to find which tools to use. However, during the actual    operation of <command>gcc</command> itself, the same search paths are not    necessarily used. To find out which standard linker <command>gcc</command>    will use, run: <command>$LFS_TGT-gcc -print-prog-name=ld</command>.</para>    <para>Detailed information can be obtained from <command>gcc</command> by    passing it the <parameter>-v</parameter> command line option while compiling    a dummy program. For example, <command>gcc -v dummy.c</command> will show    detailed information about the preprocessor, compilation, and assembly    stages, including <command>gcc</command>'s included search paths and their    order.</para>    <para>Next installed are sanitized Linux API headers. These allow the    standard C library (Glibc) to interface with features that the Linux    kernel will provide.</para>    <para>The next package installed is Glibc. The most important    considerations for building Glibc are the compiler, binary tools, and    kernel headers. The compiler is generally not an issue since Glibc will    always use the compiler relating to the <parameter>--host</parameter>    parameter passed to its configure script; e.g. in our case, the compiler    will be <command>$LFS_TGT-gcc</command>. The binary tools and kernel    headers can be a bit more complicated. Therefore, take no risks and use    the available configure switches to enforce the correct selections. After    the run of <command>configure</command>, check the contents of the    <filename>config.make</filename> file in the <filename    class="directory">build</filename> directory for all important details.    Note the use of <parameter>CC="$LFS_TGT-gcc"</parameter> (with    <envar>$LFS_TGT</envar> expanded) to control which binary tools are used    and the use of the <parameter>-nostdinc</parameter> and    <parameter>-isystem</parameter> flags to control the compiler's include    search path. These items highlight an important aspect of the Glibc    package—it is very self-sufficient in terms of its build machinery    and generally does not rely on toolchain defaults.</para>    <para>As said above, the standard C++ library is compiled next, followed in    Chapter 6 by all the programs that need themselves to be built. The install    step of libstdc++ uses the <envar>DESTDIR</envar> variable to have the    programs land into the LFS filesystem.</para>    <para>In Chapter 7 the native lfs compiler is built. First binutils-pass2,    with the same <envar>DESTDIR</envar> install as the other programs is    built, and then the second pass of GCC is constructed, omitting libstdc++    and other non-important libraries.  Due to some weird logic in GCC's    configure script, <envar>CC_FOR_TARGET</envar> ends up as    <command>cc</command> when the host is the same as the target, but is    different from the build system. This is why    <parameter>CC_FOR_TARGET=$LFS_TGT-gcc</parameter> is put explicitely into    the configure options.</para>    <para>Upon entering the chroot environment in <xref    linkend="chapter-chroot-temporary-tools"/>, the first task is to install    libstdc++. Then temporary installations of programs needed for the proper    operation of the toolchain are performed. Programs needed for testing    other programs are also built. From this point onwards, the    core toolchain is self-contained and self-hosted.  In     <xref linkend="chapter-building-system"/>, final versions of all the    packages needed for a fully functional system are built, tested and    installed.</para>  </sect2></sect1>
 |