123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340 |
- <?xml version="1.0" encoding="ISO-8859-1"?>
- <!DOCTYPE sect1 PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
- "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
- <!ENTITY % general-entities SYSTEM "../general.ent">
- %general-entities;
- ]>
- <sect1 id="ch-tools-toolchaintechnotes" xreflabel="Toolchain Technical Notes">
- <?dbhtml filename="toolchaintechnotes.html"?>
- <title>Toolchain Technical Notes</title>
- <para>This section explains some of the rationale and technical details
- behind the overall build method. It is not essential to immediately
- understand everything in this section. Most of this information will be
- clearer after performing an actual build. This section can be referred
- to at any time during the process.</para>
- <para>The overall goal of <xref linkend="chapter-cross-tools"/> and <xref
- linkend="chapter-temporary-tools"/> is to produce a temporary area that
- contains a known-good set of tools that can be isolated from the host system.
- By using <command>chroot</command>, the commands in the remaining chapters
- will be contained within that environment, ensuring a clean, trouble-free
- build of the target LFS system. The build process has been designed to
- minimize the risks for new readers and to provide the most educational value
- at the same time.</para>
- <para>The build process is based on the process of
- <emphasis>cross-compilation</emphasis>. Cross-compilation is normally used
- for building a compiler and its toolchain for a machine different from
- the one that is used for the build. This is not strictly needed for LFS,
- since the machine where the new system will run is the same as the one
- used for the build. But cross-compilation has the great advantage that
- anything that is cross-compiled cannot depend on the host environment.</para>
- <sect2 id="cross-compile" xreflabel="About Cross-Compilation">
- <title>About Cross-Compilation</title>
- <para>Cross-compilation involves some concepts that deserve a section on
- their own. Although this section may be omitted in a first reading, it
- is strongly suggested to come back to it later in order to get a full
- grasp of the build process.</para>
- <para>Let us first define some terms used in this context:</para>
- <variablelist>
- <varlistentry><term>build</term><listitem>
- <para>is the machine where we build programs. Note that this machine
- is referred to as the <quote>host</quote> in other
- sections.</para></listitem>
- </varlistentry>
- <varlistentry><term>host</term><listitem>
- <para>is the machine/system where the built programs will run. Note
- that this use of <quote>host</quote> is not the same as in other
- sections.</para></listitem>
- </varlistentry>
- <varlistentry><term>target</term><listitem>
- <para>is only used for compilers. It is the machine the compiler
- produces code for. It may be different from both build and
- host.</para></listitem>
- </varlistentry>
- </variablelist>
- <para>As an example, let us imagine the following scenario (sometimes
- referred to as <quote>Canadian Cross</quote>): we may have a
- compiler on a slow machine only, let's call the machine A, and the compiler
- ccA. We may have also a fast machine (B), but with no compiler, and we may
- want to produce code for another slow machine (C). To build a
- compiler for machine C, we would have three stages:</para>
- <informaltable align="center">
- <tgroup cols="5">
- <colspec colnum="1" align="center"/>
- <colspec colnum="2" align="center"/>
- <colspec colnum="3" align="center"/>
- <colspec colnum="4" align="center"/>
- <colspec colnum="5" align="left"/>
- <thead>
- <row><entry>Stage</entry><entry>Build</entry><entry>Host</entry>
- <entry>Target</entry><entry>Action</entry></row>
- </thead>
- <tbody>
- <row>
- <entry>1</entry><entry>A</entry><entry>A</entry><entry>B</entry>
- <entry>build cross-compiler cc1 using ccA on machine A</entry>
- </row>
- <row>
- <entry>2</entry><entry>A</entry><entry>B</entry><entry>C</entry>
- <entry>build cross-compiler cc2 using cc1 on machine A</entry>
- </row>
- <row>
- <entry>3</entry><entry>B</entry><entry>C</entry><entry>C</entry>
- <entry>build compiler ccC using cc2 on machine B</entry>
- </row>
- </tbody>
- </tgroup>
- </informaltable>
- <para>Then, all the other programs needed by machine C can be compiled
- using cc2 on the fast machine B. Note that unless B can run programs
- produced for C, there is no way to test the built programs until machine
- C itself is running. For example, for testing ccC, we may want to add a
- fourth stage:</para>
- <informaltable align="center">
- <tgroup cols="5">
- <colspec colnum="1" align="center"/>
- <colspec colnum="2" align="center"/>
- <colspec colnum="3" align="center"/>
- <colspec colnum="4" align="center"/>
- <colspec colnum="5" align="left"/>
- <thead>
- <row><entry>Stage</entry><entry>Build</entry><entry>Host</entry>
- <entry>Target</entry><entry>Action</entry></row>
- </thead>
- <tbody>
- <row>
- <entry>4</entry><entry>C</entry><entry>C</entry><entry>C</entry>
- <entry>rebuild and test ccC using itself on machine C</entry>
- </row>
- </tbody>
- </tgroup>
- </informaltable>
- <para>In the example above, only cc1 and cc2 are cross-compilers, that is,
- they produce code for a machine different from the one they are run on.
- The other compilers ccA and ccC produce code for the machine they are run
- on. Such compilers are called <emphasis>native</emphasis> compilers.</para>
- </sect2>
- <sect2 id="lfs-cross">
- <title>Implementation of Cross-Compilation for LFS</title>
- <note>
- <para>Almost all the build systems use names of the form
- cpu-vendor-kernel-os referred to as the machine triplet. An astute
- reader may wonder why a <quote>triplet</quote> refers to a four component
- name. The reason is history: initially, three component names were enough
- to designate unambiguously a machine, but with new machines and systems
- appearing, that proved insufficient. The word <quote>triplet</quote>
- remained. A simple way to determine your machine triplet is to run
- the <command>config.guess</command>
- script that comes with the source for many packages. Unpack the binutils
- sources and run the script: <userinput>./config.guess</userinput> and note
- the output. For example, for a 32-bit Intel processor the
- output will be <emphasis>i686-pc-linux-gnu</emphasis>. On a 64-bit
- system it will be <emphasis>x86_64-pc-linux-gnu</emphasis>.</para>
- <para>Also be aware of the name of the platform's dynamic linker, often
- referred to as the dynamic loader (not to be confused with the standard
- linker <command>ld</command> that is part of binutils). The dynamic linker
- provided by Glibc finds and loads the shared libraries needed by a
- program, prepares the program to run, and then runs it. The name of the
- dynamic linker for a 32-bit Intel machine will be <filename
- class="libraryfile">ld-linux.so.2</filename> (<filename
- class="libraryfile">ld-linux-x86-64.so.2</filename> for 64-bit systems). A
- sure-fire way to determine the name of the dynamic linker is to inspect a
- random binary from the host system by running: <userinput>readelf -l
- <name of binary> | grep interpreter</userinput> and noting the
- output. The authoritative reference covering all platforms is in the
- <filename>shlib-versions</filename> file in the root of the Glibc source
- tree.</para>
- </note>
- <para>In order to fake a cross compilation, the name of the host triplet
- is slightly adjusted by changing the "vendor" field in the
- <envar>LFS_TGT</envar> variable. We also use the
- <parameter>--with-sysroot</parameter> option when building the cross linker and
- cross compiler to tell them where to find the needed host files. This
- ensures that none of the other programs built in <xref
- linkend="chapter-temporary-tools"/> can link to libraries on the build
- machine. Only two stages are mandatory, and one more for tests:</para>
- <informaltable align="center">
- <tgroup cols="5">
- <colspec colnum="1" align="center"/>
- <colspec colnum="2" align="center"/>
- <colspec colnum="3" align="center"/>
- <colspec colnum="4" align="center"/>
- <colspec colnum="5" align="left"/>
- <thead>
- <row><entry>Stage</entry><entry>Build</entry><entry>Host</entry>
- <entry>Target</entry><entry>Action</entry></row>
- </thead>
- <tbody>
- <row>
- <entry>1</entry><entry>pc</entry><entry>pc</entry><entry>lfs</entry>
- <entry>build cross-compiler cc1 using cc-pc on pc</entry>
- </row>
- <row>
- <entry>2</entry><entry>pc</entry><entry>lfs</entry><entry>lfs</entry>
- <entry>build compiler cc-lfs using cc1 on pc</entry>
- </row>
- <row>
- <entry>3</entry><entry>lfs</entry><entry>lfs</entry><entry>lfs</entry>
- <entry>rebuild and test cc-lfs using itself on lfs</entry>
- </row>
- </tbody>
- </tgroup>
- </informaltable>
- <para>In the above table, <quote>on pc</quote> means the commands are run
- on a machine using the already installed distribution. <quote>On
- lfs</quote> means the commands are run in a chrooted environment.</para>
- <para>Now, there is more about cross-compiling: the C language is not
- just a compiler, but also defines a standard library. In this book, the
- GNU C library, named glibc, is used. This library must
- be compiled for the lfs machine, that is, using the cross compiler cc1.
- But the compiler itself uses an internal library implementing complex
- instructions not available in the assembler instruction set. This
- internal library is named libgcc, and must be linked to the glibc
- library to be fully functional! Furthermore, the standard library for
- C++ (libstdc++) also needs being linked to glibc. The solution to this
- chicken and egg problem is to first build a degraded cc1 based libgcc,
- lacking some functionalities such as threads and exception handling, then
- build glibc using this degraded compiler (glibc itself is not
- degraded), then build libstdc++. But this last library will lack the
- same functionalities as libgcc.</para>
- <para>This is not the end of the story: the conclusion of the preceding
- paragraph is that cc1 is unable to build a fully functional libstdc++, but
- this is the only compiler available for building the C/C++ libraries
- during stage 2! Of course, the compiler built during stage 2, cc-lfs,
- would be able to build those libraries, but (1) the build system of
- GCC does not know that it is usable on pc, and (2) using it on pc
- would be at risk of linking to the pc libraries, since cc-lfs is a native
- compiler. So we have to build libstdc++ later, in chroot.</para>
- </sect2>
- <sect2 id="other-details">
- <title>Other procedural details</title>
- <para>The cross-compiler will be installed in a separate <filename
- class="directory">$LFS/tools</filename> directory, since it will not
- be part of the final system.</para>
- <para>Binutils is installed first because the <command>configure</command>
- runs of both GCC and Glibc perform various feature tests on the assembler
- and linker to determine which software features to enable or disable. This
- is more important than one might first realize. An incorrectly configured
- GCC or Glibc can result in a subtly broken toolchain, where the impact of
- such breakage might not show up until near the end of the build of an
- entire distribution. A test suite failure will usually highlight this error
- before too much additional work is performed.</para>
- <para>Binutils installs its assembler and linker in two locations,
- <filename class="directory">$LFS/tools/bin</filename> and <filename
- class="directory">$LFS/tools/$LFS_TGT/bin</filename>. The tools in one
- location are hard linked to the other. An important facet of the linker is
- its library search order. Detailed information can be obtained from
- <command>ld</command> by passing it the <parameter>--verbose</parameter>
- flag. For example, <command>$LFS_TGT-ld --verbose | grep SEARCH</command>
- will illustrate the current search paths and their order. It shows which
- files are linked by <command>ld</command> by compiling a dummy program and
- passing the <parameter>--verbose</parameter> switch to the linker. For
- example,
- <command>$LFS_TGT-gcc dummy.c -Wl,--verbose 2>&1 | grep succeeded</command>
- will show all the files successfully opened during the linking.</para>
- <para>The next package installed is GCC. An example of what can be
- seen during its run of <command>configure</command> is:</para>
- <screen><computeroutput>checking what assembler to use... /mnt/lfs/tools/i686-lfs-linux-gnu/bin/as
- checking what linker to use... /mnt/lfs/tools/i686-lfs-linux-gnu/bin/ld</computeroutput></screen>
- <para>This is important for the reasons mentioned above. It also
- demonstrates that GCC's configure script does not search the PATH
- directories to find which tools to use. However, during the actual
- operation of <command>gcc</command> itself, the same search paths are not
- necessarily used. To find out which standard linker <command>gcc</command>
- will use, run: <command>$LFS_TGT-gcc -print-prog-name=ld</command>.</para>
- <para>Detailed information can be obtained from <command>gcc</command> by
- passing it the <parameter>-v</parameter> command line option while compiling
- a dummy program. For example, <command>gcc -v dummy.c</command> will show
- detailed information about the preprocessor, compilation, and assembly
- stages, including <command>gcc</command>'s included search paths and their
- order.</para>
- <para>Next installed are sanitized Linux API headers. These allow the
- standard C library (Glibc) to interface with features that the Linux
- kernel will provide.</para>
- <para>The next package installed is Glibc. The most important
- considerations for building Glibc are the compiler, binary tools, and
- kernel headers. The compiler is generally not an issue since Glibc will
- always use the compiler relating to the <parameter>--host</parameter>
- parameter passed to its configure script; e.g. in our case, the compiler
- will be <command>$LFS_TGT-gcc</command>. The binary tools and kernel
- headers can be a bit more complicated. Therefore, take no risks and use
- the available configure switches to enforce the correct selections. After
- the run of <command>configure</command>, check the contents of the
- <filename>config.make</filename> file in the <filename
- class="directory">build</filename> directory for all important details.
- Note the use of <parameter>CC="$LFS_TGT-gcc"</parameter> (with
- <envar>$LFS_TGT</envar> expanded) to control which binary tools are used
- and the use of the <parameter>-nostdinc</parameter> and
- <parameter>-isystem</parameter> flags to control the compiler's include
- search path. These items highlight an important aspect of the Glibc
- package—it is very self-sufficient in terms of its build machinery
- and generally does not rely on toolchain defaults.</para>
- <para>As said above, the standard C++ library is compiled next, followed in
- <xref linkend="chapter-temporary-tools"/> by all the programs that need
- themselves to be built. The install step of all those packages uses the
- <envar>DESTDIR</envar> variable to have the
- programs land into the LFS filesystem.</para>
- <para>At the end of <xref linkend="chapter-temporary-tools"/> the native
- lfs compiler is installed. First binutils-pass2 is built,
- with the same <envar>DESTDIR</envar> install as the other programs,
- then the second pass of GCC is constructed, omitting libstdc++
- and other non-important libraries. Due to some weird logic in GCC's
- configure script, <envar>CC_FOR_TARGET</envar> ends up as
- <command>cc</command> when the host is the same as the target, but is
- different from the build system. This is why
- <parameter>CC_FOR_TARGET=$LFS_TGT-gcc</parameter> is put explicitly into
- the configure options.</para>
- <para>Upon entering the chroot environment in <xref
- linkend="chapter-chroot-temporary-tools"/>, the first task is to install
- libstdc++. Then temporary installations of programs needed for the proper
- operation of the toolchain are performed. Programs needed for testing
- other programs are also built. From this point onwards, the
- core toolchain is self-contained and self-hosted. In
- <xref linkend="chapter-building-system"/>, final versions of all the
- packages needed for a fully functional system are built, tested and
- installed.</para>
- </sect2>
- </sect1>
|