Browse Source

Updated Man-DB text to account for recent Man-DB development. Many thanks to Alexander Patrakov for patientely guiding me through this.

git-svn-id: http://svn.linuxfromscratch.org/LFS/trunk/BOOK@8698 4aa44e1e-78dd-0310-a6d2-fbcd4c07a689
DJ Lucas 17 years ago
parent
commit
7f89db8a15
2 changed files with 130 additions and 77 deletions
  1. 11 0
      chapter01/changelog.xml
  2. 119 77
      chapter06/man-db.xml

+ 11 - 0
chapter01/changelog.xml

@@ -36,6 +36,17 @@
     </listitem>
     </listitem>
 
 
 -->
 -->
+    <listitem>
+      <para>2008-10-25</para>
+      <itemizedlist>
+        <listitem>
+          <para>[dj] - Updated the text on the Man-DB page to accout for recent
+          changes in Man-DB.  Thanks to Alexander Patrakov for providing most
+          of the included text, explanations, and examples.</para>
+        </listitem>
+      </itemizedlist>
+    </listitem>
+
     <listitem>
     <listitem>
       <para>2008-10-23</para>
       <para>2008-10-23</para>
       <itemizedlist>
       <itemizedlist>

+ 119 - 77
chapter06/man-db.xml

@@ -111,71 +111,95 @@
 
 
 <screen><userinput remap="install">make install</userinput></screen>
 <screen><userinput remap="install">make install</userinput></screen>
 
 
-    <para>Some packages provide UTF-8 manual pages, which previous versions of
-    <application>Man-DB</application> were unable to display.  This limitation
-    has been fixed in recent versions, and <application>Man-DB</application>
-    can now convert manual pages from legacy encodings to UTF-8
-    (and vice-versa) on the fly.  This used to be a rather annoying
-    problem across different distributions, as packages written for one
-    distribution would require changes to work on another. The following
-    script will allow you to convert manual pages to and from legacy and UTF-8
-    encodings.</para>
-
-<screen><userinput remap="install">cat &gt;&gt; convert-mans &lt;&lt; "EOF"
-<literal>#!/bin/sh -e
-FROM="$1"
-TO="$2"
-shift ; shift
-while [ $# -gt 0 ]
-do
-        FILE="$1"
-        shift
-        iconv -f "$FROM" -t "$TO" "$FILE" >.tmp.iconv
-        mv .tmp.iconv "$FILE"
-done</literal>
-EOF
-install -m755 convert-mans  /usr/bin</userinput></screen>
-
-    <para>Additional information regarding the compression of
-    man and info pages can be found in the BLFS book at
-    <ulink url="&blfs-root;view/cvs/postlfs/compressdoc.html"/>.</para>
-
   </sect2>
   </sect2>
 
 
   <sect2>
   <sect2>
     <title>Non-English Manual Pages in LFS</title>
     <title>Non-English Manual Pages in LFS</title>
+<!--
+    <para>Some packages provide UTF-8 manual pages, which previous versions of
+    <application>Man-DB</application> were unable to display correctly because
+    the expected (8-bit) encoding for each language was hard-coded in the
+    source of <application>Man-DB</application>.
+    <application>Man-DB</application> now uses the extension of the directory
+    name in order to determine the encoding of the manual pages stored within.
+    If no extension exists, <application>Man-DB</application> uses a built-in
+    table (see below) to determine the encoding.  E.g., because of "UTF-8" in
+    the directory name, it knows that all manual pages residing in 
+    <filename class="directory">/usr/share/man/fr.UTF-8</filename> are UTF-8
+    encoded and, according to the built-in table, expects all manual pages
+    residing in <filename class="directory">/usr/share/man/ru</filename> to
+    be encoded using KOI8-R.</para>
 
 
     <para>Linux distributions have different policies concerning the character
     <para>Linux distributions have different policies concerning the character
     encoding in which manual pages are stored in the filesystem. E.g., RedHat
     encoding in which manual pages are stored in the filesystem. E.g., RedHat
     stores all manual pages in UTF-8, while Debian previously used
     stores all manual pages in UTF-8, while Debian previously used
-    language-specific (mostly 8-bit) encodings. As mentioned above, this leads
-    to incompatibility of packages with manual pages designed for different
-    distributions.</para>
-
-    <para>LFS previously used the same convention as Debian. This was chosen
-    because <application>Man-DB</application> did not understand manual pages
-    stored in UTF-8 at the time of its introduction into LFS.  For our purposes
-    at that time, <application>Man-DB</application> was preferable to
-    <application>Man</application> as it worked without any additional
-    configuration in any locale.  This is still true today as
-    <application>Man-DB</application> with Debian patched
-    <application>Groff</application> will now dynamically convert UTF-8 encoded
-    manual pages to the user's locale.  Additionally, this combination provides
-    support for Chinese and Japanese locales, and limited support for Korean,
-    whereas <application>Man</application> does not. The current offering of
-    <application>Man</application> as used in RedHat requires major
-    modifications to both the <application>Man</application> and
-    <application>Groff</application> packages, and still falls short on
-    Chinese, Japanese, and Korean encodings.</para>
-
-    <para>Finally, most distributions, including Debian, are rapidly migrating
-    to all UTF-8 encoded manual pages. Upstream packagers will very likely drop
-    legacy encodings in favor of UTF-8, though adoption has been slow due to
-    the hacks required to make the current <application>Man</application> and
-    <application>Groff</application> packages work correctly together.</para>
-
-    <para>The relationship between language codes and the expected encoding
-    of legacy manual pages is listed below.</para>
+    language-specific (mostly 8-bit) encodings. Many other distributions simply
+    ignore the problem all together.  LFS also used the legacy encodings in
+    previuos versions of the book. This was chosen because of the ease of
+    configuration associated with <application>Man-DB</application>.
+    Additionally, <application>Man-DB</application> provided support for
+    Chinese and Japanese locales, and limited support for Korean, whereas
+    <application>Man</application> did not at that time.</para>
+
+    <para>In contrast, the setup in Fedora Core expects all manual pages
+    to be UTF-8 encoded, and stored in directories without suffixes.
+    Disagreement about the expected encoding of manual pages amongst
+    distribution vendors, has led to confusion for upstream package maintainers.
+    Some packages contain, UTF-8 manual pages, while others ship with manual
+    pages in legacy encodings.  Unlike the
+    <application>Man</application>/<application>Groff</application> setup in
+    Fedora Core, <application>Man-DB</application> can make very good decisions
+    about the on disk encoding and present the information to the user in their
+    prefered format, without complex configurations.</para>
+
+    <para><application>Man-DB</application> has, for the most part, made this
+    problem completely transparent to end users, as long as the manual pages
+    are installed into the correct directory.  There may be times, however,
+    where one encoding is preferred over the other.  For this purpose, the
+    <command>convert-mans</command> script was written. It will convert manual
+    pages to another encoding before (or after) installation.  Install the
+    <command>convert-mans</command> script with the following
+    instructions:</para>
+-->
+    <para>Some packages provide non-English manual pages. They are displayed 
+    correctly only if their location and encoding matches the expectation of 
+    the "man" program. However, different Linux distributions have different 
+    policies (expressed in the choice of the <command>man</command> program,
+    its configuration and patches applied to it) concerning the character 
+    encoding in which manual pages are stored in the filesystem.</para>
+
+    <para>E.g., Debian previously required Russian manual pages to be encoded
+    in KOI8-R and to be placed in
+    <filename class="directory">/usr/share/man/ru</filename>. Now, in addition,
+    their <command>man</command> program (<application>Man-DB</application>)
+    searches for UTF-8 encoded Russian manual pages in
+    <filename class="directory">/usr/share/man/ru.UTF-8</filename>. On the
+    other hand, Fedora uses UTF-8 encoded manual pages exclusively. Russian
+    manual pages  are found in
+    <filename class="directory">/usr/share/man/ru</filename> and their
+    <command>man</command> program doesn't acknowledge
+    <filename class="directory">/usr/share/man/ru.UTF-8</filename>.  Many
+    other distributions ignore the on disk encodings completely, leaving the
+    end user with a mix of improperly encoded manual pages for their
+    configuration. When <command>man</command> processes the requtested page,
+    it will display the contents as configured, resulting in completely
+    unreadable text if the on disk encoding is not what is expected for that
+    configuration.</para>
+
+    <para>Disagreement about the expected encoding of manual pages amongst
+    distribution vendors, has led to confusion for upstream package
+    maintainers. One package may contain UTF-8 manual pages, while another
+    ships with manual pages in legacy encodings. <command>man</command>
+    searches for manual pages based on the user's locale settings.
+    <application>Man-DB</application> uses a built-in table (see below) to
+    determine the on disk encoding of manual pages found for a user's
+    locale, only if the directories found do not have an extension that
+    describes the encoding. E.g., because of ".UTF-8" in the directory name,
+    <application>Man-DB</application> knows that all manual pages residing in 
+    <filename class="directory">/usr/share/man/fr.UTF-8</filename> are UTF-8
+    encoded and, according to the built-in table, expects all manual pages
+    residing in <filename class="directory">/usr/share/man/ru</filename> to
+    be encoded using KOI8-R.</para>
 
 
     <!-- Origin: man-db-2.5.2/src/encodings.c -->
     <!-- Origin: man-db-2.5.2/src/encodings.c -->
     <table>
     <table>
@@ -308,7 +332,7 @@ install -m755 convert-mans  /usr/bin</userinput></screen>
             <entry>GBK</entry>
             <entry>GBK</entry>
           </row>
           </row>
           <row>
           <row>
-            <entry>Simplified Chinese,Singapore} (zh_SG)</entry>
+            <entry>Simplified Chinese, Singapore (zh_SG)</entry>
             <entry>GBK</entry>
             <entry>GBK</entry>
           </row>
           </row>
           <row>
           <row>
@@ -330,12 +354,36 @@ install -m755 convert-mans  /usr/bin</userinput></screen>
       Norwegian does not work because of the transition from no_NO to
       Norwegian does not work because of the transition from no_NO to
       nb_NO locale, and will be fixed in the next release of 
       nb_NO locale, and will be fixed in the next release of 
       <application>Man-DB</application>.  Korean is currently non functional
       <application>Man-DB</application>.  Korean is currently non functional
-      because of incomplete fixes in the Groff patch.</para>
+      because of incomplete fixes in the Debian
+      <application>Groff</application> patch applied in LFS.</para>
     </note>
     </note>
 
 
+    <para>Packages may install manual pages into an improperly named directory,
+    depending on which distributions the author develops the package for. To
+    assist in the conversion of the manual pages to the proper encoding for the
+    directory in which they are installed, the <command>convert-mans</command>
+    script was written. It will convert manual pages to another encoding before
+    (or after) installation.  Install the <command>convert-mans</command>
+    script with the following instructions:</para>
+
+<screen><userinput remap="install">cat &gt;&gt; convert-mans &lt;&lt; "EOF"
+<literal>#!/bin/sh -e
+FROM="$1"
+TO="$2"
+shift ; shift
+while [ $# -gt 0 ]
+do
+        FILE="$1"
+        shift
+        iconv -f "$FROM" -t "$TO" "$FILE" >.tmp.iconv
+        mv .tmp.iconv "$FILE"
+done</literal>
+EOF
+install -m755 convert-mans  /usr/bin</userinput></screen>
+
 
 
-    <para>If upstream distributes the manual pages in a legacy encoding,
-    the manual pages can simply be copied to
+    <para>If upstream distributes the manual pages in a legacy encoding, the
+    manual pages can simply be copied to
     <filename class="directory">/usr/share/man/<replaceable>&lt;language
     <filename class="directory">/usr/share/man/<replaceable>&lt;language
     code&gt;</replaceable></filename>. For example, <ulink
     code&gt;</replaceable></filename>. For example, <ulink
     url="http://www.infodrom.org/projects/manpages-de/download/manpages-de-0.5.tar.gz">
     url="http://www.infodrom.org/projects/manpages-de/download/manpages-de-0.5.tar.gz">
@@ -353,26 +401,20 @@ cp -rv man? /usr/share/man/de</userinput></screen>
     code&gt;</replaceable>.UTF-8</filename>.</para>
     code&gt;</replaceable>.UTF-8</filename>.</para>
 
 
     <para>For example, to install <ulink
     <para>For example, to install <ulink
-    url="http://ditec.um.es/~piernas/manpages-es/man-pages-es-1.55.tar.bz2">
-    Spanish manual pages</ulink> in the legacy encoding, use the following
+    url="http://manpagesfr.free.fr/download/man-pages-fr-2.40.0.tar.bz2">
+    French manual pages</ulink> in the legacy encoding, use the following
     commands:</para>
     commands:</para>
 
 
-<screen role="nodump"><userinput>mv man7/iso_8859-7.7{,X}
-convert-mans UTF-8 ISO-8859-1 man?/*.?
-mv man7/iso_8859-7.7{X,}
-make install</userinput></screen>
+<screen role="nodump"><userinput>convert-mans UTF-8 ISO-8859-1 man?/*.?
+mkdir -p /usr/share/man/fr
+cp -rv man? /usr/share/man/fr</userinput></screen>
 
 
-    <note>
-      <para>The <filename>man7/iso_8859-7.7</filename> file needs to be
-      exclueded from the conversion process because it is already in
-      ISO-8859-1 format.  This is a packaging bug in man-pages-es-1.55.
-      Future versions should not require this workaround.</para>
-    </note>
+    <note><para>The French manual pages ship with ready made scripts to do the
+    same conversion. The above instructions are used only as an example for
+    use of the <command>convert-mans</command> script.</para></note>
 
 
-    <para>Finally, as an example installation of UTF-8 manual pages, the <ulink 
-    url="http://manpagesfr.free.fr/download/man-pages-fr-2.40.0.tar.bz2">
-    French manual pages</ulink> can be installed with the following
-    commands:</para>
+    <para>Finally, as an example installation of UTF-8 manual pages, again, the
+    French manual pages could be installed with the following commands:</para>
 
 
 <screen role="nodump"><userinput>mkdir -p /usr/share/man/fr.UTF-8
 <screen role="nodump"><userinput>mkdir -p /usr/share/man/fr.UTF-8
 cp -rv man? /usr/share/man/fr.UTF-8</userinput></screen>
 cp -rv man? /usr/share/man/fr.UTF-8</userinput></screen>