| 12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273 | 
							- <refentry xmlns="http://docbook.org/ns/docbook"
 
-           xmlns:xlink="http://www.w3.org/1999/xlink"
 
-           xmlns:xi="http://www.w3.org/2001/XInclude"
 
-           xmlns:src="http://nwalsh.com/xmlns/litprog/fragment"
 
-           xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 
-           version="5.0" xml:id="make.index.markup">
 
- <refmeta>
 
- <refentrytitle>make.index.markup</refentrytitle>
 
- <refmiscinfo class="other" otherclass="datatype">boolean</refmiscinfo>
 
- </refmeta>
 
- <refnamediv>
 
- <refname>make.index.markup</refname>
 
- <refpurpose>Generate XML index markup in the index?</refpurpose>
 
- </refnamediv>
 
- <refsynopsisdiv>
 
- <src:fragment xml:id="make.index.markup.frag">
 
- <xsl:param name="make.index.markup" select="0"/>
 
- </src:fragment>
 
- </refsynopsisdiv>
 
- <refsection><info><title>Description</title></info>
 
- <para>This parameter enables a very neat trick for getting properly
 
- merged, collated back-of-the-book indexes. G. Ken Holman suggested
 
- this trick at Extreme Markup Languages 2002 and I'm indebted to him
 
- for it.</para>
 
- <para>Jeni Tennison's excellent code in
 
- <filename>autoidx.xsl</filename> does a great job of merging and
 
- sorting <tag>indexterm</tag>s in the document and building a
 
- back-of-the-book index. However, there's one thing that it cannot
 
- reasonably be expected to do: merge page numbers into ranges. (I would
 
- not have thought that it could collate and suppress duplicate page
 
- numbers, but in fact it appears to manage that task somehow.)</para>
 
- <para>Ken's trick is to produce a document in which the index at the
 
- back of the book is <quote>displayed</quote> in XML. Because the index
 
- is generated by the FO processor, all of the page numbers have been resolved.
 
- It's a bit hard to explain, but what it boils down to is that instead of having
 
- an index at the back of the book that looks like this:</para>
 
- <blockquote>
 
- <formalpara><info><title>A</title></info>
 
- <para>ap1, 1, 2, 3</para>
 
- </formalpara>
 
- </blockquote>
 
- <para>you get one that looks like this:</para>
 
- <blockquote>
 
- <programlisting><indexdiv>A</indexdiv>
 
- <indexentry>
 
- <primaryie>ap1</primaryie>,
 
- <phrase role="pageno">1</phrase>,
 
- <phrase role="pageno">2</phrase>,
 
- <phrase role="pageno">3</phrase>
 
- </indexentry></programlisting>
 
- </blockquote>
 
- <para>After building a PDF file with this sort of odd-looking index, you can
 
- extract the text from the PDF file and the result is a proper index expressed in
 
- XML.</para>
 
- <para>Now you have data that's amenable to processing and a simple Perl script
 
- (such as <filename>fo/pdf2index</filename>) can
 
- merge page ranges and generate a proper index.</para>
 
- <para>Finally, reformat your original document using this literal index instead of
 
- an automatically generated one and <quote>bingo</quote>!</para>
 
- </refsection>
 
- </refentry>
 
 
  |