Kirill Smelkov | 25 Jul 19:48 2012
Picon

Re: Multiple definition list terms, single definition?

+milde, +docutils-devel

On Fri, Apr 20, 2007 at 09:35:56AM -0400, David Goodger wrote:
> On 4/19/07, Tony Ibbs <tibs <at> tibsnjoan.co.uk> wrote:
> > Checking back on the reStructuredText documentation, it looks as if
> > it defines definition lists (not unreasonably) as term/definition
> > pairs, without the ability to have more than one term associated
> > with the same definition.
> 
> Correct.
> 
> > My colleague has reformatted the document (the traditional
> > workaround), but I wonder if either (a) there *is* a way to do it,
> > which we just couldn't think of,
> 
> No, not without ugly hacks (like forced line breaks via substitutions
> and "raw:: html").
> 
> > or (b) it is something that we *should* be supporting?
> 
> Perhaps; there's precedent.  HTML defines DL element contents as
> "(DT|DD)+", which allows multiple terms (and multiple definitions
> too).  That's very loose and a bit nonsensical (the definition list
> can start with a definition; but for what term?).  DocBook has
> "variable lists", and the contents of varlistentry elements is
> "(term+,listitem)", which allows multiple terms.  AFAICT OpenDocument
> doesn't have a concept of definition lists; it's very
> presentation-oriented.
> 
> Currently reST doesn't allow multiple-line terms.  If you try it
> you'll get an "unexpected indentation" error.  Perhaps we could define
> definition list items as one term per line, so this input:
> 
>     term 1
>     term 2
>        definition
> 
> would produce this output:
> 
>     <definition_list>
>         <definition_list_item>
>             <term>
>                 term 1
>             <term>
>                 term 2
>             <definition>
>                 <paragraph>
>                     definition
> 
> The current output is:
> 
>     <paragraph>
>         term 1
>         term 2
>     <system_message level="3" line="3" source="<stdin>" type="ERROR">
>         <paragraph>
>             Unexpected indentation.
>     <block_quote>
>         <paragraph>
>             definition
> 
> I think this type of mistake is currently quite common.  For example,
> a paragraph followed immediately (no blank line) by an indented bullet
> list.  If we make this change to definition lists a lot of those
> helpful errors will disappear, and potentially unhelpful definition
> lists will take their place.  Will the new behavior seem surprising or
> ambiguous?
> 
> The internal representation is tricky since we also have classifiers
> to contend with.  Currently a definition_list_item is defined as
> 
>     (term, classifier*, definition)
> 
> The new definition would be:
> 
>     ((term, classifier*)+, definition)
> 
> I'm not sure if the change is worthwhile or not.  It's certainly
> worthwhile discussing though.

FYI, with the following patch I've teached RST parser to accept
muliti-term definitions lists. And with another patch to latex2e writer
and custom TeX stylesheet the output looks similiar to e.g.

    http://lwn.net/images/pdf/LDD3/ch02.pdf , page 26(40)

Thanks,
Kirill

P.S. Sorry for my english
P.P.S. http://repo.or.cz/w/docutils/kirr.git/shortlog/refs/heads/y/multidef

---- 8< ----
===============
 Multidef demo
===============

Test document to demonstrate definition lists with multiple-terms entries.

function1(some,arguments)
   Behaves like this

function2(some,arguments)
function3(some,other,arguments)
function4(some,variant,on,those)
   All behave in this other way, which is best described in
   one place together. Blah blah blah...

function5(whatever)
   Goes back to the normal pattern.

---- 8< ----
>From 43e39327bdf01f769376c883d686c8b9db90f55d Mon Sep 17 00:00:00 2001
From: Kirill Smelkov <kirr <at> mns.spb.ru>
Date: Wed, 25 Jul 2012 21:29:45 +0400
Subject: [PATCH 1/2] rst: Add support for parsing multiterm definitions

e.g. for definition list like this:

    term1
        definition ...

    term2
    term3
        common definition for terms 2 and 3

I've updated tests minimally (e.g. thet all pass, but still it would be
better to change wording), but doctree spec remains untouched.
---
 docutils/docutils/parsers/rst/states.py            | 38 +++++++++++++++++-----
 .../test_parsers/test_rst/test_block_quotes.py     | 18 +++++-----
 .../test_parsers/test_rst/test_definition_lists.py | 18 +++++-----
 .../test_rst/test_directives/test_include.py       | 18 +++++-----
 .../test_parsers/test_rst/test_literal_blocks.py   | 25 ++++++++------
 5 files changed, 71 insertions(+), 46 deletions(-)

diff --git a/docutils/docutils/parsers/rst/states.py b/docutils/docutils/parsers/rst/states.py
index 5776287..aa048b4 100644
--- a/docutils/docutils/parsers/rst/states.py
+++ b/docutils/docutils/parsers/rst/states.py
 <at>  <at>  -2707,15 +2707,17  <at>  <at>  class Text(RSTState):
         return [], next_state, []

     def text(self, match, context, next_state):
-        """Paragraph."""
+        """Paragraph or definition terms."""
         startline = self.state_machine.abs_line_number() - 1
         msg = None
         try:
             block = self.state_machine.get_text_block(flush_left=True)
         except statemachine.UnexpectedIndentationError, err:
+            # it was a multiterm definition
             block, src, srcline = err.args
-            msg = self.reporter.error('Unexpected indentation.',
-                                      source=src, line=srcline)
+            # continue parsing -> it should call .indent() next
+            return context+list(block), 'Text', []
+
         lines = context + list(block)
         paragraph, literalnext = self.paragraph(lines, startline)
         self.parent += paragraph
 <at>  <at>  -2756,19 +2758,20  <at>  <at>  class Text(RSTState):
         self.goto_line(new_abs_offset)
         return parent_node.children

-    def definition_list_item(self, termline):
+    def definition_list_item(self, termlines):
         indented, indent, line_offset, blank_finish = \
               self.state_machine.get_indented()
         itemnode = nodes.definition_list_item(
-            '\n'.join(termline + list(indented)))
+            '\n'.join(termlines + list(indented)))
         lineno = self.state_machine.abs_line_number() - 1
         (itemnode.source,
          itemnode.line) = self.state_machine.get_source_and_line(lineno)
-        termlist, messages = self.term(termline, lineno)
-        itemnode += termlist
+        for termline in termlines:
+            termlist, messages = self.term([termline], lineno)
+            itemnode += termlist
         definition = nodes.definition('', *messages)
         itemnode += definition
-        if termline[0][-2:] == '::':
+        if termlines[-1][-2:] == '::':
             definition += self.reporter.info(
                   'Blank line missing before literal block (after the "::")? '
                   'Interpreted as a definition list item.',
 <at>  <at>  -2830,7 +2833,7  <at>  <at>  class Definition(SpecializedText):

     def eof(self, context):
         """Not a definition."""
-        self.state_machine.previous_line(2) # so parent SM can reassess
+        self.state_machine.previous_line(len(context)+1) # so parent SM can reassess
         return []

     def indent(self, match, context, next_state):
 <at>  <at>  -2841,6 +2844,23  <at>  <at>  class Definition(SpecializedText):
         return [], 'DefinitionList', []

 
+    def text(self, match, context, next_state):
+        startline = self.state_machine.abs_line_number() - 1
+
+        try:
+            block = self.state_machine.get_text_block(flush_left=True)
+        except statemachine.UnexpectedIndentationError, err:
+            # ok, it was a multiterm definition
+            block, src, srcline = err.args
+        else:
+            # no, unroll (only block, context will be unrolled in .eof)
+            self.state_machine.previous_line(len(block))
+            raise EOFError
+
+        # continue parsing -> it should call .indent() next
+        return context+list(block), 'Definition', []
+
+
 class Line(SpecializedText):

     """
diff --git a/docutils/test/test_parsers/test_rst/test_block_quotes.py b/docutils/test/test_parsers/test_rst/test_block_quotes.py
index 2d55fa0..125ad95 100755
--- a/docutils/test/test_parsers/test_rst/test_block_quotes.py
+++ b/docutils/test/test_parsers/test_rst/test_block_quotes.py
 <at>  <at>  -60,15 +60,15  <at>  <at>  Line 2.
 """,
 """\
 <document source="test data">
-    <paragraph>
-        Line 1.
-        Line 2.
-    <system_message level="3" line="3" source="test data" type="ERROR">
-        <paragraph>
-            Unexpected indentation.
-    <block_quote>
-        <paragraph>
-            Unexpectedly indented.
+    <definition_list>
+        <definition_list_item>
+            <term>
+                Line 1.
+            <term>
+                Line 2.
+            <definition>
+                <paragraph>
+                    Unexpectedly indented.
 """],
 ["""\
 Line 1.
diff --git a/docutils/test/test_parsers/test_rst/test_definition_lists.py b/docutils/test/test_parsers/test_rst/test_definition_lists.py
index 76251b6..22fd462 100755
--- a/docutils/test/test_parsers/test_rst/test_definition_lists.py
+++ b/docutils/test/test_parsers/test_rst/test_definition_lists.py
 <at>  <at>  -94,15 +94,15  <at>  <at>  a term may only be one line long
 """,
 """\
 <document source="test data">
-    <paragraph>
-        this is not a term;
-        a term may only be one line long
-    <system_message level="3" line="3" source="test data" type="ERROR">
-        <paragraph>
-            Unexpected indentation.
-    <block_quote>
-        <paragraph>
-            this is not a definition
+    <definition_list>
+        <definition_list_item>
+            <term>
+                this is not a term;
+            <term>
+                a term may only be one line long
+            <definition>
+                <paragraph>
+                    this is not a definition
 """],
 ["""\
 term 1
diff --git a/docutils/test/test_parsers/test_rst/test_directives/test_include.py b/docutils/test/test_parsers/test_rst/test_directives/test_include.py
index 5fce520..22686fa 100755
--- a/docutils/test/test_parsers/test_rst/test_directives/test_include.py
+++ b/docutils/test/test_parsers/test_rst/test_directives/test_include.py
 <at>  <at>  -504,15 +504,15  <at>  <at>  Testing errors in included file:
                 Invalid context: the "date" directive can only be used within a substitution definition.
             <literal_block xml:space="preserve">
                 .. date::
-        <paragraph>
-            not a
-            definition list:
-        <system_message level="3" line="29" source="%(source)s" type="ERROR">
-            <paragraph>
-                Unexpected indentation.
-        <block_quote>
-            <paragraph>
-                as a term may only be one line long.
+        <definition_list>
+            <definition_list_item>
+                <term>
+                    not a
+                <term>
+                    definition list:
+                <definition>
+                    <paragraph>
+                        as a term may only be one line long.
         <system_message level="3" line="31" source="%(source)s" type="ERROR">
             <paragraph>
                 Error in "admonition" directive:
diff --git a/docutils/test/test_parsers/test_rst/test_literal_blocks.py b/docutils/test/test_parsers/test_rst/test_literal_blocks.py
index d1738dc..4157e07 100755
--- a/docutils/test/test_parsers/test_rst/test_literal_blocks.py
+++ b/docutils/test/test_parsers/test_rst/test_literal_blocks.py
 <at>  <at>  -94,16 +94,21  <at>  <at>  one line::
 """,
 """\
 <document source="test data">
-    <paragraph>
-        A paragraph
-        on more than
-        one line:
-    <system_message level="3" line="4" source="test data" type="ERROR">
-        <paragraph>
-            Unexpected indentation.
-    <literal_block xml:space="preserve">
-        A literal block
-        with no blank line above.
+    <definition_list>
+        <definition_list_item>
+            <term>
+                A paragraph
+            <term>
+                on more than
+            <term>
+                one line::
+            <definition>
+                <system_message level="1" line="5" source="test data" type="INFO">
+                    <paragraph>
+                        Blank line missing before literal block (after the "::")? Interpreted as a definition list item.
+                <paragraph>
+                    A literal block
+                    with no blank line above.
 """],
 ["""\
 A paragraph::
-- 
1.7.11.1.213.gb567ea5.dirty

---- 8< ----
>From b97723dc3ab11eeae16e363fd7ba4d03b8502005 Mon Sep 17 00:00:00 2001
From: Kirill Smelkov <kirr <at> mns.spb.ru>
Date: Wed, 25 Jul 2012 21:32:50 +0400
Subject: [PATCH 2/2] latex2e: Add support for multiterm definitions

emits \item for first term and \xitem for second and subsequent terms.
Needs \xitem support from stylesheet.
---
 docutils/docutils/writers/latex2e/__init__.py | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/docutils/docutils/writers/latex2e/__init__.py b/docutils/docutils/writers/latex2e/__init__.py
index 4db4036..1d65002 100644
--- a/docutils/docutils/writers/latex2e/__init__.py
+++ b/docutils/docutils/writers/latex2e/__init__.py
 <at>  <at>  -1248,6 +1248,9  <at>  <at>  class LaTeXTranslator(nodes.NodeVisitor):
         self.out = self.body
         self.out_stack = []  # stack of output collectors

+        # Seen terms in current definition_list_item so far
+        self._deflistitem_nterms = 0
+
         # Process settings
         # ~~~~~~~~~~~~~~~~
         # Encodings:
 <at>  <at>  -1766,7 +1769,7  <at>  <at>  class LaTeXTranslator(nodes.NodeVisitor):
         self.out.append( '\\end{description}\n' )

     def visit_definition_list_item(self, node):
-        pass
+        self._deflistitem_nterms = 0

     def depart_definition_list_item(self, node):
         pass
 <at>  <at>  -2821,7 +2824,11  <at>  <at>  class LaTeXTranslator(nodes.NodeVisitor):
         """definition list term"""
         # Commands with optional args inside an optional arg must be put
         # in a group, e.g. ``\item[{\hyperref[label]{text}}]``.
-        self.out.append('\\item[{')
+        multidef = (self._deflistitem_nterms > 0)
+        self.out.append('%s\\%sitem[{' %
+                            (multidef and '\n' or '',
+                             multidef and 'x' or ''))
+        self._deflistitem_nterms += 1

     def depart_term(self, node):
         # \leavevmode results in a line break if the
-- 
1.7.11.1.213.gb567ea5.dirty

---- 8< ---- (multidef.sty,  `rst2xelatex.py --stylesheet=multidef.sty $<`)
% demo style with \xitem to support multidef output
% XXX \item is simplified - 
\usepackage{parskip}

% description: term in italic (was bold by default)
% http://stackoverflow.com/questions/2740437/changing-style-of-latex-description-lists
\renewcommand{\descriptionlabel}[1]{\hspace{\labelsep}\textit{#1}}

% description: text goes from newline
% http://stackoverflow.com/questions/486104/redefining-commands-in-a-new-environment
% XXX had to do via \item redefinition - had no luck modifying only \makelabel
\newcommand{\Xisdesci}{n}
\newcommand{\Xisdescii}{n}
\newcommand{\Xisdesciii}{n}
\newcommand{\Xisdesciv}{n}
\newcommand{\Xisdescv}{n}

\let\Xsaveitem\item
\renewcommand{\item}[1][Xnoarg]{%    FIXME Xnoarg is lame
    \ifthenelse{\equal{#1}{Xnoarg}}{
        \Xsaveitem%
    }{
        \Xsaveitem[#1]%
    }

    % in description text after label goes from newline
    \ifthenelse{\equal{\csname Xisdesc\romannumeral\the\ <at> listdepth\endcsname}{y}}{%
        \hfil\linebreak%
    }{}
}

\let\Xorigdescription\description
\renewenvironment{description}{
  \Xorigdescription

  % mark the level as description
  \expandafter\def\csname Xisdesc\romannumeral\the\ <at> listdepth\endcsname{y}
}
{
  % unmark on exit
  \expandafter\def\csname Xisdesc\romannumeral\the\ <at> listdepth\endcsname{n}
  \endlist%
}

% corrected \item for description where there are multiple terms first with
% common content, e.g.
%
%   func1();
%   func2();
%     These functions do that-and-that.
%
% \xitem should be used from-after second term
\newcommand{\xitem}[1][]{%
    \Xsaveitem[#1]%
    % compensate \item's vspace
    \vskip -\parsep  \vskip -\baselineskip

    % now force text to go from newline, again
    \hfil\linebreak%

    % compensate some vspace back (XXX why this is needed?)
    \vskip -\parsep  \vskip -\baselineskip

    % now vspace and paragraph settings are the same as after \item here. good.
}

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/

Gmane