Re: pod2mdoc, docbook2mdoc

discuss@mandoc.bsd.lv
 help / color / mirror / Atom feed

From: Ingo Schwarze <schwarze@usta.de>
To: discuss@mdocml.bsd.lv
Subject: Re: pod2mdoc, docbook2mdoc
Date: Thu, 3 Apr 2014 15:53:02 +0200	[thread overview]
Message-ID: <20140403135302.GA22574@iris.usta.de> (raw)
In-Reply-To: <533D552E.1080407@bsd.lv>

Hi Kristaps and Thomas,

Kristaps Dzonsons wrote on Thu, Apr 03, 2014 at 02:33:50PM +0200:
> Thomas Klausner wrote:

>> Some suggestions:

Thomas, you are already aiming high.
All your suggestions definitely make sense, but
they are beyond what the current state-of-the-art tools can do.

For comparison, i have run:

  pod2man --official --release="OpenBSD 5.5" --center=OpenSSL \
    --section=3 --name=DSA_SIG_NEW DSA_SIG_new.pod > DSA_SIG_new.man
  doclifter < DSA_SIG_new.man > DSA_SIG_new.xml

The idea is that

 * pod2man contains some Smarts, so the .man is already going to be
   semantically enriched compared to the *.pod.
 * doclifter contains many more Smarts, so the .xml is going to be
   much more semantically enriched compared to the *.man.
   This grade of enrichment is what i call the state of the art
   regarding Smarts.  Eric S. Raymond basically says that it took
   him a decade to get there, and he calls it very difficult.
   Which doesn't mean that Kristaps can't do better in a few days,
   but...  well...  ESR isn't exactly known for stupidity in the
   free software world, either, so bear with Kristaps.  :)

>> * foo() -> .Fn foo

> I know that pod2man does this--that'd be an easy smart.

Wait, pod2man(1) code is not exactly easy.  It does contain
quite some Smarts, and it isn't pretty.  That's exactly why
i chose not to embark on this project some time ago.

But let's see what current tools do:

 - *.pod has no markup at all.
 - *.man has:
      - in the SYNOPSIS: .Vb, which is basically .nf (almost nothing)
      - in the text: \fI...\fR (presentational)
 - *.xml has:
      - in the SYNOPSIS:
        <funcdef><function>DSA_SIG</function> *DSA_SIG_new</funcdef>
        which is outright WRONG (DSA_SIG is not a function,
        and DSA_SIG_new lacks markup)
        <funcdef>void <function>DSA_SIG_free</function></funcdef>
        which is a bit better, but the function type still lacks markup
      - in the text:
        <emphasis remap='I'>DSA_SIG_new()</emphasis>
        <emphasis remap='I'>DSA_SIG_free()</emphasis>
        So even though DSA_SIG_free() is recognized as a function
        in the SYNOPSIS, doclifter is not smart enough to remember
        that for text formatting.

So what you ask for is well beyond the state of the art.

I'm not impressed by doclifter here.

>> * too many Ns before punctuation

> This is a big one, but complicated to handle properly.  Getting the
> spacing right was really a nightmare.  Consider:
> 
> B<foo>B< bar>
> B<foo >B<bar>
> B<foo B<bar Z<>><B foo> >
> 
> and so on.  pod2mdoc does fine with all but one combination noted in
> the manual.  As mandoc(1) will ignore the superfluous "Nm", it's
> safer to leave it as-is than try to be smarter.

Sure, the .Ns isn't incorrect, just superfluous.

By the way, mandoc(1) already contains the necessary logic,
in file mdoc.c, function isdelim().  When you are about to print .Ns,
the next word is a single character, and isdelim() returns DELIM_CLOSE,
you can skip the .Ns.  With sugar on top.

>> * section number is 1 instead of 3 (not sure how to detect that)

There is no safe way.  The build system is supposed to supply that
information externally, see the pod2man command above.

That said, you could try to be Smart.  If the SYNOPSIS contains .Fn,
you are more likely to be in section 3 than section 1.

> perlpod says that, absent a suffix of ".pm", the manual should be
> considered a section 1.  I guess that ".pod" should be treated
> similarly--can you verify that?

No, it should not.  The *.pod suffix just means "plain old documentation",
that can be any section, even 3p - though 1, 3, 5, 7, 8 are maybe more
likely than 3p because most Perlmodule docs are embedded in the *.pm.
Anyway, you shouldn't conclude anything from a *.pod filename.

> I also have a #define for which
> section should be the Perl module section.  OpenBSD has 3p.  (And a
> note that pod2mdoc needs to be changed if it's redefined.)

>> * .Bd -literal + #include <foo> + Ed -> .In foo

> Yes, this is TODO for more Smarts.

 - *.pod only has a leading blank character.
 - *.man has .Vb, see above, almost nothing.
 - *.xml has:

   <funcsynopsisinfo>
    #include &lt;openssl/dsa.h&gt;

   </funcsynopsisinfo>

That's pathetic.  It doesn't even remove the leading blank
and provides almost no semantics.  So again, Thomas, you aim
well beyond the state of the art.  That's already the second
case of an easy smart that doclifter fails to deliver.

>> * splitting up the functions in the SYNOPSIS the same way
>>   is probably too much effort?

> Same thing: more Smarts...

See above:  The state of the art is to try and fail.  :-o

>> Related question:
>> How do people use Sy vs. Dv?
>> I personally always mark up NULL as a Dv and similarly for most
>> other defined C symbols.

> I use Dv for preprocessor symbols and NULL.  I've never used Sy on
> my own.  A fairly low-hanging smart would be to decorate NULL.

Confirmed, i'd call .Dv NULL best practice.

 - *.pod has B<NULL>
 - *.man has \fB\s-1NULL\s0\fR
 - *.xml has <emphasis remap='B'><?troff ps -1?>NULL<?troff ps 0?></emphasis>

Again, pathetic.  No attempt at being smart.

So, to summarize the state of the art:

 * #include: Do not even try.
 * functions in the SYNOPSIS: Try in a half-assed way
   and partially fail even regarding the part that was tried.
 * functions in text: Do not even try. 
 * NULL: Do not even try.

No, i'm not impressed by the state of the art.

The bright side is that this means Kristaps will almost certainly
be able to do better, without prohibitive effort.  Given the
amount of boasting by ESR, that's a bit surprising...

Yours,
  Ingo
--
 To unsubscribe send an email to discuss+unsubscribe@mdocml.bsd.lv

next prev parent reply	other threads:[~2014-04-03 13:53 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <sfid-H20140330-202333-+022.64-1@spamfilter.osbf.lua>
2014-03-30 18:23 ` Kristaps Dzonsons
2014-03-31  9:09   ` Thomas Klausner
2014-03-31 10:30     ` Kristaps Dzonsons
2014-03-31 16:13       ` Ingo Schwarze
2014-03-31 19:40         ` Kristaps Dzonsons
2014-03-31 20:57           ` Ingo Schwarze
2014-03-31 21:30             ` Thomas Klausner
2014-03-31 21:54               ` Kristaps Dzonsons
2014-03-31 22:21                 ` Ingo Schwarze
2014-03-31 22:31                   ` Thomas Klausner
2014-04-03 12:02                     ` Kristaps Dzonsons
2014-04-03 12:17                       ` Thomas Klausner
2014-04-03 12:33                         ` Kristaps Dzonsons
2014-04-03 13:53                           ` Ingo Schwarze [this message]
2014-04-03 16:51                             ` Kristaps Dzonsons
2014-04-03 22:06                             ` Thomas Klausner
2014-04-04 13:43                               ` Kristaps Dzonsons

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140403135302.GA22574@iris.usta.de \
    --to=schwarze@usta.de \
    --cc=discuss@mdocml.bsd.lv \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).