discuss@mandoc.bsd.lv
 help / color / mirror / Atom feed
* Converting the Xenocara manpages
@ 2019-04-20 12:43 Stephen Gregoratto
  2019-04-20 14:43 ` Ingo Schwarze
  0 siblings, 1 reply; 3+ messages in thread
From: Stephen Gregoratto @ 2019-04-20 12:43 UTC (permalink / raw)
  To: discuss

Since you mentioned converting the DocBook files in the Xenocara tree
on Undeadly, I thought I'd take a quick stab at converting some of them
myself. To start, I'm only going to focus on <refentry> manpages. A
quick recursive grep shows that there are only 47 of these. Some notes:

One particularly bad case is lib/libXcomposite/man/Xcomposite.xml.
It starts as a <reference>, a collection of <refentry>'s. This is
strange considering that it only has one: Xcomposite(3). docbook2mdoc
renders this as one big, bold string. Piping just the refentry node
to docbook2mdoc is almost perfect: the only error is encountering the
unknown "<void />" element, which would we would output as .Ft void.
Might make a patch for this later.

All of the sgml files in dist/fontconfig/doc contain more than one
refentry node (funny considering that this is where a <reference> would
be useful). Only the first node gets its own ".Dd .Dt .Os" preamble
with the headers/text of subsequent nodes tacked on after. There are
some approaches on how new refentry nodes could be handled:

  1. Reset the parser and print another unique preamble. The resultant
  output could be split using a regular expression (e.g. perl or
  chaining awk+sed). I've tested csplit from the GNU coreutils which
  does this and it works quite well.

  2. Reset the parser and write to a new file. I can already imagine
  this being a PITA to implement and changes the program from being
  a simple input filter to a full blown converter, so I don't think
  this is the best choice.

  3. Split the sgml file itself using the method in approach 1 and
  run docbook2mdoc over each.  No code needed then :).

Many files reference variables that are set by autoconf. Some of them
could be stripped out (e.g. version numbers) whilst others can should
be replaced entirely (e.g. the *mansuffix variables).

Anyway, that's all I've got for now. If you're not actively working
on it, I might send a patches to tech@ for some edited conversions
of these files. The only problem I would have is how deep autoconf
integration should be (i.e. using @VARIABLES@) for the new files.
But I'll cross that bridge when I get there.
-- 
Stephen Gregoratto
PGP: 3FC6 3D0E 2801 C348 1C44 2D34 A80C 0F8E 8BAB EC8B
--
 To unsubscribe send an email to discuss+unsubscribe@mandoc.bsd.lv

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Converting the Xenocara manpages
  2019-04-20 12:43 Converting the Xenocara manpages Stephen Gregoratto
@ 2019-04-20 14:43 ` Ingo Schwarze
  2019-04-25 18:03   ` Ingo Schwarze
  0 siblings, 1 reply; 3+ messages in thread
From: Ingo Schwarze @ 2019-04-20 14:43 UTC (permalink / raw)
  To: Stephen Gregoratto; +Cc: discuss

Hi Stephen,

Stephen Gregoratto wrote on Sat, Apr 20, 2019 at 10:43:45PM +1000:

> Since you mentioned converting the DocBook files in the Xenocara tree
> on Undeadly, I thought I'd take a quick stab at converting some of them
> myself. To start, I'm only going to focus on <refentry> manpages. A
> quick recursive grep shows that there are only 47 of these. Some notes:
> 
> One particularly bad case is lib/libXcomposite/man/Xcomposite.xml.

That one is already installed as /usr/X11R6/man/man3/Xcomposite.3,
unless i'm mistaken.  The content seems to be very similar, if not
identical.

> It starts as a <reference>, a collection of <refentry>'s. This is
> strange considering that it only has one: Xcomposite(3). docbook2mdoc
> renders this as one big, bold string.

Right, i should fix that.  What happens here is that the document (= top
level) element <reference> is unknown, so it gets ignored, and the first
element within, <title>, becomes the document element.  Then the
<refentry> is *after* the document element, so it gets included *into*
the title because an XML file cannot have two top-level elements.
But an isolated <title> - i.e. one that isn't in a <section> or a
similar element - is converted to bold text, which isn't what we want
here.  You can see the misparsing with

 $ docbook2mdoc -T tree /usr/xenocara/lib/libXcomposite/man/Xcomposite.xml \
     | head -n 3
 /usr/xenocara/lib/libXcomposite/man/Xcomposite.xml:4:1: \
     ERROR: unknown element <reference>
 /usr/xenocara/lib/libXcomposite/man/Xcomposite.xml:74:4: \
     ERROR: unknown element <void>

 title
   (t) X Composite Extension Library
   refentry id='Xcomposite.man'

> Piping just the refentry node
> to docbook2mdoc is almost perfect: the only error is encountering the
> unknown "<void />" element, which would we would output as .Ft void.
> Might make a patch for this later.

Yes, <void> needs to be implemented.

> All of the sgml files in dist/fontconfig/doc

It seems (at least part of) their content is already installed as manual
pages, see /usr/X11R6/man/man3/Fc*.3.

> contain more than one
> refentry node (funny considering that this is where a <reference> would
> be useful). Only the first node gets its own ".Dd .Dt .Os" preamble
> with the headers/text of subsequent nodes tacked on after.  There are
> some approaches on how new refentry nodes could be handled:
> 
>   1. Reset the parser and print another unique preamble. The resultant
>   output could be split using a regular expression (e.g. perl or
>   chaining awk+sed). I've tested csplit from the GNU coreutils which
>   does this and it works quite well.
> 
>   2. Reset the parser and write to a new file. I can already imagine
>   this being a PITA to implement and changes the program from being
>   a simple input filter to a full blown converter, so I don't think
>   this is the best choice.
> 
>   3. Split the sgml file itself using the method in approach 1 and
>   run docbook2mdoc over each.  No code needed then :).

I tend to postpone that decision until we really need it.
All three options seem viable, i'm not yet sure which one
will be best for the specific needs that might arise.

> Many files reference variables that are set by autoconf. Some of them
> could be stripped out (e.g. version numbers) whilst others can should
> be replaced entirely (e.g. the *mansuffix variables).

Yes, that is among the final aspects to polish once we decide to
install specific files.  It will not be difficult to fix.

> Anyway, that's all I've got for now. If you're not actively working
> on it, I might send a patches to tech@ for some edited conversions
> of these files.

We should probably not start with files the content of which is already
being installed, unless we can show that what is being installed is
outdated and the newly converted versions are better.

> The only problem I would have is how deep autoconf
> integration should be (i.e. using @VARIABLES@) for the new files.
> But I'll cross that bridge when I get there.

We probably shouldn't waste time in autohell; instead, KISS.
All that is needed can probably be done easily in Makefile.bsd-wrappers.
But you are right build system questions are hardly a priority;
they can get sorted out once we know which files we want to install,
if any.

Yours,
  Ingo
--
 To unsubscribe send an email to discuss+unsubscribe@mandoc.bsd.lv

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Converting the Xenocara manpages
  2019-04-20 14:43 ` Ingo Schwarze
@ 2019-04-25 18:03   ` Ingo Schwarze
  0 siblings, 0 replies; 3+ messages in thread
From: Ingo Schwarze @ 2019-04-25 18:03 UTC (permalink / raw)
  To: Stephen Gregoratto; +Cc: discuss

Hi Stephen,

Ingo Schwarze wrote on Sat, Apr 20, 2019 at 04:43:57PM +0200:
> Stephen Gregoratto wrote on Sat, Apr 20, 2019 at 10:43:45PM +1000:

>> One particularly bad case is lib/libXcomposite/man/Xcomposite.xml.
>> It starts as a <reference>, a collection of <refentry>'s. This is
>> strange considering that it only has one: Xcomposite(3). docbook2mdoc
>> renders this as one big, bold string.

> Right, i should fix that.  What happens here is that the document (= top
> level) element <reference> is unknown, so it gets ignored, and the first
> element within, <title>, becomes the document element.  Then the
> <refentry> is *after* the document element, so it gets included *into*
> the title because an XML file cannot have two top-level elements.
> But an isolated <title> - i.e. one that isn't in a <section> or a
> similar element - is converted to bold text, which isn't what we want
> here.  You can see the misparsing with
> 
>  $ docbook2mdoc -T tree /usr/xenocara/lib/libXcomposite/man/Xcomposite.xml \
>      | head -n 3
>  /usr/xenocara/lib/libXcomposite/man/Xcomposite.xml:4:1: \
>      ERROR: unknown element <reference>
>  /usr/xenocara/lib/libXcomposite/man/Xcomposite.xml:74:4: \
>      ERROR: unknown element <void>
> 
>  title
>    (t) X Composite Extension Library
>    refentry id='Xcomposite.man'

The below commit is an easy fix for this issue.

Given the rarity of <reference>, i refrained from writing a full-blown
handler for it, at least for now.

Yours,
  Ingo


Log Message:
-----------
Make <reference> an alias for <section>.

This is not perfect because if <reference> starts with a <title>,
that title is still printed before the NAME section.  But at least
the <refentry>s no longer get lumped *into* the <title>.

Issue found by Stephen Gregoratto <dev at sgregoratto dot me>
in Xcomposite(3).

Modified Files:
--------------
    docbook2mdoc:
        parse.c
        statistics.c

Revision Data
-------------
Index: parse.c
===================================================================
RCS file: /home/cvs/mdocml/docbook2mdoc/parse.c,v
retrieving revision 1.51
retrieving revision 1.52
diff -Lparse.c -Lparse.c -u -p -r1.51 -r1.52
--- parse.c
+++ parse.c
@@ -104,6 +104,7 @@ static	const struct alias aliases[] = {
 	{ "phrase",		NODE_IGNORE },
 	{ "primary",		NODE_DELETE },
 	{ "property",		NODE_PARAMETER },
+	{ "reference",		NODE_SECTION },
 	{ "refsect1",		NODE_SECTION },
 	{ "refsect2",		NODE_SECTION },
 	{ "refsect3",		NODE_SECTION },
Index: statistics.c
===================================================================
RCS file: /home/cvs/mdocml/docbook2mdoc/statistics.c,v
retrieving revision 1.35
retrieving revision 1.36
diff -Lstatistics.c -Lstatistics.c -u -p -r1.35 -r1.36
--- statistics.c
+++ statistics.c
@@ -368,6 +368,7 @@ main(int argc, char *argv[])
 		table_add("ROOT", "part");
 		table_add("ROOT", "preface");
 		table_add("ROOT", "refentry");
+		table_add("ROOT", "reference");
 		table_add("ROOT", "sect1");
 		table_add("ROOT", "sect2");
 		table_add("acronym", "TEXT");
@@ -527,6 +528,7 @@ main(int argc, char *argv[])
 		table_add("refentryinfo", "date");
 		table_add("refentryinfo", "productname");
 		table_add("refentrytitle", "TEXT");
+		table_add("reference", "refentry");
 		table_add("refmeta", "manvolnum");
 		table_add("refmeta", "refentrytitle");
 		table_add("refmeta", "refmiscinfo");
--
 To unsubscribe send an email to discuss+unsubscribe@mandoc.bsd.lv

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-04-25 18:03 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-20 12:43 Converting the Xenocara manpages Stephen Gregoratto
2019-04-20 14:43 ` Ingo Schwarze
2019-04-25 18:03   ` Ingo Schwarze

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).