Since you mentioned converting the DocBook files in the Xenocara tree on Undeadly, I thought I'd take a quick stab at converting some of them myself. To start, I'm only going to focus on <refentry> manpages. A quick recursive grep shows that there are only 47 of these. Some notes: One particularly bad case is lib/libXcomposite/man/Xcomposite.xml. It starts as a <reference>, a collection of <refentry>'s. This is strange considering that it only has one: Xcomposite(3). docbook2mdoc renders this as one big, bold string. Piping just the refentry node to docbook2mdoc is almost perfect: the only error is encountering the unknown "<void />" element, which would we would output as .Ft void. Might make a patch for this later. All of the sgml files in dist/fontconfig/doc contain more than one refentry node (funny considering that this is where a <reference> would be useful). Only the first node gets its own ".Dd .Dt .Os" preamble with the headers/text of subsequent nodes tacked on after. There are some approaches on how new refentry nodes could be handled: 1. Reset the parser and print another unique preamble. The resultant output could be split using a regular expression (e.g. perl or chaining awk+sed). I've tested csplit from the GNU coreutils which does this and it works quite well. 2. Reset the parser and write to a new file. I can already imagine this being a PITA to implement and changes the program from being a simple input filter to a full blown converter, so I don't think this is the best choice. 3. Split the sgml file itself using the method in approach 1 and run docbook2mdoc over each. No code needed then :). Many files reference variables that are set by autoconf. Some of them could be stripped out (e.g. version numbers) whilst others can should be replaced entirely (e.g. the *mansuffix variables). Anyway, that's all I've got for now. If you're not actively working on it, I might send a patches to tech@ for some edited conversions of these files. The only problem I would have is how deep autoconf integration should be (i.e. using @VARIABLES@) for the new files. But I'll cross that bridge when I get there. -- Stephen Gregoratto PGP: 3FC6 3D0E 2801 C348 1C44 2D34 A80C 0F8E 8BAB EC8B -- To unsubscribe send an email to discuss+unsubscribe@mandoc.bsd.lv
Hi Stephen, Stephen Gregoratto wrote on Sat, Apr 20, 2019 at 10:43:45PM +1000: > Since you mentioned converting the DocBook files in the Xenocara tree > on Undeadly, I thought I'd take a quick stab at converting some of them > myself. To start, I'm only going to focus on <refentry> manpages. A > quick recursive grep shows that there are only 47 of these. Some notes: > > One particularly bad case is lib/libXcomposite/man/Xcomposite.xml. That one is already installed as /usr/X11R6/man/man3/Xcomposite.3, unless i'm mistaken. The content seems to be very similar, if not identical. > It starts as a <reference>, a collection of <refentry>'s. This is > strange considering that it only has one: Xcomposite(3). docbook2mdoc > renders this as one big, bold string. Right, i should fix that. What happens here is that the document (= top level) element <reference> is unknown, so it gets ignored, and the first element within, <title>, becomes the document element. Then the <refentry> is *after* the document element, so it gets included *into* the title because an XML file cannot have two top-level elements. But an isolated <title> - i.e. one that isn't in a <section> or a similar element - is converted to bold text, which isn't what we want here. You can see the misparsing with $ docbook2mdoc -T tree /usr/xenocara/lib/libXcomposite/man/Xcomposite.xml \ | head -n 3 /usr/xenocara/lib/libXcomposite/man/Xcomposite.xml:4:1: \ ERROR: unknown element <reference> /usr/xenocara/lib/libXcomposite/man/Xcomposite.xml:74:4: \ ERROR: unknown element <void> title (t) X Composite Extension Library refentry id='Xcomposite.man' > Piping just the refentry node > to docbook2mdoc is almost perfect: the only error is encountering the > unknown "<void />" element, which would we would output as .Ft void. > Might make a patch for this later. Yes, <void> needs to be implemented. > All of the sgml files in dist/fontconfig/doc It seems (at least part of) their content is already installed as manual pages, see /usr/X11R6/man/man3/Fc*.3. > contain more than one > refentry node (funny considering that this is where a <reference> would > be useful). Only the first node gets its own ".Dd .Dt .Os" preamble > with the headers/text of subsequent nodes tacked on after. There are > some approaches on how new refentry nodes could be handled: > > 1. Reset the parser and print another unique preamble. The resultant > output could be split using a regular expression (e.g. perl or > chaining awk+sed). I've tested csplit from the GNU coreutils which > does this and it works quite well. > > 2. Reset the parser and write to a new file. I can already imagine > this being a PITA to implement and changes the program from being > a simple input filter to a full blown converter, so I don't think > this is the best choice. > > 3. Split the sgml file itself using the method in approach 1 and > run docbook2mdoc over each. No code needed then :). I tend to postpone that decision until we really need it. All three options seem viable, i'm not yet sure which one will be best for the specific needs that might arise. > Many files reference variables that are set by autoconf. Some of them > could be stripped out (e.g. version numbers) whilst others can should > be replaced entirely (e.g. the *mansuffix variables). Yes, that is among the final aspects to polish once we decide to install specific files. It will not be difficult to fix. > Anyway, that's all I've got for now. If you're not actively working > on it, I might send a patches to tech@ for some edited conversions > of these files. We should probably not start with files the content of which is already being installed, unless we can show that what is being installed is outdated and the newly converted versions are better. > The only problem I would have is how deep autoconf > integration should be (i.e. using @VARIABLES@) for the new files. > But I'll cross that bridge when I get there. We probably shouldn't waste time in autohell; instead, KISS. All that is needed can probably be done easily in Makefile.bsd-wrappers. But you are right build system questions are hardly a priority; they can get sorted out once we know which files we want to install, if any. Yours, Ingo -- To unsubscribe send an email to discuss+unsubscribe@mandoc.bsd.lv
Hi Stephen, Ingo Schwarze wrote on Sat, Apr 20, 2019 at 04:43:57PM +0200: > Stephen Gregoratto wrote on Sat, Apr 20, 2019 at 10:43:45PM +1000: >> One particularly bad case is lib/libXcomposite/man/Xcomposite.xml. >> It starts as a <reference>, a collection of <refentry>'s. This is >> strange considering that it only has one: Xcomposite(3). docbook2mdoc >> renders this as one big, bold string. > Right, i should fix that. What happens here is that the document (= top > level) element <reference> is unknown, so it gets ignored, and the first > element within, <title>, becomes the document element. Then the > <refentry> is *after* the document element, so it gets included *into* > the title because an XML file cannot have two top-level elements. > But an isolated <title> - i.e. one that isn't in a <section> or a > similar element - is converted to bold text, which isn't what we want > here. You can see the misparsing with > > $ docbook2mdoc -T tree /usr/xenocara/lib/libXcomposite/man/Xcomposite.xml \ > | head -n 3 > /usr/xenocara/lib/libXcomposite/man/Xcomposite.xml:4:1: \ > ERROR: unknown element <reference> > /usr/xenocara/lib/libXcomposite/man/Xcomposite.xml:74:4: \ > ERROR: unknown element <void> > > title > (t) X Composite Extension Library > refentry id='Xcomposite.man' The below commit is an easy fix for this issue. Given the rarity of <reference>, i refrained from writing a full-blown handler for it, at least for now. Yours, Ingo Log Message: ----------- Make <reference> an alias for <section>. This is not perfect because if <reference> starts with a <title>, that title is still printed before the NAME section. But at least the <refentry>s no longer get lumped *into* the <title>. Issue found by Stephen Gregoratto <dev at sgregoratto dot me> in Xcomposite(3). Modified Files: -------------- docbook2mdoc: parse.c statistics.c Revision Data ------------- Index: parse.c =================================================================== RCS file: /home/cvs/mdocml/docbook2mdoc/parse.c,v retrieving revision 1.51 retrieving revision 1.52 diff -Lparse.c -Lparse.c -u -p -r1.51 -r1.52 --- parse.c +++ parse.c @@ -104,6 +104,7 @@ static const struct alias aliases[] = { { "phrase", NODE_IGNORE }, { "primary", NODE_DELETE }, { "property", NODE_PARAMETER }, + { "reference", NODE_SECTION }, { "refsect1", NODE_SECTION }, { "refsect2", NODE_SECTION }, { "refsect3", NODE_SECTION }, Index: statistics.c =================================================================== RCS file: /home/cvs/mdocml/docbook2mdoc/statistics.c,v retrieving revision 1.35 retrieving revision 1.36 diff -Lstatistics.c -Lstatistics.c -u -p -r1.35 -r1.36 --- statistics.c +++ statistics.c @@ -368,6 +368,7 @@ main(int argc, char *argv[]) table_add("ROOT", "part"); table_add("ROOT", "preface"); table_add("ROOT", "refentry"); + table_add("ROOT", "reference"); table_add("ROOT", "sect1"); table_add("ROOT", "sect2"); table_add("acronym", "TEXT"); @@ -527,6 +528,7 @@ main(int argc, char *argv[]) table_add("refentryinfo", "date"); table_add("refentryinfo", "productname"); table_add("refentrytitle", "TEXT"); + table_add("reference", "refentry"); table_add("refmeta", "manvolnum"); table_add("refmeta", "refentrytitle"); table_add("refmeta", "refmiscinfo"); -- To unsubscribe send an email to discuss+unsubscribe@mandoc.bsd.lv