tech@mandoc.bsd.lv
 help / color / mirror / Atom feed
* Re: mdocml: Unify mdoc and man enums and structs into mandoc.h.
       [not found] ` <20101002175621.GB19515@iris.usta.de>
@ 2010-10-03 16:49   ` Kristaps Dzonsons
  2010-10-03 22:36     ` Ingo Schwarze
  0 siblings, 1 reply; 4+ messages in thread
From: Kristaps Dzonsons @ 2010-10-03 16:49 UTC (permalink / raw)
  To: Ingo Schwarze, tech

>> Log Message:
>> -----------
>> Unify mdoc and man enums and structs into mandoc.h.
>> This is part of the slow process of logically splitting
>> formatting frontend and parser backend without pollution.
> 
> Hmm, i fear i don't understand this.
> Given the relation "depends on" shown as "<-",
> i would expect the following dependencies:
> 
>   frontends (html.h, term.h)    <- general & output utils
>   backends (*man.h, *mdoc.h)    <- general utils
>   preprocessor (roff.h)         <- general utils
>   output utils (chars.h, out.h) <- none
>   general utils (*mandoc.h)     <- none
> 
> But i don't understand why two different frontends or two
> different backends should depend on each other, or why including
> one might make it necessary to include the other.
> 
> Sorry, i'm lacking the tiome right now to look deeper into this...

Ingo, cross-posting to tech@....

I want to make a simple mandoc.h and libmandoc.a that has all the 
ingredients for writing front-ends, such as a fancier makewhatis and 
apropos, or man.cgi or whatnot.

To begin with: roff.h, mdoc.h, and man.h -> mandoc.h; libmdoc.a, 
libroff.a, and libman.a (and associated stuff) being merged into a 
single libmandoc.a.  Then libmdoc.h, libman.h, and libroff.h being 
merged into libmandoc.h, used internally within libmandoc.a.

This will reduce structural complexity that's been bothering me for a while.

Once this is done, I will abstract and push the fdesc() function into 
the library: it implements parts of the grammar (such as escaped 
newlines) that should be internal to the library.

Another push is to get the escape routines in one place; right now, the 
functionality is duplicated.  Restructuring is a necessary precondition 
before I do so.

Thoughts?

Kristaps
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: mdocml: Unify mdoc and man enums and structs into mandoc.h.
  2010-10-03 16:49   ` mdocml: Unify mdoc and man enums and structs into mandoc.h Kristaps Dzonsons
@ 2010-10-03 22:36     ` Ingo Schwarze
  2010-10-04  6:35       ` Kristaps Dzonsons
  0 siblings, 1 reply; 4+ messages in thread
From: Ingo Schwarze @ 2010-10-03 22:36 UTC (permalink / raw)
  To: tech

Hi Kristaps,

> Thoughts?

Many, and conflicting ones; so i cannot present final solutions,
but some thoughts indeed.

> I want to make a simple mandoc.h and libmandoc.a that has all the
> ingredients for writing front-ends, such as a fancier makewhatis and
> apropos, or man.cgi or whatnot.

Regarding libmandoc.a, sure.  Actually, i don't see much of a point
in having libraries at all in this context; i doubt that anybody will
ever want to use the parsers outside the mandoc program, or, put
the other way round, all functionality that can reasonably be based
on the mdoc language can probably reasonably be included into the
mandoc binary program.

Regarding makewhatis, apropos and man.cgi, i do not have much hope.
Remember that those must be able to work on the -Tascii output,
at least in OpenBSD, because that's the only version of the manuals
getting installed, and there is next to no hope to have that changed,
based on what Theo and Bob say.  Besides, i don't really see a need
to install manual source code either.  On a typical production
system, you don't need manual source code, just as you don't need
program source code; besides, the src.tar.gz ball is readily
available for each release, and anonymous CVS is not rocket science
either, in case you need the sources for some reason.

Regarding mandoc.h, actually, i still don't see the point.
Why should a file like mdoc_macro.c, or even mdoc_term.c,
be forced to include man data structures and function prototypes?
In the current implementation, there is not a single file
including both man.h and mdoc.h or both libman.h and libmdoc.h,
except main.c and tree.c.  And even if there were one or two
such files:  What is the advantage of a frontend file including
just mandoc.h instead of man.h and mdoc.h?

To the opposite:  In the frontends, i think it is good to keep
the following parts separate:
 1. language-independent output code
    e.g. doing things like indentation, line breaking, filling,
    hyphenation - term.c being a typical example
 2. language-dependent output code common to man and mdoc
    e.g. character translating tables like in chars.c
 3. language-specific AST-interpretation code
    e.g. deciding how much indentation .Bd needs - mdoc_term.c
Here, 1 & 2 do not need any language-dependent headers (but
probably language-independent headers like mandoc.h), while 3.
needs headers for *one* language (but not two).

> To begin with: roff.h, mdoc.h, and man.h -> mandoc.h; libmdoc.a,
> libroff.a, and libman.a (and associated stuff) being merged into a
> single libmandoc.a.  Then libmdoc.h, libman.h, and libroff.h being
> merged into libmandoc.h, used internally within libmandoc.a.
> 
> This will reduce structural complexity that's been bothering me for a while.
> 
> Once this is done, I will abstract and push the fdesc() function
> into the library: it implements parts of the grammar (such as
> escaped newlines) that should be internal to the library.
> 
> Another push is to get the escape routines in one place; right now,
> the functionality is duplicated.  Restructuring is a necessary
> precondition before I do so.

Wouldn't that suggest a structure like the following?
Admittedly, i'm just drawing a big picture, and a somewhat vague one.
Non-trivial design devils will certainly hide in the details...

1. A common lower layer, including:
   1.1. utilities used everywhere
        like memory management, error handling...
   1.2. roff parser, including fdesc() and escape parsing
   1.3. roff output, including escape rendering
   1.4. language-independent output handling (see 1. above)

2. Two middle layers for two languages, man and mdoc:
   2.2. macro parsers, producing ASTs, using 1.2.
   2.3. AST renderers, using 1.3 and 1.4

3. Upper layer:
   The main program tying 2.2. and 2.3. together for both backends


That said, here the conflicting thoughts i mentioned at the
beginning will show up: There IS a reason to bind man and mdoc
closer together.  Both languages include features of one third
language, roff.  And it is not only escapes which are common
to both: There are also common macros.  Here is a list of roff
macros that *might* be relevant to mandoc - this list is definitely
incomplete, some of these are already implemented in both mandoc
backends, some only in one, some in libroff, some not at all:

 .ad        - adjust output lines left, center, right...
 .bp        - eject current page
 .br        - break line
 .break     - break out of repeted execution
 .char      - define character to string
 .continue  - start next cycle of repeated execution
 .de        - define macro
 .di        - divert output to macro
 .ds        - define string
 .el        - else clause for conditional execution
 .fi        - fill output lines
 .hy        - enable hyphenation
 .ie        - conditional execution allowing else clause
 .if        - conditional execution
 .ig        - ignore following input
 .in        - indent
 .length    - store the length of a string into a register
 .ll        - set line length
 .nf        - do not fill output lines
 .nh        - disable hyphenation
 .nm        - output line numbering
 .nr        - define and set number register
 .ns        - no-space mode
 .os        - output saved vertical distance
 .papersize - set the paper size (think of -Tps)
 .pl        - set page length in lines
 .rm        - remove request, macro or string
 .rn        - rename request, macro or string
 .rr        - remove register
 .rs        - restore spacing mode
 .sentchar  - define sentence-endig characters
 .sp        - vertical space
 .substring - replace string by a substring
 .sv        - save vertical distance
 .ta        - tab settings
 .tl        - three part title
 .tm        - print string on terminal (stderr)
 .tr        - translate characters on output
 .ul        - underline
 .while     - repeated execution

Besides, the distinction of macros and escapes is fuzzy.
Here is a list of a few roff escapes actually behaving more
like macros, that is, not just producing one output character,
but having non-local effects on the parsing process:

 \" - start a comment
 \* - interpolate a string
 \d - half vertical space (oops - similar to .sp)
 \f - switch font (oops - similar to .Em)
 \n - interpolate number register
 \p - break output line (oops - similar to .br)
 \R - set number register (oops - similar to .nr)
 \s - set font size

So, in the very long term a need might arise to

 1. Handle roff macros in a common module, and be able to intermix
    them with high level, in particular man, macros
 2. Handle roff escape sequences in a way similar to macros,
    such that they create elements (\n, \p) or even blocks (\f, \s)

Note that not all of the macros can be handled well by a preprocessor,
for example .bp .br .sp are clearly elements and .ad .fi .in .ul are
clearly blocks.  Besides, even part of the stuff that, on first
sight, can be handled by a preprocessor, actually cannot, e.g. .ds:
Once strings are set dynamically, deleted and reset and then
maybe interpolate registers influenced by high-level macros.

On top of that, i have seen stray man macros, for example .B,
used in mdoc documents.  Taking all that together, it *might* make
sense in the distant future to have a common macro table for roff,
man and mdoc.  Or perhaps that's overkill and it might not, i'm
not sure.  Even if we don't go for a full common table, some way
to include the same roff macros into both man and mdoc ASTs might
turn out to be useful, without implementing them twice.  And some
way to handle at least some escape sequences as elements and blocks.

Now, this is certainly inconsistent - just some thoughts.

Yours,
  Ingo
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: mdocml: Unify mdoc and man enums and structs into mandoc.h.
  2010-10-03 22:36     ` Ingo Schwarze
@ 2010-10-04  6:35       ` Kristaps Dzonsons
  2010-10-04 20:05         ` Ingo Schwarze
  0 siblings, 1 reply; 4+ messages in thread
From: Kristaps Dzonsons @ 2010-10-04  6:35 UTC (permalink / raw)
  To: tech

> Many, and conflicting ones; so i cannot present final solutions,
> but some thoughts indeed.

Ingo, yeah, by the large I agree---for the time being I'll revert the 
changes.

> Regarding libmandoc.a, sure.  Actually, i don't see much of a point
> in having libraries at all in this context; i doubt that anybody will
> ever want to use the parsers outside the mandoc program, or, put
> the other way round, all functionality that can reasonably be based
> on the mdoc language can probably reasonably be included into the
> mandoc binary program.
> 
> Regarding makewhatis, apropos and man.cgi, i do not have much hope.
> Remember that those must be able to work on the -Tascii output,
> at least in OpenBSD, because that's the only version of the manuals
> getting installed, and there is next to no hope to have that changed,
> based on what Theo and Bob say.  Besides, i don't really see a need
> to install manual source code either.  On a typical production
> system, you don't need manual source code, just as you don't need
> program source code; besides, the src.tar.gz ball is readily
> available for each release, and anonymous CVS is not rocket science
> either, in case you need the sources for some reason.

I agree, but this is not relevant: if OpenBSD doesn't want to use the 
libraries, the object files can be linked directly into mandoc.

> Regarding mandoc.h, actually, i still don't see the point.
> Why should a file like mdoc_macro.c, or even mdoc_term.c,
> be forced to include man data structures and function prototypes?
> In the current implementation, there is not a single file
> including both man.h and mdoc.h or both libman.h and libmdoc.h,
> except main.c and tree.c.  And even if there were one or two
> such files:  What is the advantage of a frontend file including
> just mandoc.h instead of man.h and mdoc.h?

Yep, this is the reason for my patch reversion.  Pushing libmdoc, 
libman, and libroff tighter together can occur without header merging of 
the {man,mdoc,roff}.h headers.

Kristaps
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: mdocml: Unify mdoc and man enums and structs into mandoc.h.
  2010-10-04  6:35       ` Kristaps Dzonsons
@ 2010-10-04 20:05         ` Ingo Schwarze
  0 siblings, 0 replies; 4+ messages in thread
From: Ingo Schwarze @ 2010-10-04 20:05 UTC (permalink / raw)
  To: tech

Hi Kristaps,

Kristaps Dzonsons wrote on Mon, Oct 04, 2010 at 08:35:17AM +0200:

> if OpenBSD doesn't want to use the libraries,
> the object files can be linked directly into mandoc.

Sure, and that's what we do, and why the bsd.lv Makefile (332 lines)
is seventeen times longer than the OpenBSD one (25 lines).

Of course, there is no problem with you playing with libraries,
as long as it doesn't hinder development, which indeed happened
*very* rarely, nearly not at all.  I only remember one single
occasion:  I think the memory management roytines could not be
used in the libraries because because they were part of the
non-library code, or something like that...

> Pushing libmdoc, libman, and libroff tighter together
> can occur without header merging of the {man,mdoc,roff}.h headers.

Good, thanks, i like that.

Yours,
  Ingo
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-10-04 20:05 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <201010021014.o92AEcOr023027@krisdoz.my.domain>
     [not found] ` <20101002175621.GB19515@iris.usta.de>
2010-10-03 16:49   ` mdocml: Unify mdoc and man enums and structs into mandoc.h Kristaps Dzonsons
2010-10-03 22:36     ` Ingo Schwarze
2010-10-04  6:35       ` Kristaps Dzonsons
2010-10-04 20:05         ` Ingo Schwarze

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).