tech@mandoc.bsd.lv
 help / color / mirror / Atom feed
* Re: mdocml: Renamed mandoc.c to libmandoc.c.
       [not found] <201007052000.o65K0uQH002309@krisdoz.my.domain>
@ 2010-07-06  3:15 ` Ingo Schwarze
  2010-07-06  9:24   ` Kristaps Dzonsons
  0 siblings, 1 reply; 7+ messages in thread
From: Ingo Schwarze @ 2010-07-06  3:15 UTC (permalink / raw)
  To: tech

Hi Kristaps,

kristaps@mdocml.bsd.lv wrote on Mon, Jul 05, 2010 at 04:00:56PM -0400:

> Log Message:
> -----------
> Renamed mandoc.c to libmandoc.c.  This is in the efforts of getting a
> cleaner namespace for functions across the entire system (mandoc.h:
> getting parsed-string values, or declarations necessary for the AST
> data), and compiler functions (libmandoc.h: back-end functions and
> declarations).

I dislike this change.
I think the attempt to get a cleaner namespace this way is futile.

The whole concept of x.h vs. libx.h doesn't work out, there is
no real separation.  As far as i understand, the idea is to have
the pure AST parts in x.h, such that x.h is self-contained
except for requiring mandoc.h, while libx.h is supposed to be
the interface to one specific X parser which happens to be
using x.h, while x.h does not depend on the specific parser
interface defined by libx.h.

But this is already busted, because lots of functions from man.h
use the forward declaration of struct man from libman.h,
and lots of functions from mdoc.h use the forward declaration
of struct mdoc from libmdoc.h, so the AST cannot exist without
the one specific parser we have.  I don't say that is a problem,
because i guess nobody will ever try to write an alternative parser
using man.h or mdoc.h.  My point is just that the separation
of x.h and libx.h is mere fiction.  We could as well put
x.h and libx.h into one header (of course, don't, that would
just cause useless churn).

That said, i see no point in extending this failed concept to
mandoc.h and libmandoc.h.  I didn't speak up yet because i don't
feel that strongly about which function is defined in which header
and implemented in which .c-file, so i was mostly happy with
what we have.

But i do care about CVS history, so i think renaming mandoc.c
is a really bad idea.  We lose history for no reason.

I think this change ought to be undone.  That is, undone on the
RCS level, not just reverted by renaming the file once more.

Yours,
  Ingo


P.S.
Btw., while trying to figure out how headers and .c-files are
related, i found a few weirdnesses:

 * libmdoc.h declares mdoc_isescape and mdoc_atotime
   which are neither defined nor used
 * man.c: man_pmacro lacks static on its definition
 * man.c implements man.h without explicitely including it;
   it is only included via libman.h
 * man_macro.c: blk_close, blk_exp, blk_exp, in_line_eoln
   lack static on their definitions
 * mdoc.c: mdoc_node_free and mdoc_pmacro
   lack static on their definitions
 * mdoc.c implements mdoc.h without explicitely including it
   it is only included via libmdoc.h

These are probably easily fixed, while the following are not:

 * roff.h forward-declares struct roff
   which is not declared in any header at all,
   but only in roff.c, using more local roff.c declarations.
   That is, the roff.h interface is *not* independent of the
   particular roff.c implementation.
 * The forward declaration of struct man and struct mdoc
   in main.h is ugly, too, because effectively, that makes
   main.h depend on libman.h and libmdoc.h, thus also on
   man.h, mdoc.h and mandoc.h.  Now, main.h is used all over
   the place (html.c, man_html.c, man_term.c, mdoc_html.c,
   mdoc_term.c, term.c, term_ascii.c, term_ps.c, tree.c),
   which means that all these files depend on the full AST
   structure and BOTH parsers - yes, even the man frontends
   are not cleanly separated from the mdoc parser and vice
   versa.
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: mdocml: Renamed mandoc.c to libmandoc.c.
  2010-07-06  3:15 ` mdocml: Renamed mandoc.c to libmandoc.c Ingo Schwarze
@ 2010-07-06  9:24   ` Kristaps Dzonsons
  2010-07-06 10:58     ` Kristaps Dzonsons
  2010-07-06 23:30     ` Ingo Schwarze
  0 siblings, 2 replies; 7+ messages in thread
From: Kristaps Dzonsons @ 2010-07-06  9:24 UTC (permalink / raw)
  To: tech

Ingo,

> I dislike this change.
> I think the attempt to get a cleaner namespace this way is futile.

Yep, I agree, but was too tired to revert it last night.  Basically, I 
was trying to put together another roff_getstr() interface.  I was 
starting to abstract everything then realised---as you wrote in your 
message and I failed to see at the time---that of course you /can't/ 
have roff_getstr() in term.c because the value would be out-of-band. 
The fix is pretty mechanical (namely, doing it in mdoc_validate.c where 
the string pointer is reallocated on-the-fly if the escape is flagged), 
but I didn't have the energy to do it.

> I think this change ought to be undone.  That is, undone on the
> RCS level, not just reverted by renaming the file once more.

Sigh.  Yep.

>  * libmdoc.h declares mdoc_isescape and mdoc_atotime
>    which are neither defined nor used

I'll remove these (they've long since gone into libmandoc.h).

>  * man.c: man_pmacro lacks static on its definition

Will fix.

>  * man.c implements man.h without explicitely including it;
>    it is only included via libman.h

So do all man_* and mdoc_* files (with libman.h and libmdoc.h, resp.). 
If this bothers you, it's a mechanical change.

>  * man_macro.c: blk_close, blk_exp, blk_exp, in_line_eoln
>    lack static on their definitions
>  * mdoc.c: mdoc_node_free and mdoc_pmacro
>    lack static on their definitions

Will fix.

>  * mdoc.c implements mdoc.h without explicitely including it
>    it is only included via libmdoc.h

Ah--see above.

> 
> These are probably easily fixed, while the following are not:
> 
>  * roff.h forward-declares struct roff
>    which is not declared in any header at all,
>    but only in roff.c, using more local roff.c declarations.
>    That is, the roff.h interface is *not* independent of the
>    particular roff.c implementation.

I'm aware of this.  The fix, of course, is pretty easy: dumping the 
structures into an extern.h file and keeping the compiler functions 
within mdoc.h.

>  * The forward declaration of struct man and struct mdoc
>    in main.h is ugly, too, because effectively, that makes
>    main.h depend on libman.h and libmdoc.h, thus also on
>    man.h, mdoc.h and mandoc.h.  Now, main.h is used all over
>    the place (html.c, man_html.c, man_term.c, mdoc_html.c,
>    mdoc_term.c, term.c, term_ascii.c, term_ps.c, tree.c),
>    which means that all these files depend on the full AST
>    structure and BOTH parsers - yes, even the man frontends
>    are not cleanly separated from the mdoc parser and vice
>    versa.

Ja.  I don't worry about this too much, as fixing it's (1) not necessary 
right now, and (2) easy to do.

For the record, there should ideally be a division as follows:

               +-----libmandoc----+
            libmdoc   libman   libroff
               |         |        |
               +---------+--------+
                         |
                        main
                         |
             +-----------+-----------+
             |           |           |
           term---------html--------tree
         ps/ascii    html/xhtml  pine/birch

It's clear that /all/ of these need the mdoc.h and man.h structures. 
Only the libmdoc/man/roff/main need the parse/access functions.

Ingo, your thoughts: after the `Bk' stuff is checked in, I make another 
little tag, we clean up, fix this architectural cruft while it's fresh 
in our minds, then tag and make a real release?

Kristaps
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: mdocml: Renamed mandoc.c to libmandoc.c.
  2010-07-06  9:24   ` Kristaps Dzonsons
@ 2010-07-06 10:58     ` Kristaps Dzonsons
  2010-07-06 22:01       ` Ingo Schwarze
  2010-07-06 23:30     ` Ingo Schwarze
  1 sibling, 1 reply; 7+ messages in thread
From: Kristaps Dzonsons @ 2010-07-06 10:58 UTC (permalink / raw)
  To: tech

>> I think this change ought to be undone.  That is, undone on the
>> RCS level, not just reverted by renaming the file once more.

I don't understand this.  Isn't a `cvs add' good enough to resurrect the
file after it's been removed?
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: mdocml: Renamed mandoc.c to libmandoc.c.
  2010-07-06 10:58     ` Kristaps Dzonsons
@ 2010-07-06 22:01       ` Ingo Schwarze
  0 siblings, 0 replies; 7+ messages in thread
From: Ingo Schwarze @ 2010-07-06 22:01 UTC (permalink / raw)
  To: tech

Hi Kristaps,

>> I think this change ought to be undone.  That is, undone on the
>> RCS level, not just reverted by renaming the file once more.

> I don't understand this.  Isn't a `cvs add' good enough to resurrect the
> file after it's been removed?

Yes, it is, i only worried about the history, but even that it fully
preserved, as i just tested:

 * create a file (rev. 1.1)
 * change a line (rev. 1.2)
 * delete the file (rev. 1.3)
 * resurrect the file (rev. 1.4)

I assumed cvs annotate would then show 1.4 for all lines.
But it doesn't, it still shows 1.1 and 1.2 just like it did
before the removal; cvs annotate apparently just ignores the
deleted revision.

Thus, you are right, just cvs add & commit is the way to go.

Yours,
  Ingo
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: mdocml: Renamed mandoc.c to libmandoc.c.
  2010-07-06  9:24   ` Kristaps Dzonsons
  2010-07-06 10:58     ` Kristaps Dzonsons
@ 2010-07-06 23:30     ` Ingo Schwarze
  2010-07-07 10:56       ` Kristaps Dzonsons
  1 sibling, 1 reply; 7+ messages in thread
From: Ingo Schwarze @ 2010-07-06 23:30 UTC (permalink / raw)
  To: tech

Hi Kristaps,

Kristaps Dzonsons wrote on Tue, Jul 06, 2010 at 11:24:18AM +0200:

[...]
> The fix is pretty mechanical (namely, doing it in mdoc_validate.c
> where the string pointer is reallocated on-the-fly if the escape is
> flagged), but I didn't have the energy to do it.

Hm, i shall wait and see what you are driving at, though i'm
a bit suspicious whether the parser backend it the best place
for this.

Apparently, groff supports constructs like the following,
see /usr/src/gnu/usr.bin/groff/tmac/tmac.doc:

.nr aP 3
.ds A3 Fl
.\*(A\n(aP

resolving via ".\*(A3" to ".Fl" which will be evaluated as -
the mdoc(7) flag macro.  Obfuscation, anybody?

I don't say we need to implement all that, probably most such
stuff will be rarely needed, but when we have the choice, it
is perhaps worthwhile doing some of the expansions on the
preprocessor level, maybe even iteratively.

Of course, things like special characters and font escapes,
which depend critically on the choice of the frontend, have
no business in the preprocessor, but belong into the frontend.
Yet, the escape sequence parsing logic for \*x, \*(xx, \*[xxx]
is the same for both cases.  Not sure how to organize the code
to deal with that situation, to minimize code duplication as
well as duplicate processing...

In any case, the one place where i would least expect to find
escape sequence parsing is in the mdoc/man parser backends.

[...]
>> * man.c implements man.h without explicitely including it;
>>   it is only included via libman.h

> So do all man_* and mdoc_* files (with libman.h and libmdoc.h,
> resp.). If this bothers you, it's a mechanical change.

I tend to directly include a header when i directly use symbols
defined there, even when another header i'm using implicitely
includes it anyway - but that is a matter of style and not so
important.

What i do find surprising is that a .c-file defining a function
(say, man.c defining man_parseln) does not explicitely include
the header declaring it (i.e., man.h declaring man_parseln),
but relies on another header it happens to be using (libman.h)
to implicitely include man.h.

As far as i see, that happens in two cases only:
  man.c   not including  man.h   directly
  mdoc.c  not including  mdoc.h  directly

But even that is certainly not critical.


[...]
> Ja.  I don't worry about this too much, as fixing it's
> (1) not necessary right now,

Yes, i agree there is no need to reshuffle the headers
right now.  I only looked at it because i was under the
impression you were about to start that kind of reshuffling,
and wanted to figure out how good or bad the situation
actually is, and whether one could expect perfect results
with reasonable effort.

Right now, let's not do more than simple fixes, let's not
move files around or change the basic structure, it works
well enough.


> For the record, there should ideally be a division as follows:
> 
>               +-----libmandoc----+
>            libmdoc   libman   libroff
>               |         |        |
>               +---------+--------+
>                         |
>                        main
>                         |
>             +-----------+-----------+
>             |           |           |
>           term---------html--------tree
>         ps/ascii    html/xhtml  pine/birch
> 
> It's clear that /all/ of these need the mdoc.h and man.h structures.
> Only the libmdoc/man/roff/main need the parse/access functions.

Yes, that basic picture is fine.  Except that many functions in man.h
and mdoc.h use pointers to struct man and struct mdoc, which contain
all the parser functions, so there is a bit of a looping.  But again,
no need to unwind that right now.

> Ingo, your thoughts: after the `Bk' stuff is checked in, I make
> another little tag, we clean up, fix this architectural cruft while
> it's fresh in our minds, then tag and make a real release?

Sounds good.  So i will do the main merging to OpenBSD after the
little tag, and perhaps we can then also put in two small fixes
to .Sm and .No i'm currently preparing.

Yours,
  Ingo
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: mdocml: Renamed mandoc.c to libmandoc.c.
  2010-07-06 23:30     ` Ingo Schwarze
@ 2010-07-07 10:56       ` Kristaps Dzonsons
  2010-07-07 21:58         ` Ingo Schwarze
  0 siblings, 1 reply; 7+ messages in thread
From: Kristaps Dzonsons @ 2010-07-07 10:56 UTC (permalink / raw)
  To: tech

Ingo,

> resolving via ".\*(A3" to ".Fl" which will be evaluated as -
> the mdoc(7) flag macro.  Obfuscation, anybody?

That's the nail in the coffin to put this in libroff, as once the 
character is re-written, it can be pumped (ROFF_RERUN) back in for more 
processing.  Sigh.

> I don't say we need to implement all that, probably most such
> stuff will be rarely needed, but when we have the choice, it
> is perhaps worthwhile doing some of the expansions on the
> preprocessor level, maybe even iteratively.
> 
> Of course, things like special characters and font escapes,
> which depend critically on the choice of the frontend, have
> no business in the preprocessor, but belong into the frontend.
> Yet, the escape sequence parsing logic for \*x, \*(xx, \*[xxx]
> is the same for both cases.  Not sure how to organize the code
> to deal with that situation, to minimize code duplication as
> well as duplicate processing...

mandoc_special() should be merged with a2roffdeco(), which would address 
this.

> In any case, the one place where i would least expect to find
> escape sequence parsing is in the mdoc/man parser backends.

Warnings about malformed escape sequences should be raised somewhere, 
and this shouldn't be in the front-end.

Thanks,

Kristaps
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: mdocml: Renamed mandoc.c to libmandoc.c.
  2010-07-07 10:56       ` Kristaps Dzonsons
@ 2010-07-07 21:58         ` Ingo Schwarze
  0 siblings, 0 replies; 7+ messages in thread
From: Ingo Schwarze @ 2010-07-07 21:58 UTC (permalink / raw)
  To: tech

Hi Kristaps,

Kristaps Dzonsons wrote on Wed, Jul 07, 2010 at 12:56:37PM +0200:

>> resolving via ".\*(A3" to ".Fl" which will be evaluated as -
>> the mdoc(7) flag macro.  Obfuscation, anybody?

> That's the nail in the coffin to put this in libroff, as once the
> character is re-written, it can be pumped (ROFF_RERUN) back in for
> more processing.  Sigh.

From inspection, your code looks good!

It is still handling predefined strings later than user-defined
strings, but that is not so important that it needs to be fixed now.
In both old and new groff, the following works (horrors!-):

  .ds pi surprising
  This is \*(\*(Pi.

Which gives us:

  This is surprising.

The predefined string \*(Pi evaluates to "pi",
and we have defined \*(pi ourselves.  Oh well.  :-)

Yours,
  Ingo
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-07-07 21:58 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <201007052000.o65K0uQH002309@krisdoz.my.domain>
2010-07-06  3:15 ` mdocml: Renamed mandoc.c to libmandoc.c Ingo Schwarze
2010-07-06  9:24   ` Kristaps Dzonsons
2010-07-06 10:58     ` Kristaps Dzonsons
2010-07-06 22:01       ` Ingo Schwarze
2010-07-06 23:30     ` Ingo Schwarze
2010-07-07 10:56       ` Kristaps Dzonsons
2010-07-07 21:58         ` Ingo Schwarze

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).