tech@mandoc.bsd.lv
 help / color / mirror / Atom feed
* Implementing roff requests.
@ 2012-05-30 20:58 Jesse Hagewood
  2012-05-31  1:22 ` Ingo Schwarze
  2012-05-31 11:25 ` Kristaps Dzonsons
  0 siblings, 2 replies; 4+ messages in thread
From: Jesse Hagewood @ 2012-05-30 20:58 UTC (permalink / raw)
  To: tech

[-- Attachment #1: Type: text/plain, Size: 1437 bytes --]

Hello,

I am a student participating in Google Summer of Code, working under
FreeBSD. Part of my project this summer is to add features to mdocml so
that it can deprecate groff in the FreeBSD core.

What I'm trying to do is implement the following roff requests:

.ad (adjust margins)
.na
.it
.ns (no-space mode)
.rs (no-space mode off)
.ti (temporary indent)
.ta (tab settings)
.hy (hyphenation)
.ne
.nh
.ni
.ps

Right now I'm working on the requests .ns and .rs. Here's how I figure I
will do that: 1) Create a flag that checks for no-space mode 2) in the
function roffhash_find, if the no-space flag is on, check if the current
roff node has the .sp or .br request and skip to the next node. 3) .rs will
simply turn the no-space flag off.

I'm not sure if this would be the correct way to implement these, so I was
wondering if anyone here had any input. I am assuming the only code I will
have to modify is in roff.c. So far all I have added to the code is
function stubs for the requests, and temporarily using an int for a
no-space flag and a simple if statement in roffhash_find to check if the
flag is on/off and if the roff macro is .sp, although that gives me a seg
fault right now.

Here are links to my project wiki and SVN repository for anyone interested:
http://wiki.freebsd.org/SummerOfCode2012/JesseHagewood
https://socsvn.freebsd.org/socsvn/soc2012/jhagewood/

Any input or advice would be greatly appreciated. Thanks!

[-- Attachment #2: Type: text/html, Size: 2225 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Implementing roff requests.
  2012-05-30 20:58 Implementing roff requests Jesse Hagewood
@ 2012-05-31  1:22 ` Ingo Schwarze
  2012-05-31 11:25 ` Kristaps Dzonsons
  1 sibling, 0 replies; 4+ messages in thread
From: Ingo Schwarze @ 2012-05-31  1:22 UTC (permalink / raw)
  To: Jesse Hagewood; +Cc: tech

Hi Jesse,

Jesse Hagewood wrote on Wed, May 30, 2012 at 04:58:58PM -0400:

> I am a student participating in Google Summer of Code, working under
> FreeBSD. Part of my project this summer is to add features to mdocml

Wow.  Welcome!

> so that it can deprecate groff in the FreeBSD core.
> 
> What I'm trying to do is implement the following roff requests:
> 
> .ad (adjust margins)
> .na

Those are tough ones.

When attacking those, do not take changing term_flushln() lightly.
It is not very long, but among the trickiest parts of mandoc.

As you are drawing up a TODO list for mandoc, i guess you know

  http://mdocml.bsd.lv/cgi-bin/cvsweb/TODO?cvsroot=mdocml   ?

Adjustment is one of the missing features listed there -
even though it's neither easy nor high priority.

In general, that file usually doesn't tell you about priorities
or difficulties, and some of the entries may be unintelligible
without access to my private mailbox containing most of the
bug reports and feature requests listed there.

If in doubt about what that file wants to tell you, just ask.

> .it

Trying to implement that in mandoc sounds really crazy.
Off the top of my head, i cannot think of any reasonable way
to do so.

You know, the basic idea of mandoc is to first transform the input
text and macros into an abstract syntax tree, without making any
decisions about later visual representation, then let the formatters
work from that syntax tree, without ever going back to the original
source.

To implement .it, you need to know the output representation
before you can even start building the syntax tree.
Under the mandoc paradigm, that's a very nasty contradiction.

I advise against even trying to attack that part of the task unless
you feel like a mandoc expert *and* have a lot of time on your hands.

> .ns (no-space mode)
> .rs (no-space mode off)
> .ti (temporary indent)
> .ta (tab settings)

That sounds all feasible, though not exactly easy.
There are much easier tasks available in the TODO file,
if you want to get up to speed with mandoc hacking.
In general, roff.c and everything related to it is not
an easy entry into mandoc.

> .hy (hyphenation)
> .nh

What?  Hyphenation - as one small entry in a longer list?
Be careful to not get stuck in there, that's a can of worms.

That said, I do have code for hyphenation, but we decided to not
commit it because the feature is not very useful, and rather
intrusive on top of it.  If you really want that, i can dig it up.
The diff won't apply to current code, but it's a start.

Again, i advise against even trying that unless you feel like a
mandoc expert *and* have a lot of time on your hands.

> .ne
> .ps

I don't understand what you mean to change in that respect.
In mandoc, there is no concept of pages whatsoever.
Neither of fonts or font sizes.

So, parsing and ignoring these macros just like we do now
seems the right thing to do.

> .ni

What's that?
I fail to find it anywhere, neither in the
Ossanna/Kernighan/Ritter troff manual nor in the GNU troff manual.

> Right now I'm working on the requests .ns and .rs. Here's how I figure I
> will do that: 1) Create a flag that checks for no-space mode 2) in the
> function roffhash_find, if the no-space flag is on, check if the current
> roff node has the .sp or .br request and skip to the next node.

That's clearly insufficient:  The point of .ns is not to switch off
.sp completely, but to disable it until the next output occurs.
Consider:

.ns
.fi
.sp

.ns
.Ux
.sp

In the first (man(7)) example, the .fi doesn't produce output,
so the .sp is suppressed; whereas in the second (mdoc(7)) example,
the .Ux does produce output, so the .sp is not suppressed.

So, what you have to do instead is the following:

 - In libmandoc.h, enum regs, define REG_ns (not to be confused
   with REG_nS).
 - In roff.c, implement .ns/.rs to set/clear it.
 - In the formatters (term.c, html.c), use roff_regunset()
   when output occurs - it might be tricky to get this right,
   it is not conceptually easy to say what exactly output is.
   For example, \& produces zero-length "output" in this respect.
 - In the macro-formatters (mdoc_term.c, mdoc_html.c, man_term.c,
   man_html.c) modify the .sp handlers to do nothing when the
   flag is set; use roff_regisset() to find out.

So, you have to touch at least eight files, and that's one of the
easier tasks you have set yourself.

> 3) .rs will
> simply turn the no-space flag off.

Sure.

> I'm not sure if this would be the correct way to implement these, so I was
> wondering if anyone here had any input. I am assuming the only code I will
> have to modify is in roff.c. So far all I have added to the code is
> function stubs for the requests, and temporarily using an int for a
> no-space flag and a simple if statement in roffhash_find to check if the
> flag is on/off and if the roff macro is .sp, although that gives me a seg
> fault right now.
> 
> Here are links to my project wiki and SVN repository for anyone interested:
> http://wiki.freebsd.org/SummerOfCode2012/JesseHagewood

Hum.  Strange.  des@freebsd never contributed anything to mandoc
as far as i remeber, strange he should mentor a project on it.
Actually, i never heard him express interest in mandoc before this.

When working on mandoc in the FreeBSD context, you should get in
touch with Ulrich Spoerlein <uqs@> who has done a lot of integration
work so far.  He may be able to really help you.

Regarding Ben Fiedler:  That guy sent one "hello" mail when he started
his GSoC session and didn't even bother to answer the mail he received
in response to that.  He never sent any patches, and not a single
mandoc commit resulted from what he did.  Actually, i have no idea
whether he achieved anything at all, i kind of doubt it...

> https://socsvn.freebsd.org/socsvn/soc2012/jhagewood/

What, you want to implement all those features in less than three
weeks (until June 17), without having any prior experience in
mandoc hacking?  

If i were to address the list you have set up *myself*, i'd plan
for approximately the following times:

 - .ad, .na - probably a week or two
 - .it - at least a month, if it's feasible at all;
   it would probably require re-architecturing the basic
   structure of mandoc
 - .ns, .rs - maybe a day, maybe two
 - .ti - maybe a day, maybe two
 - .ta - probably a few days
 - .hy, .nh - starting from what i have, maybe a week,
   maybe two, until it's barely useable in production;
   reaching good quality would take much longer
 - .ne, .ps, .ni - no estimation,
   as i don't understand what the point is

But that's with a lot of mandoc hacking experience, not at all the
times suggested for a beginner.

> Any input or advice would be greatly appreciated. Thanks!

I'd suggest you start on .nr/.rs.  If you get that committed
to at least one of bsd.lv or openbsd.org within a week, i'd
call that a success.  Then try .ti and .ta; if you manage that
until June 17, great.  If you manage .ad, i will be really
impressed (it is not impossible, but don't be disappointed
if your time is up before you even get round to start it).

Stay clear of .it and .hy.  I don't think doing that is even
remotely realistic.

Regarding the working style:  Get a CVS checkout of mdocml.bsd.lv.
Send patches against that repo as early and as often as possible.
Do *not* send large patches.  Send small patches that make one
specific thing better.  If your patches improve mandoc and cause
no regressions, i will try to test them and commit them one by one,
to both openbsd.org and bsd.lv.  If they have defects, i will try
to provide feedback.

I can't promise short reaction times, i'm sometimes rather
swamped at work.  But the next few weeks might be a bit better
than usual for me, so i *hope* i can provide some feedback and
be of some help.

In any case, have fun!
  Ingo

-- 
Ingo Schwarze <schwarze@openbsd.org>
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Implementing roff requests.
  2012-05-30 20:58 Implementing roff requests Jesse Hagewood
  2012-05-31  1:22 ` Ingo Schwarze
@ 2012-05-31 11:25 ` Kristaps Dzonsons
  2012-05-31 12:52   ` Ingo Schwarze
  1 sibling, 1 reply; 4+ messages in thread
From: Kristaps Dzonsons @ 2012-05-31 11:25 UTC (permalink / raw)
  To: tech; +Cc: Jesse Hagewood

Hi Jesse,

Welcome!

As Ingo has mentioned, a few of these macros aren't possible given the 
layering of mandoc: roff.c is an opaque preprocessor and has limited 
functionality to cue the next processors (libmandoc.h), and no control 
at all over the front-ends (main.h).  `.ad', for example, is not 
possible given the current layering.  You'd need a separate <div> tag in 
HTML mode and it's uncertain how this would play with other 
implicit-margin-changing constructs (like lists).

As you can see, roff.c itself is quite delicate---too delicate, if you 
ask me, bowing under conditionals and loops and so on.  roff and mandoc 
don't mix well: mandoc is [more or less] reflowable, while roff is line- 
and character-based.  If your focus is moving more and more into roff, 
you may want to revive Heirloom troff instead of using mandoc.  I 
actually prefer to move /away/ from roff unless explicitly demanded. 
Things like .ns, for example, already have -mdoc equivalents.

You might get partial coverage by re-writing `ns' as an appropriate 
-mdoc or -man no-space macro, but I'm not certain about behaviour. 
Furthermore, this would pollute the layering even more with knowledge of 
the underlying parse type.  As written above, there's a well-defined 
point of diminishing returns here.

I also note that tabs are a headache: I designed mandoc poorly in this 
regard.  I think an awesome coup would be to throw out all the 
tab-handling in -mdoc, consider tabbed `Bl -column' lists to be synonyms 
for `Bd', and use the standard argument-parser.  This has been in my 
TODO OF DEATH since forever.  It would throw out a lot of ugly, 
workaround code, and also accomodate for lots of tab-separated `Bl 
-column' syntax found in the wild.

Meanwhile, if you've any implementations of macros, please send them to 
tech and we'll look them over!

Best,

Kristaps
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Implementing roff requests.
  2012-05-31 11:25 ` Kristaps Dzonsons
@ 2012-05-31 12:52   ` Ingo Schwarze
  0 siblings, 0 replies; 4+ messages in thread
From: Ingo Schwarze @ 2012-05-31 12:52 UTC (permalink / raw)
  To: tech; +Cc: Jesse Hagewood

Hi,

just a few brief comments...

Kristaps Dzonsons wrote on Thu, May 31, 2012 at 01:25:08PM +0200:

> As Ingo has mentioned, a few of these macros aren't possible given
> the layering of mandoc: roff.c is an opaque preprocessor and has
> limited functionality to cue the next processors (libmandoc.h), and
> no control at all over the front-ends (main.h).  `.ad', for example,
> is not possible given the current layering.

Well, if i had to do .ad, i'd probably use a roff register to save
the mode and put the actual alignment into term_flushln.  That won't
be straightforward and might need a few more smart ideas, but that
one seems certainly possible to me - and maybe even marginally useful
because so far we don't have alignment to the right margin at all.

> You'd need a separate
> <div> tag in HTML mode and it's uncertain how this would play with
> other implicit-margin-changing constructs (like lists).

In the HTML frontend, adding style attributes to some selected
tags - for example, <p> - might suffice.  Then again, it's not
a serious problem if some frontends ignore some macros.

> As you can see, roff.c itself is quite delicate---too delicate, if
> you ask me, bowing under conditionals and loops and so on.  roff and
> mandoc don't mix well: mandoc is [more or less] reflowable, while
> roff is line- and character-based.  If your focus is moving more and
> more into roff, you may want to revive Heirloom troff instead of
> using mandoc.

Well, as i see it, improving roff(7) request support in mandoc(1)
is useful purely for backward compatibility:  As Jesse says, make
people accept mandoc(1) as the default man(7) formatter.  Those
features are certainly not intended to be used in new documents.

> I actually prefer to move /away/ from roff unless
> explicitly demanded.

Indeed, in particular for any kind of new manuals, and for
improving existing manuals as well.  I heartily agree.

> Things like .ns, for example, already have
> -mdoc equivalents.
> 
> You might get partial coverage by re-writing `ns' as an appropriate
> -mdoc or -man no-space macro, but I'm not certain about behaviour.

I think you confuse apples and oranges here:  .ns partially
suppresses *vertical* spacing, while .Ns and .Sm and even stuff
like .BI is concerned with *horizontal* spacing.

Then again, .ns is mostly pointless without traps, which are not
likely to appear soon, and of little use in mdoc(7) and man(7)
documents in general.  The point about such macros isn't that
we want to use the functionality, but that we can process
historical, messily written manuals.

> Furthermore, this would pollute the layering even more with
> knowledge of the underlying parse type.  As written above, there's a
> well-defined point of diminishing returns here.

There is a point of diminishing return, indeed; the problem is,
it isn't well-defined at all.  You remeber your original stance
on roff(7) in mandoc(1) in general?  Wasn't it somewhat similar
to "over my dead body", if we go back enough before Rostock?  :)

At some point, i will possibly start to oppose low-level roff(7)
patches, maybe...  But i think Jesse is still safely away from that
point, whereever that point may actually lie, the patches he sent
so far weren't going over the top.  ;-)

And i'm going to warn him before he comes close to that point,
i think.  Right now, there is room for improvement of roff(7)
inside mandoc(1).  It's not the easiest corner of mandoc(1) nor
the most pressing, but if that's what he is interested in,
totally fine with me!

> I also note that tabs are a headache: I designed mandoc poorly in
> this regard.  I think an awesome coup would be to throw out all the
> tab-handling in -mdoc, consider tabbed `Bl -column' lists to be
> synonyms for `Bd', and use the standard argument-parser.  This has
> been in my TODO OF DEATH since forever.  It would throw out a lot of
> ugly, workaround code, and also accomodate for lots of tab-separated
> `Bl -column' syntax found in the wild.

Yes, the .Bl -column framework is complex and fragile and could
profit from refactoring (though i don't consider it completely
busted).  What you draft here is a bold plan!  In a way, it does
sound charming, but not as easy as that:  The crucial property
of .Bd -column is that the tabs are *not* at fixed positions, but
vary with the width of individual cell contents in individual rows.
Yes, it's probably feasible to design that one way or another
even in .Bd, but don't uproot everything while Jesse is trying
to find his way around the system!

> Meanwhile, if you've any implementations of macros, please send them
> to tech and we'll look them over!

Indeed.  Send often, send early - the smaller, less intrusive, and
more focussed a patch is, the easier and quicker to review and get
committed.

Yours,
  Ingo
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2012-05-31 12:52 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-05-30 20:58 Implementing roff requests Jesse Hagewood
2012-05-31  1:22 ` Ingo Schwarze
2012-05-31 11:25 ` Kristaps Dzonsons
2012-05-31 12:52   ` Ingo Schwarze

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).