tech@mandoc.bsd.lv
 help / color / mirror / Atom feed
From: Ingo Schwarze <schwarze@usta.de>
To: Jesse Hagewood <jesse.hagewood@gmail.com>
Cc: tech@mdocml.bsd.lv
Subject: Re: Implementing roff requests.
Date: Thu, 31 May 2012 03:22:46 +0200	[thread overview]
Message-ID: <20120531012246.GA11297@iris.usta.de> (raw)
In-Reply-To: <CACfFK-UWc9Wq2cA8=W3aG4yUm1onVaAbeOi8+gF+bxBO3GOL3g@mail.gmail.com>

Hi Jesse,

Jesse Hagewood wrote on Wed, May 30, 2012 at 04:58:58PM -0400:

> I am a student participating in Google Summer of Code, working under
> FreeBSD. Part of my project this summer is to add features to mdocml

Wow.  Welcome!

> so that it can deprecate groff in the FreeBSD core.
> 
> What I'm trying to do is implement the following roff requests:
> 
> .ad (adjust margins)
> .na

Those are tough ones.

When attacking those, do not take changing term_flushln() lightly.
It is not very long, but among the trickiest parts of mandoc.

As you are drawing up a TODO list for mandoc, i guess you know

  http://mdocml.bsd.lv/cgi-bin/cvsweb/TODO?cvsroot=mdocml   ?

Adjustment is one of the missing features listed there -
even though it's neither easy nor high priority.

In general, that file usually doesn't tell you about priorities
or difficulties, and some of the entries may be unintelligible
without access to my private mailbox containing most of the
bug reports and feature requests listed there.

If in doubt about what that file wants to tell you, just ask.

> .it

Trying to implement that in mandoc sounds really crazy.
Off the top of my head, i cannot think of any reasonable way
to do so.

You know, the basic idea of mandoc is to first transform the input
text and macros into an abstract syntax tree, without making any
decisions about later visual representation, then let the formatters
work from that syntax tree, without ever going back to the original
source.

To implement .it, you need to know the output representation
before you can even start building the syntax tree.
Under the mandoc paradigm, that's a very nasty contradiction.

I advise against even trying to attack that part of the task unless
you feel like a mandoc expert *and* have a lot of time on your hands.

> .ns (no-space mode)
> .rs (no-space mode off)
> .ti (temporary indent)
> .ta (tab settings)

That sounds all feasible, though not exactly easy.
There are much easier tasks available in the TODO file,
if you want to get up to speed with mandoc hacking.
In general, roff.c and everything related to it is not
an easy entry into mandoc.

> .hy (hyphenation)
> .nh

What?  Hyphenation - as one small entry in a longer list?
Be careful to not get stuck in there, that's a can of worms.

That said, I do have code for hyphenation, but we decided to not
commit it because the feature is not very useful, and rather
intrusive on top of it.  If you really want that, i can dig it up.
The diff won't apply to current code, but it's a start.

Again, i advise against even trying that unless you feel like a
mandoc expert *and* have a lot of time on your hands.

> .ne
> .ps

I don't understand what you mean to change in that respect.
In mandoc, there is no concept of pages whatsoever.
Neither of fonts or font sizes.

So, parsing and ignoring these macros just like we do now
seems the right thing to do.

> .ni

What's that?
I fail to find it anywhere, neither in the
Ossanna/Kernighan/Ritter troff manual nor in the GNU troff manual.

> Right now I'm working on the requests .ns and .rs. Here's how I figure I
> will do that: 1) Create a flag that checks for no-space mode 2) in the
> function roffhash_find, if the no-space flag is on, check if the current
> roff node has the .sp or .br request and skip to the next node.

That's clearly insufficient:  The point of .ns is not to switch off
.sp completely, but to disable it until the next output occurs.
Consider:

.ns
.fi
.sp

.ns
.Ux
.sp

In the first (man(7)) example, the .fi doesn't produce output,
so the .sp is suppressed; whereas in the second (mdoc(7)) example,
the .Ux does produce output, so the .sp is not suppressed.

So, what you have to do instead is the following:

 - In libmandoc.h, enum regs, define REG_ns (not to be confused
   with REG_nS).
 - In roff.c, implement .ns/.rs to set/clear it.
 - In the formatters (term.c, html.c), use roff_regunset()
   when output occurs - it might be tricky to get this right,
   it is not conceptually easy to say what exactly output is.
   For example, \& produces zero-length "output" in this respect.
 - In the macro-formatters (mdoc_term.c, mdoc_html.c, man_term.c,
   man_html.c) modify the .sp handlers to do nothing when the
   flag is set; use roff_regisset() to find out.

So, you have to touch at least eight files, and that's one of the
easier tasks you have set yourself.

> 3) .rs will
> simply turn the no-space flag off.

Sure.

> I'm not sure if this would be the correct way to implement these, so I was
> wondering if anyone here had any input. I am assuming the only code I will
> have to modify is in roff.c. So far all I have added to the code is
> function stubs for the requests, and temporarily using an int for a
> no-space flag and a simple if statement in roffhash_find to check if the
> flag is on/off and if the roff macro is .sp, although that gives me a seg
> fault right now.
> 
> Here are links to my project wiki and SVN repository for anyone interested:
> http://wiki.freebsd.org/SummerOfCode2012/JesseHagewood

Hum.  Strange.  des@freebsd never contributed anything to mandoc
as far as i remeber, strange he should mentor a project on it.
Actually, i never heard him express interest in mandoc before this.

When working on mandoc in the FreeBSD context, you should get in
touch with Ulrich Spoerlein <uqs@> who has done a lot of integration
work so far.  He may be able to really help you.

Regarding Ben Fiedler:  That guy sent one "hello" mail when he started
his GSoC session and didn't even bother to answer the mail he received
in response to that.  He never sent any patches, and not a single
mandoc commit resulted from what he did.  Actually, i have no idea
whether he achieved anything at all, i kind of doubt it...

> https://socsvn.freebsd.org/socsvn/soc2012/jhagewood/

What, you want to implement all those features in less than three
weeks (until June 17), without having any prior experience in
mandoc hacking?  

If i were to address the list you have set up *myself*, i'd plan
for approximately the following times:

 - .ad, .na - probably a week or two
 - .it - at least a month, if it's feasible at all;
   it would probably require re-architecturing the basic
   structure of mandoc
 - .ns, .rs - maybe a day, maybe two
 - .ti - maybe a day, maybe two
 - .ta - probably a few days
 - .hy, .nh - starting from what i have, maybe a week,
   maybe two, until it's barely useable in production;
   reaching good quality would take much longer
 - .ne, .ps, .ni - no estimation,
   as i don't understand what the point is

But that's with a lot of mandoc hacking experience, not at all the
times suggested for a beginner.

> Any input or advice would be greatly appreciated. Thanks!

I'd suggest you start on .nr/.rs.  If you get that committed
to at least one of bsd.lv or openbsd.org within a week, i'd
call that a success.  Then try .ti and .ta; if you manage that
until June 17, great.  If you manage .ad, i will be really
impressed (it is not impossible, but don't be disappointed
if your time is up before you even get round to start it).

Stay clear of .it and .hy.  I don't think doing that is even
remotely realistic.

Regarding the working style:  Get a CVS checkout of mdocml.bsd.lv.
Send patches against that repo as early and as often as possible.
Do *not* send large patches.  Send small patches that make one
specific thing better.  If your patches improve mandoc and cause
no regressions, i will try to test them and commit them one by one,
to both openbsd.org and bsd.lv.  If they have defects, i will try
to provide feedback.

I can't promise short reaction times, i'm sometimes rather
swamped at work.  But the next few weeks might be a bit better
than usual for me, so i *hope* i can provide some feedback and
be of some help.

In any case, have fun!
  Ingo

-- 
Ingo Schwarze <schwarze@openbsd.org>
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

  reply	other threads:[~2012-05-31  1:22 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-30 20:58 Jesse Hagewood
2012-05-31  1:22 ` Ingo Schwarze [this message]
2012-05-31 11:25 ` Kristaps Dzonsons
2012-05-31 12:52   ` Ingo Schwarze

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120531012246.GA11297@iris.usta.de \
    --to=schwarze@usta.de \
    --cc=jesse.hagewood@gmail.com \
    --cc=tech@mdocml.bsd.lv \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).