9front - general discussion about 9front
 help / color / mirror / Atom feed
* (no subject)
@ 2018-08-10  1:17 sl
  0 siblings, 0 replies; only message in thread
From: sl @ 2018-08-10  1:17 UTC (permalink / raw)
  To: 9front

WARNING

I'm posting this to the 9front list because only spammers
are subscribed to the werc list.

BACKROUND

werc/apps/wman/app.rc prints man pages as HTML:

	fn wman_page_gen {
	    #troff -manhtml $1| troff2html -t 'Plan 9 from User Space'
	    troff -N -m$wman_tmac $1 | wman_out_filter
	}

The function wman_default_out_filter then performs some magic to
transform the resulting plain text into markdown, which is in turn
processed by the standard werc handlers.  This produces minimal
HTML (as opposed to the commented-out, original troff pipeline,
which produces hard to read HTML containing tables and other
relatively complex structures).

Recently, someone pointed out that wman botches HREF links when troff
-N automatically linewraps because of a dash:

	; hget http://man.9front.org/8/venti | grep fmt | sed -n 2,3p
	          were formatted with fmtarenas or fmtisect (see venti-
	          <a href="../8/fmt">fmt(8)</a>). In particular, only the configuration needs to be

FIX

Currently, I have instituted a medium- to low-quality fix
on the running system by inserting a ssam(1) line into the
wman_default_out_filter function:

	fn wman_default_out_filter {
		# col -x syntax is the same for UNIX and Plan 9.
	    escape_html \
	    | ssam 'x/[a-z]+-\n[ ]+[a-z]+\([0-9]\)/s/\n[ ]+//g' \
	    | sed 's!([\.\-a-zA-Z0-9]+)\(('^`{echo $wman_cat_list|tr ' ' '|'}^')\)!<a href="../\2/\1">&</a>!g' \
	    | awk '/^$/ {if(n != 1) print; n=1; next} /./ {n=0; print}' \
	    | col -x
	}

Now we get:

	; hget http://man.9front.org/8/venti | grep fmt | sed -n 2,3p
	          were formatted with fmtarenas or fmtisect (see <a href="../8/venti-fmt">venti-fmt(8)</a>). In particular, only the configuration needs to be
	                       fmtarenas.

WHINING

This sucks for a couple of reasons:

	- ssam(1) creates a temporary file on disk.

	- page formatting is now dicked-up, as we remove
	a newline every time we fix a link.

Plan 9 sed(1) and awk(1) do not recognize the \n for newline shorthand
that is available in sam(1).

It should be possible to address this with awk(1), but I'm out of
time for today.

Suggestions welcome.

sl


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2018-08-10  1:17 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-10  1:17 sl

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).