tech@mandoc.bsd.lv
 help / color / mirror / Atom feed
* [PATCH] Understand EQ/EN blocks.
@ 2011-02-03 10:26 Kristaps Dzonsons
  2011-02-05 14:25 ` Ingo Schwarze
  0 siblings, 1 reply; 2+ messages in thread
From: Kristaps Dzonsons @ 2011-02-03 10:26 UTC (permalink / raw)
  To: tech, Jason McIntyre

[-- Attachment #1: Type: text/plain, Size: 1189 bytes --]

Hi,

Enclosed is a patch that sets us down the road of understanding EQ/EN 
blocks.  It's quite simple.  Like TBL, it intercepts equations in 
libroff (roff.c).  It then passes back the EQN parse (eqn.c) into the 
main drive (main.c), which (not yet) will push these into libmdoc and 
libman.

The difference between TBL and EQN is that EQN sucks in all stuff 
between .EQ and .EN, while TBL must interpret what happens in these blocks.

For now, the bloat is one more file (the parser for EQN, which is for 
now just a few lines), some conditionals in the libroff parser to check 
for active equation parses, and handling of EQ/EN.

The equations themselves are thrown away.  This is only because I wanted 
to float a patch early with the libroff framework.  It's trivial to put 
an addspan() style routine to both libmdoc and libman that at least 
print out the data.  I'll do that with an Ok or two.

Thoughts?

By the way, we'll be able to fully support "delim", as libroff allows us 
to chop up its input lines and send them to the other backends, so

  foo $a + b$ bar

assuming `$$' are the EQN delimiters, would be passed back as

  foo
  EQN:a + b
  bar

Neat.

Thanks,

Kristaps

[-- Attachment #2: patch.eqn.txt --]
[-- Type: text/plain, Size: 10221 bytes --]

? XTextWidth.man
? awk.1
? config.h
? config.log
? foo.1
? foo.1.html
? foo.1.xhtml
? foo.3
? foo.3.ps
? foo.html
? gcc.1
? gm.1
? gm.1.html
? mandoc
? mandoc.1.htm
? patch.eqn.txt
? patch.txt
? pcap-savefile.manfile.in
? roff.patch
? style.old.css
Index: Makefile
===================================================================
RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/Makefile,v
retrieving revision 1.309
diff -u -r1.309 Makefile
--- Makefile	7 Jan 2011 15:22:21 -0000	1.309
+++ Makefile	3 Feb 2011 10:14:03 -0000
@@ -31,11 +31,11 @@
 
 LINTFLAGS += $(VFLAGS)
 
-ROFFLNS    = roff.ln tbl.ln tbl_opts.ln tbl_layout.ln tbl_data.ln
+ROFFLNS    = roff.ln tbl.ln tbl_opts.ln tbl_layout.ln tbl_data.ln eqn.ln
 
-ROFFSRCS   = roff.c tbl.c tbl_opts.c tbl_layout.c tbl_data.c
+ROFFSRCS   = roff.c tbl.c tbl_opts.c tbl_layout.c tbl_data.c eqn.c
 
-ROFFOBJS   = roff.o tbl.o tbl_opts.o tbl_layout.o tbl_data.o
+ROFFOBJS   = roff.o tbl.o tbl_opts.o tbl_layout.o tbl_data.o eqn.o
 
 MANDOCLNS  = mandoc.ln
 
Index: eqn.c
===================================================================
RCS file: eqn.c
diff -N eqn.c
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ eqn.c	3 Feb 2011 10:14:03 -0000
@@ -0,0 +1,78 @@
+/*	$Id: tbl.c,v 1.22 2011/01/25 12:24:27 schwarze Exp $ */
+/*
+ * Copyright (c) 2011 Kristaps Dzonsons <kristaps@bsd.lv>
+ *
+ * Permission to use, copy, modify, and distribute this software for any
+ * purpose with or without fee is hereby granted, provided that the above
+ * copyright notice and this permission notice appear in all copies.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
+ * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
+ * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
+ * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
+ * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
+ * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+ */
+#ifdef HAVE_CONFIG_H
+#include "config.h"
+#endif
+
+#include <assert.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "mandoc.h"
+#include "roff.h"
+#include "libmandoc.h"
+#include "libroff.h"
+
+/* ARGSUSED */
+enum rofferr
+eqn_read(struct eqn_node **epp, int ln, const char *p, int offs)
+{
+	size_t		 sz;
+	struct eqn_node	*ep;
+
+	if (0 == strcmp(p, ".EN")) {
+		*epp = NULL;
+		return(ROFF_EQN);
+	}
+
+	ep = *epp;
+
+	sz = strlen(&p[offs]);
+	ep->eqn.data = mandoc_realloc(ep->eqn.data, ep->eqn.sz + sz + 1);
+	if (0 == ep->eqn.sz)
+		*ep->eqn.data = '\0';
+
+	ep->eqn.sz += sz;
+	strlcat(ep->eqn.data, &p[offs], ep->eqn.sz + 1);
+	return(ROFF_IGN);
+}
+
+struct eqn_node *
+eqn_alloc(int pos, int line)
+{
+	struct eqn_node	*p;
+
+	p = mandoc_calloc(1, sizeof(struct eqn_node));
+	p->line = line;
+	p->pos = pos;
+	return(p);
+}
+
+void
+eqn_end(struct eqn_node *e)
+{
+
+	/* Nothing to do. */
+}
+
+void
+eqn_free(struct eqn_node *p)
+{
+
+	free(p->eqn.data);
+	free(p);
+}
Index: libroff.h
===================================================================
RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/libroff.h,v
retrieving revision 1.17
diff -u -r1.17 libroff.h
--- libroff.h	25 Jan 2011 12:24:27 -0000	1.17
+++ libroff.h	3 Feb 2011 10:14:03 -0000
@@ -43,6 +43,13 @@
 	struct tbl_node	 *next;
 };
 
+struct	eqn_node {
+	int		  pos; /* invocation column */
+	int		  line; /* invocation line */
+	struct eqn	  eqn;
+	struct eqn_node	 *next;
+};
+
 #define	TBL_MSG(tblp, type, line, col) \
 	(*(tblp)->msg)((type), (tblp)->data, (line), (col), NULL)
 
@@ -57,6 +64,10 @@
 int		 tbl_cdata(struct tbl_node *, int, const char *);
 const struct tbl_span	*tbl_span(struct tbl_node *);
 void		 tbl_end(struct tbl_node *);
+struct eqn_node	*eqn_alloc(int, int);
+void		 eqn_end(struct eqn_node *);
+void		 eqn_free(struct eqn_node *);
+enum rofferr 	 eqn_read(struct eqn_node **, int, const char *, int);
 
 __END_DECLS
 
Index: main.c
===================================================================
RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/main.c,v
retrieving revision 1.142
diff -u -r1.142 main.c
--- main.c	2 Feb 2011 21:40:45 -0000	1.142
+++ main.c	3 Feb 2011 10:14:03 -0000
@@ -863,6 +863,8 @@
 				else
 					mdoc_addspan(curp->mdoc, span);
 			}
+		} else if (ROFF_EQN == rr) {
+			assert(curp->man || curp->mdoc);
 		} else if (curp->man || curp->mdoc) {
 			rc = curp->man ?
 				man_parseln(curp->man, 
Index: mandoc.h
===================================================================
RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/mandoc.h,v
retrieving revision 1.54
diff -u -r1.54 mandoc.h
--- mandoc.h	2 Feb 2011 21:40:45 -0000	1.54
+++ mandoc.h	3 Feb 2011 10:14:03 -0000
@@ -267,6 +267,11 @@
 	struct tbl_span	 *next;
 };
 
+struct	eqn {
+	size_t		  sz;
+	char		 *data;
+};
+
 /*
  * Available registers (set in libroff, accessed elsewhere).
  */
Index: roff.c
===================================================================
RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/roff.c,v
retrieving revision 1.124
diff -u -r1.124 roff.c
--- roff.c	25 Jan 2011 01:12:02 -0000	1.124
+++ roff.c	3 Feb 2011 10:14:03 -0000
@@ -64,6 +64,8 @@
 	ROFF_TS,
 	ROFF_TE,
 	ROFF_T_,
+	ROFF_EQ,
+	ROFF_EN,
 	ROFF_cblock,
 	ROFF_ccond, /* FIXME: remove this. */
 	ROFF_USERDEF,
@@ -93,6 +95,9 @@
 	struct tbl_node	*first_tbl; /* first table parsed */
 	struct tbl_node	*last_tbl; /* last table parsed */
 	struct tbl_node	*tbl; /* current table being parsed */
+	struct eqn_node	*last_eqn; /* last equation parsed */
+	struct eqn_node	*first_eqn; /* first equation parsed */
+	struct eqn_node	*eqn; /* current equation being parsed */
 };
 
 struct	roffnode {
@@ -151,6 +156,8 @@
 static	enum rofferr	 roff_so(ROFF_ARGS);
 static	enum rofferr	 roff_TE(ROFF_ARGS);
 static	enum rofferr	 roff_TS(ROFF_ARGS);
+static	enum rofferr	 roff_EQ(ROFF_ARGS);
+static	enum rofferr	 roff_EN(ROFF_ARGS);
 static	enum rofferr	 roff_T_(ROFF_ARGS);
 static	enum rofferr	 roff_userdef(ROFF_ARGS);
 
@@ -189,6 +196,8 @@
 	{ "TS", roff_TS, NULL, NULL, 0, NULL },
 	{ "TE", roff_TE, NULL, NULL, 0, NULL },
 	{ "T&", roff_T_, NULL, NULL, 0, NULL },
+	{ "EQ", roff_EQ, NULL, NULL, 0, NULL },
+	{ "EN", roff_EN, NULL, NULL, 0, NULL },
 	{ ".", roff_cblock, NULL, NULL, 0, NULL },
 	{ "\\}", roff_ccond, NULL, NULL, 0, NULL },
 	{ NULL, roff_userdef, NULL, NULL, 0, NULL },
@@ -311,15 +320,22 @@
 roff_free1(struct roff *r)
 {
 	struct tbl_node	*t;
+	struct eqn_node	*e;
 
-	while (r->first_tbl) {
-		t = r->first_tbl;
+	while (NULL != (t = r->first_tbl)) {
 		r->first_tbl = t->next;
 		tbl_free(t);
 	}
 
 	r->first_tbl = r->last_tbl = r->tbl = NULL;
 
+	while (NULL != (e = r->first_eqn)) {
+		r->first_eqn = e->next;
+		eqn_free(e);
+	}
+
+	r->first_eqn = r->last_eqn = r->eqn = NULL;
+
 	while (r->last)
 		roffnode_pop(r);
 
@@ -477,6 +493,8 @@
 	 * First, if a scope is open and we're not a macro, pass the
 	 * text through the macro's filter.  If a scope isn't open and
 	 * we're not a macro, just let it through.
+	 * Finally, if there's an equation scope open, divert it into it
+	 * no matter our state.
 	 */
 
 	if (r->last && ! ROFF_CTL((*bufp)[pos])) {
@@ -485,18 +503,26 @@
 		e = (*roffs[t].text)
 			(r, t, bufp, szp, ln, pos, pos, offs);
 		assert(ROFF_IGN == e || ROFF_CONT == e);
-		if (ROFF_CONT == e && r->tbl)
+		if (ROFF_CONT != e)
+			return(e);
+		if (r->eqn)
+			return(eqn_read(&r->eqn, ln, *bufp, *offs));
+		if (r->tbl)
 			return(tbl_read(r->tbl, ln, *bufp, *offs));
-		return(e);
+		return(ROFF_CONT);
 	} else if ( ! ROFF_CTL((*bufp)[pos])) {
+		if (r->eqn)
+			return(eqn_read(&r->eqn, ln, *bufp, *offs));
 		if (r->tbl)
 			return(tbl_read(r->tbl, ln, *bufp, *offs));
 		return(ROFF_CONT);
-	}
+	} else if (r->eqn)
+		return(eqn_read(&r->eqn, ln, *bufp, *offs));
 
 	/*
 	 * If a scope is open, go to the child handler for that macro,
 	 * as it may want to preprocess before doing anything with it.
+	 * Don't do so if an equation is open.
 	 */
 
 	if (r->last) {
@@ -532,6 +558,13 @@
 		(*r->msg)(MANDOCERR_SCOPEEXIT, r->data,
 				r->last->line, r->last->col, NULL);
 
+	if (r->eqn) {
+		(*r->msg)(MANDOCERR_SCOPEEXIT, r->data, 
+				r->eqn->line, r->eqn->pos, NULL);
+		eqn_end(r->eqn);
+		r->eqn = NULL;
+	}
+
 	if (r->tbl) {
 		(*r->msg)(MANDOCERR_SCOPEEXIT, r->data, 
 				r->tbl->line, r->tbl->pos, NULL);
@@ -1140,6 +1173,33 @@
 
 /* ARGSUSED */
 static enum rofferr
+roff_EQ(ROFF_ARGS)
+{
+	struct eqn_node	*e;
+
+	assert(NULL == r->eqn);
+	e = eqn_alloc(ppos, ln);
+
+	if (r->last_eqn)
+		r->last_eqn->next = e;
+	else
+		r->first_eqn = r->last_eqn = e;
+
+	r->eqn = r->last_eqn = e;
+	return(ROFF_IGN);
+}
+
+/* ARGSUSED */
+static enum rofferr
+roff_EN(ROFF_ARGS)
+{
+
+	(*r->msg)(MANDOCERR_NOSCOPE, r->data, ln, ppos, NULL);
+	return(ROFF_IGN);
+}
+
+/* ARGSUSED */
+static enum rofferr
 roff_TS(ROFF_ARGS)
 {
 	struct tbl_node	*t;
@@ -1240,7 +1300,6 @@
 	   ROFF_REPARSE : ROFF_APPEND);
 }
 
-
 static char *
 roff_getname(struct roff *r, char **cpp, int ln, int pos)
 {
@@ -1274,7 +1333,6 @@
 	return(name);
 }
 
-
 /*
  * Store *string into the user-defined string called *name.
  * In multiline mode, append to an existing entry and append '\n';
@@ -1344,7 +1402,6 @@
 	*c = '\0';
 }
 
-
 static const char *
 roff_getstrn(const struct roff *r, const char *name, size_t len)
 {
@@ -1357,7 +1414,6 @@
 	return(n ? n->string : NULL);
 }
 
-
 static void
 roff_freestr(struct roff *r)
 {
@@ -1378,4 +1434,11 @@
 {
 	
 	return(r->tbl ? tbl_span(r->tbl) : NULL);
+}
+
+const struct eqn *
+roff_eqn(const struct roff *r)
+{
+	
+	return(r->last_eqn ? &r->last_eqn->eqn : NULL);
 }
Index: roff.h
===================================================================
RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/roff.h,v
retrieving revision 1.22
diff -u -r1.22 roff.h
--- roff.h	1 Jan 2011 16:18:39 -0000	1.22
+++ roff.h	3 Feb 2011 10:14:03 -0000
@@ -25,6 +25,7 @@
 	ROFF_SO, /* include another file */
 	ROFF_IGN, /* ignore current line */
 	ROFF_TBL, /* a table row was successfully parsed */
+	ROFF_EQN, /* an equation was successfully parsed */
 	ROFF_ERR /* badness: puke and stop */
 };
 

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH] Understand EQ/EN blocks.
  2011-02-03 10:26 [PATCH] Understand EQ/EN blocks Kristaps Dzonsons
@ 2011-02-05 14:25 ` Ingo Schwarze
  0 siblings, 0 replies; 2+ messages in thread
From: Ingo Schwarze @ 2011-02-05 14:25 UTC (permalink / raw)
  To: tech

Hi Kristaps,

> Enclosed is a patch that sets us down the road of understanding
> EQ/EN blocks.

From reading, merging and running it, this makes sense to me.

[...]
> The equations themselves are thrown away.  This is only because I
> wanted to float a patch early with the libroff framework.  It's
> trivial to put an addspan() style routine to both libmdoc and libman
> that at least print out the data.  I'll do that with an Ok or two.

That is indeed the logical next step.

> Thoughts?

I guess i would merge it for release if we get at least a bit
added value compared to just ignoring the .EQ lines with an ERROR,
as we currently do.  After looking at

  http://heirloom.sourceforge.net/doctools/eqn.1b.html

i think the easiest and most useful bits that could make rendering
better are:

 * remove "left" and "right"
 * pass through strings enclosed in double quotes "..."
   untouched, removing the quotes
 * convert "over" to "/"
 * convert "x sub i" to "x_i" and "x sup n" to "x^n"
 * replace ~ and ^ by a space, and collapse multiple spaces,
   tabs, and/or newlines to become a single space
 * remove spaces after ({[ and before )}]

A bit less important:

 * remove "size [+|-]n", roman, italic, bold, "font n",
 * remove "mark" and "lineup"

Much less important:

 * implement roman, italic, bold
 * implement under

Everything else should probably just be passed through, at least
for now:

 * Let braces { } be braces, converting them to parentheses ( )
   would just make the eqn text harder to grok, having lots of
   adjacent, similar tokens in some formulae.
 * just keep keywords like: sqrt, from, to, pile, matrix, ccol,
   dot, hat, tilde, bar, vec, sum, int, inf
 * leave "define" untouched
 * don't worry about GNU extensions for now

> By the way, we'll be able to fully support "delim", as libroff
> allows us to chop up its input lines and send them to the other
> backends, so
> 
>  foo $a + b$ bar
> 
> assuming `$$' are the EQN delimiters, would be passed back as
> 
>  foo
>  EQN:a + b
>  bar

That does seem worthwhile, too, maybe even more useful than the
two items marked "less important" above.

Thanks for getting this started,
  Ingo
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2011-02-05 14:25 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-02-03 10:26 [PATCH] Understand EQ/EN blocks Kristaps Dzonsons
2011-02-05 14:25 ` Ingo Schwarze

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).