* [PATCH] Massive restructuring into mandoc.h/libmandoc.a. @ 2011-03-21 17:37 Kristaps Dzonsons 2011-03-21 17:53 ` Kristaps Dzonsons [not found] ` <20110321215744.GA16603@iris.usta.de> 0 siblings, 2 replies; 5+ messages in thread From: Kristaps Dzonsons @ 2011-03-21 17:37 UTC (permalink / raw) To: tech [-- Attachment #1: Type: text/plain, Size: 2175 bytes --] Hi, This monster patch is the final part of restructuring the libraries. If the patch doesn't work, I've included the patched sources. With this, the structure of the mdocml package looks like: mdocXXX.o (libmdoc.h)-----+ | manXXX.o (libman.h)---(libmandoc.h) libmandoc.a (mandoc.h) | | | roffXXX.o (libroff.h)-----+ | main.o tblXXX.o | XXXterm.o eqnXXX.o libmandoc.o XXXhtml.o read.o etc. In other words, mandoc.h is the public interface to ALL parsing functionality (via mparse_readfd()). No more having main.c dance around libroff and libmdoc, no more inclusion of mdoc.h and man.h in the front-ends. It's all in one header file. Internally, libmandoc.h is now the shared header file used by all back-end compilers. The libmdoc.h, libman.h, and libroff.h have been retained for each private compiler backend. There's no more libmdoc, libroff, and libman. They all now live in libmandoc.a. This considerably simplifies the spaghetti-mess of inclusions, hierarchies, and so on, although there still remains some clean-up to be done (chars.h, out.h, main.h, etc.). On a related note, mdoc.3, man.3, and roff.3 have had their important parts merged into mandoc.3 (this is in progress, as the functions themselves need to be documented, although it's kind of obvious). The rest I've discarded. I also re-wrote the Makefile to be much more readable and fixable, and to track dependencies much closer. This has some changes regarding the www site: I no longer print the ChangeLog (which, according to my logs, is rarely visited and anyways supplanted by cvsweb) and added eqn.7, alongside removing the other .3 manuals. I notice that the compiled binary has bloated. Any ideas why? This troubles me a great deal. Things that remain to be done in the immediate future: - get rid of mdoc_isdelim() (using cues somehow?) - merge chars.h into out.h - do something about out.h/main.h Thoughts? Reactions? Kristaps [-- Attachment #2: patch.txt --] [-- Type: text/plain, Size: 83495 bytes --] ? bar.1 ? baz.1 ? foo.1 ? index.c ? patch.txt ? regress/output Index: ChangeLog.xsl =================================================================== RCS file: ChangeLog.xsl diff -N ChangeLog.xsl --- ChangeLog.xsl 21 Sep 2009 15:12:03 -0000 1.4 +++ /dev/null 1 Jan 1970 00:00:00 -0000 @@ -1,43 +0,0 @@ -<?xml version='1.0' encoding="utf-8"?> -<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0" > -<xsl:output encoding="utf-8" method="html" indent="yes" doctype-public="-//W3C//DTD HTML 4.01 Transitional//EN" /> -<xsl:template match="/changelog"> -<html> - <head> - <title>mdocml - CVS-ChangeLog</title> - <link rel="stylesheet" href="index.css" type="text/css" media="all" /> - </head> - <body> - <xsl:for-each select="entry"> - <div class="clhead"> - <xsl:text>Files modified by </xsl:text> - <xsl:value-of select="concat(author, ': ', date, ' (', time, ')')" /> - </div> - <div class="clbody"> - <strong> - <xsl:text>Note: </xsl:text> - </strong> - <xsl:value-of select="msg"/> - <ul class="clbody"> - <xsl:for-each select="file"> - <li> - <xsl:value-of select="name"/> - <span class="rev"> - <xsl:text> — Rev: </xsl:text> - <xsl:value-of select="revision"/> - <xsl:text>, Status: </xsl:text> - <xsl:value-of select="cvsstate"/> - <xsl:if test="tag"> - <xsl:text>, Tag: </xsl:text> - <xsl:value-of select="tag" /> - </xsl:if> - </span> - </li> - </xsl:for-each> - </ul> - </div> - </xsl:for-each> - </body> -</html> -</xsl:template> -</xsl:stylesheet> Index: Makefile =================================================================== RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/Makefile,v retrieving revision 1.315 diff -u -r1.315 Makefile --- Makefile 20 Mar 2011 11:41:24 -0000 1.315 +++ Makefile 21 Mar 2011 17:23:12 -0000 @@ -1,344 +1,304 @@ -.SUFFIXES: .html .xml .sgml .1 .3 .7 .md5 .tar.gz -.SUFFIXES: .1.txt .3.txt .7.txt -.SUFFIXES: .1.xhtml .3.xhtml .7.xhtml -.SUFFIXES: .1.sgml .3.sgml .7.sgml -.SUFFIXES: .h .h.html -.SUFFIXES: .1.ps .3.ps .7.ps -.SUFFIXES: .1.pdf .3.pdf .7.pdf - -PREFIX = /usr/local -BINDIR = $(PREFIX)/bin -INCLUDEDIR = $(PREFIX)/include -LIBDIR = $(PREFIX)/lib -MANDIR = $(PREFIX)/man -EXAMPLEDIR = $(PREFIX)/share/examples/mandoc -INSTALL = install -INSTALL_PROGRAM = $(INSTALL) -m 0755 -INSTALL_DATA = $(INSTALL) -m 0444 -INSTALL_LIB = $(INSTALL) -m 0644 -INSTALL_MAN = $(INSTALL_DATA) - -VERSION = 1.10.10 -VDATE = 20 March 2011 - -VFLAGS = -DVERSION="\"$(VERSION)\"" -WFLAGS = -W -Wall -Wstrict-prototypes -Wno-unused-parameter -Wwrite-strings -CFLAGS += -g $(WFLAGS) $(VFLAGS) -DHAVE_CONFIG_H +.PHONY: clean install +.SUFFIXES: .sgml .html .md5 .h .h.html +.SUFFIXES: .1 .3 .7 +.SUFFIXES: .1.txt .3.txt .7.txt +.SUFFIXES: .1.pdf .3.pdf .7.pdf +.SUFFIXES: .1.ps .3.ps .7.ps +.SUFFIXES: .1.html .3.html .7.html +.SUFFIXES: .1.xhtml .3.xhtml .7.xhtml # Specify this if you want to hard-code the operating system to appear # in the lower-left hand corner of -mdoc manuals. -# CFLAGS += -DOSNAME="\"OpenBSD 4.5\"" +# CFLAGS += -DOSNAME="\"OpenBSD 4.5\"" -LINTFLAGS += $(VFLAGS) +VERSION = 1.10.10 +VDATE = 20 March 2011 +CFLAGS += -g -DHAVE_CONFIG_H -DVERSION="\"$(VERSION)\"" +CFLAGS += -W -Wall -Wstrict-prototypes -Wno-unused-parameter -Wwrite-strings +PREFIX = /usr/local +BINDIR = $(PREFIX)/bin +INCLUDEDIR = $(PREFIX)/include +LIBDIR = $(PREFIX)/lib +MANDIR = $(PREFIX)/man +EXAMPLEDIR = $(PREFIX)/share/examples/mandoc +INSTALL = install +INSTALL_PROGRAM = $(INSTALL) -m 0755 +INSTALL_DATA = $(INSTALL) -m 0444 +INSTALL_LIB = $(INSTALL) -m 0644 +INSTALL_MAN = $(INSTALL_DATA) + +all: mandoc + +SRCS = Makefile \ + arch.c \ + arch.in \ + att.c \ + att.in \ + chars.c \ + chars.h \ + chars.in \ + compat.c \ + config.h.post \ + config.h.pre \ + eqn.7 \ + eqn.c \ + example.style.css \ + external.png \ + html.c \ + html.h \ + index.c \ + index.css \ + index.sgml \ + lib.c \ + lib.in \ + libman.h \ + libmandoc.h \ + libmdoc.h \ + libroff.h \ + main.c \ + main.h \ + man.7 \ + man.c \ + man_argv.c \ + man_hash.c \ + man_html.c \ + man_macro.c \ + man_term.c \ + man_validate.c \ + mandoc.1 \ + mandoc.3 \ + mandoc.c \ + mandoc.h \ + mandoc_char.7 \ + mdoc.7 \ + mdoc.c \ + mdoc_argv.c \ + mdoc_hash.c \ + mdoc_html.c \ + mdoc_macro.c \ + mdoc_term.c \ + mdoc_validate.c \ + msec.c \ + msec.in \ + out.c \ + out.h \ + read.c \ + roff.7 \ + roff.c \ + st.c \ + st.in \ + style.css \ + tbl.7 \ + tbl.c \ + tbl_data.c \ + tbl_html.c \ + tbl_layout.c \ + tbl_opts.c \ + tbl_term.c \ + term.c \ + term.h \ + term_ascii.c \ + term_ps.c \ + test-strlcat.c \ + test-strlcpy.c \ + tree.c \ + vol.c \ + vol.in + +LIBMAN_OBJS = man.o \ + man_argv.o \ + man_hash.o \ + man_macro.o \ + man_validate.o +LIBMDOC_OBJS = arch.o \ + att.o \ + chars.o \ + lib.o \ + mdoc.o \ + mdoc_argv.o \ + mdoc_hash.o \ + mdoc_macro.o \ + mdoc_validate.o \ + msec.o \ + st.o \ + vol.o +LIBROFF_OBJS = eqn.o \ + roff.o \ + tbl.o \ + tbl_data.o \ + tbl_layout.o \ + tbl_opts.o +LIBMANDOC_OBJS = $(LIBMAN_OBJS) \ + $(LIBMDOC_OBJS) \ + $(LIBROFF_OBJS) \ + mandoc.o \ + read.o + +arch.o: arch.in +att.o: att.in +chars.o: chars.in +lib.o: lib.in +msec.o: msec.in +st.o: st.in +vol.o: vol.in + +$(LIBMAN_OBJS): libmdoc.h +$(LIBMDOC_OBJS): libmdoc.h +$(LIBROFF_OBJS): libroff.h +$(LIBMANDOC_OBJS): mandoc.h libmandoc.h config.h + +MANDOC_HTML_OBJS = html.o \ + man_html.o \ + mdoc_html.o \ + tbl_html.o +MANDOC_TERM_OBJS = man_term.o \ + mdoc_term.o \ + term.o \ + term_ascii.o \ + term_ps.o \ + tbl_term.o +MANDOC_OBJS = $(MANDOC_HTML_OBJS) \ + $(MANDOC_TERM_OBJS) \ + main.o \ + out.o \ + tree.o + +$(MANDOC_HTML_OBJS): html.h +$(MANDOC_TERM_OBJS): term.h +$(MANDOC_OBJS): main.h mandoc.h config.h out.h chars.h + +compat.o: config.h + +INDEX_MANS = mandoc.1.html \ + mandoc.1.xhtml \ + mandoc.1.ps \ + mandoc.1.pdf \ + mandoc.1.txt \ + mandoc.3.html \ + mandoc.3.xhtml \ + mandoc.3.ps \ + mandoc.3.pdf \ + mandoc.3.txt \ + eqn.7.html \ + eqn.7.xhtml \ + eqn.7.ps \ + eqn.7.pdf \ + eqn.7.txt \ + man.7.html \ + man.7.xhtml \ + man.7.ps \ + man.7.pdf \ + man.7.txt \ + mandoc_char.7.html \ + mandoc_char.7.xhtml \ + mandoc_char.7.ps \ + mandoc_char.7.pdf \ + mandoc_char.7.txt \ + mdoc.7.html \ + mdoc.7.xhtml \ + mdoc.7.ps \ + mdoc.7.pdf \ + mdoc.7.txt \ + roff.7.html \ + roff.7.xhtml \ + roff.7.ps \ + roff.7.pdf \ + roff.7.txt \ + tbl.7.html \ + tbl.7.xhtml \ + tbl.7.ps \ + tbl.7.pdf \ + tbl.7.txt + +$(INDEX_MANS): mandoc + +INDEX_OBJS = $(INDEX_MANS) \ + mandoc.h.html \ + mdocml.tar.gz \ + mdocml.md5 -ROFFLNS = roff.ln tbl.ln tbl_opts.ln tbl_layout.ln tbl_data.ln eqn.ln - -ROFFSRCS = roff.c tbl.c tbl_opts.c tbl_layout.c tbl_data.c eqn.c - -ROFFOBJS = roff.o tbl.o tbl_opts.o tbl_layout.o tbl_data.o eqn.o - -MANDOCLNS = mandoc.ln - -MANDOCSRCS = mandoc.c - -MANDOCOBJS = mandoc.o - -MDOCLNS = mdoc_macro.ln mdoc.ln mdoc_hash.ln \ - mdoc_argv.ln mdoc_validate.ln \ - lib.ln att.ln arch.ln vol.ln msec.ln st.ln - -MDOCOBJS = mdoc_macro.o mdoc.o mdoc_hash.o \ - mdoc_argv.o mdoc_validate.o lib.o att.o \ - arch.o vol.o msec.o st.o - -MDOCSRCS = mdoc_macro.c mdoc.c mdoc_hash.c \ - mdoc_argv.c mdoc_validate.c lib.c att.c \ - arch.c vol.c msec.c st.c - -MANLNS = man_macro.ln man.ln man_hash.ln man_validate.ln \ - man_argv.ln - -MANOBJS = man_macro.o man.o man_hash.o man_validate.o \ - man_argv.o -MANSRCS = man_macro.c man.c man_hash.c man_validate.c \ - man_argv.c - -MAINLNS = main.ln mdoc_term.ln chars.ln term.ln tree.ln \ - compat.ln man_term.ln html.ln mdoc_html.ln \ - man_html.ln out.ln term_ps.ln term_ascii.ln \ - tbl_term.ln tbl_html.ln read.ln - -MAINOBJS = main.o mdoc_term.o chars.o term.o tree.o compat.o \ - man_term.o html.o mdoc_html.o man_html.o out.o \ - term_ps.o term_ascii.o tbl_term.o tbl_html.o read.o - -MAINSRCS = main.c mdoc_term.c chars.c term.c tree.c compat.c \ - man_term.c html.c mdoc_html.c man_html.c out.c \ - term_ps.c term_ascii.c tbl_term.c tbl_html.c read.c - -LLNS = llib-llibmdoc.ln llib-llibman.ln llib-lmandoc.ln \ - llib-llibmandoc.ln llib-llibroff.ln - -LNS = $(MAINLNS) $(MDOCLNS) $(MANLNS) \ - $(MANDOCLNS) $(ROFFLNS) - -LIBS = libmdoc.a libman.a libmandoc.a libroff.a - -OBJS = $(MDOCOBJS) $(MAINOBJS) $(MANOBJS) \ - $(MANDOCOBJS) $(ROFFOBJS) - -SRCS = $(MDOCSRCS) $(MAINSRCS) $(MANSRCS) \ - $(MANDOCSRCS) $(ROFFSRCS) - -DATAS = arch.in att.in lib.in msec.in st.in \ - vol.in chars.in - -HEADS = mdoc.h libmdoc.h man.h libman.h term.h \ - libmandoc.h html.h chars.h out.h main.h roff.h \ - mandoc.h libroff.h - -GSGMLS = mandoc.1.sgml mdoc.3.sgml mdoc.7.sgml \ - mandoc_char.7.sgml man.7.sgml man.3.sgml roff.7.sgml \ - roff.3.sgml tbl.7.sgml eqn.7.sgml - -SGMLS = index.sgml - -XHTMLS = mandoc.1.xhtml mdoc.3.xhtml \ - man.3.xhtml mdoc.7.xhtml man.7.xhtml mandoc_char.7.xhtml \ - roff.7.xhtml roff.3.xhtml tbl.7.xhtml eqn.7.xhtml - -HTMLS = ChangeLog.html index.html man.h.html mdoc.h.html \ - mandoc.h.html roff.h.html mandoc.1.html mdoc.3.html \ - man.3.html mdoc.7.html man.7.html mandoc_char.7.html \ - roff.7.html roff.3.html tbl.7.html eqn.7.html - -PSS = mandoc.1.ps mdoc.3.ps man.3.ps mdoc.7.ps man.7.ps \ - mandoc_char.7.ps roff.7.ps roff.3.ps tbl.7.ps eqn.7.ps - -PDFS = mandoc.1.pdf mdoc.3.pdf man.3.pdf mdoc.7.pdf man.7.pdf \ - mandoc_char.7.pdf roff.7.pdf roff.3.pdf tbl.7.pdf eqn.7.pdf - -XSLS = ChangeLog.xsl - -TEXTS = mandoc.1.txt mdoc.3.txt man.3.txt mdoc.7.txt man.7.txt \ - mandoc_char.7.txt ChangeLog.txt \ - roff.7.txt roff.3.txt tbl.7.txt eqn.7.txt - -EXAMPLES = example.style.css - -XMLS = ChangeLog.xml - -STATICS = index.css style.css external.png - -MD5S = mdocml-$(VERSION).md5 - -TARGZS = mdocml-$(VERSION).tar.gz - -MANS = mandoc.1 mdoc.3 mdoc.7 mandoc_char.7 man.7 \ - man.3 roff.7 roff.3 tbl.7 eqn.7 - -BINS = mandoc - -TESTS = test-strlcat.c test-strlcpy.c - -CONFIGS = config.h.pre config.h.post - -DOCLEAN = $(BINS) $(LNS) $(LLNS) $(LIBS) $(OBJS) $(HTMLS) \ - $(TARGZS) tags $(MD5S) $(XMLS) $(TEXTS) $(GSGMLS) \ - config.h config.log $(PSS) $(PDFS) $(XHTMLS) - -DOINSTALL = $(SRCS) $(HEADS) Makefile $(MANS) $(SGMLS) $(STATICS) \ - $(DATAS) $(XSLS) $(EXAMPLES) $(TESTS) $(CONFIGS) - -all: $(BINS) - -lint: $(LLNS) +www: index.html clean: - rm -f $(DOCLEAN) - -dist: mdocml-$(VERSION).tar.gz - -www: all $(GSGMLS) $(HTMLS) $(XHTMLS) $(TEXTS) $(MD5S) $(TARGZS) $(PSS) $(PDFS) - -ps: $(PSS) - -pdf: $(PDFS) + rm -f libmandoc.a $(LIBMANDOC_OBJS) + rm -f mandoc $(MANDOC_OBJS) + rm -f config.h compat.o config.log + rm -f mdocml.tar.gz + rm -f index.html $(INDEX_OBJS) -installwww: www - $(INSTALL_DATA) $(HTMLS) $(XHTMLS) $(PSS) $(PDFS) $(TEXTS) $(STATICS) $(DESTDIR)$(PREFIX)/ - $(INSTALL_DATA) mdocml-$(VERSION).tar.gz $(DESTDIR)$(PREFIX)/snapshots/ - $(INSTALL_DATA) mdocml-$(VERSION).md5 $(DESTDIR)$(PREFIX)/snapshots/ - $(INSTALL_DATA) mdocml-$(VERSION).tar.gz $(DESTDIR)$(PREFIX)/snapshots/mdocml.tar.gz - $(INSTALL_DATA) mdocml-$(VERSION).md5 $(DESTDIR)$(PREFIX)/snapshots/mdocml.md5 - -install: +install: all mkdir -p $(DESTDIR)$(BINDIR) mkdir -p $(DESTDIR)$(EXAMPLEDIR) mkdir -p $(DESTDIR)$(MANDIR)/man1 + mkdir -p $(DESTDIR)$(MANDIR)/man3 mkdir -p $(DESTDIR)$(MANDIR)/man7 $(INSTALL_PROGRAM) mandoc $(DESTDIR)$(BINDIR) + $(INSTALL_LIB) libmandoc.a $(DESTDIR)$(LIBDIR)/ $(INSTALL_MAN) mandoc.1 $(DESTDIR)$(MANDIR)/man1 + $(INSTALL_MAN) mandoc.3 $(DESTDIR)$(MANDIR)/man3 $(INSTALL_MAN) man.7 mdoc.7 roff.7 eqn.7 tbl.7 mandoc_char.7 $(DESTDIR)$(MANDIR)/man7 $(INSTALL_DATA) example.style.css $(DESTDIR)$(EXAMPLEDIR) -uninstall: - rm -f $(DESTDIR)$(BINDIR)/mandoc - rm -f $(DESTDIR)$(MANDIR)/man1/mandoc.1 - rm -f $(DESTDIR)$(MANDIR)/man7/mdoc.7 - rm -f $(DESTDIR)$(MANDIR)/man7/roff.7 - rm -f $(DESTDIR)$(MANDIR)/man7/eqn.7 - rm -f $(DESTDIR)$(MANDIR)/man7/tbl.7 - rm -f $(DESTDIR)$(MANDIR)/man7/man.7 - rm -f $(DESTDIR)$(MANDIR)/man7/mandoc_char.7 - rm -f $(DESTDIR)$(EXAMPLEDIR)/example.style.css - -$(OBJS): config.h - -$(LNS): config.h - -man_macro.ln man_macro.o: man_macro.c libman.h - -lib.ln lib.o: lib.c lib.in libmdoc.h - -att.ln att.o: att.c att.in libmdoc.h - -arch.ln arch.o: arch.c arch.in libmdoc.h - -vol.ln vol.o: vol.c vol.in libmdoc.h - -chars.ln chars.o: chars.c chars.in chars.h - -msec.ln msec.o: msec.c msec.in libmdoc.h - -st.ln st.o: st.c st.in libmdoc.h - -mdoc_macro.ln mdoc_macro.o: mdoc_macro.c libmdoc.h - -mdoc_term.ln mdoc_term.o: mdoc_term.c term.h mdoc.h - -man_hash.ln man_hash.o: man_hash.c libman.h - -mdoc_hash.ln mdoc_hash.o: mdoc_hash.c libmdoc.h - -mdoc.ln mdoc.o: mdoc.c libmdoc.h - -man.ln man.o: man.c libman.h - -main.ln main.o: main.c mdoc.h man.h roff.h - -compat.ln compat.o: compat.c - -term.ln term.o: term.c term.h man.h mdoc.h chars.h - -term_ps.ln term_ps.o: term_ps.c term.h main.h - -term_ascii.ln term_ascii.o: term_ascii.c term.h main.h - -html.ln html.o: html.c html.h chars.h - -mdoc_html.ln mdoc_html.o: mdoc_html.c html.h mdoc.h - -man_html.ln man_html.o: man_html.c html.h man.h out.h - -out.ln out.o: out.c out.h - -mandoc.ln mandoc.o: mandoc.c libmandoc.h - -tree.ln tree.o: tree.c man.h mdoc.h - -mdoc_argv.ln mdoc_argv.o: mdoc_argv.c libmdoc.h - -man_argv.ln man_argv.o: man_argv.c libman.h - -man_validate.ln man_validate.o: man_validate.c libman.h - -mdoc_validate.ln mdoc_validate.o: mdoc_validate.c libmdoc.h - -libmdoc.h: mdoc.h - -ChangeLog.xml: - cvs2cl --xml --xml-encoding iso-8859-15 -t --noxmlns -f $@ - -ChangeLog.txt: - cvs2cl -t -f $@ - -ChangeLog.html: ChangeLog.xml ChangeLog.xsl - xsltproc -o $@ ChangeLog.xsl ChangeLog.xml - -mdocml-$(VERSION).tar.gz: $(DOINSTALL) - mkdir -p .dist/mdocml/mdocml-$(VERSION)/ - cp -f $(DOINSTALL) .dist/mdocml/mdocml-$(VERSION)/ - ( cd .dist/mdocml/ && tar zcf ../../$@ mdocml-$(VERSION)/ ) +installwww: www + $(INSTALL_DATA) $(INDEX_MANS) $(PREFIX) + $(INSTALL_DATA) mandoc.h.html $(PREFIX) + $(INSTALL_DATA) external.png style.css index.css $(PREFIX) + $(INSTALL_DATA) mdocml.tar.gz $(PREFIX)/snapshots + $(INSTALL_DATA) mdocml.md5 $(PREFIX)/snapshots + $(INSTALL_DATA) mdocml.tar.gz $(PREFIX)/snapshots/mdocml-$(VERSION).tar.gz + $(INSTALL_DATA) mdocml.md5 $(PREFIX)/snapshots/mdocml-$(VERSION).md5 + +libmandoc.a: compat.o $(LIBMANDOC_OBJS) + $(AR) rs $@ compat.o $(LIBMANDOC_OBJS) + +mandoc: $(MANDOC_OBJS) libmandoc.a + $(CC) -o $@ $(MANDOC_OBJS) libmandoc.a + +mdocml.md5: mdocml.tar.gz + md5 mdocml.tar.gz >$@ + +mdocml.tar.gz: $(SRCS) + mkdir -p .dist/mdocml-$(VERSION)/ + $(INSTALL) -m 0444 $(SRCS) .dist/mdocml-$(VERSION) + ( cd .dist/ && tar zcf ../$@ ./ ) rm -rf .dist/ -llib-llibmdoc.ln: $(MDOCLNS) - $(LINT) -Clibmdoc $(MDOCLNS) - -llib-llibman.ln: $(MANLNS) - $(LINT) -Clibman $(MANLNS) - -llib-llibmandoc.ln: $(MANDOCLNS) - $(LINT) -Clibmandoc $(MANDOCLNS) - -llib-llibroff.ln: $(ROFFLNS) - $(LINT) -Clibroff $(ROFFLNS) - -llib-lmandoc.ln: $(MAINLNS) llib-llibmdoc.ln llib-llibman.ln llib-llibmandoc.ln llib-llibroff.ln - $(LINT) -Cmandoc $(MAINLNS) llib-llibmdoc.ln llib-llibman.ln llib-llibmandoc.ln llib-llibroff.ln - -libmdoc.a: $(MDOCOBJS) - $(AR) rs $@ $(MDOCOBJS) +index.html: $(INDEX_OBJS) -libman.a: $(MANOBJS) - $(AR) rs $@ $(MANOBJS) - -libmandoc.a: $(MANDOCOBJS) - $(AR) rs $@ $(MANDOCOBJS) - -libroff.a: $(ROFFOBJS) - $(AR) rs $@ $(ROFFOBJS) - -mandoc: $(MAINOBJS) libroff.a libmdoc.a libman.a libmandoc.a - $(CC) $(CFLAGS) -o $@ $(MAINOBJS) libroff.a libmdoc.a libman.a libmandoc.a +config.h: config.h.pre config.h.post + rm -f config.log + ( cat config.h.pre; \ + echo; \ + if $(CC) $(CFLAGS) -Werror -o test-strlcat test-strlcat.c >> config.log 2>&1; then \ + echo '#define HAVE_STRLCAT'; \ + rm test-strlcat; \ + fi; \ + if $(CC) $(CFLAGS) -Werror -o test-strlcpy test-strlcpy.c >> config.log 2>&1; then \ + echo '#define HAVE_STRLCPY'; \ + rm test-strlcpy; \ + fi; \ + echo; \ + cat config.h.post \ + ) > $@ -.sgml.html: - validate --warn $< - sed -e "s!@VERSION@!$(VERSION)!" -e "s!@VDATE@!$(VDATE)!" $< > $@ +.h.h.html: + highlight -I $< >$@ .1.1.txt .3.3.txt .7.7.txt: - ./mandoc -Tascii -Wall,stop $< | col -b > $@ + ./mandoc -Tascii -Wall,stop $< | col -b >$@ -.1.1.sgml .3.3.sgml .7.7.sgml: - ./mandoc -Thtml -Wall,stop -Ostyle=style.css,man=%N.%S.html,includes=%I.html $< > $@ +.1.1.html .3.3.html .7.7.html: + ./mandoc -Thtml -Wall,stop -Ostyle=style.css,man=%N.%S.html,includes=%I.html $< >$@ .1.1.ps .3.3.ps .7.7.ps: - ./mandoc -Tps -Wall,stop $< > $@ + ./mandoc -Tps -Wall,stop $< >$@ .1.1.xhtml .3.3.xhtml .7.7.xhtml: - ./mandoc -Txhtml -Wall,stop -Ostyle=style.css,man=%N.%S.xhtml,includes=%I.html $< > $@ + ./mandoc -Txhtml -Wall,stop -Ostyle=style.css,man=%N.%S.xhtml,includes=%I.html $< >$@ .1.1.pdf .3.3.pdf .7.7.pdf: - ./mandoc -Tpdf -Wall,stop $< > $@ + ./mandoc -Tpdf -Wall,stop $< >$@ -.tar.gz.md5: - md5 $< > $@ - -.h.h.html: - highlight -I $< >$@ - -config.h: config.h.pre config.h.post - rm -f config.log - ( cat config.h.pre; \ - echo; \ - if $(CC) $(CFLAGS) -Werror -o test-strlcat test-strlcat.c >> config.log 2>&1; then \ - echo '#define HAVE_STRLCAT'; \ - rm test-strlcat; \ - fi; \ - if $(CC) $(CFLAGS) -Werror -o test-strlcpy test-strlcpy.c >> config.log 2>&1; then \ - echo '#define HAVE_STRLCPY'; \ - rm test-strlcpy; \ - fi; \ - echo; \ - cat config.h.post \ - ) > $@ +.sgml.html: + validate --warn $< + sed -e "s!@VERSION@!$(VERSION)!" -e "s!@VDATE@!$(VDATE)!" $< >$@ Index: chars.h =================================================================== RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/chars.h,v retrieving revision 1.7 diff -u -r1.7 chars.h --- chars.h 30 Jan 2011 16:05:37 -0000 1.7 +++ chars.h 21 Mar 2011 17:23:12 -0000 @@ -18,6 +18,10 @@ #ifndef CHARS_H #define CHARS_H +/* + * FIXME: MERGE THIS WITH OUT.H. + */ + __BEGIN_DECLS enum chars { Index: eqn.c =================================================================== RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/eqn.c,v retrieving revision 1.3 diff -u -r1.3 eqn.c --- eqn.c 15 Mar 2011 16:23:51 -0000 1.3 +++ eqn.c 21 Mar 2011 17:23:13 -0000 @@ -25,7 +25,6 @@ #include <time.h> #include "mandoc.h" -#include "roff.h" #include "libmandoc.h" #include "libroff.h" Index: index.sgml =================================================================== RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/index.sgml,v retrieving revision 1.105 diff -u -r1.105 index.sgml --- index.sgml 7 Jan 2011 15:22:21 -0000 1.105 +++ index.sgml 21 Mar 2011 17:23:13 -0000 @@ -39,10 +39,9 @@ </P> <P> - <SPAN CLASS="nm">mdocml</SPAN> consists of the <A HREF="mdoc.3.html">libmdoc</A>, <A - HREF="man.3.html">libman</A>, and <A HREF="roff.3.html">libroff</A> validating compilers; and <A - HREF="mandoc.1.html">mandoc</A>, which interfaces with the compiler libraries to format output for UNIX - terminals, XHTML, HTML, PostScript, and PDF. It is a <A CLASS="external" + <SPAN CLASS="nm">mdocml</SPAN> consists of the <A HREF="mandoc.3.html">libmandoc</A> validating + compilers and <A HREF="mandoc.1.html">mandoc</A>, which interfaces with the compiler library to format + output for UNIX terminals, XHTML, HTML, PostScript, and PDF. It is a <A CLASS="external" HREF="http://bsd.lv/">BSD.lv</A> project. </P> @@ -60,8 +59,7 @@ <P> <SPAN CLASS="nm">mdocml</SPAN> is architecture- and system-neutral, written in plain-old C. The most - current version is <SPAN CLASS="attn">@VERSION@</SPAN>, dated <SPAN class="attn">@VDATE@</SPAN>. A full - <A HREF="ChangeLog.html">ChangeLog</A> (<A HREF="ChangeLog.txt">txt</A>) is written with each release. + current version is <SPAN CLASS="attn">@VERSION@</SPAN>, dated <SPAN class="attn">@VDATE@</SPAN>. </P> <H2> @@ -172,38 +170,14 @@ </TD> </TR> <TR> - <TD VALIGN="top"><A HREF="man.3.html">man(3)</A></TD> + <TD VALIGN="top"><A HREF="mandoc.3.html">mandoc(3)</A></TD> <TD VALIGN="top"> - man macro compiler library + mandoc macro compiler library <SPAN STYLE="font-size: smaller;"> - (<A HREF="man.3.txt">text</A> | - <A HREF="man.3.xhtml">xhtml</A> | - <A HREF="man.3.pdf">pdf</A> | - <A HREF="man.3.ps">postscript</A>) - </SPAN> - </TD> - </TR> - <TR> - <TD VALIGN="top"><A HREF="mdoc.3.html">mdoc(3)</A></TD> - <TD VALIGN="top"> - mdoc macro compiler library - <SPAN STYLE="font-size: smaller;"> - (<A HREF="mdoc.3.txt">text</A> | - <A HREF="mdoc.3.xhtml">xhtml</A> | - <A HREF="mdoc.3.pdf">pdf</A> | - <A HREF="mdoc.3.ps">postscript</A>) - </SPAN> - </TD> - </TR> - <TR> - <TD VALIGN="top"><A HREF="roff.3.html">roff(3)</A></TD> - <TD VALIGN="top"> - roff macro compiler library - <SPAN STYLE="font-size: smaller;"> - (<A HREF="roff.3.txt">text</A> | - <A HREF="roff.3.xhtml">xhtml</A> | - <A HREF="roff.3.pdf">pdf</A> | - <A HREF="roff.3.ps">postscript</A>) + (<A HREF="mandoc.3.txt">text</A> | + <A HREF="mandoc.3.xhtml">xhtml</A> | + <A HREF="mandoc.3.pdf">pdf</A> | + <A HREF="mandoc.3.ps">postscript</A>) </SPAN> </TD> </TR> @@ -216,6 +190,18 @@ <A HREF="man.7.xhtml">xhtml</A> | <A HREF="man.7.pdf">pdf</A> | <A HREF="man.7.ps">postscript</A>) + </SPAN> + </TD> + </TR> + <TR> + <TD VALIGN="top"><A HREF="eqn.7.html">eqn(7)</A></TD> + <TD VALIGN="top"> + eqn-mandoc language reference + <SPAN STYLE="font-size: smaller;"> + (<A HREF="eqn.7.txt">text</A> | + <A HREF="eqn.7.xhtml">xhtml</A> | + <A HREF="eqn.7.pdf">pdf</A> | + <A HREF="eqn.7.ps">postscript</A>) </SPAN> </TD> </TR> Index: libman.h =================================================================== RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/libman.h,v retrieving revision 1.47 diff -u -r1.47 libman.h --- libman.h 20 Mar 2011 16:02:05 -0000 1.47 +++ libman.h 21 Mar 2011 17:23:13 -0000 @@ -17,8 +17,6 @@ #ifndef LIBMAN_H #define LIBMAN_H -#include "man.h" - enum man_next { MAN_NEXT_SIBLING = 0, MAN_NEXT_CHILD Index: libmandoc.h =================================================================== RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/libmandoc.h,v retrieving revision 1.13 diff -u -r1.13 libmandoc.h --- libmandoc.h 20 Mar 2011 16:02:05 -0000 1.13 +++ libmandoc.h 21 Mar 2011 17:23:13 -0000 @@ -17,18 +17,58 @@ #ifndef LIBMANDOC_H #define LIBMANDOC_H +enum rofferr { + ROFF_CONT, /* continue processing line */ + ROFF_RERUN, /* re-run roff interpreter with offset */ + ROFF_APPEND, /* re-run main parser, appending next line */ + ROFF_REPARSE, /* re-run main parser on the result */ + ROFF_SO, /* include another file */ + ROFF_IGN, /* ignore current line */ + ROFF_TBL, /* a table row was successfully parsed */ + ROFF_EQN, /* an equation was successfully parsed */ + ROFF_ERR /* badness: puke and stop */ +}; + +struct roff; + __BEGIN_DECLS -void mandoc_msg(enum mandocerr, struct mparse *, - int, int, const char *); -void mandoc_vmsg(enum mandocerr, struct mparse *, - int, int, const char *, ...); -int mandoc_special(char *); -char *mandoc_strdup(const char *); -char *mandoc_getarg(struct mparse *, char **, int, int *); -char *mandoc_normdate(struct mparse *, char *, int, int); -int mandoc_eos(const char *, size_t, int); -int mandoc_hyph(const char *, const char *); +void mandoc_msg(enum mandocerr, struct mparse *, + int, int, const char *); +void mandoc_vmsg(enum mandocerr, struct mparse *, + int, int, const char *, ...); +int mandoc_special(char *); +char *mandoc_strdup(const char *); +char *mandoc_getarg(struct mparse *, char **, int, int *); +char *mandoc_normdate(struct mparse *, char *, int, int); +int mandoc_eos(const char *, size_t, int); +int mandoc_hyph(const char *, const char *); + +void mdoc_free(struct mdoc *); +struct mdoc *mdoc_alloc(struct regset *, struct mparse *); +void mdoc_reset(struct mdoc *); +int mdoc_parseln(struct mdoc *, int, char *, int); +int mdoc_endparse(struct mdoc *); +int mdoc_addspan(struct mdoc *, const struct tbl_span *); +int mdoc_addeqn(struct mdoc *, const struct eqn *); + +void man_free(struct man *); +struct man *man_alloc(struct regset *, struct mparse *); +void man_reset(struct man *); +int man_parseln(struct man *, int, char *, int); +int man_endparse(struct man *); +int man_addspan(struct man *, const struct tbl_span *); +int man_addeqn(struct man *, const struct eqn *); + +void roff_free(struct roff *); +struct roff *roff_alloc(struct regset *, struct mparse *); +void roff_reset(struct roff *); +enum rofferr roff_parseln(struct roff *, int, + char **, size_t *, int, int *); +void roff_endparse(struct roff *); + +const struct tbl_span *roff_span(const struct roff *); +const struct eqn *roff_eqn(const struct roff *); __END_DECLS Index: libmdoc.h =================================================================== RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/libmdoc.h,v retrieving revision 1.69 diff -u -r1.69 libmdoc.h --- libmdoc.h 20 Mar 2011 16:02:05 -0000 1.69 +++ libmdoc.h 21 Mar 2011 17:23:13 -0000 @@ -17,8 +17,6 @@ #ifndef LIBMDOC_H #define LIBMDOC_H -#include "mdoc.h" - enum mdoc_next { MDOC_NEXT_SIBLING = 0, MDOC_NEXT_CHILD Index: main.c =================================================================== RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/main.c,v retrieving revision 1.157 diff -u -r1.157 main.c --- main.c 21 Mar 2011 12:04:26 -0000 1.157 +++ main.c 21 Mar 2011 17:23:13 -0000 @@ -28,8 +28,6 @@ #include "mandoc.h" #include "main.h" -#include "mdoc.h" -#include "man.h" #if !defined(__GNUC__) || (__GNUC__ < 2) # if !defined(lint) Index: man.3 =================================================================== RCS file: man.3 diff -N man.3 --- man.3 9 Feb 2011 09:18:15 -0000 1.30 +++ /dev/null 1 Jan 1970 00:00:00 -0000 @@ -1,283 +0,0 @@ -.\" $Id: man.3,v 1.30 2011/02/09 09:18:15 kristaps Exp $ -.\" -.\" Copyright (c) 2009-2010 Kristaps Dzonsons <kristaps@bsd.lv> -.\" -.\" Permission to use, copy, modify, and distribute this software for any -.\" purpose with or without fee is hereby granted, provided that the above -.\" copyright notice and this permission notice appear in all copies. -.\" -.\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES -.\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF -.\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR -.\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES -.\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN -.\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF -.\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. -.\" -.Dd $Mdocdate: February 9 2011 $ -.Dt MAN 3 -.Os -.Sh NAME -.Nm man , -.Nm man_addeqn , -.Nm man_addspan , -.Nm man_alloc , -.Nm man_endparse , -.Nm man_free , -.Nm man_meta , -.Nm man_node , -.Nm man_parseln , -.Nm man_reset -.Nd man macro compiler library -.Sh SYNOPSIS -.In mandoc.h -.In man.h -.Vt extern const char * const * man_macronames; -.Ft int -.Fo man_addeqn -.Fa "struct man *man" -.Fa "const struct eqn *eqn" -.Fc -.Ft int -.Fo man_addspan -.Fa "struct man *man" -.Fa "const struct tbl_span *span" -.Fc -.Ft "struct man *" -.Fo man_alloc -.Fa "struct regset *regs" -.Fa "void *data" -.Fa "mandocmsg msgs" -.Fc -.Ft int -.Fn man_endparse "struct man *man" -.Ft void -.Fn man_free "struct man *man" -.Ft "const struct man_meta *" -.Fn man_meta "const struct man *man" -.Ft "const struct man_node *" -.Fn man_node "const struct man *man" -.Ft int -.Fo man_parseln -.Fa "struct man *man" -.Fa "int line" -.Fa "char *buf" -.Fc -.Ft void -.Fn man_reset "struct man *man" -.Sh DESCRIPTION -The -.Nm -library parses lines of -.Xr man 7 -input into an abstract syntax tree (AST). -.Pp -In general, applications initiate a parsing sequence with -.Fn man_alloc , -parse each line in a document with -.Fn man_parseln , -close the parsing session with -.Fn man_endparse , -operate over the syntax tree returned by -.Fn man_node -and -.Fn man_meta , -then free all allocated memory with -.Fn man_free . -The -.Fn man_reset -function may be used in order to reset the parser for another input -sequence. -.Pp -Beyond the full set of macros defined in -.Xr man 7 , -the -.Nm -library also accepts the following macro: -.Pp -.Bl -tag -width Ds -compact -.It PD -Has no effect. -Handled as a current-scope line macro. -.El -.Ss Types -.Bl -ohang -.It Vt struct man -An opaque type. -Its values are only used privately within the library. -.It Vt struct man_node -A parsed node. -See -.Sx Abstract Syntax Tree -for details. -.El -.Ss Functions -If -.Fn man_addeqn , -.Fn man_addspan , -.Fn man_parseln , -or -.Fn man_endparse -return 0, calls to any function but -.Fn man_reset -or -.Fn man_free -will raise an assertion. -.Bl -ohang -.It Fn man_addeqn -Add an equation to the parsing stream. -Returns 0 on failure, 1 on success. -.It Fn man_addspan -Add a table span to the parsing stream. -Returns 0 on failure, 1 on success. -.It Fn man_alloc -Allocates a parsing structure. -The -.Fa data -pointer is passed to -.Fa msgs . -Always returns a valid pointer. -The pointer must be freed with -.Fn man_free . -.It Fn man_reset -Reset the parser for another parse routine. -After its use, -.Fn man_parseln -behaves as if invoked for the first time. -.It Fn man_free -Free all resources of a parser. -The pointer is no longer valid after invocation. -.It Fn man_parseln -Parse a nil-terminated line of input. -This line should not contain the trailing newline. -Returns 0 on failure, 1 on success. -The input buffer -.Fa buf -is modified by this function. -.It Fn man_endparse -Signals that the parse is complete. -Returns 0 on failure, 1 on success. -.It Fn man_node -Returns the first node of the parse. -.It Fn man_meta -Returns the document's parsed meta-data. -.El -.Ss Variables -The following variables are also defined: -.Bl -ohang -.It Va man_macronames -An array of string-ified token names. -.El -.Ss Abstract Syntax Tree -The -.Nm -functions produce an abstract syntax tree (AST) describing input in a -regular form. -It may be reviewed at any time with -.Fn man_nodes ; -however, if called before -.Fn man_endparse , -or after -.Fn man_endparse -or -.Fn man_parseln -fail, it may be incomplete. -.Pp -This AST is governed by the ontological rules dictated in -.Xr man 7 -and derives its terminology accordingly. -.Pp -The AST is composed of -.Vt struct man_node -nodes with element, root and text types as declared by the -.Va type -field. -Each node also provides its parse point (the -.Va line , -.Va sec , -and -.Va pos -fields), its position in the tree (the -.Va parent , -.Va child , -.Va next -and -.Va prev -fields) and some type-specific data. -.Pp -The tree itself is arranged according to the following normal form, -where capitalised non-terminals represent nodes. -.Pp -.Bl -tag -width "ELEMENTXX" -compact -.It ROOT -\(<- mnode+ -.It mnode -\(<- ELEMENT | TEXT | BLOCK -.It BLOCK -\(<- HEAD BODY -.It HEAD -\(<- mnode* -.It BODY -\(<- mnode* -.It ELEMENT -\(<- ELEMENT | TEXT* -.It TEXT -\(<- [[:alpha:]]* -.El -.Pp -The only elements capable of nesting other elements are those with -next-lint scope as documented in -.Xr man 7 . -.Sh EXAMPLES -The following example reads lines from stdin and parses them, operating -on the finished parse tree with -.Fn parsed . -This example does not error-check nor free memory upon failure. -.Bd -literal -offset indent -struct regset regs; -struct man *man; -struct man_node *node; -char *buf; -size_t len; -int line; - -bzero(®s, sizeof(struct regset)); -line = 1; -man = man_alloc(®s, NULL, NULL); -buf = NULL; -alloc_len = 0; - -while ((len = getline(&buf, &alloc_len, stdin)) >= 0) { - if (len && buflen[len - 1] = '\en') - buf[len - 1] = '\e0'; - if ( ! man_parseln(man, line, buf)) - errx(1, "man_parseln"); - line++; -} - -free(buf); - -if ( ! man_endparse(man)) - errx(1, "man_endparse"); -if (NULL == (node = man_node(man))) - errx(1, "man_node"); - -parsed(man, node); -man_free(man); -.Ed -.Pp -To compile this, execute -.Pp -.Dl % cc main.c libman.a libmandoc.a -.Pp -where -.Pa main.c -is the example file. -.Sh SEE ALSO -.Xr mandoc 1 , -.Xr man 7 -.Sh AUTHORS -The -.Nm -library was written by -.An Kristaps Dzonsons Aq kristaps@bsd.lv . Index: man.h =================================================================== RCS file: man.h diff -N man.h --- man.h 20 Mar 2011 16:02:05 -0000 1.55 +++ /dev/null 1 Jan 1970 00:00:00 -0000 @@ -1,133 +0,0 @@ -/* $Id: man.h,v 1.55 2011/03/20 16:02:05 kristaps Exp $ */ -/* - * Copyright (c) 2009, 2010, 2011 Kristaps Dzonsons <kristaps@bsd.lv> - * - * Permission to use, copy, modify, and distribute this software for any - * purpose with or without fee is hereby granted, provided that the above - * copyright notice and this permission notice appear in all copies. - * - * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES - * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF - * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR - * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES - * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN - * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF - * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. - */ -#ifndef MAN_H -#define MAN_H - -/* - * What follows is a list of ALL possible macros. - */ -enum mant { - MAN_br = 0, - MAN_TH, - MAN_SH, - MAN_SS, - MAN_TP, - MAN_LP, - MAN_PP, - MAN_P, - MAN_IP, - MAN_HP, - MAN_SM, - MAN_SB, - MAN_BI, - MAN_IB, - MAN_BR, - MAN_RB, - MAN_R, - MAN_B, - MAN_I, - MAN_IR, - MAN_RI, - MAN_na, - MAN_sp, - MAN_nf, - MAN_fi, - MAN_RE, - MAN_RS, - MAN_DT, - MAN_UC, - MAN_PD, - MAN_AT, - MAN_in, - MAN_ft, - MAN_MAX -}; - -/* - * Type of a syntax node. - */ -enum man_type { - MAN_TEXT, - MAN_ELEM, - MAN_ROOT, - MAN_BLOCK, - MAN_HEAD, - MAN_BODY, - MAN_TBL, - MAN_EQN -}; - -/* - * Information from prologue. - */ -struct man_meta { - char *msec; /* `TH' section (1, 3p, etc.) */ - char *date; /* `TH' normalised date */ - char *vol; /* `TH' volume */ - char *title; /* `TH' title (e.g., FOO) */ - char *source; /* `TH' source (e.g., GNU) */ -}; - -/* - * Single node in tree-linked AST. - */ -struct man_node { - struct man_node *parent; /* parent AST node */ - struct man_node *child; /* first child AST node */ - struct man_node *next; /* sibling AST node */ - struct man_node *prev; /* prior sibling AST node */ - int nchild; /* number children */ - int line; - int pos; - enum mant tok; /* tok or MAN__MAX if none */ - int flags; -#define MAN_VALID (1 << 0) /* has been validated */ -#define MAN_EOS (1 << 2) /* at sentence boundary */ -#define MAN_LINE (1 << 3) /* first macro/text on line */ - enum man_type type; /* AST node type */ - char *string; /* TEXT node argument */ - struct man_node *head; /* BLOCK node HEAD ptr */ - struct man_node *body; /* BLOCK node BODY ptr */ - const struct tbl_span *span; /* TBL */ - const struct eqn *eqn; /* EQN */ -}; - -/* - * Names of macros. Index is enum mant. Indexing into this returns - * the normalised name, e.g., man_macronames[MAN_SH] -> "SH". - */ -extern const char *const *man_macronames; - -__BEGIN_DECLS - -struct man; - -void man_free(struct man *); -struct man *man_alloc(struct regset *, struct mparse *); -void man_reset(struct man *); -int man_parseln(struct man *, int, char *, int); -int man_endparse(struct man *); -int man_addspan(struct man *, - const struct tbl_span *); -int man_addeqn(struct man *, const struct eqn *); - -const struct man_node *man_node(const struct man *); -const struct man_meta *man_meta(const struct man *); - -__END_DECLS - -#endif /*!MAN_H*/ Index: man_html.c =================================================================== RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/man_html.c,v retrieving revision 1.70 diff -u -r1.70 man_html.c --- man_html.c 7 Mar 2011 01:35:51 -0000 1.70 +++ man_html.c 21 Mar 2011 17:23:13 -0000 @@ -29,7 +29,6 @@ #include "mandoc.h" #include "out.h" #include "html.h" -#include "man.h" #include "main.h" /* TODO: preserve ident widths. */ Index: man_term.c =================================================================== RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/man_term.c,v retrieving revision 1.104 diff -u -r1.104 man_term.c --- man_term.c 7 Mar 2011 01:35:51 -0000 1.104 +++ man_term.c 21 Mar 2011 17:23:13 -0000 @@ -29,7 +29,6 @@ #include "mandoc.h" #include "out.h" -#include "man.h" #include "term.h" #include "chars.h" #include "main.h" Index: mandoc.3 =================================================================== RCS file: mandoc.3 diff -N mandoc.3 --- /dev/null 1 Jan 1970 00:00:00 -0000 +++ mandoc.3 21 Mar 2011 17:23:13 -0000 @@ -0,0 +1,321 @@ +.\" $Id: mdoc.3,v 1.57 2011/02/09 09:18:15 kristaps Exp $ +.\" +.\" Copyright (c) 2009, 2010 Kristaps Dzonsons <kristaps@bsd.lv> +.\" Copyright (c) 2010 Ingo Schwarze <schwarze@openbsd.org> +.\" +.\" Permission to use, copy, modify, and distribute this software for any +.\" purpose with or without fee is hereby granted, provided that the above +.\" copyright notice and this permission notice appear in all copies. +.\" +.\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES +.\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF +.\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR +.\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES +.\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN +.\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF +.\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. +.\" +.Dd $Mdocdate: February 9 2011 $ +.Dt MANDOC 3 +.Os +.Sh NAME +.Nm mandoc , +.Nm man_meta , +.Nm man_node , +.Nm mdoc_meta , +.Nm mdoc_node , +.Nm mparse_alloc , +.Nm mparse_free , +.Nm mparse_readfd , +.Nm mparse_reset , +.Nm mparse_result +.Nd mandoc macro compiler library +.Sh SYNOPSIS +.In mandoc.h +.Ft "const struct man_meta *" +.Fo man_meta +.Fa "const struct man *man" +.Fc +.Ft "const struct man_node *" +.Fo man_node +.Fa "const struct man *man" +.Fc +.Ft "const struct mdoc_meta *" +.Fo mdoc_meta +.Fa "const struct mdoc *mdoc" +.Fc +.Ft "const struct mdoc_node *" +.Fo mdoc_node +.Fa "const struct mdoc *mdoc" +.Fc +.Ft void +.Fo mparse_alloc +.Fa "enum mparset type" +.Fa "enum mandoclevel wlevel" +.Fa "mandocmsg msg" +.Fa "void *msgarg" +.Fc +.Ft void +.Fo mparse_free +.Fa "struct mparse *parse" +.Fc +.Ft "enum mandoclevel" +.Fo mparse_readfd +.Fa "struct mparse *parse" +.Fa "int fd" +.Fa "const char *fname" +.Fc +.Ft void +.Fo mparse_reset +.Fa "struct mparse *parse" +.Fc +.Ft void +.Fo mparse_result +.Fa "struct mparse *parse" +.Fa "struct mdoc **mdoc" +.Fa "struct man **man" +.Fc +.Vt extern const char * const * man_macronames; +.Vt extern const char * const * mdoc_argnames; +.Vt extern const char * const * mdoc_macronames; +.Sh DESCRIPTION +The +.Nm mandoc +library parses a +.Ux +manual into an abstract syntax tree (AST). +.Ux +manuals are composed of +.Xr mdoc 7 +or +.Xr man 7 , +and may be mixed with +.Xr roff 7 , +.Xr tbl 7 , +and +.Xr eqn 7 +invocations. +.Pp +The following describes a general parse sequence: +.Bl -enum +.It +initiate a parsing sequence with +.Fn mparse_alloc ; +.It +parse files or file descriptors with +.Fn mparse_readfd ; +.It +retrieve a parsed syntax tree, if the parse was successful, with +.Fn mparse_result ; +.It +iterate over parse nodes with +.Fn mdoc_node +or +.Fn man_node ; +.It +free all allocated memory with +.Fn mparse_free , +or invoke +.Fn mparse_reset +and parse new files. +.El +.Sh IMPLEMENTATION NOTES +This section consists of structural documentation for +.Xr mdoc 7 +and +.Xr man 7 +syntax trees. +.Ss Man Abstract Syntax Tree +This AST is governed by the ontological rules dictated in +.Xr man 7 +and derives its terminology accordingly. +.Pp +The AST is composed of +.Vt struct man_node +nodes with element, root and text types as declared by the +.Va type +field. +Each node also provides its parse point (the +.Va line , +.Va sec , +and +.Va pos +fields), its position in the tree (the +.Va parent , +.Va child , +.Va next +and +.Va prev +fields) and some type-specific data. +.Pp +The tree itself is arranged according to the following normal form, +where capitalised non-terminals represent nodes. +.Pp +.Bl -tag -width "ELEMENTXX" -compact +.It ROOT +\(<- mnode+ +.It mnode +\(<- ELEMENT | TEXT | BLOCK +.It BLOCK +\(<- HEAD BODY +.It HEAD +\(<- mnode* +.It BODY +\(<- mnode* +.It ELEMENT +\(<- ELEMENT | TEXT* +.It TEXT +\(<- [[:alpha:]]* +.El +.Pp +The only elements capable of nesting other elements are those with +next-lint scope as documented in +.Xr man 7 . +.Ss Mdoc Abstract Syntax Tree +This AST is governed by the ontological +rules dictated in +.Xr mdoc 7 +and derives its terminology accordingly. +.Qq In-line +elements described in +.Xr mdoc 7 +are described simply as +.Qq elements . +.Pp +The AST is composed of +.Vt struct mdoc_node +nodes with block, head, body, element, root and text types as declared +by the +.Va type +field. +Each node also provides its parse point (the +.Va line , +.Va sec , +and +.Va pos +fields), its position in the tree (the +.Va parent , +.Va child , +.Va nchild , +.Va next +and +.Va prev +fields) and some type-specific data, in particular, for nodes generated +from macros, the generating macro in the +.Va tok +field. +.Pp +The tree itself is arranged according to the following normal form, +where capitalised non-terminals represent nodes. +.Pp +.Bl -tag -width "ELEMENTXX" -compact +.It ROOT +\(<- mnode+ +.It mnode +\(<- BLOCK | ELEMENT | TEXT +.It BLOCK +\(<- HEAD [TEXT] (BODY [TEXT])+ [TAIL [TEXT]] +.It ELEMENT +\(<- TEXT* +.It HEAD +\(<- mnode* +.It BODY +\(<- mnode* [ENDBODY mnode*] +.It TAIL +\(<- mnode* +.It TEXT +\(<- [[:printable:],0x1e]* +.El +.Pp +Of note are the TEXT nodes following the HEAD, BODY and TAIL nodes of +the BLOCK production: these refer to punctuation marks. +Furthermore, although a TEXT node will generally have a non-zero-length +string, in the specific case of +.Sq \&.Bd \-literal , +an empty line will produce a zero-length string. +Multiple body parts are only found in invocations of +.Sq \&Bl \-column , +where a new body introduces a new phrase. +.Pp +The +.Xr mdoc 7 +syntax tree accomodates for broken block structures as well. +The ENDBODY node is available to end the formatting associated +with a given block before the physical end of that block. +It has a non-null +.Va end +field, is of the BODY +.Va type , +has the same +.Va tok +as the BLOCK it is ending, and has a +.Va pending +field pointing to that BLOCK's BODY node. +It is an indirect child of that BODY node +and has no children of its own. +.Pp +An ENDBODY node is generated when a block ends while one of its child +blocks is still open, like in the following example: +.Bd -literal -offset indent +\&.Ao ao +\&.Bo bo ac +\&.Ac bc +\&.Bc end +.Ed +.Pp +This example results in the following block structure: +.Bd -literal -offset indent +BLOCK Ao + HEAD Ao + BODY Ao + TEXT ao + BLOCK Bo, pending -> Ao + HEAD Bo + BODY Bo + TEXT bo + TEXT ac + ENDBODY Ao, pending -> Ao + TEXT bc +TEXT end +.Ed +.Pp +Here, the formatting of the +.Sq \&Ao +block extends from TEXT ao to TEXT ac, +while the formatting of the +.Sq \&Bo +block extends from TEXT bo to TEXT bc. +It renders as follows in +.Fl T Ns Cm ascii +mode: +.Pp +.Dl <ao [bo ac> bc] end +.Pp +Support for badly-nested blocks is only provided for backward +compatibility with some older +.Xr mdoc 7 +implementations. +Using badly-nested blocks is +.Em strongly discouraged ; +for example, the +.Fl T Ns Cm html +and +.Fl T Ns Cm xhtml +front-ends to +.Xr mandoc 1 +are unable to render them in any meaningful way. +Furthermore, behaviour when encountering badly-nested blocks is not +consistent across troff implementations, especially when using multiple +levels of badly-nested blocks. +.Sh SEE ALSO +.Xr mandoc 1 , +.Xr eqn 7 , +.Xr man 7 , +.Xr mdoc 7 , +.Xr roff 7 , +.Xr tbl 7 +.Sh AUTHORS +The +.Nm +library was written by +.An Kristaps Dzonsons Aq kristaps@bsd.lv . Index: mandoc.h =================================================================== RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/mandoc.h,v retrieving revision 1.64 diff -u -r1.64 mandoc.h --- mandoc.h 20 Mar 2011 16:05:21 -0000 1.64 +++ mandoc.h 21 Mar 2011 17:23:13 -0000 @@ -27,7 +27,7 @@ * threshold). */ enum mandoclevel { - MANDOCLEVEL_OK = 0, + MANDOCLEVEL_OK = 0, /* everything's ok */ MANDOCLEVEL_RESERVED, MANDOCLEVEL_WARNING, /* warnings: syntax, whitespace, etc. */ MANDOCLEVEL_ERROR, /* input has been thrown away */ @@ -277,7 +277,7 @@ }; /* - * Available registers (set in libroff, accessed elsewhere). + * Available roff registers. */ enum regs { REG_nS = 0, @@ -328,6 +328,436 @@ DELIM_CLOSE }; +enum mdoct { + MDOC_Ap = 0, + MDOC_Dd, + MDOC_Dt, + MDOC_Os, + MDOC_Sh, + MDOC_Ss, + MDOC_Pp, + MDOC_D1, + MDOC_Dl, + MDOC_Bd, + MDOC_Ed, + MDOC_Bl, + MDOC_El, + MDOC_It, + MDOC_Ad, + MDOC_An, + MDOC_Ar, + MDOC_Cd, + MDOC_Cm, + MDOC_Dv, + MDOC_Er, + MDOC_Ev, + MDOC_Ex, + MDOC_Fa, + MDOC_Fd, + MDOC_Fl, + MDOC_Fn, + MDOC_Ft, + MDOC_Ic, + MDOC_In, + MDOC_Li, + MDOC_Nd, + MDOC_Nm, + MDOC_Op, + MDOC_Ot, + MDOC_Pa, + MDOC_Rv, + MDOC_St, + MDOC_Va, + MDOC_Vt, + MDOC_Xr, + MDOC__A, + MDOC__B, + MDOC__D, + MDOC__I, + MDOC__J, + MDOC__N, + MDOC__O, + MDOC__P, + MDOC__R, + MDOC__T, + MDOC__V, + MDOC_Ac, + MDOC_Ao, + MDOC_Aq, + MDOC_At, + MDOC_Bc, + MDOC_Bf, + MDOC_Bo, + MDOC_Bq, + MDOC_Bsx, + MDOC_Bx, + MDOC_Db, + MDOC_Dc, + MDOC_Do, + MDOC_Dq, + MDOC_Ec, + MDOC_Ef, + MDOC_Em, + MDOC_Eo, + MDOC_Fx, + MDOC_Ms, + MDOC_No, + MDOC_Ns, + MDOC_Nx, + MDOC_Ox, + MDOC_Pc, + MDOC_Pf, + MDOC_Po, + MDOC_Pq, + MDOC_Qc, + MDOC_Ql, + MDOC_Qo, + MDOC_Qq, + MDOC_Re, + MDOC_Rs, + MDOC_Sc, + MDOC_So, + MDOC_Sq, + MDOC_Sm, + MDOC_Sx, + MDOC_Sy, + MDOC_Tn, + MDOC_Ux, + MDOC_Xc, + MDOC_Xo, + MDOC_Fo, + MDOC_Fc, + MDOC_Oo, + MDOC_Oc, + MDOC_Bk, + MDOC_Ek, + MDOC_Bt, + MDOC_Hf, + MDOC_Fr, + MDOC_Ud, + MDOC_Lb, + MDOC_Lp, + MDOC_Lk, + MDOC_Mt, + MDOC_Brq, + MDOC_Bro, + MDOC_Brc, + MDOC__C, + MDOC_Es, + MDOC_En, + MDOC_Dx, + MDOC__Q, + MDOC_br, + MDOC_sp, + MDOC__U, + MDOC_Ta, + MDOC_MAX +}; + +enum mdocargt { + MDOC_Split, + MDOC_Nosplit, + MDOC_Ragged, + MDOC_Unfilled, + MDOC_Literal, + MDOC_File, + MDOC_Offset, + MDOC_Bullet, + MDOC_Dash, + MDOC_Hyphen, + MDOC_Item, + MDOC_Enum, + MDOC_Tag, + MDOC_Diag, + MDOC_Hang, + MDOC_Ohang, + MDOC_Inset, + MDOC_Column, + MDOC_Width, + MDOC_Compact, + MDOC_Std, + MDOC_Filled, + MDOC_Words, + MDOC_Emphasis, + MDOC_Symbolic, + MDOC_Nested, + MDOC_Centred, + MDOC_ARG_MAX +}; + +enum mdoc_type { + MDOC_TEXT, /* text */ + MDOC_ELEM, /* in-line element */ + MDOC_HEAD, /* block head */ + MDOC_TAIL, /* block tail */ + MDOC_BODY, /* block body */ + MDOC_BLOCK, /* block enclosure */ + MDOC_TBL, /* table */ + MDOC_EQN, /* equation */ + MDOC_ROOT /* root of document */ +}; + +/* + * Section (named/unnamed) of mdoc(7) `Sh'. Note that these appear in + * the conventional order imposed by mdoc(7). + */ +enum mdoc_sec { + SEC_NONE = 0, /* No section, yet. */ + SEC_NAME, + SEC_LIBRARY, + SEC_SYNOPSIS, + SEC_DESCRIPTION, + SEC_IMPLEMENTATION, + SEC_RETURN_VALUES, + SEC_ENVIRONMENT, + SEC_FILES, + SEC_EXIT_STATUS, + SEC_EXAMPLES, + SEC_DIAGNOSTICS, + SEC_COMPATIBILITY, + SEC_ERRORS, + SEC_SEE_ALSO, + SEC_STANDARDS, + SEC_HISTORY, + SEC_AUTHORS, + SEC_CAVEATS, + SEC_BUGS, + SEC_SECURITY, + SEC_CUSTOM, /* User-defined. */ + SEC__MAX +}; + +struct mdoc_meta { + char *msec; /* `Dt' section (1, 3p, etc.) */ + char *vol; /* `Dt' volume (implied) */ + char *arch; /* `Dt' arch (i386, etc.) */ + char *date; /* `Dd' normalised date */ + char *title; /* `Dt' title (FOO, etc.) */ + char *os; /* `Os' system (OpenBSD, etc.) */ + char *name; /* leading `Nm' name */ +}; + +/* + * An argument to a mdoc(7) macro (multiple values = `-column xxx yyy'). + */ +struct mdoc_argv { + enum mdocargt arg; /* type of argument */ + int line; + int pos; + size_t sz; /* elements in "value" */ + char **value; /* argument strings */ +}; + +/* + * Reference-counted macro arguments. These are refcounted because + * blocks have multiple instances of the same arguments spread across + * the HEAD, BODY, TAIL, and BLOCK node types. + */ +struct mdoc_arg { + size_t argc; + struct mdoc_argv *argv; + unsigned int refcnt; +}; + +/* + * Indicates that a BODY's formatting has ended, but the scope is still + * open. Used for syntax-broken blocks. + */ +enum mdoc_endbody { + ENDBODY_NOT = 0, + ENDBODY_SPACE, /* is broken: append a space */ + ENDBODY_NOSPACE /* is broken: don't append a space */ +}; + +enum mdoc_list { + LIST__NONE = 0, + LIST_bullet, /* -bullet argument */ + LIST_column, /* -column argument */ + LIST_dash, /* -dash argument */ + LIST_diag, /* -diag argument */ + LIST_enum, /* -enum argument */ + LIST_hang, /* -hang argument */ + LIST_hyphen, /* -hyphen argument */ + LIST_inset, /* -inset argument */ + LIST_item, /* -item argument */ + LIST_ohang, /* -ohang argument */ + LIST_tag, /* -tag argument */ + LIST_MAX +}; + +enum mdoc_disp { + DISP__NONE = 0, + DISP_centred, /* -centred argument */ + DISP_ragged, /* -ragged argument */ + DISP_unfilled, /* -unfilled argument */ + DISP_filled, /* -filled argument */ + DISP_literal /* -literal argument */ +}; + +enum mdoc_auth { + AUTH__NONE = 0, + AUTH_split, /* -split argument */ + AUTH_nosplit /* -nosplit argument */ +}; + +enum mdoc_font { + FONT__NONE = 0, + FONT_Em, /* "Em" or -emphasis */ + FONT_Li, /* "Li" or -literal */ + FONT_Sy /* "Sy" or -symbolic */ +}; + +struct mdoc_bd { + const char *offs; /* -offset */ + enum mdoc_disp type; /* -ragged, etc. */ + int comp; /* -compact */ +}; + +struct mdoc_bl { + const char *width; /* -width */ + const char *offs; /* -offset */ + enum mdoc_list type; /* -tag, -enum, etc. */ + int comp; /* -compact */ + size_t ncols; /* -column arg count */ + const char **cols; /* -column val ptr */ +}; + +struct mdoc_bf { + enum mdoc_font font; /* font */ +}; + +struct mdoc_an { + enum mdoc_auth auth; /* -split, etc. */ +}; + +struct mdoc_rs { + int quote_T; /* whether to quote %T */ +}; + +/* + * Consists of normalised node arguments. These should be used instead + * of iterating through the mdoc_arg pointers of a node: defaults are + * provided, etc. + */ +union mdoc_data { + struct mdoc_an An; /* An arguments */ + struct mdoc_bd Bd; /* Bd arguments */ + struct mdoc_bf Bf; /* Bf arguments */ + struct mdoc_bl Bl; /* Bl arguments */ + struct mdoc_rs Rs; /* Rs arguments */ +}; + +/* + * Single node in tree-linked AST. + */ +struct mdoc_node { + struct mdoc_node *parent; /* parent AST node */ + struct mdoc_node *child; /* first child AST node */ + struct mdoc_node *last; /* last child AST node */ + struct mdoc_node *next; /* sibling AST node */ + struct mdoc_node *prev; /* prior sibling AST node */ + int nchild; /* number children */ + int line; /* parse line */ + int pos; /* parse column */ + enum mdoct tok; /* tok or MDOC__MAX if none */ + int flags; +#define MDOC_VALID (1 << 0) /* has been validated */ +#define MDOC_EOS (1 << 2) /* at sentence boundary */ +#define MDOC_LINE (1 << 3) /* first macro/text on line */ +#define MDOC_SYNPRETTY (1 << 4) /* SYNOPSIS-style formatting */ +#define MDOC_ENDED (1 << 5) /* rendering has been ended */ + enum mdoc_type type; /* AST node type */ + enum mdoc_sec sec; /* current named section */ + union mdoc_data *norm; /* normalised args */ + /* FIXME: these can be union'd to shave a few bytes. */ + struct mdoc_arg *args; /* BLOCK/ELEM */ + struct mdoc_node *pending; /* BLOCK */ + struct mdoc_node *head; /* BLOCK */ + struct mdoc_node *body; /* BLOCK */ + struct mdoc_node *tail; /* BLOCK */ + char *string; /* TEXT */ + const struct tbl_span *span; /* TBL */ + const struct eqn *eqn; /* EQN */ + enum mdoc_endbody end; /* BODY */ +}; + +enum mant { + MAN_br = 0, + MAN_TH, + MAN_SH, + MAN_SS, + MAN_TP, + MAN_LP, + MAN_PP, + MAN_P, + MAN_IP, + MAN_HP, + MAN_SM, + MAN_SB, + MAN_BI, + MAN_IB, + MAN_BR, + MAN_RB, + MAN_R, + MAN_B, + MAN_I, + MAN_IR, + MAN_RI, + MAN_na, + MAN_sp, + MAN_nf, + MAN_fi, + MAN_RE, + MAN_RS, + MAN_DT, + MAN_UC, + MAN_PD, + MAN_AT, + MAN_in, + MAN_ft, + MAN_MAX +}; + +enum man_type { + MAN_TEXT, + MAN_ELEM, + MAN_ROOT, + MAN_BLOCK, + MAN_HEAD, + MAN_BODY, + MAN_TBL, + MAN_EQN +}; + +struct man_meta { + char *msec; /* `TH' section (1, 3p, etc.) */ + char *date; /* `TH' normalised date */ + char *vol; /* `TH' volume */ + char *title; /* `TH' title (e.g., FOO) */ + char *source; /* `TH' source (e.g., GNU) */ +}; + +struct man_node { + struct man_node *parent; /* parent AST node */ + struct man_node *child; /* first child AST node */ + struct man_node *next; /* sibling AST node */ + struct man_node *prev; /* prior sibling AST node */ + int nchild; /* number children */ + int line; + int pos; + enum mant tok; /* tok or MAN__MAX if none */ + int flags; +#define MAN_VALID (1 << 0) /* has been validated */ +#define MAN_EOS (1 << 2) /* at sentence boundary */ +#define MAN_LINE (1 << 3) /* first macro/text on line */ + enum man_type type; /* AST node type */ + char *string; /* TEXT node argument */ + struct man_node *head; /* BLOCK node HEAD ptr */ + struct man_node *body; /* BLOCK node BODY ptr */ + const struct tbl_span *span; /* TBL */ + const struct eqn *eqn; /* EQN */ +}; + /* * The type of parse sequence. This value is usually passed via the * mandoc(1) command line of -man and -mdoc. It's almost exclusively @@ -342,12 +772,22 @@ typedef void (*mandocmsg)(enum mandocerr, enum mandoclevel, const char *, int, int, const char *); +/* Names of macros. Index is enum mdoct. */ +extern const char * const *mdoc_macronames; + +/* Names of macro args. Index is enum mdocargt. */ +extern const char * const *mdoc_argnames; + +/* Names of macros. Index is enum mant. */ +extern const char * const *man_macronames; + + +__BEGIN_DECLS + struct mparse; struct mdoc; struct man; -__BEGIN_DECLS - void mparse_free(struct mparse *); void mparse_reset(struct mparse *); struct mparse *mparse_alloc(enum mparset, @@ -360,6 +800,11 @@ void *mandoc_realloc(void *, size_t); #define DELIMSZ 6 /* hint: max possible size of a delimiter */ enum mdelim mandoc_isdelim(const char *); + +const struct man_node *man_node(const struct man *); +const struct man_meta *man_meta(const struct man *); +const struct mdoc_node *mdoc_node(const struct mdoc *); +const struct mdoc_meta *mdoc_meta(const struct mdoc *); __END_DECLS Index: mdoc.3 =================================================================== RCS file: mdoc.3 diff -N mdoc.3 --- mdoc.3 9 Feb 2011 09:18:15 -0000 1.57 +++ /dev/null 1 Jan 1970 00:00:00 -0000 @@ -1,359 +0,0 @@ -.\" $Id: mdoc.3,v 1.57 2011/02/09 09:18:15 kristaps Exp $ -.\" -.\" Copyright (c) 2009, 2010 Kristaps Dzonsons <kristaps@bsd.lv> -.\" Copyright (c) 2010 Ingo Schwarze <schwarze@openbsd.org> -.\" -.\" Permission to use, copy, modify, and distribute this software for any -.\" purpose with or without fee is hereby granted, provided that the above -.\" copyright notice and this permission notice appear in all copies. -.\" -.\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES -.\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF -.\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR -.\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES -.\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN -.\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF -.\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. -.\" -.Dd $Mdocdate: February 9 2011 $ -.Dt MDOC 3 -.Os -.Sh NAME -.Nm mdoc , -.Nm mdoc_addeqn , -.Nm mdoc_addspan , -.Nm mdoc_alloc , -.Nm mdoc_endparse , -.Nm mdoc_free , -.Nm mdoc_meta , -.Nm mdoc_node , -.Nm mdoc_parseln , -.Nm mdoc_reset -.Nd mdoc macro compiler library -.Sh SYNOPSIS -.In mandoc.h -.In mdoc.h -.Vt extern const char * const * mdoc_macronames; -.Vt extern const char * const * mdoc_argnames; -.Ft int -.Fo mdoc_addeqn -.Fa "struct mdoc *mdoc" -.Fa "const struct eqn *eqn" -.Fc -.Ft int -.Fo mdoc_addspan -.Fa "struct mdoc *mdoc" -.Fa "const struct tbl_span *span" -.Fc -.Ft "struct mdoc *" -.Fo mdoc_alloc -.Fa "struct regset *regs" -.Fa "void *data" -.Fa "mandocmsg msgs" -.Fc -.Ft int -.Fn mdoc_endparse "struct mdoc *mdoc" -.Ft void -.Fn mdoc_free "struct mdoc *mdoc" -.Ft "const struct mdoc_meta *" -.Fn mdoc_meta "const struct mdoc *mdoc" -.Ft "const struct mdoc_node *" -.Fn mdoc_node "const struct mdoc *mdoc" -.Ft int -.Fo mdoc_parseln -.Fa "struct mdoc *mdoc" -.Fa "int line" -.Fa "char *buf" -.Fc -.Ft int -.Fn mdoc_reset "struct mdoc *mdoc" -.Sh DESCRIPTION -The -.Nm mdoc -library parses lines of -.Xr mdoc 7 -input -into an abstract syntax tree (AST). -.Pp -In general, applications initiate a parsing sequence with -.Fn mdoc_alloc , -parse each line in a document with -.Fn mdoc_parseln , -close the parsing session with -.Fn mdoc_endparse , -operate over the syntax tree returned by -.Fn mdoc_node -and -.Fn mdoc_meta , -then free all allocated memory with -.Fn mdoc_free . -The -.Fn mdoc_reset -function may be used in order to reset the parser for another input -sequence. -.Ss Types -.Bl -ohang -.It Vt struct mdoc -An opaque type. -Its values are only used privately within the library. -.It Vt struct mdoc_node -A parsed node. -See -.Sx Abstract Syntax Tree -for details. -.El -.Ss Functions -If -.Fn mdoc_addeqn , -.Fn mdoc_addspan , -.Fn mdoc_parseln , -or -.Fn mdoc_endparse -return 0, calls to any function but -.Fn mdoc_reset -or -.Fn mdoc_free -will raise an assertion. -.Bl -ohang -.It Fn mdoc_addeqn -Add an equation to the parsing stream. -Returns 0 on failure, 1 on success. -.It Fn mdoc_addspan -Add a table span to the parsing stream. -Returns 0 on failure, 1 on success. -.It Fn mdoc_alloc -Allocates a parsing structure. -The -.Fa data -pointer is passed to -.Fa msgs . -Always returns a valid pointer. -The pointer must be freed with -.Fn mdoc_free . -.It Fn mdoc_reset -Reset the parser for another parse routine. -After its use, -.Fn mdoc_parseln -behaves as if invoked for the first time. -If it returns 0, memory could not be allocated. -.It Fn mdoc_free -Free all resources of a parser. -The pointer is no longer valid after invocation. -.It Fn mdoc_parseln -Parse a nil-terminated line of input. -This line should not contain the trailing newline. -Returns 0 on failure, 1 on success. -The input buffer -.Fa buf -is modified by this function. -.It Fn mdoc_endparse -Signals that the parse is complete. -Returns 0 on failure, 1 on success. -.It Fn mdoc_node -Returns the first node of the parse. -.It Fn mdoc_meta -Returns the document's parsed meta-data. -.El -.Ss Variables -.Bl -ohang -.It Va mdoc_macronames -An array of string-ified token names. -.It Va mdoc_argnames -An array of string-ified token argument names. -.El -.Ss Abstract Syntax Tree -The -.Nm -functions produce an abstract syntax tree (AST) describing input in a -regular form. -It may be reviewed at any time with -.Fn mdoc_nodes ; -however, if called before -.Fn mdoc_endparse , -or after -.Fn mdoc_endparse -or -.Fn mdoc_parseln -fail, it may be incomplete. -.Pp -This AST is governed by the ontological -rules dictated in -.Xr mdoc 7 -and derives its terminology accordingly. -.Qq In-line -elements described in -.Xr mdoc 7 -are described simply as -.Qq elements . -.Pp -The AST is composed of -.Vt struct mdoc_node -nodes with block, head, body, element, root and text types as declared -by the -.Va type -field. -Each node also provides its parse point (the -.Va line , -.Va sec , -and -.Va pos -fields), its position in the tree (the -.Va parent , -.Va child , -.Va nchild , -.Va next -and -.Va prev -fields) and some type-specific data, in particular, for nodes generated -from macros, the generating macro in the -.Va tok -field. -.Pp -The tree itself is arranged according to the following normal form, -where capitalised non-terminals represent nodes. -.Pp -.Bl -tag -width "ELEMENTXX" -compact -.It ROOT -\(<- mnode+ -.It mnode -\(<- BLOCK | ELEMENT | TEXT -.It BLOCK -\(<- HEAD [TEXT] (BODY [TEXT])+ [TAIL [TEXT]] -.It ELEMENT -\(<- TEXT* -.It HEAD -\(<- mnode* -.It BODY -\(<- mnode* [ENDBODY mnode*] -.It TAIL -\(<- mnode* -.It TEXT -\(<- [[:printable:],0x1e]* -.El -.Pp -Of note are the TEXT nodes following the HEAD, BODY and TAIL nodes of -the BLOCK production: these refer to punctuation marks. -Furthermore, although a TEXT node will generally have a non-zero-length -string, in the specific case of -.Sq \&.Bd \-literal , -an empty line will produce a zero-length string. -Multiple body parts are only found in invocations of -.Sq \&Bl \-column , -where a new body introduces a new phrase. -.Ss Badly-nested Blocks -The ENDBODY node is available to end the formatting associated -with a given block before the physical end of that block. -It has a non-null -.Va end -field, is of the BODY -.Va type , -has the same -.Va tok -as the BLOCK it is ending, and has a -.Va pending -field pointing to that BLOCK's BODY node. -It is an indirect child of that BODY node -and has no children of its own. -.Pp -An ENDBODY node is generated when a block ends while one of its child -blocks is still open, like in the following example: -.Bd -literal -offset indent -\&.Ao ao -\&.Bo bo ac -\&.Ac bc -\&.Bc end -.Ed -.Pp -This example results in the following block structure: -.Bd -literal -offset indent -BLOCK Ao - HEAD Ao - BODY Ao - TEXT ao - BLOCK Bo, pending -> Ao - HEAD Bo - BODY Bo - TEXT bo - TEXT ac - ENDBODY Ao, pending -> Ao - TEXT bc -TEXT end -.Ed -.Pp -Here, the formatting of the -.Sq \&Ao -block extends from TEXT ao to TEXT ac, -while the formatting of the -.Sq \&Bo -block extends from TEXT bo to TEXT bc. -It renders as follows in -.Fl T Ns Cm ascii -mode: -.Pp -.Dl <ao [bo ac> bc] end -.Pp -Support for badly-nested blocks is only provided for backward -compatibility with some older -.Xr mdoc 7 -implementations. -Using badly-nested blocks is -.Em strongly discouraged : -the -.Fl T Ns Cm html -and -.Fl T Ns Cm xhtml -front-ends are unable to render them in any meaningful way. -Furthermore, behaviour when encountering badly-nested blocks is not -consistent across troff implementations, especially when using multiple -levels of badly-nested blocks. -.Sh EXAMPLES -The following example reads lines from stdin and parses them, operating -on the finished parse tree with -.Fn parsed . -This example does not error-check nor free memory upon failure. -.Bd -literal -offset indent -struct regset regs; -struct mdoc *mdoc; -const struct mdoc_node *node; -char *buf; -size_t len; -int line; - -bzero(®s, sizeof(struct regset)); -line = 1; -mdoc = mdoc_alloc(®s, NULL, NULL); -buf = NULL; -alloc_len = 0; - -while ((len = getline(&buf, &alloc_len, stdin)) >= 0) { - if (len && buflen[len - 1] = '\en') - buf[len - 1] = '\e0'; - if ( ! mdoc_parseln(mdoc, line, buf)) - errx(1, "mdoc_parseln"); - line++; -} - -if ( ! mdoc_endparse(mdoc)) - errx(1, "mdoc_endparse"); -if (NULL == (node = mdoc_node(mdoc))) - errx(1, "mdoc_node"); - -parsed(mdoc, node); -mdoc_free(mdoc); -.Ed -.Pp -To compile this, execute -.Pp -.Dl % cc main.c libmdoc.a libmandoc.a -.Pp -where -.Pa main.c -is the example file. -.Sh SEE ALSO -.Xr mandoc 1 , -.Xr mdoc 7 -.Sh AUTHORS -The -.Nm -library was written by -.An Kristaps Dzonsons Aq kristaps@bsd.lv . Index: mdoc.h =================================================================== RCS file: mdoc.h diff -N mdoc.h --- mdoc.h 20 Mar 2011 16:02:05 -0000 1.119 +++ /dev/null 1 Jan 1970 00:00:00 -0000 @@ -1,440 +0,0 @@ -/* $Id: mdoc.h,v 1.119 2011/03/20 16:02:05 kristaps Exp $ */ -/* - * Copyright (c) 2008, 2009, 2010, 2011 Kristaps Dzonsons <kristaps@bsd.lv> - * - * Permission to use, copy, modify, and distribute this software for any - * purpose with or without fee is hereby granted, provided that the above - * copyright notice and this permission notice appear in all copies. - * - * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES - * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF - * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR - * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES - * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN - * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF - * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. - */ -#ifndef MDOC_H -#define MDOC_H - -/* - * What follows is a list of ALL possible macros. - */ -enum mdoct { - MDOC_Ap = 0, - MDOC_Dd, - MDOC_Dt, - MDOC_Os, - MDOC_Sh, - MDOC_Ss, - MDOC_Pp, - MDOC_D1, - MDOC_Dl, - MDOC_Bd, - MDOC_Ed, - MDOC_Bl, - MDOC_El, - MDOC_It, - MDOC_Ad, - MDOC_An, - MDOC_Ar, - MDOC_Cd, - MDOC_Cm, - MDOC_Dv, - MDOC_Er, - MDOC_Ev, - MDOC_Ex, - MDOC_Fa, - MDOC_Fd, - MDOC_Fl, - MDOC_Fn, - MDOC_Ft, - MDOC_Ic, - MDOC_In, - MDOC_Li, - MDOC_Nd, - MDOC_Nm, - MDOC_Op, - MDOC_Ot, - MDOC_Pa, - MDOC_Rv, - MDOC_St, - MDOC_Va, - MDOC_Vt, - MDOC_Xr, - MDOC__A, - MDOC__B, - MDOC__D, - MDOC__I, - MDOC__J, - MDOC__N, - MDOC__O, - MDOC__P, - MDOC__R, - MDOC__T, - MDOC__V, - MDOC_Ac, - MDOC_Ao, - MDOC_Aq, - MDOC_At, - MDOC_Bc, - MDOC_Bf, - MDOC_Bo, - MDOC_Bq, - MDOC_Bsx, - MDOC_Bx, - MDOC_Db, - MDOC_Dc, - MDOC_Do, - MDOC_Dq, - MDOC_Ec, - MDOC_Ef, - MDOC_Em, - MDOC_Eo, - MDOC_Fx, - MDOC_Ms, - MDOC_No, - MDOC_Ns, - MDOC_Nx, - MDOC_Ox, - MDOC_Pc, - MDOC_Pf, - MDOC_Po, - MDOC_Pq, - MDOC_Qc, - MDOC_Ql, - MDOC_Qo, - MDOC_Qq, - MDOC_Re, - MDOC_Rs, - MDOC_Sc, - MDOC_So, - MDOC_Sq, - MDOC_Sm, - MDOC_Sx, - MDOC_Sy, - MDOC_Tn, - MDOC_Ux, - MDOC_Xc, - MDOC_Xo, - MDOC_Fo, - MDOC_Fc, - MDOC_Oo, - MDOC_Oc, - MDOC_Bk, - MDOC_Ek, - MDOC_Bt, - MDOC_Hf, - MDOC_Fr, - MDOC_Ud, - MDOC_Lb, - MDOC_Lp, - MDOC_Lk, - MDOC_Mt, - MDOC_Brq, - MDOC_Bro, - MDOC_Brc, - MDOC__C, - MDOC_Es, - MDOC_En, - MDOC_Dx, - MDOC__Q, - MDOC_br, - MDOC_sp, - MDOC__U, - MDOC_Ta, - MDOC_MAX -}; - -/* - * What follows is a list of ALL possible macro arguments. - */ -enum mdocargt { - MDOC_Split, - MDOC_Nosplit, - MDOC_Ragged, - MDOC_Unfilled, - MDOC_Literal, - MDOC_File, - MDOC_Offset, - MDOC_Bullet, - MDOC_Dash, - MDOC_Hyphen, - MDOC_Item, - MDOC_Enum, - MDOC_Tag, - MDOC_Diag, - MDOC_Hang, - MDOC_Ohang, - MDOC_Inset, - MDOC_Column, - MDOC_Width, - MDOC_Compact, - MDOC_Std, - MDOC_Filled, - MDOC_Words, - MDOC_Emphasis, - MDOC_Symbolic, - MDOC_Nested, - MDOC_Centred, - MDOC_ARG_MAX -}; - -/* - * Type of a syntax node. - */ -enum mdoc_type { - MDOC_TEXT, - MDOC_ELEM, - MDOC_HEAD, - MDOC_TAIL, - MDOC_BODY, - MDOC_BLOCK, - MDOC_TBL, - MDOC_EQN, - MDOC_ROOT -}; - -/* - * Section (named/unnamed) of `Sh'. Note that these appear in the - * conventional order imposed by mdoc.7. - */ -enum mdoc_sec { - SEC_NONE = 0, /* No section, yet. */ - SEC_NAME, - SEC_LIBRARY, - SEC_SYNOPSIS, - SEC_DESCRIPTION, - SEC_IMPLEMENTATION, - SEC_RETURN_VALUES, - SEC_ENVIRONMENT, - SEC_FILES, - SEC_EXIT_STATUS, - SEC_EXAMPLES, - SEC_DIAGNOSTICS, - SEC_COMPATIBILITY, - SEC_ERRORS, - SEC_SEE_ALSO, - SEC_STANDARDS, - SEC_HISTORY, - SEC_AUTHORS, - SEC_CAVEATS, - SEC_BUGS, - SEC_SECURITY, - SEC_CUSTOM, /* User-defined. */ - SEC__MAX -}; - -/* - * Information from prologue. - */ -struct mdoc_meta { - char *msec; /* `Dt' section (1, 3p, etc.) */ - char *vol; /* `Dt' volume (implied) */ - char *arch; /* `Dt' arch (i386, etc.) */ - char *date; /* `Dd' normalised date */ - char *title; /* `Dt' title (FOO, etc.) */ - char *os; /* `Os' system (OpenBSD, etc.) */ - char *name; /* leading `Nm' name */ -}; - -/* - * An argument to a macro (multiple values = `-column xxx yyy'). - */ -struct mdoc_argv { - enum mdocargt arg; /* type of argument */ - int line; - int pos; - size_t sz; /* elements in "value" */ - char **value; /* argument strings */ -}; - -/* - * Reference-counted macro arguments. These are refcounted because - * blocks have multiple instances of the same arguments spread across - * the HEAD, BODY, TAIL, and BLOCK node types. - */ -struct mdoc_arg { - size_t argc; - struct mdoc_argv *argv; - unsigned int refcnt; -}; - -/* - * Indicates that a BODY's formatting has ended, but the scope is still - * open. Used for syntax-broken blocks. - */ -enum mdoc_endbody { - ENDBODY_NOT = 0, - ENDBODY_SPACE, /* is broken: append a space */ - ENDBODY_NOSPACE /* is broken: don't append a space */ -}; - -/* - * Normalised `Bl' list type. - */ -enum mdoc_list { - LIST__NONE = 0, - LIST_bullet, - LIST_column, - LIST_dash, - LIST_diag, - LIST_enum, - LIST_hang, - LIST_hyphen, - LIST_inset, - LIST_item, - LIST_ohang, - LIST_tag, - LIST_MAX -}; - -/* - * Normalised `Bd' display type. - */ -enum mdoc_disp { - DISP__NONE = 0, - DISP_centred, - DISP_ragged, - DISP_unfilled, - DISP_filled, - DISP_literal -}; - -/* - * Normalised `An' splitting argument. - */ -enum mdoc_auth { - AUTH__NONE = 0, - AUTH_split, - AUTH_nosplit -}; - -/* - * Normalised `Bf' font type. - */ -enum mdoc_font { - FONT__NONE = 0, - FONT_Em, - FONT_Li, - FONT_Sy -}; - -/* - * Normalised arguments for `Bd'. - */ -struct mdoc_bd { - const char *offs; /* -offset */ - enum mdoc_disp type; /* -ragged, etc. */ - int comp; /* -compact */ -}; - -/* - * Normalised arguments for `Bl'. - */ -struct mdoc_bl { - const char *width; /* -width */ - const char *offs; /* -offset */ - enum mdoc_list type; /* -tag, -enum, etc. */ - int comp; /* -compact */ - size_t ncols; /* -column arg count */ - const char **cols; /* -column val ptr */ -}; - -/* - * Normalised arguments for `Bf'. - */ -struct mdoc_bf { - enum mdoc_font font; /* font */ -}; - -/* - * Normalised arguments for `An'. - */ -struct mdoc_an { - enum mdoc_auth auth; /* -split, etc. */ -}; - -struct mdoc_rs { - int quote_T; /* whether to quote %T */ -}; - -/* - * Consists of normalised node arguments. These should be used instead - * of iterating through the mdoc_arg pointers of a node: defaults are - * provided, etc. - */ -union mdoc_data { - struct mdoc_an An; - struct mdoc_bd Bd; - struct mdoc_bf Bf; - struct mdoc_bl Bl; - struct mdoc_rs Rs; -}; - -/* - * Single node in tree-linked AST. - */ -struct mdoc_node { - struct mdoc_node *parent; /* parent AST node */ - struct mdoc_node *child; /* first child AST node */ - struct mdoc_node *last; /* last child AST node */ - struct mdoc_node *next; /* sibling AST node */ - struct mdoc_node *prev; /* prior sibling AST node */ - int nchild; /* number children */ - int line; /* parse line */ - int pos; /* parse column */ - enum mdoct tok; /* tok or MDOC__MAX if none */ - int flags; -#define MDOC_VALID (1 << 0) /* has been validated */ -#define MDOC_EOS (1 << 2) /* at sentence boundary */ -#define MDOC_LINE (1 << 3) /* first macro/text on line */ -#define MDOC_SYNPRETTY (1 << 4) /* SYNOPSIS-style formatting */ -#define MDOC_ENDED (1 << 5) /* rendering has been ended */ - enum mdoc_type type; /* AST node type */ - enum mdoc_sec sec; /* current named section */ - union mdoc_data *norm; /* normalised args */ - /* FIXME: these can be union'd to shave a few bytes. */ - struct mdoc_arg *args; /* BLOCK/ELEM */ - struct mdoc_node *pending; /* BLOCK */ - struct mdoc_node *head; /* BLOCK */ - struct mdoc_node *body; /* BLOCK */ - struct mdoc_node *tail; /* BLOCK */ - char *string; /* TEXT */ - const struct tbl_span *span; /* TBL */ - const struct eqn *eqn; /* EQN */ - enum mdoc_endbody end; /* BODY */ -}; - -/* - * Names of macros. Index is enum mdoct. Indexing into this returns - * the normalised name, e.g., mdoc_macronames[MDOC_Sh] -> "Sh". - */ -extern const char *const *mdoc_macronames; - -/* - * Names of macro args. Index is enum mdocargt. Indexing into this - * returns the normalised name, e.g., mdoc_argnames[MDOC_File] -> - * "file". - */ -extern const char *const *mdoc_argnames; - -__BEGIN_DECLS - -struct mdoc; - -void mdoc_free(struct mdoc *); -struct mdoc *mdoc_alloc(struct regset *, struct mparse *); -void mdoc_reset(struct mdoc *); -int mdoc_parseln(struct mdoc *, int, char *, int); -const struct mdoc_node *mdoc_node(const struct mdoc *); -const struct mdoc_meta *mdoc_meta(const struct mdoc *); -int mdoc_endparse(struct mdoc *); -int mdoc_addspan(struct mdoc *, - const struct tbl_span *); -int mdoc_addeqn(struct mdoc *, - const struct eqn *); - -__END_DECLS - -#endif /*!MDOC_H*/ Index: mdoc_html.c =================================================================== RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/mdoc_html.c,v retrieving revision 1.154 diff -u -r1.154 mdoc_html.c --- mdoc_html.c 7 Mar 2011 01:35:51 -0000 1.154 +++ mdoc_html.c 21 Mar 2011 17:23:14 -0000 @@ -30,7 +30,6 @@ #include "mandoc.h" #include "out.h" #include "html.h" -#include "mdoc.h" #include "main.h" #define INDENT 5 Index: mdoc_term.c =================================================================== RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/mdoc_term.c,v retrieving revision 1.220 diff -u -r1.220 mdoc_term.c --- mdoc_term.c 7 Mar 2011 01:35:51 -0000 1.220 +++ mdoc_term.c 21 Mar 2011 17:23:14 -0000 @@ -31,7 +31,6 @@ #include "mandoc.h" #include "out.h" #include "term.h" -#include "mdoc.h" #include "chars.h" #include "main.h" Index: read.c =================================================================== RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/read.c,v retrieving revision 1.4 diff -u -r1.4 read.c --- read.c 20 Mar 2011 16:05:21 -0000 1.4 +++ read.c 21 Mar 2011 17:23:14 -0000 @@ -29,9 +29,6 @@ #include "mandoc.h" #include "libmandoc.h" -#include "mdoc.h" -#include "man.h" -#include "roff.h" #ifndef MAP_FILE #define MAP_FILE 0 Index: roff.3 =================================================================== RCS file: roff.3 diff -N roff.3 --- roff.3 1 Jan 2011 16:18:39 -0000 1.10 +++ /dev/null 1 Jan 1970 00:00:00 -0000 @@ -1,177 +0,0 @@ -.\" $Id: roff.3,v 1.10 2011/01/01 16:18:39 kristaps Exp $ -.\" -.\" Copyright (c) 2010 Kristaps Dzonsons <kristaps@bsd.lv> -.\" -.\" Permission to use, copy, modify, and distribute this software for any -.\" purpose with or without fee is hereby granted, provided that the above -.\" copyright notice and this permission notice appear in all copies. -.\" -.\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES -.\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF -.\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR -.\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES -.\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN -.\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF -.\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. -.\" -.Dd $Mdocdate: January 1 2011 $ -.Dt ROFF 3 -.Os -.Sh NAME -.Nm roff , -.Nm roff_alloc , -.Nm roff_endparse , -.Nm roff_free , -.Nm roff_parseln , -.Nm roff_reset , -.Nm roff_span -.Nd roff macro compiler library -.Sh SYNOPSIS -.In mandoc.h -.In roff.h -.Ft "struct roff *" -.Fo roff_alloc -.Fa "struct regset *regs" -.Fa "void *data" -.Fa "mandocmsg msgs" -.Fc -.Ft void -.Fn roff_endparse "struct roff *roff" -.Ft void -.Fn roff_free "struct roff *roff" -.Ft "enum rofferr" -.Fo roff_parseln -.Fa "struct roff *roff" -.Fa "int line" -.Fa "char **bufp" -.Fa "size_t *bufsz" -.Fa "int pos" -.Fa "int *offs" -.Fc -.Ft void -.Fn roff_reset "struct roff *roff" -.Ft "const struct tbl_span *" -.Fn roff_span "const struct roff *roff" -.Sh DESCRIPTION -The -.Nm -library processes lines of -.Xr roff 7 -input. -.Pp -In general, applications initiate a parsing sequence with -.Fn roff_alloc , -parse each line in a document with -.Fn roff_parseln , -close the parsing session with -.Fn roff_endparse , -and finally free all allocated memory with -.Fn roff_free . -The -.Fn roff_reset -function may be used in order to reset the parser for another input -sequence. -.Pp -The -.Fn roff_parseln -function should be invoked before passing a line into the -.Xr mdoc 3 -or -.Xr man 3 -libraries. -.Pp -See the -.Sx EXAMPLES -section for a full example. -.Sh REFERENCE -This section further defines the -.Sx Types -and -.Sx Functions -available to programmers. -.Ss Types -Functions (see -.Sx Functions ) -may use the following types: -.Bl -ohang -.It Vt "enum rofferr" -Instructions for further processing to the caller of -.Fn roff_parseln . -.It Vt struct roff -An opaque type defined in -.Pa roff.c . -Its values are only used privately within the library. -.It Vt mandocmsg -A function callback type defined in -.Pa mandoc.h . -.El -.Ss Functions -Function descriptions follow: -.Bl -ohang -.It Fn roff_alloc -Allocates a parsing structure. -The -.Fa data -pointer is passed to -.Fa msgs . -Returns NULL on failure. -If non-NULL, the pointer must be freed with -.Fn roff_free . -.It Fn roff_reset -Reset the parser for another parse routine. -After its use, -.Fn roff_parseln -behaves as if invoked for the first time. -.It Fn roff_free -Free all resources of a parser. -The pointer is no longer valid after invocation. -.It Fn roff_parseln -Parse a nil-terminated line of input. -The character array -.Fa bufp -may be modified or reallocated within this function. -In the latter case, -.Fa bufsz -will be modified accordingly. -The -.Fa offs -pointer will be modified if the line start during subsequent processing -of the line is not at the zeroth index. -This line should not contain the trailing newline. -Returns 0 on failure, 1 on success. -.It Fn roff_endparse -Signals that the parse is complete. -.It Fn roff_span -If -.Fn roff_parseln -returned -.Va ROFF_TBL , -return the last parsed table row. -Returns NULL otherwise. -.El -.Sh EXAMPLES -See -.Pa main.c -in the source distribution for an example of usage. -.Sh SEE ALSO -.Xr mandoc 1 , -.Xr man 3 , -.Xr mdoc 3 , -.Xr roff 7 -.Sh AUTHORS -The -.Nm -library was written by -.An Kristaps Dzonsons Aq kristaps@bsd.lv . -.Sh BUGS -The implementation of user-defined strings needs improvement: -.Bl -dash -.It -String values are taken literally and are not interpreted. -.It -Parsing of quoted strings is incomplete. -.It -The stings are stored internally using a singly linked list, -which is fine for small numbers of strings, -but ineffient when handling many strings. -.El Index: roff.c =================================================================== RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/roff.c,v retrieving revision 1.128 diff -u -r1.128 roff.c --- roff.c 20 Mar 2011 16:02:05 -0000 1.128 +++ roff.c 21 Mar 2011 17:23:14 -0000 @@ -28,7 +28,6 @@ #include <stdio.h> #include "mandoc.h" -#include "roff.h" #include "libroff.h" #include "libmandoc.h" Index: roff.h =================================================================== RCS file: roff.h diff -N roff.h --- roff.h 20 Mar 2011 16:02:05 -0000 1.25 +++ /dev/null 1 Jan 1970 00:00:00 -0000 @@ -1,47 +0,0 @@ -/* $Id: roff.h,v 1.25 2011/03/20 16:02:05 kristaps Exp $ */ -/* - * Copyright (c) 2010 Kristaps Dzonsons <kristaps@bsd.lv> - * - * Permission to use, copy, modify, and distribute this software for any - * purpose with or without fee is hereby granted, provided that the above - * copyright notice and this permission notice appear in all copies. - * - * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES - * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF - * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR - * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES - * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN - * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF - * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. - */ -#ifndef ROFF_H -#define ROFF_H - -enum rofferr { - ROFF_CONT, /* continue processing line */ - ROFF_RERUN, /* re-run roff interpreter with offset */ - ROFF_APPEND, /* re-run main parser, appending next line */ - ROFF_REPARSE, /* re-run main parser on the result */ - ROFF_SO, /* include another file */ - ROFF_IGN, /* ignore current line */ - ROFF_TBL, /* a table row was successfully parsed */ - ROFF_EQN, /* an equation was successfully parsed */ - ROFF_ERR /* badness: puke and stop */ -}; - -__BEGIN_DECLS - -struct roff; - -void roff_free(struct roff *); -struct roff *roff_alloc(struct regset *, struct mparse *); -void roff_reset(struct roff *); -enum rofferr roff_parseln(struct roff *, int, - char **, size_t *, int, int *); -void roff_endparse(struct roff *); -const struct tbl_span *roff_span(const struct roff *); -const struct eqn *roff_eqn(const struct roff *); - -__END_DECLS - -#endif /*!ROFF_H*/ Index: tbl.c =================================================================== RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/tbl.c,v retrieving revision 1.23 diff -u -r1.23 tbl.c --- tbl.c 20 Mar 2011 16:02:05 -0000 1.23 +++ tbl.c 21 Mar 2011 17:23:14 -0000 @@ -22,7 +22,6 @@ #include <time.h> #include "mandoc.h" -#include "roff.h" #include "libmandoc.h" #include "libroff.h" Index: tree.c =================================================================== RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/tree.c,v retrieving revision 1.36 diff -u -r1.36 tree.c --- tree.c 9 Feb 2011 09:18:15 -0000 1.36 +++ tree.c 21 Mar 2011 17:23:14 -0000 @@ -24,8 +24,6 @@ #include <time.h> #include "mandoc.h" -#include "mdoc.h" -#include "man.h" #include "main.h" static void print_mdoc(const struct mdoc_node *, int); [-- Attachment #3: mdocml.tar.gz --] [-- Type: application/gzip, Size: 192329 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] Massive restructuring into mandoc.h/libmandoc.a. 2011-03-21 17:37 [PATCH] Massive restructuring into mandoc.h/libmandoc.a Kristaps Dzonsons @ 2011-03-21 17:53 ` Kristaps Dzonsons [not found] ` <20110321215744.GA16603@iris.usta.de> 1 sibling, 0 replies; 5+ messages in thread From: Kristaps Dzonsons @ 2011-03-21 17:53 UTC (permalink / raw) To: tech [-- Attachment #1: Type: text/plain, Size: 398 bytes --] > Things that remain to be done in the immediate future: > > - get rid of mdoc_isdelim() (using cues somehow?) > - merge chars.h into out.h > - do something about out.h/main.h This whacks chars.h, putting it into out.h. Enclosed in the same way. I noticed that chars.o was compiled into libmandoc by accent---this has been fixed, too. But I add another TODO: - re-add the "lint" target K. [-- Attachment #2: mdocml.tar.gz --] [-- Type: application/gzip, Size: 192184 bytes --] [-- Attachment #3: patch.txt --] [-- Type: text/plain, Size: 87430 bytes --] ? bar.1 ? baz.1 ? config.h ? config.log ? foo.1 ? index.c ? mandoc ? patch.txt ? regress/output Index: ChangeLog.xsl =================================================================== RCS file: ChangeLog.xsl diff -N ChangeLog.xsl --- ChangeLog.xsl 21 Sep 2009 15:12:03 -0000 1.4 +++ /dev/null 1 Jan 1970 00:00:00 -0000 @@ -1,43 +0,0 @@ -<?xml version='1.0' encoding="utf-8"?> -<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0" > -<xsl:output encoding="utf-8" method="html" indent="yes" doctype-public="-//W3C//DTD HTML 4.01 Transitional//EN" /> -<xsl:template match="/changelog"> -<html> - <head> - <title>mdocml - CVS-ChangeLog</title> - <link rel="stylesheet" href="index.css" type="text/css" media="all" /> - </head> - <body> - <xsl:for-each select="entry"> - <div class="clhead"> - <xsl:text>Files modified by </xsl:text> - <xsl:value-of select="concat(author, ': ', date, ' (', time, ')')" /> - </div> - <div class="clbody"> - <strong> - <xsl:text>Note: </xsl:text> - </strong> - <xsl:value-of select="msg"/> - <ul class="clbody"> - <xsl:for-each select="file"> - <li> - <xsl:value-of select="name"/> - <span class="rev"> - <xsl:text> — Rev: </xsl:text> - <xsl:value-of select="revision"/> - <xsl:text>, Status: </xsl:text> - <xsl:value-of select="cvsstate"/> - <xsl:if test="tag"> - <xsl:text>, Tag: </xsl:text> - <xsl:value-of select="tag" /> - </xsl:if> - </span> - </li> - </xsl:for-each> - </ul> - </div> - </xsl:for-each> - </body> -</html> -</xsl:template> -</xsl:stylesheet> Index: Makefile =================================================================== RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/Makefile,v retrieving revision 1.315 diff -u -r1.315 Makefile --- Makefile 20 Mar 2011 11:41:24 -0000 1.315 +++ Makefile 21 Mar 2011 17:53:01 -0000 @@ -1,344 +1,304 @@ -.SUFFIXES: .html .xml .sgml .1 .3 .7 .md5 .tar.gz -.SUFFIXES: .1.txt .3.txt .7.txt -.SUFFIXES: .1.xhtml .3.xhtml .7.xhtml -.SUFFIXES: .1.sgml .3.sgml .7.sgml -.SUFFIXES: .h .h.html -.SUFFIXES: .1.ps .3.ps .7.ps -.SUFFIXES: .1.pdf .3.pdf .7.pdf - -PREFIX = /usr/local -BINDIR = $(PREFIX)/bin -INCLUDEDIR = $(PREFIX)/include -LIBDIR = $(PREFIX)/lib -MANDIR = $(PREFIX)/man -EXAMPLEDIR = $(PREFIX)/share/examples/mandoc -INSTALL = install -INSTALL_PROGRAM = $(INSTALL) -m 0755 -INSTALL_DATA = $(INSTALL) -m 0444 -INSTALL_LIB = $(INSTALL) -m 0644 -INSTALL_MAN = $(INSTALL_DATA) - -VERSION = 1.10.10 -VDATE = 20 March 2011 - -VFLAGS = -DVERSION="\"$(VERSION)\"" -WFLAGS = -W -Wall -Wstrict-prototypes -Wno-unused-parameter -Wwrite-strings -CFLAGS += -g $(WFLAGS) $(VFLAGS) -DHAVE_CONFIG_H +.PHONY: clean install +.SUFFIXES: .sgml .html .md5 .h .h.html +.SUFFIXES: .1 .3 .7 +.SUFFIXES: .1.txt .3.txt .7.txt +.SUFFIXES: .1.pdf .3.pdf .7.pdf +.SUFFIXES: .1.ps .3.ps .7.ps +.SUFFIXES: .1.html .3.html .7.html +.SUFFIXES: .1.xhtml .3.xhtml .7.xhtml # Specify this if you want to hard-code the operating system to appear # in the lower-left hand corner of -mdoc manuals. -# CFLAGS += -DOSNAME="\"OpenBSD 4.5\"" +# CFLAGS += -DOSNAME="\"OpenBSD 4.5\"" -LINTFLAGS += $(VFLAGS) +VERSION = 1.10.10 +VDATE = 20 March 2011 +CFLAGS += -g -DHAVE_CONFIG_H -DVERSION="\"$(VERSION)\"" +CFLAGS += -W -Wall -Wstrict-prototypes -Wno-unused-parameter -Wwrite-strings +PREFIX = /usr/local +BINDIR = $(PREFIX)/bin +INCLUDEDIR = $(PREFIX)/include +LIBDIR = $(PREFIX)/lib +MANDIR = $(PREFIX)/man +EXAMPLEDIR = $(PREFIX)/share/examples/mandoc +INSTALL = install +INSTALL_PROGRAM = $(INSTALL) -m 0755 +INSTALL_DATA = $(INSTALL) -m 0444 +INSTALL_LIB = $(INSTALL) -m 0644 +INSTALL_MAN = $(INSTALL_DATA) + +all: mandoc + +SRCS = Makefile \ + arch.c \ + arch.in \ + att.c \ + att.in \ + chars.c \ + chars.in \ + compat.c \ + config.h.post \ + config.h.pre \ + eqn.7 \ + eqn.c \ + example.style.css \ + external.png \ + html.c \ + html.h \ + index.c \ + index.css \ + index.sgml \ + lib.c \ + lib.in \ + libman.h \ + libmandoc.h \ + libmdoc.h \ + libroff.h \ + main.c \ + main.h \ + man.7 \ + man.c \ + man_argv.c \ + man_hash.c \ + man_html.c \ + man_macro.c \ + man_term.c \ + man_validate.c \ + mandoc.1 \ + mandoc.3 \ + mandoc.c \ + mandoc.h \ + mandoc_char.7 \ + mdoc.7 \ + mdoc.c \ + mdoc_argv.c \ + mdoc_hash.c \ + mdoc_html.c \ + mdoc_macro.c \ + mdoc_term.c \ + mdoc_validate.c \ + msec.c \ + msec.in \ + out.c \ + out.h \ + read.c \ + roff.7 \ + roff.c \ + st.c \ + st.in \ + style.css \ + tbl.7 \ + tbl.c \ + tbl_data.c \ + tbl_html.c \ + tbl_layout.c \ + tbl_opts.c \ + tbl_term.c \ + term.c \ + term.h \ + term_ascii.c \ + term_ps.c \ + test-strlcat.c \ + test-strlcpy.c \ + tree.c \ + vol.c \ + vol.in + +LIBMAN_OBJS = man.o \ + man_argv.o \ + man_hash.o \ + man_macro.o \ + man_validate.o +LIBMDOC_OBJS = arch.o \ + att.o \ + lib.o \ + mdoc.o \ + mdoc_argv.o \ + mdoc_hash.o \ + mdoc_macro.o \ + mdoc_validate.o \ + msec.o \ + st.o \ + vol.o +LIBROFF_OBJS = eqn.o \ + roff.o \ + tbl.o \ + tbl_data.o \ + tbl_layout.o \ + tbl_opts.o +LIBMANDOC_OBJS = $(LIBMAN_OBJS) \ + $(LIBMDOC_OBJS) \ + $(LIBROFF_OBJS) \ + mandoc.o \ + read.o + +arch.o: arch.in +att.o: att.in +lib.o: lib.in +msec.o: msec.in +st.o: st.in +vol.o: vol.in + +$(LIBMAN_OBJS): libmdoc.h +$(LIBMDOC_OBJS): libmdoc.h +$(LIBROFF_OBJS): libroff.h +$(LIBMANDOC_OBJS): mandoc.h libmandoc.h config.h + +MANDOC_HTML_OBJS = html.o \ + man_html.o \ + mdoc_html.o \ + tbl_html.o +MANDOC_TERM_OBJS = man_term.o \ + mdoc_term.o \ + term.o \ + term_ascii.o \ + term_ps.o \ + tbl_term.o +MANDOC_OBJS = $(MANDOC_HTML_OBJS) \ + $(MANDOC_TERM_OBJS) \ + chars.o \ + main.o \ + out.o \ + tree.o + +chars.o: chars.in + +$(MANDOC_HTML_OBJS): html.h +$(MANDOC_TERM_OBJS): term.h +$(MANDOC_OBJS): main.h mandoc.h config.h out.h + +compat.o: config.h + +INDEX_MANS = mandoc.1.html \ + mandoc.1.xhtml \ + mandoc.1.ps \ + mandoc.1.pdf \ + mandoc.1.txt \ + mandoc.3.html \ + mandoc.3.xhtml \ + mandoc.3.ps \ + mandoc.3.pdf \ + mandoc.3.txt \ + eqn.7.html \ + eqn.7.xhtml \ + eqn.7.ps \ + eqn.7.pdf \ + eqn.7.txt \ + man.7.html \ + man.7.xhtml \ + man.7.ps \ + man.7.pdf \ + man.7.txt \ + mandoc_char.7.html \ + mandoc_char.7.xhtml \ + mandoc_char.7.ps \ + mandoc_char.7.pdf \ + mandoc_char.7.txt \ + mdoc.7.html \ + mdoc.7.xhtml \ + mdoc.7.ps \ + mdoc.7.pdf \ + mdoc.7.txt \ + roff.7.html \ + roff.7.xhtml \ + roff.7.ps \ + roff.7.pdf \ + roff.7.txt \ + tbl.7.html \ + tbl.7.xhtml \ + tbl.7.ps \ + tbl.7.pdf \ + tbl.7.txt + +$(INDEX_MANS): mandoc + +INDEX_OBJS = $(INDEX_MANS) \ + mandoc.h.html \ + mdocml.tar.gz \ + mdocml.md5 -ROFFLNS = roff.ln tbl.ln tbl_opts.ln tbl_layout.ln tbl_data.ln eqn.ln - -ROFFSRCS = roff.c tbl.c tbl_opts.c tbl_layout.c tbl_data.c eqn.c - -ROFFOBJS = roff.o tbl.o tbl_opts.o tbl_layout.o tbl_data.o eqn.o - -MANDOCLNS = mandoc.ln - -MANDOCSRCS = mandoc.c - -MANDOCOBJS = mandoc.o - -MDOCLNS = mdoc_macro.ln mdoc.ln mdoc_hash.ln \ - mdoc_argv.ln mdoc_validate.ln \ - lib.ln att.ln arch.ln vol.ln msec.ln st.ln - -MDOCOBJS = mdoc_macro.o mdoc.o mdoc_hash.o \ - mdoc_argv.o mdoc_validate.o lib.o att.o \ - arch.o vol.o msec.o st.o - -MDOCSRCS = mdoc_macro.c mdoc.c mdoc_hash.c \ - mdoc_argv.c mdoc_validate.c lib.c att.c \ - arch.c vol.c msec.c st.c - -MANLNS = man_macro.ln man.ln man_hash.ln man_validate.ln \ - man_argv.ln - -MANOBJS = man_macro.o man.o man_hash.o man_validate.o \ - man_argv.o -MANSRCS = man_macro.c man.c man_hash.c man_validate.c \ - man_argv.c - -MAINLNS = main.ln mdoc_term.ln chars.ln term.ln tree.ln \ - compat.ln man_term.ln html.ln mdoc_html.ln \ - man_html.ln out.ln term_ps.ln term_ascii.ln \ - tbl_term.ln tbl_html.ln read.ln - -MAINOBJS = main.o mdoc_term.o chars.o term.o tree.o compat.o \ - man_term.o html.o mdoc_html.o man_html.o out.o \ - term_ps.o term_ascii.o tbl_term.o tbl_html.o read.o - -MAINSRCS = main.c mdoc_term.c chars.c term.c tree.c compat.c \ - man_term.c html.c mdoc_html.c man_html.c out.c \ - term_ps.c term_ascii.c tbl_term.c tbl_html.c read.c - -LLNS = llib-llibmdoc.ln llib-llibman.ln llib-lmandoc.ln \ - llib-llibmandoc.ln llib-llibroff.ln - -LNS = $(MAINLNS) $(MDOCLNS) $(MANLNS) \ - $(MANDOCLNS) $(ROFFLNS) - -LIBS = libmdoc.a libman.a libmandoc.a libroff.a - -OBJS = $(MDOCOBJS) $(MAINOBJS) $(MANOBJS) \ - $(MANDOCOBJS) $(ROFFOBJS) - -SRCS = $(MDOCSRCS) $(MAINSRCS) $(MANSRCS) \ - $(MANDOCSRCS) $(ROFFSRCS) - -DATAS = arch.in att.in lib.in msec.in st.in \ - vol.in chars.in - -HEADS = mdoc.h libmdoc.h man.h libman.h term.h \ - libmandoc.h html.h chars.h out.h main.h roff.h \ - mandoc.h libroff.h - -GSGMLS = mandoc.1.sgml mdoc.3.sgml mdoc.7.sgml \ - mandoc_char.7.sgml man.7.sgml man.3.sgml roff.7.sgml \ - roff.3.sgml tbl.7.sgml eqn.7.sgml - -SGMLS = index.sgml - -XHTMLS = mandoc.1.xhtml mdoc.3.xhtml \ - man.3.xhtml mdoc.7.xhtml man.7.xhtml mandoc_char.7.xhtml \ - roff.7.xhtml roff.3.xhtml tbl.7.xhtml eqn.7.xhtml - -HTMLS = ChangeLog.html index.html man.h.html mdoc.h.html \ - mandoc.h.html roff.h.html mandoc.1.html mdoc.3.html \ - man.3.html mdoc.7.html man.7.html mandoc_char.7.html \ - roff.7.html roff.3.html tbl.7.html eqn.7.html - -PSS = mandoc.1.ps mdoc.3.ps man.3.ps mdoc.7.ps man.7.ps \ - mandoc_char.7.ps roff.7.ps roff.3.ps tbl.7.ps eqn.7.ps - -PDFS = mandoc.1.pdf mdoc.3.pdf man.3.pdf mdoc.7.pdf man.7.pdf \ - mandoc_char.7.pdf roff.7.pdf roff.3.pdf tbl.7.pdf eqn.7.pdf - -XSLS = ChangeLog.xsl - -TEXTS = mandoc.1.txt mdoc.3.txt man.3.txt mdoc.7.txt man.7.txt \ - mandoc_char.7.txt ChangeLog.txt \ - roff.7.txt roff.3.txt tbl.7.txt eqn.7.txt - -EXAMPLES = example.style.css - -XMLS = ChangeLog.xml - -STATICS = index.css style.css external.png - -MD5S = mdocml-$(VERSION).md5 - -TARGZS = mdocml-$(VERSION).tar.gz - -MANS = mandoc.1 mdoc.3 mdoc.7 mandoc_char.7 man.7 \ - man.3 roff.7 roff.3 tbl.7 eqn.7 - -BINS = mandoc - -TESTS = test-strlcat.c test-strlcpy.c - -CONFIGS = config.h.pre config.h.post - -DOCLEAN = $(BINS) $(LNS) $(LLNS) $(LIBS) $(OBJS) $(HTMLS) \ - $(TARGZS) tags $(MD5S) $(XMLS) $(TEXTS) $(GSGMLS) \ - config.h config.log $(PSS) $(PDFS) $(XHTMLS) - -DOINSTALL = $(SRCS) $(HEADS) Makefile $(MANS) $(SGMLS) $(STATICS) \ - $(DATAS) $(XSLS) $(EXAMPLES) $(TESTS) $(CONFIGS) - -all: $(BINS) - -lint: $(LLNS) +www: index.html clean: - rm -f $(DOCLEAN) - -dist: mdocml-$(VERSION).tar.gz - -www: all $(GSGMLS) $(HTMLS) $(XHTMLS) $(TEXTS) $(MD5S) $(TARGZS) $(PSS) $(PDFS) - -ps: $(PSS) - -pdf: $(PDFS) + rm -f libmandoc.a $(LIBMANDOC_OBJS) + rm -f mandoc $(MANDOC_OBJS) + rm -f config.h compat.o config.log + rm -f mdocml.tar.gz + rm -f index.html $(INDEX_OBJS) -installwww: www - $(INSTALL_DATA) $(HTMLS) $(XHTMLS) $(PSS) $(PDFS) $(TEXTS) $(STATICS) $(DESTDIR)$(PREFIX)/ - $(INSTALL_DATA) mdocml-$(VERSION).tar.gz $(DESTDIR)$(PREFIX)/snapshots/ - $(INSTALL_DATA) mdocml-$(VERSION).md5 $(DESTDIR)$(PREFIX)/snapshots/ - $(INSTALL_DATA) mdocml-$(VERSION).tar.gz $(DESTDIR)$(PREFIX)/snapshots/mdocml.tar.gz - $(INSTALL_DATA) mdocml-$(VERSION).md5 $(DESTDIR)$(PREFIX)/snapshots/mdocml.md5 - -install: +install: all mkdir -p $(DESTDIR)$(BINDIR) mkdir -p $(DESTDIR)$(EXAMPLEDIR) mkdir -p $(DESTDIR)$(MANDIR)/man1 + mkdir -p $(DESTDIR)$(MANDIR)/man3 mkdir -p $(DESTDIR)$(MANDIR)/man7 $(INSTALL_PROGRAM) mandoc $(DESTDIR)$(BINDIR) + $(INSTALL_LIB) libmandoc.a $(DESTDIR)$(LIBDIR)/ $(INSTALL_MAN) mandoc.1 $(DESTDIR)$(MANDIR)/man1 + $(INSTALL_MAN) mandoc.3 $(DESTDIR)$(MANDIR)/man3 $(INSTALL_MAN) man.7 mdoc.7 roff.7 eqn.7 tbl.7 mandoc_char.7 $(DESTDIR)$(MANDIR)/man7 $(INSTALL_DATA) example.style.css $(DESTDIR)$(EXAMPLEDIR) -uninstall: - rm -f $(DESTDIR)$(BINDIR)/mandoc - rm -f $(DESTDIR)$(MANDIR)/man1/mandoc.1 - rm -f $(DESTDIR)$(MANDIR)/man7/mdoc.7 - rm -f $(DESTDIR)$(MANDIR)/man7/roff.7 - rm -f $(DESTDIR)$(MANDIR)/man7/eqn.7 - rm -f $(DESTDIR)$(MANDIR)/man7/tbl.7 - rm -f $(DESTDIR)$(MANDIR)/man7/man.7 - rm -f $(DESTDIR)$(MANDIR)/man7/mandoc_char.7 - rm -f $(DESTDIR)$(EXAMPLEDIR)/example.style.css - -$(OBJS): config.h - -$(LNS): config.h - -man_macro.ln man_macro.o: man_macro.c libman.h - -lib.ln lib.o: lib.c lib.in libmdoc.h - -att.ln att.o: att.c att.in libmdoc.h - -arch.ln arch.o: arch.c arch.in libmdoc.h - -vol.ln vol.o: vol.c vol.in libmdoc.h - -chars.ln chars.o: chars.c chars.in chars.h - -msec.ln msec.o: msec.c msec.in libmdoc.h - -st.ln st.o: st.c st.in libmdoc.h - -mdoc_macro.ln mdoc_macro.o: mdoc_macro.c libmdoc.h - -mdoc_term.ln mdoc_term.o: mdoc_term.c term.h mdoc.h - -man_hash.ln man_hash.o: man_hash.c libman.h - -mdoc_hash.ln mdoc_hash.o: mdoc_hash.c libmdoc.h - -mdoc.ln mdoc.o: mdoc.c libmdoc.h - -man.ln man.o: man.c libman.h - -main.ln main.o: main.c mdoc.h man.h roff.h - -compat.ln compat.o: compat.c - -term.ln term.o: term.c term.h man.h mdoc.h chars.h - -term_ps.ln term_ps.o: term_ps.c term.h main.h - -term_ascii.ln term_ascii.o: term_ascii.c term.h main.h - -html.ln html.o: html.c html.h chars.h - -mdoc_html.ln mdoc_html.o: mdoc_html.c html.h mdoc.h - -man_html.ln man_html.o: man_html.c html.h man.h out.h - -out.ln out.o: out.c out.h - -mandoc.ln mandoc.o: mandoc.c libmandoc.h - -tree.ln tree.o: tree.c man.h mdoc.h - -mdoc_argv.ln mdoc_argv.o: mdoc_argv.c libmdoc.h - -man_argv.ln man_argv.o: man_argv.c libman.h - -man_validate.ln man_validate.o: man_validate.c libman.h - -mdoc_validate.ln mdoc_validate.o: mdoc_validate.c libmdoc.h - -libmdoc.h: mdoc.h - -ChangeLog.xml: - cvs2cl --xml --xml-encoding iso-8859-15 -t --noxmlns -f $@ - -ChangeLog.txt: - cvs2cl -t -f $@ - -ChangeLog.html: ChangeLog.xml ChangeLog.xsl - xsltproc -o $@ ChangeLog.xsl ChangeLog.xml - -mdocml-$(VERSION).tar.gz: $(DOINSTALL) - mkdir -p .dist/mdocml/mdocml-$(VERSION)/ - cp -f $(DOINSTALL) .dist/mdocml/mdocml-$(VERSION)/ - ( cd .dist/mdocml/ && tar zcf ../../$@ mdocml-$(VERSION)/ ) +installwww: www + $(INSTALL_DATA) $(INDEX_MANS) $(PREFIX) + $(INSTALL_DATA) mandoc.h.html $(PREFIX) + $(INSTALL_DATA) external.png style.css index.css $(PREFIX) + $(INSTALL_DATA) mdocml.tar.gz $(PREFIX)/snapshots + $(INSTALL_DATA) mdocml.md5 $(PREFIX)/snapshots + $(INSTALL_DATA) mdocml.tar.gz $(PREFIX)/snapshots/mdocml-$(VERSION).tar.gz + $(INSTALL_DATA) mdocml.md5 $(PREFIX)/snapshots/mdocml-$(VERSION).md5 + +libmandoc.a: compat.o $(LIBMANDOC_OBJS) + $(AR) rs $@ compat.o $(LIBMANDOC_OBJS) + +mandoc: $(MANDOC_OBJS) libmandoc.a + $(CC) -o $@ $(MANDOC_OBJS) libmandoc.a + +mdocml.md5: mdocml.tar.gz + md5 mdocml.tar.gz >$@ + +mdocml.tar.gz: $(SRCS) + mkdir -p .dist/mdocml-$(VERSION)/ + $(INSTALL) -m 0444 $(SRCS) .dist/mdocml-$(VERSION) + ( cd .dist/ && tar zcf ../$@ ./ ) rm -rf .dist/ -llib-llibmdoc.ln: $(MDOCLNS) - $(LINT) -Clibmdoc $(MDOCLNS) - -llib-llibman.ln: $(MANLNS) - $(LINT) -Clibman $(MANLNS) - -llib-llibmandoc.ln: $(MANDOCLNS) - $(LINT) -Clibmandoc $(MANDOCLNS) - -llib-llibroff.ln: $(ROFFLNS) - $(LINT) -Clibroff $(ROFFLNS) - -llib-lmandoc.ln: $(MAINLNS) llib-llibmdoc.ln llib-llibman.ln llib-llibmandoc.ln llib-llibroff.ln - $(LINT) -Cmandoc $(MAINLNS) llib-llibmdoc.ln llib-llibman.ln llib-llibmandoc.ln llib-llibroff.ln - -libmdoc.a: $(MDOCOBJS) - $(AR) rs $@ $(MDOCOBJS) +index.html: $(INDEX_OBJS) -libman.a: $(MANOBJS) - $(AR) rs $@ $(MANOBJS) - -libmandoc.a: $(MANDOCOBJS) - $(AR) rs $@ $(MANDOCOBJS) - -libroff.a: $(ROFFOBJS) - $(AR) rs $@ $(ROFFOBJS) - -mandoc: $(MAINOBJS) libroff.a libmdoc.a libman.a libmandoc.a - $(CC) $(CFLAGS) -o $@ $(MAINOBJS) libroff.a libmdoc.a libman.a libmandoc.a +config.h: config.h.pre config.h.post + rm -f config.log + ( cat config.h.pre; \ + echo; \ + if $(CC) $(CFLAGS) -Werror -o test-strlcat test-strlcat.c >> config.log 2>&1; then \ + echo '#define HAVE_STRLCAT'; \ + rm test-strlcat; \ + fi; \ + if $(CC) $(CFLAGS) -Werror -o test-strlcpy test-strlcpy.c >> config.log 2>&1; then \ + echo '#define HAVE_STRLCPY'; \ + rm test-strlcpy; \ + fi; \ + echo; \ + cat config.h.post \ + ) > $@ -.sgml.html: - validate --warn $< - sed -e "s!@VERSION@!$(VERSION)!" -e "s!@VDATE@!$(VDATE)!" $< > $@ +.h.h.html: + highlight -I $< >$@ .1.1.txt .3.3.txt .7.7.txt: - ./mandoc -Tascii -Wall,stop $< | col -b > $@ + ./mandoc -Tascii -Wall,stop $< | col -b >$@ -.1.1.sgml .3.3.sgml .7.7.sgml: - ./mandoc -Thtml -Wall,stop -Ostyle=style.css,man=%N.%S.html,includes=%I.html $< > $@ +.1.1.html .3.3.html .7.7.html: + ./mandoc -Thtml -Wall,stop -Ostyle=style.css,man=%N.%S.html,includes=%I.html $< >$@ .1.1.ps .3.3.ps .7.7.ps: - ./mandoc -Tps -Wall,stop $< > $@ + ./mandoc -Tps -Wall,stop $< >$@ .1.1.xhtml .3.3.xhtml .7.7.xhtml: - ./mandoc -Txhtml -Wall,stop -Ostyle=style.css,man=%N.%S.xhtml,includes=%I.html $< > $@ + ./mandoc -Txhtml -Wall,stop -Ostyle=style.css,man=%N.%S.xhtml,includes=%I.html $< >$@ .1.1.pdf .3.3.pdf .7.7.pdf: - ./mandoc -Tpdf -Wall,stop $< > $@ + ./mandoc -Tpdf -Wall,stop $< >$@ -.tar.gz.md5: - md5 $< > $@ - -.h.h.html: - highlight -I $< >$@ - -config.h: config.h.pre config.h.post - rm -f config.log - ( cat config.h.pre; \ - echo; \ - if $(CC) $(CFLAGS) -Werror -o test-strlcat test-strlcat.c >> config.log 2>&1; then \ - echo '#define HAVE_STRLCAT'; \ - rm test-strlcat; \ - fi; \ - if $(CC) $(CFLAGS) -Werror -o test-strlcpy test-strlcpy.c >> config.log 2>&1; then \ - echo '#define HAVE_STRLCPY'; \ - rm test-strlcpy; \ - fi; \ - echo; \ - cat config.h.post \ - ) > $@ +.sgml.html: + validate --warn $< + sed -e "s!@VERSION@!$(VERSION)!" -e "s!@VDATE@!$(VDATE)!" $< >$@ Index: chars.c =================================================================== RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/chars.c,v retrieving revision 1.33 diff -u -r1.33 chars.c --- chars.c 17 Mar 2011 08:49:34 -0000 1.33 +++ chars.c 21 Mar 2011 17:53:01 -0000 @@ -25,7 +25,7 @@ #include <string.h> #include "mandoc.h" -#include "chars.h" +#include "out.h" #define PRINT_HI 126 #define PRINT_LO 32 Index: chars.h =================================================================== RCS file: chars.h diff -N chars.h --- chars.h 30 Jan 2011 16:05:37 -0000 1.7 +++ /dev/null 1 Jan 1970 00:00:00 -0000 @@ -1,38 +0,0 @@ -/* $Id: chars.h,v 1.7 2011/01/30 16:05:37 schwarze Exp $ */ -/* - * Copyright (c) 2008, 2009, 2010 Kristaps Dzonsons <kristaps@bsd.lv> - * Copyright (c) 2011 Ingo Schwarze <schwarze@openbsd.org> - * - * Permission to use, copy, modify, and distribute this software for any - * purpose with or without fee is hereby granted, provided that the above - * copyright notice and this permission notice appear in all copies. - * - * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES - * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF - * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR - * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES - * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN - * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF - * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. - */ -#ifndef CHARS_H -#define CHARS_H - -__BEGIN_DECLS - -enum chars { - CHARS_ASCII, - CHARS_HTML -}; - -void *chars_init(enum chars); -const char *chars_num2char(const char *, size_t); -const char *chars_spec2str(void *, const char *, size_t, size_t *); -int chars_spec2cp(void *, const char *, size_t); -const char *chars_res2str(void *, const char *, size_t, size_t *); -int chars_res2cp(void *, const char *, size_t); -void chars_free(void *); - -__END_DECLS - -#endif /*!CHARS_H*/ Index: eqn.c =================================================================== RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/eqn.c,v retrieving revision 1.3 diff -u -r1.3 eqn.c --- eqn.c 15 Mar 2011 16:23:51 -0000 1.3 +++ eqn.c 21 Mar 2011 17:53:02 -0000 @@ -25,7 +25,6 @@ #include <time.h> #include "mandoc.h" -#include "roff.h" #include "libmandoc.h" #include "libroff.h" Index: html.c =================================================================== RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/html.c,v retrieving revision 1.129 diff -u -r1.129 html.c --- html.c 17 Mar 2011 09:16:38 -0000 1.129 +++ html.c 21 Mar 2011 17:53:02 -0000 @@ -32,7 +32,6 @@ #include "mandoc.h" #include "out.h" -#include "chars.h" #include "html.h" #include "main.h" Index: index.sgml =================================================================== RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/index.sgml,v retrieving revision 1.105 diff -u -r1.105 index.sgml --- index.sgml 7 Jan 2011 15:22:21 -0000 1.105 +++ index.sgml 21 Mar 2011 17:53:02 -0000 @@ -39,10 +39,9 @@ </P> <P> - <SPAN CLASS="nm">mdocml</SPAN> consists of the <A HREF="mdoc.3.html">libmdoc</A>, <A - HREF="man.3.html">libman</A>, and <A HREF="roff.3.html">libroff</A> validating compilers; and <A - HREF="mandoc.1.html">mandoc</A>, which interfaces with the compiler libraries to format output for UNIX - terminals, XHTML, HTML, PostScript, and PDF. It is a <A CLASS="external" + <SPAN CLASS="nm">mdocml</SPAN> consists of the <A HREF="mandoc.3.html">libmandoc</A> validating + compilers and <A HREF="mandoc.1.html">mandoc</A>, which interfaces with the compiler library to format + output for UNIX terminals, XHTML, HTML, PostScript, and PDF. It is a <A CLASS="external" HREF="http://bsd.lv/">BSD.lv</A> project. </P> @@ -60,8 +59,7 @@ <P> <SPAN CLASS="nm">mdocml</SPAN> is architecture- and system-neutral, written in plain-old C. The most - current version is <SPAN CLASS="attn">@VERSION@</SPAN>, dated <SPAN class="attn">@VDATE@</SPAN>. A full - <A HREF="ChangeLog.html">ChangeLog</A> (<A HREF="ChangeLog.txt">txt</A>) is written with each release. + current version is <SPAN CLASS="attn">@VERSION@</SPAN>, dated <SPAN class="attn">@VDATE@</SPAN>. </P> <H2> @@ -172,38 +170,14 @@ </TD> </TR> <TR> - <TD VALIGN="top"><A HREF="man.3.html">man(3)</A></TD> + <TD VALIGN="top"><A HREF="mandoc.3.html">mandoc(3)</A></TD> <TD VALIGN="top"> - man macro compiler library + mandoc macro compiler library <SPAN STYLE="font-size: smaller;"> - (<A HREF="man.3.txt">text</A> | - <A HREF="man.3.xhtml">xhtml</A> | - <A HREF="man.3.pdf">pdf</A> | - <A HREF="man.3.ps">postscript</A>) - </SPAN> - </TD> - </TR> - <TR> - <TD VALIGN="top"><A HREF="mdoc.3.html">mdoc(3)</A></TD> - <TD VALIGN="top"> - mdoc macro compiler library - <SPAN STYLE="font-size: smaller;"> - (<A HREF="mdoc.3.txt">text</A> | - <A HREF="mdoc.3.xhtml">xhtml</A> | - <A HREF="mdoc.3.pdf">pdf</A> | - <A HREF="mdoc.3.ps">postscript</A>) - </SPAN> - </TD> - </TR> - <TR> - <TD VALIGN="top"><A HREF="roff.3.html">roff(3)</A></TD> - <TD VALIGN="top"> - roff macro compiler library - <SPAN STYLE="font-size: smaller;"> - (<A HREF="roff.3.txt">text</A> | - <A HREF="roff.3.xhtml">xhtml</A> | - <A HREF="roff.3.pdf">pdf</A> | - <A HREF="roff.3.ps">postscript</A>) + (<A HREF="mandoc.3.txt">text</A> | + <A HREF="mandoc.3.xhtml">xhtml</A> | + <A HREF="mandoc.3.pdf">pdf</A> | + <A HREF="mandoc.3.ps">postscript</A>) </SPAN> </TD> </TR> @@ -216,6 +190,18 @@ <A HREF="man.7.xhtml">xhtml</A> | <A HREF="man.7.pdf">pdf</A> | <A HREF="man.7.ps">postscript</A>) + </SPAN> + </TD> + </TR> + <TR> + <TD VALIGN="top"><A HREF="eqn.7.html">eqn(7)</A></TD> + <TD VALIGN="top"> + eqn-mandoc language reference + <SPAN STYLE="font-size: smaller;"> + (<A HREF="eqn.7.txt">text</A> | + <A HREF="eqn.7.xhtml">xhtml</A> | + <A HREF="eqn.7.pdf">pdf</A> | + <A HREF="eqn.7.ps">postscript</A>) </SPAN> </TD> </TR> Index: libman.h =================================================================== RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/libman.h,v retrieving revision 1.47 diff -u -r1.47 libman.h --- libman.h 20 Mar 2011 16:02:05 -0000 1.47 +++ libman.h 21 Mar 2011 17:53:02 -0000 @@ -17,8 +17,6 @@ #ifndef LIBMAN_H #define LIBMAN_H -#include "man.h" - enum man_next { MAN_NEXT_SIBLING = 0, MAN_NEXT_CHILD Index: libmandoc.h =================================================================== RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/libmandoc.h,v retrieving revision 1.13 diff -u -r1.13 libmandoc.h --- libmandoc.h 20 Mar 2011 16:02:05 -0000 1.13 +++ libmandoc.h 21 Mar 2011 17:53:02 -0000 @@ -17,18 +17,58 @@ #ifndef LIBMANDOC_H #define LIBMANDOC_H +enum rofferr { + ROFF_CONT, /* continue processing line */ + ROFF_RERUN, /* re-run roff interpreter with offset */ + ROFF_APPEND, /* re-run main parser, appending next line */ + ROFF_REPARSE, /* re-run main parser on the result */ + ROFF_SO, /* include another file */ + ROFF_IGN, /* ignore current line */ + ROFF_TBL, /* a table row was successfully parsed */ + ROFF_EQN, /* an equation was successfully parsed */ + ROFF_ERR /* badness: puke and stop */ +}; + +struct roff; + __BEGIN_DECLS -void mandoc_msg(enum mandocerr, struct mparse *, - int, int, const char *); -void mandoc_vmsg(enum mandocerr, struct mparse *, - int, int, const char *, ...); -int mandoc_special(char *); -char *mandoc_strdup(const char *); -char *mandoc_getarg(struct mparse *, char **, int, int *); -char *mandoc_normdate(struct mparse *, char *, int, int); -int mandoc_eos(const char *, size_t, int); -int mandoc_hyph(const char *, const char *); +void mandoc_msg(enum mandocerr, struct mparse *, + int, int, const char *); +void mandoc_vmsg(enum mandocerr, struct mparse *, + int, int, const char *, ...); +int mandoc_special(char *); +char *mandoc_strdup(const char *); +char *mandoc_getarg(struct mparse *, char **, int, int *); +char *mandoc_normdate(struct mparse *, char *, int, int); +int mandoc_eos(const char *, size_t, int); +int mandoc_hyph(const char *, const char *); + +void mdoc_free(struct mdoc *); +struct mdoc *mdoc_alloc(struct regset *, struct mparse *); +void mdoc_reset(struct mdoc *); +int mdoc_parseln(struct mdoc *, int, char *, int); +int mdoc_endparse(struct mdoc *); +int mdoc_addspan(struct mdoc *, const struct tbl_span *); +int mdoc_addeqn(struct mdoc *, const struct eqn *); + +void man_free(struct man *); +struct man *man_alloc(struct regset *, struct mparse *); +void man_reset(struct man *); +int man_parseln(struct man *, int, char *, int); +int man_endparse(struct man *); +int man_addspan(struct man *, const struct tbl_span *); +int man_addeqn(struct man *, const struct eqn *); + +void roff_free(struct roff *); +struct roff *roff_alloc(struct regset *, struct mparse *); +void roff_reset(struct roff *); +enum rofferr roff_parseln(struct roff *, int, + char **, size_t *, int, int *); +void roff_endparse(struct roff *); + +const struct tbl_span *roff_span(const struct roff *); +const struct eqn *roff_eqn(const struct roff *); __END_DECLS Index: libmdoc.h =================================================================== RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/libmdoc.h,v retrieving revision 1.69 diff -u -r1.69 libmdoc.h --- libmdoc.h 20 Mar 2011 16:02:05 -0000 1.69 +++ libmdoc.h 21 Mar 2011 17:53:02 -0000 @@ -17,8 +17,6 @@ #ifndef LIBMDOC_H #define LIBMDOC_H -#include "mdoc.h" - enum mdoc_next { MDOC_NEXT_SIBLING = 0, MDOC_NEXT_CHILD Index: main.c =================================================================== RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/main.c,v retrieving revision 1.157 diff -u -r1.157 main.c --- main.c 21 Mar 2011 12:04:26 -0000 1.157 +++ main.c 21 Mar 2011 17:53:02 -0000 @@ -28,8 +28,6 @@ #include "mandoc.h" #include "main.h" -#include "mdoc.h" -#include "man.h" #if !defined(__GNUC__) || (__GNUC__ < 2) # if !defined(lint) Index: man.3 =================================================================== RCS file: man.3 diff -N man.3 --- man.3 9 Feb 2011 09:18:15 -0000 1.30 +++ /dev/null 1 Jan 1970 00:00:00 -0000 @@ -1,283 +0,0 @@ -.\" $Id: man.3,v 1.30 2011/02/09 09:18:15 kristaps Exp $ -.\" -.\" Copyright (c) 2009-2010 Kristaps Dzonsons <kristaps@bsd.lv> -.\" -.\" Permission to use, copy, modify, and distribute this software for any -.\" purpose with or without fee is hereby granted, provided that the above -.\" copyright notice and this permission notice appear in all copies. -.\" -.\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES -.\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF -.\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR -.\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES -.\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN -.\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF -.\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. -.\" -.Dd $Mdocdate: February 9 2011 $ -.Dt MAN 3 -.Os -.Sh NAME -.Nm man , -.Nm man_addeqn , -.Nm man_addspan , -.Nm man_alloc , -.Nm man_endparse , -.Nm man_free , -.Nm man_meta , -.Nm man_node , -.Nm man_parseln , -.Nm man_reset -.Nd man macro compiler library -.Sh SYNOPSIS -.In mandoc.h -.In man.h -.Vt extern const char * const * man_macronames; -.Ft int -.Fo man_addeqn -.Fa "struct man *man" -.Fa "const struct eqn *eqn" -.Fc -.Ft int -.Fo man_addspan -.Fa "struct man *man" -.Fa "const struct tbl_span *span" -.Fc -.Ft "struct man *" -.Fo man_alloc -.Fa "struct regset *regs" -.Fa "void *data" -.Fa "mandocmsg msgs" -.Fc -.Ft int -.Fn man_endparse "struct man *man" -.Ft void -.Fn man_free "struct man *man" -.Ft "const struct man_meta *" -.Fn man_meta "const struct man *man" -.Ft "const struct man_node *" -.Fn man_node "const struct man *man" -.Ft int -.Fo man_parseln -.Fa "struct man *man" -.Fa "int line" -.Fa "char *buf" -.Fc -.Ft void -.Fn man_reset "struct man *man" -.Sh DESCRIPTION -The -.Nm -library parses lines of -.Xr man 7 -input into an abstract syntax tree (AST). -.Pp -In general, applications initiate a parsing sequence with -.Fn man_alloc , -parse each line in a document with -.Fn man_parseln , -close the parsing session with -.Fn man_endparse , -operate over the syntax tree returned by -.Fn man_node -and -.Fn man_meta , -then free all allocated memory with -.Fn man_free . -The -.Fn man_reset -function may be used in order to reset the parser for another input -sequence. -.Pp -Beyond the full set of macros defined in -.Xr man 7 , -the -.Nm -library also accepts the following macro: -.Pp -.Bl -tag -width Ds -compact -.It PD -Has no effect. -Handled as a current-scope line macro. -.El -.Ss Types -.Bl -ohang -.It Vt struct man -An opaque type. -Its values are only used privately within the library. -.It Vt struct man_node -A parsed node. -See -.Sx Abstract Syntax Tree -for details. -.El -.Ss Functions -If -.Fn man_addeqn , -.Fn man_addspan , -.Fn man_parseln , -or -.Fn man_endparse -return 0, calls to any function but -.Fn man_reset -or -.Fn man_free -will raise an assertion. -.Bl -ohang -.It Fn man_addeqn -Add an equation to the parsing stream. -Returns 0 on failure, 1 on success. -.It Fn man_addspan -Add a table span to the parsing stream. -Returns 0 on failure, 1 on success. -.It Fn man_alloc -Allocates a parsing structure. -The -.Fa data -pointer is passed to -.Fa msgs . -Always returns a valid pointer. -The pointer must be freed with -.Fn man_free . -.It Fn man_reset -Reset the parser for another parse routine. -After its use, -.Fn man_parseln -behaves as if invoked for the first time. -.It Fn man_free -Free all resources of a parser. -The pointer is no longer valid after invocation. -.It Fn man_parseln -Parse a nil-terminated line of input. -This line should not contain the trailing newline. -Returns 0 on failure, 1 on success. -The input buffer -.Fa buf -is modified by this function. -.It Fn man_endparse -Signals that the parse is complete. -Returns 0 on failure, 1 on success. -.It Fn man_node -Returns the first node of the parse. -.It Fn man_meta -Returns the document's parsed meta-data. -.El -.Ss Variables -The following variables are also defined: -.Bl -ohang -.It Va man_macronames -An array of string-ified token names. -.El -.Ss Abstract Syntax Tree -The -.Nm -functions produce an abstract syntax tree (AST) describing input in a -regular form. -It may be reviewed at any time with -.Fn man_nodes ; -however, if called before -.Fn man_endparse , -or after -.Fn man_endparse -or -.Fn man_parseln -fail, it may be incomplete. -.Pp -This AST is governed by the ontological rules dictated in -.Xr man 7 -and derives its terminology accordingly. -.Pp -The AST is composed of -.Vt struct man_node -nodes with element, root and text types as declared by the -.Va type -field. -Each node also provides its parse point (the -.Va line , -.Va sec , -and -.Va pos -fields), its position in the tree (the -.Va parent , -.Va child , -.Va next -and -.Va prev -fields) and some type-specific data. -.Pp -The tree itself is arranged according to the following normal form, -where capitalised non-terminals represent nodes. -.Pp -.Bl -tag -width "ELEMENTXX" -compact -.It ROOT -\(<- mnode+ -.It mnode -\(<- ELEMENT | TEXT | BLOCK -.It BLOCK -\(<- HEAD BODY -.It HEAD -\(<- mnode* -.It BODY -\(<- mnode* -.It ELEMENT -\(<- ELEMENT | TEXT* -.It TEXT -\(<- [[:alpha:]]* -.El -.Pp -The only elements capable of nesting other elements are those with -next-lint scope as documented in -.Xr man 7 . -.Sh EXAMPLES -The following example reads lines from stdin and parses them, operating -on the finished parse tree with -.Fn parsed . -This example does not error-check nor free memory upon failure. -.Bd -literal -offset indent -struct regset regs; -struct man *man; -struct man_node *node; -char *buf; -size_t len; -int line; - -bzero(®s, sizeof(struct regset)); -line = 1; -man = man_alloc(®s, NULL, NULL); -buf = NULL; -alloc_len = 0; - -while ((len = getline(&buf, &alloc_len, stdin)) >= 0) { - if (len && buflen[len - 1] = '\en') - buf[len - 1] = '\e0'; - if ( ! man_parseln(man, line, buf)) - errx(1, "man_parseln"); - line++; -} - -free(buf); - -if ( ! man_endparse(man)) - errx(1, "man_endparse"); -if (NULL == (node = man_node(man))) - errx(1, "man_node"); - -parsed(man, node); -man_free(man); -.Ed -.Pp -To compile this, execute -.Pp -.Dl % cc main.c libman.a libmandoc.a -.Pp -where -.Pa main.c -is the example file. -.Sh SEE ALSO -.Xr mandoc 1 , -.Xr man 7 -.Sh AUTHORS -The -.Nm -library was written by -.An Kristaps Dzonsons Aq kristaps@bsd.lv . Index: man.h =================================================================== RCS file: man.h diff -N man.h --- man.h 20 Mar 2011 16:02:05 -0000 1.55 +++ /dev/null 1 Jan 1970 00:00:00 -0000 @@ -1,133 +0,0 @@ -/* $Id: man.h,v 1.55 2011/03/20 16:02:05 kristaps Exp $ */ -/* - * Copyright (c) 2009, 2010, 2011 Kristaps Dzonsons <kristaps@bsd.lv> - * - * Permission to use, copy, modify, and distribute this software for any - * purpose with or without fee is hereby granted, provided that the above - * copyright notice and this permission notice appear in all copies. - * - * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES - * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF - * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR - * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES - * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN - * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF - * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. - */ -#ifndef MAN_H -#define MAN_H - -/* - * What follows is a list of ALL possible macros. - */ -enum mant { - MAN_br = 0, - MAN_TH, - MAN_SH, - MAN_SS, - MAN_TP, - MAN_LP, - MAN_PP, - MAN_P, - MAN_IP, - MAN_HP, - MAN_SM, - MAN_SB, - MAN_BI, - MAN_IB, - MAN_BR, - MAN_RB, - MAN_R, - MAN_B, - MAN_I, - MAN_IR, - MAN_RI, - MAN_na, - MAN_sp, - MAN_nf, - MAN_fi, - MAN_RE, - MAN_RS, - MAN_DT, - MAN_UC, - MAN_PD, - MAN_AT, - MAN_in, - MAN_ft, - MAN_MAX -}; - -/* - * Type of a syntax node. - */ -enum man_type { - MAN_TEXT, - MAN_ELEM, - MAN_ROOT, - MAN_BLOCK, - MAN_HEAD, - MAN_BODY, - MAN_TBL, - MAN_EQN -}; - -/* - * Information from prologue. - */ -struct man_meta { - char *msec; /* `TH' section (1, 3p, etc.) */ - char *date; /* `TH' normalised date */ - char *vol; /* `TH' volume */ - char *title; /* `TH' title (e.g., FOO) */ - char *source; /* `TH' source (e.g., GNU) */ -}; - -/* - * Single node in tree-linked AST. - */ -struct man_node { - struct man_node *parent; /* parent AST node */ - struct man_node *child; /* first child AST node */ - struct man_node *next; /* sibling AST node */ - struct man_node *prev; /* prior sibling AST node */ - int nchild; /* number children */ - int line; - int pos; - enum mant tok; /* tok or MAN__MAX if none */ - int flags; -#define MAN_VALID (1 << 0) /* has been validated */ -#define MAN_EOS (1 << 2) /* at sentence boundary */ -#define MAN_LINE (1 << 3) /* first macro/text on line */ - enum man_type type; /* AST node type */ - char *string; /* TEXT node argument */ - struct man_node *head; /* BLOCK node HEAD ptr */ - struct man_node *body; /* BLOCK node BODY ptr */ - const struct tbl_span *span; /* TBL */ - const struct eqn *eqn; /* EQN */ -}; - -/* - * Names of macros. Index is enum mant. Indexing into this returns - * the normalised name, e.g., man_macronames[MAN_SH] -> "SH". - */ -extern const char *const *man_macronames; - -__BEGIN_DECLS - -struct man; - -void man_free(struct man *); -struct man *man_alloc(struct regset *, struct mparse *); -void man_reset(struct man *); -int man_parseln(struct man *, int, char *, int); -int man_endparse(struct man *); -int man_addspan(struct man *, - const struct tbl_span *); -int man_addeqn(struct man *, const struct eqn *); - -const struct man_node *man_node(const struct man *); -const struct man_meta *man_meta(const struct man *); - -__END_DECLS - -#endif /*!MAN_H*/ Index: man_html.c =================================================================== RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/man_html.c,v retrieving revision 1.70 diff -u -r1.70 man_html.c --- man_html.c 7 Mar 2011 01:35:51 -0000 1.70 +++ man_html.c 21 Mar 2011 17:53:02 -0000 @@ -29,7 +29,6 @@ #include "mandoc.h" #include "out.h" #include "html.h" -#include "man.h" #include "main.h" /* TODO: preserve ident widths. */ Index: man_term.c =================================================================== RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/man_term.c,v retrieving revision 1.104 diff -u -r1.104 man_term.c --- man_term.c 7 Mar 2011 01:35:51 -0000 1.104 +++ man_term.c 21 Mar 2011 17:53:02 -0000 @@ -29,9 +29,7 @@ #include "mandoc.h" #include "out.h" -#include "man.h" #include "term.h" -#include "chars.h" #include "main.h" #define INDENT 7 Index: mandoc.3 =================================================================== RCS file: mandoc.3 diff -N mandoc.3 --- /dev/null 1 Jan 1970 00:00:00 -0000 +++ mandoc.3 21 Mar 2011 17:53:02 -0000 @@ -0,0 +1,321 @@ +.\" $Id: mdoc.3,v 1.57 2011/02/09 09:18:15 kristaps Exp $ +.\" +.\" Copyright (c) 2009, 2010 Kristaps Dzonsons <kristaps@bsd.lv> +.\" Copyright (c) 2010 Ingo Schwarze <schwarze@openbsd.org> +.\" +.\" Permission to use, copy, modify, and distribute this software for any +.\" purpose with or without fee is hereby granted, provided that the above +.\" copyright notice and this permission notice appear in all copies. +.\" +.\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES +.\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF +.\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR +.\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES +.\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN +.\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF +.\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. +.\" +.Dd $Mdocdate: February 9 2011 $ +.Dt MANDOC 3 +.Os +.Sh NAME +.Nm mandoc , +.Nm man_meta , +.Nm man_node , +.Nm mdoc_meta , +.Nm mdoc_node , +.Nm mparse_alloc , +.Nm mparse_free , +.Nm mparse_readfd , +.Nm mparse_reset , +.Nm mparse_result +.Nd mandoc macro compiler library +.Sh SYNOPSIS +.In mandoc.h +.Ft "const struct man_meta *" +.Fo man_meta +.Fa "const struct man *man" +.Fc +.Ft "const struct man_node *" +.Fo man_node +.Fa "const struct man *man" +.Fc +.Ft "const struct mdoc_meta *" +.Fo mdoc_meta +.Fa "const struct mdoc *mdoc" +.Fc +.Ft "const struct mdoc_node *" +.Fo mdoc_node +.Fa "const struct mdoc *mdoc" +.Fc +.Ft void +.Fo mparse_alloc +.Fa "enum mparset type" +.Fa "enum mandoclevel wlevel" +.Fa "mandocmsg msg" +.Fa "void *msgarg" +.Fc +.Ft void +.Fo mparse_free +.Fa "struct mparse *parse" +.Fc +.Ft "enum mandoclevel" +.Fo mparse_readfd +.Fa "struct mparse *parse" +.Fa "int fd" +.Fa "const char *fname" +.Fc +.Ft void +.Fo mparse_reset +.Fa "struct mparse *parse" +.Fc +.Ft void +.Fo mparse_result +.Fa "struct mparse *parse" +.Fa "struct mdoc **mdoc" +.Fa "struct man **man" +.Fc +.Vt extern const char * const * man_macronames; +.Vt extern const char * const * mdoc_argnames; +.Vt extern const char * const * mdoc_macronames; +.Sh DESCRIPTION +The +.Nm mandoc +library parses a +.Ux +manual into an abstract syntax tree (AST). +.Ux +manuals are composed of +.Xr mdoc 7 +or +.Xr man 7 , +and may be mixed with +.Xr roff 7 , +.Xr tbl 7 , +and +.Xr eqn 7 +invocations. +.Pp +The following describes a general parse sequence: +.Bl -enum +.It +initiate a parsing sequence with +.Fn mparse_alloc ; +.It +parse files or file descriptors with +.Fn mparse_readfd ; +.It +retrieve a parsed syntax tree, if the parse was successful, with +.Fn mparse_result ; +.It +iterate over parse nodes with +.Fn mdoc_node +or +.Fn man_node ; +.It +free all allocated memory with +.Fn mparse_free , +or invoke +.Fn mparse_reset +and parse new files. +.El +.Sh IMPLEMENTATION NOTES +This section consists of structural documentation for +.Xr mdoc 7 +and +.Xr man 7 +syntax trees. +.Ss Man Abstract Syntax Tree +This AST is governed by the ontological rules dictated in +.Xr man 7 +and derives its terminology accordingly. +.Pp +The AST is composed of +.Vt struct man_node +nodes with element, root and text types as declared by the +.Va type +field. +Each node also provides its parse point (the +.Va line , +.Va sec , +and +.Va pos +fields), its position in the tree (the +.Va parent , +.Va child , +.Va next +and +.Va prev +fields) and some type-specific data. +.Pp +The tree itself is arranged according to the following normal form, +where capitalised non-terminals represent nodes. +.Pp +.Bl -tag -width "ELEMENTXX" -compact +.It ROOT +\(<- mnode+ +.It mnode +\(<- ELEMENT | TEXT | BLOCK +.It BLOCK +\(<- HEAD BODY +.It HEAD +\(<- mnode* +.It BODY +\(<- mnode* +.It ELEMENT +\(<- ELEMENT | TEXT* +.It TEXT +\(<- [[:alpha:]]* +.El +.Pp +The only elements capable of nesting other elements are those with +next-lint scope as documented in +.Xr man 7 . +.Ss Mdoc Abstract Syntax Tree +This AST is governed by the ontological +rules dictated in +.Xr mdoc 7 +and derives its terminology accordingly. +.Qq In-line +elements described in +.Xr mdoc 7 +are described simply as +.Qq elements . +.Pp +The AST is composed of +.Vt struct mdoc_node +nodes with block, head, body, element, root and text types as declared +by the +.Va type +field. +Each node also provides its parse point (the +.Va line , +.Va sec , +and +.Va pos +fields), its position in the tree (the +.Va parent , +.Va child , +.Va nchild , +.Va next +and +.Va prev +fields) and some type-specific data, in particular, for nodes generated +from macros, the generating macro in the +.Va tok +field. +.Pp +The tree itself is arranged according to the following normal form, +where capitalised non-terminals represent nodes. +.Pp +.Bl -tag -width "ELEMENTXX" -compact +.It ROOT +\(<- mnode+ +.It mnode +\(<- BLOCK | ELEMENT | TEXT +.It BLOCK +\(<- HEAD [TEXT] (BODY [TEXT])+ [TAIL [TEXT]] +.It ELEMENT +\(<- TEXT* +.It HEAD +\(<- mnode* +.It BODY +\(<- mnode* [ENDBODY mnode*] +.It TAIL +\(<- mnode* +.It TEXT +\(<- [[:printable:],0x1e]* +.El +.Pp +Of note are the TEXT nodes following the HEAD, BODY and TAIL nodes of +the BLOCK production: these refer to punctuation marks. +Furthermore, although a TEXT node will generally have a non-zero-length +string, in the specific case of +.Sq \&.Bd \-literal , +an empty line will produce a zero-length string. +Multiple body parts are only found in invocations of +.Sq \&Bl \-column , +where a new body introduces a new phrase. +.Pp +The +.Xr mdoc 7 +syntax tree accomodates for broken block structures as well. +The ENDBODY node is available to end the formatting associated +with a given block before the physical end of that block. +It has a non-null +.Va end +field, is of the BODY +.Va type , +has the same +.Va tok +as the BLOCK it is ending, and has a +.Va pending +field pointing to that BLOCK's BODY node. +It is an indirect child of that BODY node +and has no children of its own. +.Pp +An ENDBODY node is generated when a block ends while one of its child +blocks is still open, like in the following example: +.Bd -literal -offset indent +\&.Ao ao +\&.Bo bo ac +\&.Ac bc +\&.Bc end +.Ed +.Pp +This example results in the following block structure: +.Bd -literal -offset indent +BLOCK Ao + HEAD Ao + BODY Ao + TEXT ao + BLOCK Bo, pending -> Ao + HEAD Bo + BODY Bo + TEXT bo + TEXT ac + ENDBODY Ao, pending -> Ao + TEXT bc +TEXT end +.Ed +.Pp +Here, the formatting of the +.Sq \&Ao +block extends from TEXT ao to TEXT ac, +while the formatting of the +.Sq \&Bo +block extends from TEXT bo to TEXT bc. +It renders as follows in +.Fl T Ns Cm ascii +mode: +.Pp +.Dl <ao [bo ac> bc] end +.Pp +Support for badly-nested blocks is only provided for backward +compatibility with some older +.Xr mdoc 7 +implementations. +Using badly-nested blocks is +.Em strongly discouraged ; +for example, the +.Fl T Ns Cm html +and +.Fl T Ns Cm xhtml +front-ends to +.Xr mandoc 1 +are unable to render them in any meaningful way. +Furthermore, behaviour when encountering badly-nested blocks is not +consistent across troff implementations, especially when using multiple +levels of badly-nested blocks. +.Sh SEE ALSO +.Xr mandoc 1 , +.Xr eqn 7 , +.Xr man 7 , +.Xr mdoc 7 , +.Xr roff 7 , +.Xr tbl 7 +.Sh AUTHORS +The +.Nm +library was written by +.An Kristaps Dzonsons Aq kristaps@bsd.lv . Index: mandoc.h =================================================================== RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/mandoc.h,v retrieving revision 1.64 diff -u -r1.64 mandoc.h --- mandoc.h 20 Mar 2011 16:05:21 -0000 1.64 +++ mandoc.h 21 Mar 2011 17:53:02 -0000 @@ -27,7 +27,7 @@ * threshold). */ enum mandoclevel { - MANDOCLEVEL_OK = 0, + MANDOCLEVEL_OK = 0, /* everything's ok */ MANDOCLEVEL_RESERVED, MANDOCLEVEL_WARNING, /* warnings: syntax, whitespace, etc. */ MANDOCLEVEL_ERROR, /* input has been thrown away */ @@ -277,7 +277,7 @@ }; /* - * Available registers (set in libroff, accessed elsewhere). + * Available roff registers. */ enum regs { REG_nS = 0, @@ -328,6 +328,436 @@ DELIM_CLOSE }; +enum mdoct { + MDOC_Ap = 0, + MDOC_Dd, + MDOC_Dt, + MDOC_Os, + MDOC_Sh, + MDOC_Ss, + MDOC_Pp, + MDOC_D1, + MDOC_Dl, + MDOC_Bd, + MDOC_Ed, + MDOC_Bl, + MDOC_El, + MDOC_It, + MDOC_Ad, + MDOC_An, + MDOC_Ar, + MDOC_Cd, + MDOC_Cm, + MDOC_Dv, + MDOC_Er, + MDOC_Ev, + MDOC_Ex, + MDOC_Fa, + MDOC_Fd, + MDOC_Fl, + MDOC_Fn, + MDOC_Ft, + MDOC_Ic, + MDOC_In, + MDOC_Li, + MDOC_Nd, + MDOC_Nm, + MDOC_Op, + MDOC_Ot, + MDOC_Pa, + MDOC_Rv, + MDOC_St, + MDOC_Va, + MDOC_Vt, + MDOC_Xr, + MDOC__A, + MDOC__B, + MDOC__D, + MDOC__I, + MDOC__J, + MDOC__N, + MDOC__O, + MDOC__P, + MDOC__R, + MDOC__T, + MDOC__V, + MDOC_Ac, + MDOC_Ao, + MDOC_Aq, + MDOC_At, + MDOC_Bc, + MDOC_Bf, + MDOC_Bo, + MDOC_Bq, + MDOC_Bsx, + MDOC_Bx, + MDOC_Db, + MDOC_Dc, + MDOC_Do, + MDOC_Dq, + MDOC_Ec, + MDOC_Ef, + MDOC_Em, + MDOC_Eo, + MDOC_Fx, + MDOC_Ms, + MDOC_No, + MDOC_Ns, + MDOC_Nx, + MDOC_Ox, + MDOC_Pc, + MDOC_Pf, + MDOC_Po, + MDOC_Pq, + MDOC_Qc, + MDOC_Ql, + MDOC_Qo, + MDOC_Qq, + MDOC_Re, + MDOC_Rs, + MDOC_Sc, + MDOC_So, + MDOC_Sq, + MDOC_Sm, + MDOC_Sx, + MDOC_Sy, + MDOC_Tn, + MDOC_Ux, + MDOC_Xc, + MDOC_Xo, + MDOC_Fo, + MDOC_Fc, + MDOC_Oo, + MDOC_Oc, + MDOC_Bk, + MDOC_Ek, + MDOC_Bt, + MDOC_Hf, + MDOC_Fr, + MDOC_Ud, + MDOC_Lb, + MDOC_Lp, + MDOC_Lk, + MDOC_Mt, + MDOC_Brq, + MDOC_Bro, + MDOC_Brc, + MDOC__C, + MDOC_Es, + MDOC_En, + MDOC_Dx, + MDOC__Q, + MDOC_br, + MDOC_sp, + MDOC__U, + MDOC_Ta, + MDOC_MAX +}; + +enum mdocargt { + MDOC_Split, + MDOC_Nosplit, + MDOC_Ragged, + MDOC_Unfilled, + MDOC_Literal, + MDOC_File, + MDOC_Offset, + MDOC_Bullet, + MDOC_Dash, + MDOC_Hyphen, + MDOC_Item, + MDOC_Enum, + MDOC_Tag, + MDOC_Diag, + MDOC_Hang, + MDOC_Ohang, + MDOC_Inset, + MDOC_Column, + MDOC_Width, + MDOC_Compact, + MDOC_Std, + MDOC_Filled, + MDOC_Words, + MDOC_Emphasis, + MDOC_Symbolic, + MDOC_Nested, + MDOC_Centred, + MDOC_ARG_MAX +}; + +enum mdoc_type { + MDOC_TEXT, /* text */ + MDOC_ELEM, /* in-line element */ + MDOC_HEAD, /* block head */ + MDOC_TAIL, /* block tail */ + MDOC_BODY, /* block body */ + MDOC_BLOCK, /* block enclosure */ + MDOC_TBL, /* table */ + MDOC_EQN, /* equation */ + MDOC_ROOT /* root of document */ +}; + +/* + * Section (named/unnamed) of mdoc(7) `Sh'. Note that these appear in + * the conventional order imposed by mdoc(7). + */ +enum mdoc_sec { + SEC_NONE = 0, /* No section, yet. */ + SEC_NAME, + SEC_LIBRARY, + SEC_SYNOPSIS, + SEC_DESCRIPTION, + SEC_IMPLEMENTATION, + SEC_RETURN_VALUES, + SEC_ENVIRONMENT, + SEC_FILES, + SEC_EXIT_STATUS, + SEC_EXAMPLES, + SEC_DIAGNOSTICS, + SEC_COMPATIBILITY, + SEC_ERRORS, + SEC_SEE_ALSO, + SEC_STANDARDS, + SEC_HISTORY, + SEC_AUTHORS, + SEC_CAVEATS, + SEC_BUGS, + SEC_SECURITY, + SEC_CUSTOM, /* User-defined. */ + SEC__MAX +}; + +struct mdoc_meta { + char *msec; /* `Dt' section (1, 3p, etc.) */ + char *vol; /* `Dt' volume (implied) */ + char *arch; /* `Dt' arch (i386, etc.) */ + char *date; /* `Dd' normalised date */ + char *title; /* `Dt' title (FOO, etc.) */ + char *os; /* `Os' system (OpenBSD, etc.) */ + char *name; /* leading `Nm' name */ +}; + +/* + * An argument to a mdoc(7) macro (multiple values = `-column xxx yyy'). + */ +struct mdoc_argv { + enum mdocargt arg; /* type of argument */ + int line; + int pos; + size_t sz; /* elements in "value" */ + char **value; /* argument strings */ +}; + +/* + * Reference-counted macro arguments. These are refcounted because + * blocks have multiple instances of the same arguments spread across + * the HEAD, BODY, TAIL, and BLOCK node types. + */ +struct mdoc_arg { + size_t argc; + struct mdoc_argv *argv; + unsigned int refcnt; +}; + +/* + * Indicates that a BODY's formatting has ended, but the scope is still + * open. Used for syntax-broken blocks. + */ +enum mdoc_endbody { + ENDBODY_NOT = 0, + ENDBODY_SPACE, /* is broken: append a space */ + ENDBODY_NOSPACE /* is broken: don't append a space */ +}; + +enum mdoc_list { + LIST__NONE = 0, + LIST_bullet, /* -bullet argument */ + LIST_column, /* -column argument */ + LIST_dash, /* -dash argument */ + LIST_diag, /* -diag argument */ + LIST_enum, /* -enum argument */ + LIST_hang, /* -hang argument */ + LIST_hyphen, /* -hyphen argument */ + LIST_inset, /* -inset argument */ + LIST_item, /* -item argument */ + LIST_ohang, /* -ohang argument */ + LIST_tag, /* -tag argument */ + LIST_MAX +}; + +enum mdoc_disp { + DISP__NONE = 0, + DISP_centred, /* -centred argument */ + DISP_ragged, /* -ragged argument */ + DISP_unfilled, /* -unfilled argument */ + DISP_filled, /* -filled argument */ + DISP_literal /* -literal argument */ +}; + +enum mdoc_auth { + AUTH__NONE = 0, + AUTH_split, /* -split argument */ + AUTH_nosplit /* -nosplit argument */ +}; + +enum mdoc_font { + FONT__NONE = 0, + FONT_Em, /* "Em" or -emphasis */ + FONT_Li, /* "Li" or -literal */ + FONT_Sy /* "Sy" or -symbolic */ +}; + +struct mdoc_bd { + const char *offs; /* -offset */ + enum mdoc_disp type; /* -ragged, etc. */ + int comp; /* -compact */ +}; + +struct mdoc_bl { + const char *width; /* -width */ + const char *offs; /* -offset */ + enum mdoc_list type; /* -tag, -enum, etc. */ + int comp; /* -compact */ + size_t ncols; /* -column arg count */ + const char **cols; /* -column val ptr */ +}; + +struct mdoc_bf { + enum mdoc_font font; /* font */ +}; + +struct mdoc_an { + enum mdoc_auth auth; /* -split, etc. */ +}; + +struct mdoc_rs { + int quote_T; /* whether to quote %T */ +}; + +/* + * Consists of normalised node arguments. These should be used instead + * of iterating through the mdoc_arg pointers of a node: defaults are + * provided, etc. + */ +union mdoc_data { + struct mdoc_an An; /* An arguments */ + struct mdoc_bd Bd; /* Bd arguments */ + struct mdoc_bf Bf; /* Bf arguments */ + struct mdoc_bl Bl; /* Bl arguments */ + struct mdoc_rs Rs; /* Rs arguments */ +}; + +/* + * Single node in tree-linked AST. + */ +struct mdoc_node { + struct mdoc_node *parent; /* parent AST node */ + struct mdoc_node *child; /* first child AST node */ + struct mdoc_node *last; /* last child AST node */ + struct mdoc_node *next; /* sibling AST node */ + struct mdoc_node *prev; /* prior sibling AST node */ + int nchild; /* number children */ + int line; /* parse line */ + int pos; /* parse column */ + enum mdoct tok; /* tok or MDOC__MAX if none */ + int flags; +#define MDOC_VALID (1 << 0) /* has been validated */ +#define MDOC_EOS (1 << 2) /* at sentence boundary */ +#define MDOC_LINE (1 << 3) /* first macro/text on line */ +#define MDOC_SYNPRETTY (1 << 4) /* SYNOPSIS-style formatting */ +#define MDOC_ENDED (1 << 5) /* rendering has been ended */ + enum mdoc_type type; /* AST node type */ + enum mdoc_sec sec; /* current named section */ + union mdoc_data *norm; /* normalised args */ + /* FIXME: these can be union'd to shave a few bytes. */ + struct mdoc_arg *args; /* BLOCK/ELEM */ + struct mdoc_node *pending; /* BLOCK */ + struct mdoc_node *head; /* BLOCK */ + struct mdoc_node *body; /* BLOCK */ + struct mdoc_node *tail; /* BLOCK */ + char *string; /* TEXT */ + const struct tbl_span *span; /* TBL */ + const struct eqn *eqn; /* EQN */ + enum mdoc_endbody end; /* BODY */ +}; + +enum mant { + MAN_br = 0, + MAN_TH, + MAN_SH, + MAN_SS, + MAN_TP, + MAN_LP, + MAN_PP, + MAN_P, + MAN_IP, + MAN_HP, + MAN_SM, + MAN_SB, + MAN_BI, + MAN_IB, + MAN_BR, + MAN_RB, + MAN_R, + MAN_B, + MAN_I, + MAN_IR, + MAN_RI, + MAN_na, + MAN_sp, + MAN_nf, + MAN_fi, + MAN_RE, + MAN_RS, + MAN_DT, + MAN_UC, + MAN_PD, + MAN_AT, + MAN_in, + MAN_ft, + MAN_MAX +}; + +enum man_type { + MAN_TEXT, + MAN_ELEM, + MAN_ROOT, + MAN_BLOCK, + MAN_HEAD, + MAN_BODY, + MAN_TBL, + MAN_EQN +}; + +struct man_meta { + char *msec; /* `TH' section (1, 3p, etc.) */ + char *date; /* `TH' normalised date */ + char *vol; /* `TH' volume */ + char *title; /* `TH' title (e.g., FOO) */ + char *source; /* `TH' source (e.g., GNU) */ +}; + +struct man_node { + struct man_node *parent; /* parent AST node */ + struct man_node *child; /* first child AST node */ + struct man_node *next; /* sibling AST node */ + struct man_node *prev; /* prior sibling AST node */ + int nchild; /* number children */ + int line; + int pos; + enum mant tok; /* tok or MAN__MAX if none */ + int flags; +#define MAN_VALID (1 << 0) /* has been validated */ +#define MAN_EOS (1 << 2) /* at sentence boundary */ +#define MAN_LINE (1 << 3) /* first macro/text on line */ + enum man_type type; /* AST node type */ + char *string; /* TEXT node argument */ + struct man_node *head; /* BLOCK node HEAD ptr */ + struct man_node *body; /* BLOCK node BODY ptr */ + const struct tbl_span *span; /* TBL */ + const struct eqn *eqn; /* EQN */ +}; + /* * The type of parse sequence. This value is usually passed via the * mandoc(1) command line of -man and -mdoc. It's almost exclusively @@ -342,12 +772,22 @@ typedef void (*mandocmsg)(enum mandocerr, enum mandoclevel, const char *, int, int, const char *); +/* Names of macros. Index is enum mdoct. */ +extern const char * const *mdoc_macronames; + +/* Names of macro args. Index is enum mdocargt. */ +extern const char * const *mdoc_argnames; + +/* Names of macros. Index is enum mant. */ +extern const char * const *man_macronames; + + +__BEGIN_DECLS + struct mparse; struct mdoc; struct man; -__BEGIN_DECLS - void mparse_free(struct mparse *); void mparse_reset(struct mparse *); struct mparse *mparse_alloc(enum mparset, @@ -360,6 +800,11 @@ void *mandoc_realloc(void *, size_t); #define DELIMSZ 6 /* hint: max possible size of a delimiter */ enum mdelim mandoc_isdelim(const char *); + +const struct man_node *man_node(const struct man *); +const struct man_meta *man_meta(const struct man *); +const struct mdoc_node *mdoc_node(const struct mdoc *); +const struct mdoc_meta *mdoc_meta(const struct mdoc *); __END_DECLS Index: mdoc.3 =================================================================== RCS file: mdoc.3 diff -N mdoc.3 --- mdoc.3 9 Feb 2011 09:18:15 -0000 1.57 +++ /dev/null 1 Jan 1970 00:00:00 -0000 @@ -1,359 +0,0 @@ -.\" $Id: mdoc.3,v 1.57 2011/02/09 09:18:15 kristaps Exp $ -.\" -.\" Copyright (c) 2009, 2010 Kristaps Dzonsons <kristaps@bsd.lv> -.\" Copyright (c) 2010 Ingo Schwarze <schwarze@openbsd.org> -.\" -.\" Permission to use, copy, modify, and distribute this software for any -.\" purpose with or without fee is hereby granted, provided that the above -.\" copyright notice and this permission notice appear in all copies. -.\" -.\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES -.\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF -.\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR -.\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES -.\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN -.\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF -.\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. -.\" -.Dd $Mdocdate: February 9 2011 $ -.Dt MDOC 3 -.Os -.Sh NAME -.Nm mdoc , -.Nm mdoc_addeqn , -.Nm mdoc_addspan , -.Nm mdoc_alloc , -.Nm mdoc_endparse , -.Nm mdoc_free , -.Nm mdoc_meta , -.Nm mdoc_node , -.Nm mdoc_parseln , -.Nm mdoc_reset -.Nd mdoc macro compiler library -.Sh SYNOPSIS -.In mandoc.h -.In mdoc.h -.Vt extern const char * const * mdoc_macronames; -.Vt extern const char * const * mdoc_argnames; -.Ft int -.Fo mdoc_addeqn -.Fa "struct mdoc *mdoc" -.Fa "const struct eqn *eqn" -.Fc -.Ft int -.Fo mdoc_addspan -.Fa "struct mdoc *mdoc" -.Fa "const struct tbl_span *span" -.Fc -.Ft "struct mdoc *" -.Fo mdoc_alloc -.Fa "struct regset *regs" -.Fa "void *data" -.Fa "mandocmsg msgs" -.Fc -.Ft int -.Fn mdoc_endparse "struct mdoc *mdoc" -.Ft void -.Fn mdoc_free "struct mdoc *mdoc" -.Ft "const struct mdoc_meta *" -.Fn mdoc_meta "const struct mdoc *mdoc" -.Ft "const struct mdoc_node *" -.Fn mdoc_node "const struct mdoc *mdoc" -.Ft int -.Fo mdoc_parseln -.Fa "struct mdoc *mdoc" -.Fa "int line" -.Fa "char *buf" -.Fc -.Ft int -.Fn mdoc_reset "struct mdoc *mdoc" -.Sh DESCRIPTION -The -.Nm mdoc -library parses lines of -.Xr mdoc 7 -input -into an abstract syntax tree (AST). -.Pp -In general, applications initiate a parsing sequence with -.Fn mdoc_alloc , -parse each line in a document with -.Fn mdoc_parseln , -close the parsing session with -.Fn mdoc_endparse , -operate over the syntax tree returned by -.Fn mdoc_node -and -.Fn mdoc_meta , -then free all allocated memory with -.Fn mdoc_free . -The -.Fn mdoc_reset -function may be used in order to reset the parser for another input -sequence. -.Ss Types -.Bl -ohang -.It Vt struct mdoc -An opaque type. -Its values are only used privately within the library. -.It Vt struct mdoc_node -A parsed node. -See -.Sx Abstract Syntax Tree -for details. -.El -.Ss Functions -If -.Fn mdoc_addeqn , -.Fn mdoc_addspan , -.Fn mdoc_parseln , -or -.Fn mdoc_endparse -return 0, calls to any function but -.Fn mdoc_reset -or -.Fn mdoc_free -will raise an assertion. -.Bl -ohang -.It Fn mdoc_addeqn -Add an equation to the parsing stream. -Returns 0 on failure, 1 on success. -.It Fn mdoc_addspan -Add a table span to the parsing stream. -Returns 0 on failure, 1 on success. -.It Fn mdoc_alloc -Allocates a parsing structure. -The -.Fa data -pointer is passed to -.Fa msgs . -Always returns a valid pointer. -The pointer must be freed with -.Fn mdoc_free . -.It Fn mdoc_reset -Reset the parser for another parse routine. -After its use, -.Fn mdoc_parseln -behaves as if invoked for the first time. -If it returns 0, memory could not be allocated. -.It Fn mdoc_free -Free all resources of a parser. -The pointer is no longer valid after invocation. -.It Fn mdoc_parseln -Parse a nil-terminated line of input. -This line should not contain the trailing newline. -Returns 0 on failure, 1 on success. -The input buffer -.Fa buf -is modified by this function. -.It Fn mdoc_endparse -Signals that the parse is complete. -Returns 0 on failure, 1 on success. -.It Fn mdoc_node -Returns the first node of the parse. -.It Fn mdoc_meta -Returns the document's parsed meta-data. -.El -.Ss Variables -.Bl -ohang -.It Va mdoc_macronames -An array of string-ified token names. -.It Va mdoc_argnames -An array of string-ified token argument names. -.El -.Ss Abstract Syntax Tree -The -.Nm -functions produce an abstract syntax tree (AST) describing input in a -regular form. -It may be reviewed at any time with -.Fn mdoc_nodes ; -however, if called before -.Fn mdoc_endparse , -or after -.Fn mdoc_endparse -or -.Fn mdoc_parseln -fail, it may be incomplete. -.Pp -This AST is governed by the ontological -rules dictated in -.Xr mdoc 7 -and derives its terminology accordingly. -.Qq In-line -elements described in -.Xr mdoc 7 -are described simply as -.Qq elements . -.Pp -The AST is composed of -.Vt struct mdoc_node -nodes with block, head, body, element, root and text types as declared -by the -.Va type -field. -Each node also provides its parse point (the -.Va line , -.Va sec , -and -.Va pos -fields), its position in the tree (the -.Va parent , -.Va child , -.Va nchild , -.Va next -and -.Va prev -fields) and some type-specific data, in particular, for nodes generated -from macros, the generating macro in the -.Va tok -field. -.Pp -The tree itself is arranged according to the following normal form, -where capitalised non-terminals represent nodes. -.Pp -.Bl -tag -width "ELEMENTXX" -compact -.It ROOT -\(<- mnode+ -.It mnode -\(<- BLOCK | ELEMENT | TEXT -.It BLOCK -\(<- HEAD [TEXT] (BODY [TEXT])+ [TAIL [TEXT]] -.It ELEMENT -\(<- TEXT* -.It HEAD -\(<- mnode* -.It BODY -\(<- mnode* [ENDBODY mnode*] -.It TAIL -\(<- mnode* -.It TEXT -\(<- [[:printable:],0x1e]* -.El -.Pp -Of note are the TEXT nodes following the HEAD, BODY and TAIL nodes of -the BLOCK production: these refer to punctuation marks. -Furthermore, although a TEXT node will generally have a non-zero-length -string, in the specific case of -.Sq \&.Bd \-literal , -an empty line will produce a zero-length string. -Multiple body parts are only found in invocations of -.Sq \&Bl \-column , -where a new body introduces a new phrase. -.Ss Badly-nested Blocks -The ENDBODY node is available to end the formatting associated -with a given block before the physical end of that block. -It has a non-null -.Va end -field, is of the BODY -.Va type , -has the same -.Va tok -as the BLOCK it is ending, and has a -.Va pending -field pointing to that BLOCK's BODY node. -It is an indirect child of that BODY node -and has no children of its own. -.Pp -An ENDBODY node is generated when a block ends while one of its child -blocks is still open, like in the following example: -.Bd -literal -offset indent -\&.Ao ao -\&.Bo bo ac -\&.Ac bc -\&.Bc end -.Ed -.Pp -This example results in the following block structure: -.Bd -literal -offset indent -BLOCK Ao - HEAD Ao - BODY Ao - TEXT ao - BLOCK Bo, pending -> Ao - HEAD Bo - BODY Bo - TEXT bo - TEXT ac - ENDBODY Ao, pending -> Ao - TEXT bc -TEXT end -.Ed -.Pp -Here, the formatting of the -.Sq \&Ao -block extends from TEXT ao to TEXT ac, -while the formatting of the -.Sq \&Bo -block extends from TEXT bo to TEXT bc. -It renders as follows in -.Fl T Ns Cm ascii -mode: -.Pp -.Dl <ao [bo ac> bc] end -.Pp -Support for badly-nested blocks is only provided for backward -compatibility with some older -.Xr mdoc 7 -implementations. -Using badly-nested blocks is -.Em strongly discouraged : -the -.Fl T Ns Cm html -and -.Fl T Ns Cm xhtml -front-ends are unable to render them in any meaningful way. -Furthermore, behaviour when encountering badly-nested blocks is not -consistent across troff implementations, especially when using multiple -levels of badly-nested blocks. -.Sh EXAMPLES -The following example reads lines from stdin and parses them, operating -on the finished parse tree with -.Fn parsed . -This example does not error-check nor free memory upon failure. -.Bd -literal -offset indent -struct regset regs; -struct mdoc *mdoc; -const struct mdoc_node *node; -char *buf; -size_t len; -int line; - -bzero(®s, sizeof(struct regset)); -line = 1; -mdoc = mdoc_alloc(®s, NULL, NULL); -buf = NULL; -alloc_len = 0; - -while ((len = getline(&buf, &alloc_len, stdin)) >= 0) { - if (len && buflen[len - 1] = '\en') - buf[len - 1] = '\e0'; - if ( ! mdoc_parseln(mdoc, line, buf)) - errx(1, "mdoc_parseln"); - line++; -} - -if ( ! mdoc_endparse(mdoc)) - errx(1, "mdoc_endparse"); -if (NULL == (node = mdoc_node(mdoc))) - errx(1, "mdoc_node"); - -parsed(mdoc, node); -mdoc_free(mdoc); -.Ed -.Pp -To compile this, execute -.Pp -.Dl % cc main.c libmdoc.a libmandoc.a -.Pp -where -.Pa main.c -is the example file. -.Sh SEE ALSO -.Xr mandoc 1 , -.Xr mdoc 7 -.Sh AUTHORS -The -.Nm -library was written by -.An Kristaps Dzonsons Aq kristaps@bsd.lv . Index: mdoc.h =================================================================== RCS file: mdoc.h diff -N mdoc.h --- mdoc.h 20 Mar 2011 16:02:05 -0000 1.119 +++ /dev/null 1 Jan 1970 00:00:00 -0000 @@ -1,440 +0,0 @@ -/* $Id: mdoc.h,v 1.119 2011/03/20 16:02:05 kristaps Exp $ */ -/* - * Copyright (c) 2008, 2009, 2010, 2011 Kristaps Dzonsons <kristaps@bsd.lv> - * - * Permission to use, copy, modify, and distribute this software for any - * purpose with or without fee is hereby granted, provided that the above - * copyright notice and this permission notice appear in all copies. - * - * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES - * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF - * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR - * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES - * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN - * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF - * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. - */ -#ifndef MDOC_H -#define MDOC_H - -/* - * What follows is a list of ALL possible macros. - */ -enum mdoct { - MDOC_Ap = 0, - MDOC_Dd, - MDOC_Dt, - MDOC_Os, - MDOC_Sh, - MDOC_Ss, - MDOC_Pp, - MDOC_D1, - MDOC_Dl, - MDOC_Bd, - MDOC_Ed, - MDOC_Bl, - MDOC_El, - MDOC_It, - MDOC_Ad, - MDOC_An, - MDOC_Ar, - MDOC_Cd, - MDOC_Cm, - MDOC_Dv, - MDOC_Er, - MDOC_Ev, - MDOC_Ex, - MDOC_Fa, - MDOC_Fd, - MDOC_Fl, - MDOC_Fn, - MDOC_Ft, - MDOC_Ic, - MDOC_In, - MDOC_Li, - MDOC_Nd, - MDOC_Nm, - MDOC_Op, - MDOC_Ot, - MDOC_Pa, - MDOC_Rv, - MDOC_St, - MDOC_Va, - MDOC_Vt, - MDOC_Xr, - MDOC__A, - MDOC__B, - MDOC__D, - MDOC__I, - MDOC__J, - MDOC__N, - MDOC__O, - MDOC__P, - MDOC__R, - MDOC__T, - MDOC__V, - MDOC_Ac, - MDOC_Ao, - MDOC_Aq, - MDOC_At, - MDOC_Bc, - MDOC_Bf, - MDOC_Bo, - MDOC_Bq, - MDOC_Bsx, - MDOC_Bx, - MDOC_Db, - MDOC_Dc, - MDOC_Do, - MDOC_Dq, - MDOC_Ec, - MDOC_Ef, - MDOC_Em, - MDOC_Eo, - MDOC_Fx, - MDOC_Ms, - MDOC_No, - MDOC_Ns, - MDOC_Nx, - MDOC_Ox, - MDOC_Pc, - MDOC_Pf, - MDOC_Po, - MDOC_Pq, - MDOC_Qc, - MDOC_Ql, - MDOC_Qo, - MDOC_Qq, - MDOC_Re, - MDOC_Rs, - MDOC_Sc, - MDOC_So, - MDOC_Sq, - MDOC_Sm, - MDOC_Sx, - MDOC_Sy, - MDOC_Tn, - MDOC_Ux, - MDOC_Xc, - MDOC_Xo, - MDOC_Fo, - MDOC_Fc, - MDOC_Oo, - MDOC_Oc, - MDOC_Bk, - MDOC_Ek, - MDOC_Bt, - MDOC_Hf, - MDOC_Fr, - MDOC_Ud, - MDOC_Lb, - MDOC_Lp, - MDOC_Lk, - MDOC_Mt, - MDOC_Brq, - MDOC_Bro, - MDOC_Brc, - MDOC__C, - MDOC_Es, - MDOC_En, - MDOC_Dx, - MDOC__Q, - MDOC_br, - MDOC_sp, - MDOC__U, - MDOC_Ta, - MDOC_MAX -}; - -/* - * What follows is a list of ALL possible macro arguments. - */ -enum mdocargt { - MDOC_Split, - MDOC_Nosplit, - MDOC_Ragged, - MDOC_Unfilled, - MDOC_Literal, - MDOC_File, - MDOC_Offset, - MDOC_Bullet, - MDOC_Dash, - MDOC_Hyphen, - MDOC_Item, - MDOC_Enum, - MDOC_Tag, - MDOC_Diag, - MDOC_Hang, - MDOC_Ohang, - MDOC_Inset, - MDOC_Column, - MDOC_Width, - MDOC_Compact, - MDOC_Std, - MDOC_Filled, - MDOC_Words, - MDOC_Emphasis, - MDOC_Symbolic, - MDOC_Nested, - MDOC_Centred, - MDOC_ARG_MAX -}; - -/* - * Type of a syntax node. - */ -enum mdoc_type { - MDOC_TEXT, - MDOC_ELEM, - MDOC_HEAD, - MDOC_TAIL, - MDOC_BODY, - MDOC_BLOCK, - MDOC_TBL, - MDOC_EQN, - MDOC_ROOT -}; - -/* - * Section (named/unnamed) of `Sh'. Note that these appear in the - * conventional order imposed by mdoc.7. - */ -enum mdoc_sec { - SEC_NONE = 0, /* No section, yet. */ - SEC_NAME, - SEC_LIBRARY, - SEC_SYNOPSIS, - SEC_DESCRIPTION, - SEC_IMPLEMENTATION, - SEC_RETURN_VALUES, - SEC_ENVIRONMENT, - SEC_FILES, - SEC_EXIT_STATUS, - SEC_EXAMPLES, - SEC_DIAGNOSTICS, - SEC_COMPATIBILITY, - SEC_ERRORS, - SEC_SEE_ALSO, - SEC_STANDARDS, - SEC_HISTORY, - SEC_AUTHORS, - SEC_CAVEATS, - SEC_BUGS, - SEC_SECURITY, - SEC_CUSTOM, /* User-defined. */ - SEC__MAX -}; - -/* - * Information from prologue. - */ -struct mdoc_meta { - char *msec; /* `Dt' section (1, 3p, etc.) */ - char *vol; /* `Dt' volume (implied) */ - char *arch; /* `Dt' arch (i386, etc.) */ - char *date; /* `Dd' normalised date */ - char *title; /* `Dt' title (FOO, etc.) */ - char *os; /* `Os' system (OpenBSD, etc.) */ - char *name; /* leading `Nm' name */ -}; - -/* - * An argument to a macro (multiple values = `-column xxx yyy'). - */ -struct mdoc_argv { - enum mdocargt arg; /* type of argument */ - int line; - int pos; - size_t sz; /* elements in "value" */ - char **value; /* argument strings */ -}; - -/* - * Reference-counted macro arguments. These are refcounted because - * blocks have multiple instances of the same arguments spread across - * the HEAD, BODY, TAIL, and BLOCK node types. - */ -struct mdoc_arg { - size_t argc; - struct mdoc_argv *argv; - unsigned int refcnt; -}; - -/* - * Indicates that a BODY's formatting has ended, but the scope is still - * open. Used for syntax-broken blocks. - */ -enum mdoc_endbody { - ENDBODY_NOT = 0, - ENDBODY_SPACE, /* is broken: append a space */ - ENDBODY_NOSPACE /* is broken: don't append a space */ -}; - -/* - * Normalised `Bl' list type. - */ -enum mdoc_list { - LIST__NONE = 0, - LIST_bullet, - LIST_column, - LIST_dash, - LIST_diag, - LIST_enum, - LIST_hang, - LIST_hyphen, - LIST_inset, - LIST_item, - LIST_ohang, - LIST_tag, - LIST_MAX -}; - -/* - * Normalised `Bd' display type. - */ -enum mdoc_disp { - DISP__NONE = 0, - DISP_centred, - DISP_ragged, - DISP_unfilled, - DISP_filled, - DISP_literal -}; - -/* - * Normalised `An' splitting argument. - */ -enum mdoc_auth { - AUTH__NONE = 0, - AUTH_split, - AUTH_nosplit -}; - -/* - * Normalised `Bf' font type. - */ -enum mdoc_font { - FONT__NONE = 0, - FONT_Em, - FONT_Li, - FONT_Sy -}; - -/* - * Normalised arguments for `Bd'. - */ -struct mdoc_bd { - const char *offs; /* -offset */ - enum mdoc_disp type; /* -ragged, etc. */ - int comp; /* -compact */ -}; - -/* - * Normalised arguments for `Bl'. - */ -struct mdoc_bl { - const char *width; /* -width */ - const char *offs; /* -offset */ - enum mdoc_list type; /* -tag, -enum, etc. */ - int comp; /* -compact */ - size_t ncols; /* -column arg count */ - const char **cols; /* -column val ptr */ -}; - -/* - * Normalised arguments for `Bf'. - */ -struct mdoc_bf { - enum mdoc_font font; /* font */ -}; - -/* - * Normalised arguments for `An'. - */ -struct mdoc_an { - enum mdoc_auth auth; /* -split, etc. */ -}; - -struct mdoc_rs { - int quote_T; /* whether to quote %T */ -}; - -/* - * Consists of normalised node arguments. These should be used instead - * of iterating through the mdoc_arg pointers of a node: defaults are - * provided, etc. - */ -union mdoc_data { - struct mdoc_an An; - struct mdoc_bd Bd; - struct mdoc_bf Bf; - struct mdoc_bl Bl; - struct mdoc_rs Rs; -}; - -/* - * Single node in tree-linked AST. - */ -struct mdoc_node { - struct mdoc_node *parent; /* parent AST node */ - struct mdoc_node *child; /* first child AST node */ - struct mdoc_node *last; /* last child AST node */ - struct mdoc_node *next; /* sibling AST node */ - struct mdoc_node *prev; /* prior sibling AST node */ - int nchild; /* number children */ - int line; /* parse line */ - int pos; /* parse column */ - enum mdoct tok; /* tok or MDOC__MAX if none */ - int flags; -#define MDOC_VALID (1 << 0) /* has been validated */ -#define MDOC_EOS (1 << 2) /* at sentence boundary */ -#define MDOC_LINE (1 << 3) /* first macro/text on line */ -#define MDOC_SYNPRETTY (1 << 4) /* SYNOPSIS-style formatting */ -#define MDOC_ENDED (1 << 5) /* rendering has been ended */ - enum mdoc_type type; /* AST node type */ - enum mdoc_sec sec; /* current named section */ - union mdoc_data *norm; /* normalised args */ - /* FIXME: these can be union'd to shave a few bytes. */ - struct mdoc_arg *args; /* BLOCK/ELEM */ - struct mdoc_node *pending; /* BLOCK */ - struct mdoc_node *head; /* BLOCK */ - struct mdoc_node *body; /* BLOCK */ - struct mdoc_node *tail; /* BLOCK */ - char *string; /* TEXT */ - const struct tbl_span *span; /* TBL */ - const struct eqn *eqn; /* EQN */ - enum mdoc_endbody end; /* BODY */ -}; - -/* - * Names of macros. Index is enum mdoct. Indexing into this returns - * the normalised name, e.g., mdoc_macronames[MDOC_Sh] -> "Sh". - */ -extern const char *const *mdoc_macronames; - -/* - * Names of macro args. Index is enum mdocargt. Indexing into this - * returns the normalised name, e.g., mdoc_argnames[MDOC_File] -> - * "file". - */ -extern const char *const *mdoc_argnames; - -__BEGIN_DECLS - -struct mdoc; - -void mdoc_free(struct mdoc *); -struct mdoc *mdoc_alloc(struct regset *, struct mparse *); -void mdoc_reset(struct mdoc *); -int mdoc_parseln(struct mdoc *, int, char *, int); -const struct mdoc_node *mdoc_node(const struct mdoc *); -const struct mdoc_meta *mdoc_meta(const struct mdoc *); -int mdoc_endparse(struct mdoc *); -int mdoc_addspan(struct mdoc *, - const struct tbl_span *); -int mdoc_addeqn(struct mdoc *, - const struct eqn *); - -__END_DECLS - -#endif /*!MDOC_H*/ Index: mdoc_html.c =================================================================== RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/mdoc_html.c,v retrieving revision 1.154 diff -u -r1.154 mdoc_html.c --- mdoc_html.c 7 Mar 2011 01:35:51 -0000 1.154 +++ mdoc_html.c 21 Mar 2011 17:53:03 -0000 @@ -30,7 +30,6 @@ #include "mandoc.h" #include "out.h" #include "html.h" -#include "mdoc.h" #include "main.h" #define INDENT 5 Index: mdoc_term.c =================================================================== RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/mdoc_term.c,v retrieving revision 1.220 diff -u -r1.220 mdoc_term.c --- mdoc_term.c 7 Mar 2011 01:35:51 -0000 1.220 +++ mdoc_term.c 21 Mar 2011 17:53:03 -0000 @@ -31,8 +31,6 @@ #include "mandoc.h" #include "out.h" #include "term.h" -#include "mdoc.h" -#include "chars.h" #include "main.h" #define INDENT 5 Index: out.h =================================================================== RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/out.h,v retrieving revision 1.17 diff -u -r1.17 out.h --- out.h 7 Mar 2011 01:35:51 -0000 1.17 +++ out.h 21 Mar 2011 17:53:03 -0000 @@ -17,8 +17,6 @@ #ifndef OUT_H #define OUT_H -__BEGIN_DECLS - struct roffcol { size_t width; /* width of cell */ size_t decimal; /* decimal position in cell */ @@ -79,10 +77,24 @@ (p)->scale = (v); } \ while (/* CONSTCOND */ 0) -int a2roffsu(const char *, struct roffsu *, enum roffscale); -int a2roffdeco(enum roffdeco *, const char **, size_t *); -void time2a(time_t, char *, size_t); -void tblcalc(struct rofftbl *tbl, const struct tbl_span *); +enum chars { + CHARS_ASCII, + CHARS_HTML +}; + +__BEGIN_DECLS + +int a2roffsu(const char *, struct roffsu *, enum roffscale); +int a2roffdeco(enum roffdeco *, const char **, size_t *); +void time2a(time_t, char *, size_t); +void tblcalc(struct rofftbl *tbl, const struct tbl_span *); +void *chars_init(enum chars); +const char *chars_num2char(const char *, size_t); +const char *chars_spec2str(void *, const char *, size_t, size_t *); +int chars_spec2cp(void *, const char *, size_t); +const char *chars_res2str(void *, const char *, size_t, size_t *); +int chars_res2cp(void *, const char *, size_t); +void chars_free(void *); __END_DECLS Index: read.c =================================================================== RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/read.c,v retrieving revision 1.4 diff -u -r1.4 read.c --- read.c 20 Mar 2011 16:05:21 -0000 1.4 +++ read.c 21 Mar 2011 17:53:03 -0000 @@ -29,9 +29,6 @@ #include "mandoc.h" #include "libmandoc.h" -#include "mdoc.h" -#include "man.h" -#include "roff.h" #ifndef MAP_FILE #define MAP_FILE 0 Index: roff.3 =================================================================== RCS file: roff.3 diff -N roff.3 --- roff.3 1 Jan 2011 16:18:39 -0000 1.10 +++ /dev/null 1 Jan 1970 00:00:00 -0000 @@ -1,177 +0,0 @@ -.\" $Id: roff.3,v 1.10 2011/01/01 16:18:39 kristaps Exp $ -.\" -.\" Copyright (c) 2010 Kristaps Dzonsons <kristaps@bsd.lv> -.\" -.\" Permission to use, copy, modify, and distribute this software for any -.\" purpose with or without fee is hereby granted, provided that the above -.\" copyright notice and this permission notice appear in all copies. -.\" -.\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES -.\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF -.\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR -.\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES -.\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN -.\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF -.\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. -.\" -.Dd $Mdocdate: January 1 2011 $ -.Dt ROFF 3 -.Os -.Sh NAME -.Nm roff , -.Nm roff_alloc , -.Nm roff_endparse , -.Nm roff_free , -.Nm roff_parseln , -.Nm roff_reset , -.Nm roff_span -.Nd roff macro compiler library -.Sh SYNOPSIS -.In mandoc.h -.In roff.h -.Ft "struct roff *" -.Fo roff_alloc -.Fa "struct regset *regs" -.Fa "void *data" -.Fa "mandocmsg msgs" -.Fc -.Ft void -.Fn roff_endparse "struct roff *roff" -.Ft void -.Fn roff_free "struct roff *roff" -.Ft "enum rofferr" -.Fo roff_parseln -.Fa "struct roff *roff" -.Fa "int line" -.Fa "char **bufp" -.Fa "size_t *bufsz" -.Fa "int pos" -.Fa "int *offs" -.Fc -.Ft void -.Fn roff_reset "struct roff *roff" -.Ft "const struct tbl_span *" -.Fn roff_span "const struct roff *roff" -.Sh DESCRIPTION -The -.Nm -library processes lines of -.Xr roff 7 -input. -.Pp -In general, applications initiate a parsing sequence with -.Fn roff_alloc , -parse each line in a document with -.Fn roff_parseln , -close the parsing session with -.Fn roff_endparse , -and finally free all allocated memory with -.Fn roff_free . -The -.Fn roff_reset -function may be used in order to reset the parser for another input -sequence. -.Pp -The -.Fn roff_parseln -function should be invoked before passing a line into the -.Xr mdoc 3 -or -.Xr man 3 -libraries. -.Pp -See the -.Sx EXAMPLES -section for a full example. -.Sh REFERENCE -This section further defines the -.Sx Types -and -.Sx Functions -available to programmers. -.Ss Types -Functions (see -.Sx Functions ) -may use the following types: -.Bl -ohang -.It Vt "enum rofferr" -Instructions for further processing to the caller of -.Fn roff_parseln . -.It Vt struct roff -An opaque type defined in -.Pa roff.c . -Its values are only used privately within the library. -.It Vt mandocmsg -A function callback type defined in -.Pa mandoc.h . -.El -.Ss Functions -Function descriptions follow: -.Bl -ohang -.It Fn roff_alloc -Allocates a parsing structure. -The -.Fa data -pointer is passed to -.Fa msgs . -Returns NULL on failure. -If non-NULL, the pointer must be freed with -.Fn roff_free . -.It Fn roff_reset -Reset the parser for another parse routine. -After its use, -.Fn roff_parseln -behaves as if invoked for the first time. -.It Fn roff_free -Free all resources of a parser. -The pointer is no longer valid after invocation. -.It Fn roff_parseln -Parse a nil-terminated line of input. -The character array -.Fa bufp -may be modified or reallocated within this function. -In the latter case, -.Fa bufsz -will be modified accordingly. -The -.Fa offs -pointer will be modified if the line start during subsequent processing -of the line is not at the zeroth index. -This line should not contain the trailing newline. -Returns 0 on failure, 1 on success. -.It Fn roff_endparse -Signals that the parse is complete. -.It Fn roff_span -If -.Fn roff_parseln -returned -.Va ROFF_TBL , -return the last parsed table row. -Returns NULL otherwise. -.El -.Sh EXAMPLES -See -.Pa main.c -in the source distribution for an example of usage. -.Sh SEE ALSO -.Xr mandoc 1 , -.Xr man 3 , -.Xr mdoc 3 , -.Xr roff 7 -.Sh AUTHORS -The -.Nm -library was written by -.An Kristaps Dzonsons Aq kristaps@bsd.lv . -.Sh BUGS -The implementation of user-defined strings needs improvement: -.Bl -dash -.It -String values are taken literally and are not interpreted. -.It -Parsing of quoted strings is incomplete. -.It -The stings are stored internally using a singly linked list, -which is fine for small numbers of strings, -but ineffient when handling many strings. -.El Index: roff.c =================================================================== RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/roff.c,v retrieving revision 1.128 diff -u -r1.128 roff.c --- roff.c 20 Mar 2011 16:02:05 -0000 1.128 +++ roff.c 21 Mar 2011 17:53:03 -0000 @@ -28,7 +28,6 @@ #include <stdio.h> #include "mandoc.h" -#include "roff.h" #include "libroff.h" #include "libmandoc.h" Index: roff.h =================================================================== RCS file: roff.h diff -N roff.h --- roff.h 20 Mar 2011 16:02:05 -0000 1.25 +++ /dev/null 1 Jan 1970 00:00:00 -0000 @@ -1,47 +0,0 @@ -/* $Id: roff.h,v 1.25 2011/03/20 16:02:05 kristaps Exp $ */ -/* - * Copyright (c) 2010 Kristaps Dzonsons <kristaps@bsd.lv> - * - * Permission to use, copy, modify, and distribute this software for any - * purpose with or without fee is hereby granted, provided that the above - * copyright notice and this permission notice appear in all copies. - * - * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES - * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF - * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR - * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES - * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN - * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF - * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. - */ -#ifndef ROFF_H -#define ROFF_H - -enum rofferr { - ROFF_CONT, /* continue processing line */ - ROFF_RERUN, /* re-run roff interpreter with offset */ - ROFF_APPEND, /* re-run main parser, appending next line */ - ROFF_REPARSE, /* re-run main parser on the result */ - ROFF_SO, /* include another file */ - ROFF_IGN, /* ignore current line */ - ROFF_TBL, /* a table row was successfully parsed */ - ROFF_EQN, /* an equation was successfully parsed */ - ROFF_ERR /* badness: puke and stop */ -}; - -__BEGIN_DECLS - -struct roff; - -void roff_free(struct roff *); -struct roff *roff_alloc(struct regset *, struct mparse *); -void roff_reset(struct roff *); -enum rofferr roff_parseln(struct roff *, int, - char **, size_t *, int, int *); -void roff_endparse(struct roff *); -const struct tbl_span *roff_span(const struct roff *); -const struct eqn *roff_eqn(const struct roff *); - -__END_DECLS - -#endif /*!ROFF_H*/ Index: tbl.c =================================================================== RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/tbl.c,v retrieving revision 1.23 diff -u -r1.23 tbl.c --- tbl.c 20 Mar 2011 16:02:05 -0000 1.23 +++ tbl.c 21 Mar 2011 17:53:03 -0000 @@ -22,7 +22,6 @@ #include <time.h> #include "mandoc.h" -#include "roff.h" #include "libmandoc.h" #include "libroff.h" Index: term.c =================================================================== RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/term.c,v retrieving revision 1.180 diff -u -r1.180 term.c --- term.c 17 Mar 2011 09:16:38 -0000 1.180 +++ term.c 21 Mar 2011 17:53:03 -0000 @@ -29,7 +29,6 @@ #include <string.h> #include "mandoc.h" -#include "chars.h" #include "out.h" #include "term.h" #include "main.h" Index: tree.c =================================================================== RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/tree.c,v retrieving revision 1.36 diff -u -r1.36 tree.c --- tree.c 9 Feb 2011 09:18:15 -0000 1.36 +++ tree.c 21 Mar 2011 17:53:03 -0000 @@ -24,8 +24,6 @@ #include <time.h> #include "mandoc.h" -#include "mdoc.h" -#include "man.h" #include "main.h" static void print_mdoc(const struct mdoc_node *, int); ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <20110321215744.GA16603@iris.usta.de>]
* Re: [PATCH] Massive restructuring into mandoc.h/libmandoc.a. [not found] ` <20110321215744.GA16603@iris.usta.de> @ 2011-03-21 22:34 ` Kristaps Dzonsons 2011-03-22 2:20 ` Ingo Schwarze 0 siblings, 1 reply; 5+ messages in thread From: Kristaps Dzonsons @ 2011-03-21 22:34 UTC (permalink / raw) To: Ingo Schwarze; +Cc: tech >> This considerably simplifies the spaghetti-mess of inclusions, > > Hmm, i'm looking at the details right now. > > I'm not sure yet, i have a suspicion this might be overdone; > but maybe it is not. Expect some real feedback shortly, > hopefully tonight, or tomorrow night at the latest. Ingo, keep me posted (I hope you don't mind my cross-posting back to tech@, as this includes some motivations for the recent changes). In terms of being overdone, consider: (1) Collapsing libmdoc, libman, and libroff into libmandoc. I'm motivated by the wrongness of the existing approach: maintaining separable libmdoc, libman, and libroff compilers ignores the reality that Real Manuals are always mixing them. The more we work on libroff, in particular, the more it's going to have to work with the underlying compilers. Consider, if you will, how to handle in-line equations without calling into the EQN parser from libmdoc or libman (".Qq $a+b$" comes to mind). This will only get worse. (2) Collapsing main.c parsing into read.c and thus libmandoc. The notion of "libmdoc" and "libman" as standalone parsers is a horrible lie. A significant amount of parse complexity, such as pasting together CPP-escaped lines and validating ASCIIness, occured in main.c. Not even to mention (1), and the reality that mdoc and man rarely occur on their own, making the roff_parsln() and mdoc/man_parse dance part of the document syntax itself. (3) Collapsing mdoc.h and man.h into libmandoc.h/mandoc.h. Two things. First off, see (2). The parse() functions in both of these headers were lies. Second of all, and allowing for that, what do we get by having both mdoc.h and man.h in terms of their type definitions? All this allows is for the two pairs, mdoc_XXXX.c and man_XXXX.c, to have their own inclusions. Big f. deal: everybody else imports both anyway! It's artificial to split them and, although this isn't OpenBSD's problem, it puts an unnecessary burden on distributing libmandoc.a (requiring mdoc.h and man.h hanger-ons) for use by other utilities. That's pretty much all these patches accomplish. The rest is the Makefile being re-written, which I've wanted to do for ages. Kristaps -- To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] Massive restructuring into mandoc.h/libmandoc.a. 2011-03-21 22:34 ` Kristaps Dzonsons @ 2011-03-22 2:20 ` Ingo Schwarze 2011-03-22 9:26 ` Kristaps Dzonsons 0 siblings, 1 reply; 5+ messages in thread From: Ingo Schwarze @ 2011-03-22 2:20 UTC (permalink / raw) To: tech Hi Kristaps, Kristaps Dzonsons wrote on Mon, Mar 21, 2011 at 11:34:09PM +0100: > Ingo, keep me posted (I hope you don't mind my cross-posting back to > tech@, as this includes some motivations for the recent changes). No, i don't mind, my mail wasn't private, it just hadn't enough substance for posting it, but your reply has. This is still not final feedback, i'm still trying to understand the details of what is happening, so here i'm only trying to describe my first impression what might need to be looked at. > In terms of being overdone, consider: > > (1) Collapsing libmdoc, libman, and libroff into libmandoc. > > I'm motivated by the wrongness of the existing approach: maintaining > separable libmdoc, libman, and libroff compilers ignores the reality > that Real Manuals are always mixing them. The more we work on > libroff, in particular, the more it's going to have to work with the > underlying compilers. Yes. I agree with that. I don't see much value in separable compilers. What i'm wondering about is code layering, and how that is expressed in headers. So i'm talking about code organization, and nomenclature, not about sub-libraries. I think we have the following logical layers: 1. The main program, parsing options and iterating files. 2. The reader code, reading one document and dispatching to parsers. 3. The high-level languages mdoc(7) and man(7). 4. The low-level roff(7) language. 5. The roff(7) plugins like tbl(7) and eqn(7). 6. Specific output utilities. 7. Generic output utilities. 8. General purpose utilities. Each higher level can use each lower level. Each level can consist of one or more functional units, maybe using the same lower level units, maybe used by the same higher level units, but not using each other. Levels seem to be subdivided in this way: 1. (main.c) 2. (read.c) 3. a) mdoc parser (mdoc*.c) b) man parser (man*.c) c) mdoc terminal renderer (mdoc_term.c) d) mdoc HTML renderer (mdoc_html.c) e) man terminal renderer (man_term.c) f) man HTML renderer (man_html.c) g) tree renderers (tree.c) 4. (roff.c) 5. a) tbl parser (tbl*.c) b) tbl terminal renderer (tbl_term.c) c) tbl HTML renderer (tbl_html.c) d) eqn parser (eqn.c) 6. a) terminal output utilities (term*.c) b) HTML output utilities (html.c) c) tree output utilities (tree.c) 7. (out.c, chars.c) 8. (mandoc.c) Each unit should define an interface that can be used at the higher levels. These interfaces should be defined in headers. That doesn't mean they need to be packaged as libraries. Besides, each unit can have an internal header, not for use by higher levels (currently called lib*.h). Here is an overview of the interfaces: 3. acd) mdoc.h, main.h (+ private libmdoc.h for a) bef) man.h, main.h (+ private libman.h for b) g) main.h 4. roff.h mandoc.h (registers) 5. abc) mandoc.h (tbl) (+ private libroff.h for a) d) mandoc.h (eqn) (+ private libroff.h) 6. a) term.h, main.h b) html.h, main.h 7. out.h, chars.h 8. libmandoc.h mandoc.h (errors) So, the overall structure is not bad, except for two messy points: * main.h is a layering violation by its sheer existence; parts belong in mdoc.h, man.h, term.h, html.h. * mandoc.h is even worse; registers belong in roff.h; proper tbl.h and eqn.h would be cleaner; the rest is lowest layer, together with libmandoc.h > Consider, if you will, how to handle in-line equations without > calling into the EQN parser from libmdoc or libman > (".Qq $a+b$" comes to mind). This will only get worse. No problem there, just use layer 5 from layer 3. > (2) Collapsing main.c parsing into read.c and thus libmandoc. > > The notion of "libmdoc" and "libman" as standalone parsers is a > horrible lie. A significant amount of parse complexity, such as > pasting together CPP-escaped lines and validating ASCIIness, occured > in main.c. Not even to mention (1), and the reality that mdoc and > man rarely occur on their own, making the roff_parsln() and > mdoc/man_parse dance part of the document syntax itself. Sure. I tend to like the main.c / read.c split, whatever we call the interface. Probably mandoc.h is a good choice of name for the level 2 interface, and the level 8 interface should just keep the name libmandoc.h for now, even though having the top and bottom layer a [lib]*.h pair is a bit confusing when all the other [lib]*.h pairs live on the same level. > (3) Collapsing mdoc.h and man.h into libmandoc.h/mandoc.h. > > Two things. First off, see (2). The parse() functions in both of > these headers were lies. Second of all, and allowing for that, what > do we get by having both mdoc.h and man.h in terms of their type > definitions? All this allows is for the two pairs, mdoc_XXXX.c and > man_XXXX.c, to have their own inclusions. Big f. deal: everybody > else imports both anyway! It's artificial to split them and, > although this isn't OpenBSD's problem, it puts an unnecessary burden > on distributing libmandoc.a (requiring mdoc.h and man.h hanger-ons) > for use by other utilities. Hm. I guess here is my gripe. I still don't see the point of clobbering everything into mandoc.h. What's wrong with saying #include "man.h" #include "mdoc.h" #include "mandoc.h" in a program using libmandoc, if it really uses both parsers and the main reader? It's neither better nor worse than just #include "mandoc.h" which includes everything; the difference is just keeping code organized by topic, keeping code for one topic in one file, which is easier to read and maintain. And potentially, not having one header for everything also helps layering, doesn't it? > That's pretty much all these patches accomplish. The rest is the > Makefile being re-written, which I've wanted to do for ages. Well, i don't really worry about the Makefile, it is nice if it becomes shorter, but it won't come down to OpenBSD shortness any time soon. ;-) Good night for now, Ingo -- To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] Massive restructuring into mandoc.h/libmandoc.a. 2011-03-22 2:20 ` Ingo Schwarze @ 2011-03-22 9:26 ` Kristaps Dzonsons 0 siblings, 0 replies; 5+ messages in thread From: Kristaps Dzonsons @ 2011-03-22 9:26 UTC (permalink / raw) To: tech > This is still not final feedback, i'm still trying to understand > the details of what is happening, so here i'm only trying to > describe my first impression what might need to be looked at. Ingo, Thanks for your input! It seems, after a close read, that our goals are very similar. I'm going to check in what I have and then shape it to the points below. Don't worry---I can always roll back to the .10 tag, as all of these changes are structural. Let's massage this in-tree. > So, the overall structure is not bad, except for two messy points: > > * main.h is a layering violation by its sheer existence; > parts belong in mdoc.h, man.h, term.h, html.h. > * mandoc.h is even worse; > registers belong in roff.h; > proper tbl.h and eqn.h would be cleaner; > the rest is lowest layer, together with libmandoc.h I agree absolutely with the first. It was originally created (by Joerg?) for a good reason: for the output engine prototypes. Prior to that, if I recall, these prototypes were extern'd in main.c. And yes, the registers will have to go from mandoc.h---I hadn't even noticed that (they were in the original mandoc.h). I also agree with a proper tbl.h and eqn.h, but that can be done later. Note the mdelim stuff will also go away once I do the right cueing (after all this mumbo jumbo). >> Two things. First off, see (2). The parse() functions in both of >> these headers were lies. Second of all, and allowing for that, what >> do we get by having both mdoc.h and man.h in terms of their type >> definitions? All this allows is for the two pairs, mdoc_XXXX.c and >> man_XXXX.c, to have their own inclusions. Big f. deal: everybody >> else imports both anyway! It's artificial to split them and, >> although this isn't OpenBSD's problem, it puts an unnecessary burden >> on distributing libmandoc.a (requiring mdoc.h and man.h hanger-ons) >> for use by other utilities. > > Hm. I guess here is my gripe. I still don't see the point of > clobbering everything into mandoc.h. What's wrong with saying > > #include "man.h" > #include "mdoc.h" > #include "mandoc.h" > > in a program using libmandoc, if it really uses both parsers > and the main reader? It's neither better nor worse than just > > #include "mandoc.h" > > which includes everything; the difference is just keeping code > organized by topic, keeping code for one topic in one file, > which is easier to read and maintain. And potentially, not > having one header for everything also helps layering, doesn't it? I'm still on the fence about this, but for the time being, I'll keep it as two (mdoc.h and man.h) and we can think about it some more. My reasoning is simple: if all systems include mandoc/mdoc/man, why not just call it for what it is, and put them together? If we run into layering problems, we can always split them out later. > Well, i don't really worry about the Makefile, it is nice if > it becomes shorter, but it won't come down to OpenBSD shortness > any time soon. ;-) True. But I bet, once you factor in <bsd.prog.mk>, mine's MUCH shorter. :) :) The rest is the web-site. Thanks again, Kristaps -- To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2011-03-22 9:26 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2011-03-21 17:37 [PATCH] Massive restructuring into mandoc.h/libmandoc.a Kristaps Dzonsons 2011-03-21 17:53 ` Kristaps Dzonsons [not found] ` <20110321215744.GA16603@iris.usta.de> 2011-03-21 22:34 ` Kristaps Dzonsons 2011-03-22 2:20 ` Ingo Schwarze 2011-03-22 9:26 ` Kristaps Dzonsons
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).