From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from scc-mailout-kit-02.scc.kit.edu (scc-mailout-kit-02.scc.kit.edu [129.13.231.82]) by fantadrom.bsd.lv (OpenSMTPD) with ESMTP id 29e420ed for ; Tue, 2 Apr 2019 08:16:46 -0500 (EST) Received: from asta-nat.asta.uni-karlsruhe.de ([172.22.63.82] helo=hekate.usta.de) by scc-mailout-kit-02.scc.kit.edu with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (envelope-from ) id 1hBJHP-0004s2-RF; Tue, 02 Apr 2019 15:16:45 +0200 Received: from donnerwolke.usta.de ([172.24.96.3]) by hekate.usta.de with esmtp (Exim 4.77) (envelope-from ) id 1hBJHO-0005JA-Em; Tue, 02 Apr 2019 15:16:42 +0200 Received: from athene.usta.de ([172.24.96.10]) by donnerwolke.usta.de with esmtp (Exim 4.84_2) (envelope-from ) id 1hBJHO-0006kz-C5; Tue, 02 Apr 2019 15:16:42 +0200 Received: from localhost (athene.usta.de [local]) by athene.usta.de (OpenSMTPD) with ESMTPA id 79d65bd4; Tue, 2 Apr 2019 15:16:42 +0200 (CEST) Date: Tue, 2 Apr 2019 15:16:42 +0200 From: Ingo Schwarze To: Stephen Gregoratto Cc: tech@mandoc.bsd.lv Subject: Re: Parsing errors, output regressions with new XML parser Message-ID: <20190402131642.GC6369@athene.usta.de> References: <20190330001919.rrbc2xxrx47upalg@BlackBox> X-Mailinglist: mandoc-tech Reply-To: tech@mandoc.bsd.lv MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190330001919.rrbc2xxrx47upalg@BlackBox> User-Agent: Mutt/1.8.0 (2017-02-23) Hi Stephen, Stephen Gregoratto wrote on Sat, Mar 30, 2019 at 11:19:19AM +1100: > - XML comments aren't ignored. Most comments were already ignored. However... > This leads to documents like these[1] > being formatted as one loooong section under NAME. > [1] https://gitlab.gnome.org/GNOME/gtk/blob/master/docs/reference/gtk/css-overview.xml#L20 ... you have a point here. If a comment contains a greater-than sign, than was mistaken as the end of the comment. Fixed with the commit below. Thanks for the report, Ingo Log Message: ----------- skip XML comments even if they contain greater-than characters; issue reported by Stephen Gregoratto Modified Files: -------------- docbook2mdoc: parse.c Revision Data ------------- Index: parse.c =================================================================== RCS file: /home/cvs/mdocml/docbook2mdoc/parse.c,v retrieving revision 1.7 retrieving revision 1.8 diff -Lparse.c -Lparse.c -u -p -r1.7 -r1.8 --- parse.c +++ parse.c @@ -522,6 +522,7 @@ struct ptree * parse_file(struct parse *p, int fd, const char *fname) { char b[4096]; + char *cp; ssize_t rsz; /* Return value from read(2). */ size_t rlen; /* Number of bytes in b[]. */ size_t poff; /* Parse offset in b[]. */ @@ -647,6 +648,29 @@ parse_file(struct parse *p, int fd, cons if (advance(p, b, rlen, &pend, " >") && rsz > 0) break; + if (pend > poff + 3 && + strncmp(b + poff, ""); + if (cp == NULL) { + if (rsz > 0) { + pend = rlen; + break; + } + cp = b + rlen; + } else + cp += 3; + while (b + pend < cp) { + if (b[++pend] == '\n') { + p->nline++; + p->ncol = 1; + } else + p->ncol++; + } + continue; + } elem_end = 0; if (b[pend] != '>') in_tag = 1; -- To unsubscribe send an email to source+unsubscribe@mandoc.bsd.lv -- To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv