From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-pl1-f175.google.com (mail-pl1-f175.google.com [209.85.214.175]) by fantadrom.bsd.lv (OpenSMTPD) with ESMTP id d0394153 for ; Mon, 20 May 2019 03:34:24 -0500 (EST) Received: by mail-pl1-f175.google.com with SMTP id c5so6355112pll.11 for ; Mon, 20 May 2019 01:34:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cathet-us.20150623.gappssmtp.com; s=20150623; h=sender:from:to:subject:mime-version:content-id:date:message-id; bh=X3jZU7sv3cC5LHzky3KWpO73Jfyyht1TjRNknwTTiug=; b=ISS4kXQMu+3EmzrDnRyTunrpC/VvESvg/k5erUwKYV0uF3ekmQIDMXAlv+Q+3YAPxJ 6kTJVwl6/YSOun5jqIbpwTFIRY51R4zjqwny1ryE/lOAV8byy7HScFzvyZ92KWEbdS4p mL3dlV2lkt0RXAQ1Teg5N0f5E/srmSwnb1jjY1qADbW+A8IW0qEaqoZQjzZnyozwH5XH sefBdu/yGW46C7GZkwObSLihnm+605fLm1/Y6/5vB8Z26516VbqqnVoVWFHZiZ+qysHj A32m4sfb+oxVUgSYZqBjHQUlbdh7BI76PuOXESQm7T+vXhtR1oV0D3Otd8mAOf8L1zB/ 4wuQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:to:subject:mime-version:content-id :date:message-id; bh=X3jZU7sv3cC5LHzky3KWpO73Jfyyht1TjRNknwTTiug=; b=smMCbCFVz4iXqh4PtTAVbKEr561leTrv77bMdX3PNmLYeXy5UDB4oUZqU69PHJRZ21 xLrBRri6FmyEjychp1GkSv97odXvDgGVCAHSEzO+o0UaGBJZWC/2f5QnXvPqh/gn0nkA xY9aG4I7lZoZs2ccEkV3dOMY6FGz4OPggHUCVoNrI66n1qxme8TB1RKNPk6Y93u79ln9 UMwvsJAxJT9ceYsMs5cvQ4ax9+9NigMu7wY/AaPrkszbbstf1IAynEFWZL+dI+UJ4s/8 QypeSgTq37QZdA45kQD+vuJO8jhYTzgAsWQEypI0LQOCMsqM8XknC7F8ypV5gnTjbqU9 gaVw== X-Gm-Message-State: APjAAAVgKrhHbZLezJqMcCqbF50D5x/ntsrkor55xYnPJUDAh6o6iyBk KuWhcXYJIzmDUj5b9I23VKpFo4LfMq4= X-Google-Smtp-Source: APXvYqyVZEHpZHW+5eOX1Of650sSR/y1szgcJQH4ZbT9MhAFt2Vx1N5NhAQayYrlfSP03t4ijy+hdg== X-Received: by 2002:a17:902:714e:: with SMTP id u14mr68425183plm.218.1558341262210; Mon, 20 May 2019 01:34:22 -0700 (PDT) Received: from desktop.ajb.soy (75-161-114-94.albq.qwest.net. [75.161.114.94]) by smtp.gmail.com with ESMTPSA id c2sm24718706pfa.18.2019.05.20.01.34.21 for (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 20 May 2019 01:34:21 -0700 (PDT) Sender: "Anthony J. Bentley" Received: from desktop.ajb.soy (localhost [127.0.0.1]) by desktop.ajb.soy (OpenSMTPD) with ESMTP id a2c7af9e for ; Mon, 20 May 2019 02:34:19 -0600 (MDT) From: "Anthony J. Bentley" To: tech@mandoc.bsd.lv Subject: docbook2mdoc(1) sometimes mishandles   X-Mailinglist: mandoc-tech Reply-To: tech@mandoc.bsd.lv MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <78007.1558341259.1@desktop.ajb.soy> Date: Mon, 20 May 2019 02:34:19 -0600 Message-ID: <3491.1558341259@desktop.ajb.soy> Hi, >From fonts.xml: Standard Type 1 fonts docbook2mdoc turns this into: .Pp .Sy Standard Type\e1 fonts It happens again near the end of the document: The IETF RFC documents, available from a number of sites throughout the world, often provide interesting information about character set issues; see for example RFC 373. becomes: .Pp The IETF RFC documents, available from a number of sites throughout the world, often provide interesting information about character set issues; see for example .Lk https://datatracker.ietf.org/doc/rfc373/ "RFC\e373" . -- Anthony J. Bentley -- To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from scc-mailout-kit-01.scc.kit.edu (scc-mailout-kit-01.scc.kit.edu [129.13.231.81]) by fantadrom.bsd.lv (OpenSMTPD) with ESMTP id 11ee1d5f for ; Mon, 20 May 2019 15:38:55 -0500 (EST) Received: from asta-nat.asta.uni-karlsruhe.de ([172.22.63.82] helo=hekate.usta.de) by scc-mailout-kit-01.scc.kit.edu with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (envelope-from ) id 1hSp3d-0003jA-QG; Mon, 20 May 2019 22:38:54 +0200 Received: from donnerwolke.usta.de ([172.24.96.3]) by hekate.usta.de with esmtp (Exim 4.77) (envelope-from ) id 1hSp3d-00000S-4D; Mon, 20 May 2019 22:38:53 +0200 Received: from athene.usta.de ([172.24.96.10]) by donnerwolke.usta.de with esmtp (Exim 4.84_2) (envelope-from ) id 1hSp3c-0007LD-UC; Mon, 20 May 2019 22:38:53 +0200 Received: from localhost (athene.usta.de [local]) by athene.usta.de (OpenSMTPD) with ESMTPA id 283c725e; Mon, 20 May 2019 22:38:52 +0200 (CEST) Date: Mon, 20 May 2019 22:38:52 +0200 From: Ingo Schwarze To: "Anthony J. Bentley" Cc: tech@mandoc.bsd.lv Subject: Re: docbook2mdoc(1) sometimes mishandles   Message-ID: <20190520203852.GA10196@athene.usta.de> References: <3491.1558341259@desktop.ajb.soy> X-Mailinglist: mandoc-tech Reply-To: tech@mandoc.bsd.lv MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3491.1558341259@desktop.ajb.soy> User-Agent: Mutt/1.8.0 (2017-02-23) Hi Anthony, Anthony J. Bentley wrote on Mon, May 20, 2019 at 02:34:19AM -0600: > From fonts.xml: > > > Standard Type 1 fonts > > docbook2mdoc turns this into: > > .Pp > .Sy Standard Type\e1 fonts > > It happens again near the end of the document: > > > The IETF RFC documents, available from a number of sites throughout > the world, often provide interesting information about character set > issues; see for example url="https://datatracker.ietf.org/doc/rfc373/">RFC 373. > > > becomes: > > .Pp > The IETF RFC documents, available from a number of sites throughout > the world, often provide interesting information about character set > issues; see for example > .Lk https://datatracker.ietf.org/doc/rfc373/ "RFC\e373" . Thanks for reporting, fxied with the following commit. Strangely, the file fonts.7 in the Xenocara tree is already correct... Yours, Ingo Log Message: ----------- When rendering XML entities, skip escaping in macro_addarg(). Fixing a bug which bentley@ found in fonts(7). Modified Files: -------------- docbook2mdoc: macro.c macro.h Revision Data ------------- Index: macro.h =================================================================== RCS file: /home/cvs/mdocml/docbook2mdoc/macro.h,v retrieving revision 1.7 retrieving revision 1.8 diff -Lmacro.h -Lmacro.h -u -p -r1.7 -r1.8 --- macro.h +++ macro.h @@ -44,10 +44,11 @@ struct format { enum parastate parastate; }; -#define ARG_SPACE 1 /* Insert whitespace before this argument. */ -#define ARG_SINGLE 2 /* Quote argument if it contains whitespace. */ -#define ARG_QUOTED 4 /* We are already in a quoted argument. */ -#define ARG_UPPER 8 /* Covert argument to upper case. */ +#define ARG_SPACE (1 << 0) /* Insert whitespace before this argument. */ +#define ARG_SINGLE (1 << 1) /* Quote arg if it contains whitespace. */ +#define ARG_QUOTED (1 << 2) /* We are already in a quoted argument. */ +#define ARG_RAW (1 << 3) /* Skip macro and backslash escaping. */ +#define ARG_UPPER (1 << 4) /* Convert argument to upper case. */ void macro_open(struct format *, const char *); Index: macro.c =================================================================== RCS file: /home/cvs/mdocml/docbook2mdoc/macro.c,v retrieving revision 1.20 retrieving revision 1.21 diff -Lmacro.c -Lmacro.c -u -p -r1.20 -r1.21 --- macro.c +++ macro.c @@ -130,6 +130,13 @@ macro_addarg(struct format *f, const cha flags &= ~ ARG_SPACE; } + /* For XML entities, skip escaping. */ + + if (flags & ARG_RAW) { + fputs(arg, stdout); + break; + } + /* Escape us if we look like a macro. */ if ((flags & (ARG_QUOTED | ARG_UPPER)) == 0 && @@ -186,10 +193,16 @@ macro_addnode(struct format *f, struct p TAILQ_NEXT(nc, child) == NULL) n = nc; - if (n->node == NODE_TEXT || n->node == NODE_ESCAPE) { + switch (n->node) { + case NODE_ESCAPE: + flags |= ARG_RAW; + /* FALLTHROUGH */ + case NODE_TEXT: macro_addarg(f, n->b, flags); f->parastate = PARA_MID; return; + default: + break; } /* -- To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv