tech@mandoc.bsd.lv
 help / color / mirror / Atom feed
* Crash core dump when parsing krb5_openlog.3
@ 2023-10-13  7:06 Baptiste Daroussin
  2023-10-13 10:57 ` Ingo Schwarze
  0 siblings, 1 reply; 3+ messages in thread
From: Baptiste Daroussin @ 2023-10-13  7:06 UTC (permalink / raw)
  To: tech

Hello,

here is a report I got from FreeBSD and I can reproduce on latest CVS version of
mandoc:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=266882

curl -sSf
https://people.freebsd.org/~wosch/tmp/mandoc/core-dumped/krb5_openlog.3 | mandoc
>/dev/null

I haven't had time yet to debug, do you actually need more information?

Best regards,
Bapt
--
 To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Crash core dump when parsing krb5_openlog.3
  2023-10-13  7:06 Crash core dump when parsing krb5_openlog.3 Baptiste Daroussin
@ 2023-10-13 10:57 ` Ingo Schwarze
  2023-10-21 17:54   ` Ingo Schwarze
  0 siblings, 1 reply; 3+ messages in thread
From: Ingo Schwarze @ 2023-10-13 10:57 UTC (permalink / raw)
  To: Baptiste Daroussin; +Cc: tech

Hi Bapt,


Baptiste Daroussin wrote on Fri, Oct 13, 2023 at 09:06:56AM +0200:

> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=266882
> I haven't had time yet to debug, do you actually need more information?

No, the information in that report is perfect.
That's a very useful report, i replied to it here:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=266882#c2

Pasting the reply here, too:

Actually, with mandoc-current, i get a different assertion, but in
the same function, and judging from what i know about the code,
i hold a strong suspicion that the root cause is still the same,
i.e. that the same invariant is still being violated in the code:

$ mandoc krb5_openlog.3 assertion "rval != ESCAPE_EXPAND" failed:
file "/usr/src/usr.bin/mandoc/roff_escape.c", line 46, function
"mandoc_escape" Abort trap (core dumped)

This is a very important bug report because this particular bug has
already been reported a few weeks ago, but the reporter was unable
to provide a reproducer.  I tried to construct a reproducer from
code inspection but unfortunately failed.  So having the reproducer
is very valuable.

Right now, we are in the middle of an OpenBSD release, so it will
take up to a week before i will find the time of looking into this.

Apparently, the bug is that some particular roff(7) escape sequence is
likely regarded as output-device-dependent by the roff(7) pre-parser
and hence not substituted but instead left for the formatters to
handle, similar to the \*(.T predefined string, but the formatters
regards that particular escape sequence as one that should have been
replaced by the pre-parser, hence dying because they cannot handle it.

Having a quick look at your reproducing input file, i suspect that
the following input confuses mandoc:

.\" ouch!
.ds xx \\*(fP\fR(\fP\\*(lI*\\*(fP

Notably, you already marked that line with "ouch" in the manual
page source code, presumably acknowledging that doing such low-level
gymnastics in a manual page is asking for trouble.  To not handle such
madness gracefully is still a bug in mandoc though, i do not deny that.

I suspect that using the extremely special escape sequence "\\"
inside .ds is confusing mandoc - that sequence is mostly intended
for being used inside .de.  The pre-parser still sees the "\\" and
consequently sees no user-defined string replacement escape sequences.
But when the "xx" string is later used, the "\\*" likely gets resolved
to "\*", i.e. to a string replacement request, but at the point, the
pre-parser has already been run so the string does not get replaced
and makes it through to the formatters, and those cannot handle it.
Or something like that, i will have to investigate in detail.

No wonder i was unable to construct a reproducer from scratch, given
how crazy the reproducer actually looks...

Yours,
  Ingo
--
 To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Crash core dump when parsing krb5_openlog.3
  2023-10-13 10:57 ` Ingo Schwarze
@ 2023-10-21 17:54   ` Ingo Schwarze
  0 siblings, 0 replies; 3+ messages in thread
From: Ingo Schwarze @ 2023-10-21 17:54 UTC (permalink / raw)
  To: Baptiste Daroussin; +Cc: tech

Hi Bapt,

Ingo Schwarze wrote on Fri, Oct 13, 2023 at 12:57:37PM +0200:
> Baptiste Daroussin wrote on Fri, Oct 13, 2023 at 09:06:56AM +0200:

>> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=266882
>> I haven't had time yet to debug, do you actually need more information?

I believe this is fixed by the commit appended below.

I'm also going to update the FreeBSD bug report.

[...]
> Having a quick look at your reproducing input file, i suspect that
> the following input confuses mandoc:
> 
> .\" ouch!
> .ds xx \\*(fP\fR(\fP\\*(lI*\\*(fP
> 
> Notably, you already marked that line with "ouch" in the manual
> page source code, presumably acknowledging that doing such low-level
> gymnastics in a manual page is asking for trouble.  To not handle such
> madness gracefully is still a bug in mandoc though, i do not deny that.
> 
> I suspect that using the extremely special escape sequence "\\"
> inside .ds is confusing mandoc - that sequence is mostly intended
> for being used inside .de.  The pre-parser still sees the "\\" and
> consequently sees no user-defined string replacement escape sequences.
> But when the "xx" string is later used, the "\\*" likely gets resolved
> to "\*", i.e. to a string replacement request, but at the point, the
> pre-parser has already been run so the string does not get replaced
> and makes it through to the formatters, and those cannot handle it.
> Or something like that, i will have to investigate in detail.

That turned out to be true, as far as it goes.

Two aspects are missing from the explanation:

 1. Not only is the "xx" string later used, but it is used
    (1) with delayed expansion (i.e. \\*(xx rather than \*(xx)
    and (2) inside a macro argument.
 2. The macro argument parser already handled delayed expansion.
    What it failed to handle was recursive expansion where both
    the outer and the inner expansion were delayed.

To summarize, as far as i can see, the assertion only triggers
when a manual pages uses delayed expansion recursively inside
a macro argument.

Note that using expansion inside a manual page is already
bad style, even if it is neither recursive nor delayed nor
inside a macro argument...  :-(

Feedback whether the patch works for you is welcome.

Thanks again for the report!
  Ingo


Log Message:
-----------
When parsing a macro argument results in delayed escape sequence 
expansion, re-check for all contained escape sequences whether they 
need delayed expansion, not just for the particular escape sequences
that triggered delayed expansion in the first place.  This is needed
because delayed expansion can result in strings containing nested
escape sequences recursively needing delayed expansion, too.

This fixes an assertion failure in krb5_openlog(3), see:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=266882

Thanks to Wolfram Schneider <wosch at FreeBSD> for reporting the bug
and to Baptiste Daroussin <bapt at FreeBSD> for forwarding the report.

Modified Files:
--------------
    mandoc:
        mandoc.h
        roff.c

Revision Data
-------------
Index: mandoc.h
===================================================================
RCS file: /home/cvs/mandoc/mandoc/mandoc.h,v
retrieving revision 1.281
retrieving revision 1.282
diff -Lmandoc.h -Lmandoc.h -u -p -r1.281 -r1.282
--- mandoc.h
+++ mandoc.h
@@ -23,7 +23,6 @@
 #define ASCII_NBRZW	 30  /* non-breaking zero-width space */
 #define ASCII_BREAK	 29  /* breakable zero-width space */
 #define ASCII_HYPH	 28  /* breakable hyphen */
-#define ASCII_ESC	 27  /* escape sequence from copy-in processing */
 #define ASCII_TABREF	 26  /* reset tab reference position */
 
 /*
Index: roff.c
===================================================================
RCS file: /home/cvs/mandoc/mandoc/roff.c,v
retrieving revision 1.396
retrieving revision 1.397
diff -Lroff.c -Lroff.c -u -p -r1.396 -r1.397
--- roff.c
+++ roff.c
@@ -1387,7 +1387,7 @@ roff_expand(struct roff *r, struct buf *
 		 */
 
 		if (buf->buf[pos] != ec) {
-			if (ec != ASCII_ESC && buf->buf[pos] == '\\') {
+			if (buf->buf[pos] == '\\') {
 				roff_expand_patch(buf, pos, "\\e", pos + 1);
 				pos++;
 			}
@@ -1632,12 +1632,7 @@ roff_getarg(struct roff *r, char **cpp, 
 				cp++;
 				break;
 			case '\\':
-				/*
-				 * Signal to roff_expand() that an escape
-				 * sequence resulted from copy-in processing
-				 * and needs to be checked or interpolated.
-				 */
-				cp[-pairs] = ASCII_ESC;
+				cp[-pairs] = '\\';
 				newesc = 1;
 				pairs++;
 				cp++;
@@ -1694,7 +1689,7 @@ roff_getarg(struct roff *r, char **cpp, 
 	buf.buf = start;
 	buf.sz = strlen(start) + 1;
 	buf.next = NULL;
-	if (roff_expand(r, &buf, ln, 0, ASCII_ESC) & ROFF_IGN) {
+	if (roff_expand(r, &buf, ln, 0, '\\') & ROFF_IGN) {
 		free(buf.buf);
 		buf.buf = mandoc_strdup("");
 	}
--
 To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-10-21 17:54 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-13  7:06 Crash core dump when parsing krb5_openlog.3 Baptiste Daroussin
2023-10-13 10:57 ` Ingo Schwarze
2023-10-21 17:54   ` Ingo Schwarze

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).