From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-0.5 required=5.0 tests=DKIM_ADSP_CUSTOM_MED, DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI autolearn=ham autolearn_force=no version=3.4.4 Received: from minnie.tuhs.org (minnie.tuhs.org [50.116.15.146]) by inbox.vuxu.org (Postfix) with ESMTP id 468D020DD4 for ; Mon, 20 May 2024 16:01:05 +0200 (CEST) Received: from minnie.tuhs.org (localhost [IPv6:::1]) by minnie.tuhs.org (Postfix) with ESMTP id E4A5141661; Tue, 21 May 2024 00:00:58 +1000 (AEST) Received: from mail-ot1-x333.google.com (mail-ot1-x333.google.com [IPv6:2607:f8b0:4864:20::333]) by minnie.tuhs.org (Postfix) with ESMTPS id 41059415F0 for ; Tue, 21 May 2024 00:00:52 +1000 (AEST) Received: by mail-ot1-x333.google.com with SMTP id 46e09a7af769-6f0f0494459so1297326a34.0 for ; Mon, 20 May 2024 07:00:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1716213651; x=1716818451; darn=tuhs.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=TBL4wQP2brX4oTZooaEAOELyahxIAd+4qm6YlDRX14k=; b=m8cht6bor8WpHW/AV/AtLmPxdq8jEjm+Sfj8hvZtrhKauw8bKmrk+MKY3yk5S1Gimz bQyqtEt+e8KVV5Q7odU+P2W1frha0C5PE1ecoiTCkVYQqzcvhFnphoODvFNxwX9HumxV Dbhp6D++qA8yJhWn3YRglwEYwQg5sKd9Qsp7wltpC2bINWB4+BDWGApVlmZzRPxMAzVJ F1A24SkhC0qNlJhbT6QXWZAb+NakG6fWinnF/rSmoIDW6o1lfRvuUfyShgYFJnFgCGkZ 4Be3T05RGpUmV12RBZhTD8dZ+Wz1FbGlOZlpl2kxqxYhd6vxDFHM5XD58mT2TJK9nfnN jzww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716213651; x=1716818451; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=TBL4wQP2brX4oTZooaEAOELyahxIAd+4qm6YlDRX14k=; b=W8xCB8o2xGp1pHGxXo/V2Hof94ZpuDjnGKMZANmG6GpSEzm+ycGOS52x33kIW2ATya 6jTk2PqgCXXWPHG13Q/V1uoiU0sYL6fVxvLjG1R8xTHKTqdL0MuSmJruzyf/d/GGHx7o 1CgOXFSC1A04YZ6TCxJF9Yw66PL0q4pjNrTeNZ9MgMh92ZRHvdU0pAgaCZUlOtDpP4eT fcc7FejuZ+lJvMe5PpGSk+4IKoUu3KbC5n10ASzuHDXfB2ADrONox3R8tkjhnKAyTF6v Ha4RAmQQXT/am08UZF697jdHDcs0wKBf+fCOn4TyGrp3dlJtdDPHYiCFmgeeapkzYswY Bp9g== X-Gm-Message-State: AOJu0YzqycAUqC4emh0zvH/UAQNWB6HrEoBVRhNxj/hWLtL0Z1msQZfE XJX7TXl5Wp2OdSImZdm+c20jpkTtBjsYZCedaYQbnek6UzH8PhPYSRC6PA== X-Google-Smtp-Source: AGHT+IFzHwzHZScZkDmIRwXtTNHwgfhnO5sZPP5vxLNTvGDRUyQaR7RZW60pGVwKyiQlHdjwAuL/UA== X-Received: by 2002:a05:6870:a54d:b0:23d:ac89:1c14 with SMTP id 586e51a60fabf-24172f6a2e7mr33136331fac.39.1716213650110; Mon, 20 May 2024 07:00:50 -0700 (PDT) Received: from illithid (ip68-12-97-90.ok.ok.cox.net. [68.12.97.90]) by smtp.gmail.com with ESMTPSA id 46e09a7af769-6f265bf65f5sm240277a34.59.2024.05.20.07.00.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 20 May 2024 07:00:49 -0700 (PDT) Date: Mon, 20 May 2024 09:00:47 -0500 From: "G. Branden Robinson" To: tuhs@tuhs.org Message-ID: <20240520140047.4x4lwzs6wmo34uge@illithid> References: <202405201314.44KDE7rq170661@freefriends.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="ykviugjk6aprnpep" Content-Disposition: inline In-Reply-To: <202405201314.44KDE7rq170661@freefriends.org> Message-ID-Hash: 5ZS6I6QPSD3WFUCWFYKSPTEF3MPZHEVV X-Message-ID-Hash: 5ZS6I6QPSD3WFUCWFYKSPTEF3MPZHEVV X-MailFrom: g.branden.robinson@gmail.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: douglas.mcilroy@dartmouth.edu, groff@gnu.org X-Mailman-Version: 3.3.6b1 Precedence: list Subject: [TUHS] Re: A fuzzy awk. (Was: The 'usage: ...' message.) List-Id: The Unix Heritage Society mailing list Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --ykviugjk6aprnpep Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi folks, At 2024-05-20T07:14:07-0600, arnold@skeeve.com wrote: > Douglas McIlroy wrote: > > I'm surprised by nonchalance about bad inputs evoking bad program > > behavior. That attitude may have been excusable 50 years ago. By > > now, though, we have seen so much malicious exploitation of open > > avenues of "undefined behavior" that we can no longer ignore bugs > > that "can't happen when using the tool correctly". Mature software > > should not brook incorrect usage. >=20 > It's not nonchalance, not at all! >=20 > The current behavior is to die on the first syntax error, instead of > trying to be "helpful" by continuing to try to parse the program in > the hope of reporting other errors. [...] > The crashes came because errors cascaded. I don't see a reason to > spend valuable, *personal* time on adding defenses *where they aren't > needed*. >=20 > A steel door on your bedroom closet does no good if your front door is > made of balsa wood. My change was to stop the badness at the front > door. >=20 > > I commend attention to the LangSec movement, which advocates for > > rigorously enforced separation between legal and illegal inputs. >=20 > Illegal input, in gawk, as far as I know, should always cause a syntax > error report and an immediate exit. >=20 > If it doesn't, that is a bug, and I'll be happy to try to fix it. >=20 > I hope that clarifies things. For grins, and for a data point from elsewhere in GNU-land, GNU troff is pretty robust to this sort of thing. Much as I might like to boast of having improved it in this area, it appears to have already come with iron long johns courtesy of James Clark and/or Werner Lemberg. I threw troff its own ELF executable as a crude fuzz test some years ago, and I don't recall needing to fix anything except unhelpfully vague diagnostic messages (a phenomenon I am predisposed to observe anyway). I did notice today that in one case we were spewing back out unprintable characters (newlines, character codes > 127) _in_ one (but only one) of the diagnostic messages, and while that's ugly, it's not an obvious exploitation vector to me. Nevertheless I decided to fix it and it will be in my next push. So here's the mess you get when feeding GNU troff to itself. No GNU troff since before 1.22.3 core dumps on this sort of unprepossessing input. $ ./build/test-groff -Ww -z /usr/bin/troff 2>&1 | sed 's/:[0-9]\+:/:/' | so= rt | uniq -c 17 troff:/usr/bin/troff: error: a backspace character is not allowed i= n an escape sequence parameter 10 troff:/usr/bin/troff: error: a space character is not allowed in an= escape sequence parameter 1 troff:/usr/bin/troff: error: a space is not allowed as a starting d= elimiter 1 troff:/usr/bin/troff: error: a special character is not allowed in = an identifier 1 troff:/usr/bin/troff: error: character '-' is not allowed as a star= ting delimiter 1 troff:/usr/bin/troff: error: invalid argument ')' to output suppres= sion escape sequence 1 troff:/usr/bin/troff: error: invalid argument 'c' to output suppres= sion escape sequence 1 troff:/usr/bin/troff: error: invalid argument 'l' to output suppres= sion escape sequence 1 troff:/usr/bin/troff: error: invalid argument 'm' to output suppres= sion escape sequence 1 troff:/usr/bin/troff: error: invalid positional argument number ',' 3 troff:/usr/bin/troff: error: invalid positional argument number '<' 3 troff:/usr/bin/troff: error: invalid positional argument number 'D' 1 troff:/usr/bin/troff: error: invalid positional argument number 'E' 10 troff:/usr/bin/troff: error: invalid positional argument number 'H' 1 troff:/usr/bin/troff: error: invalid positional argument number 'Hi' 1 troff:/usr/bin/troff: error: invalid positional argument number 'I' 1 troff:/usr/bin/troff: error: invalid positional argument number 'I9' 1 troff:/usr/bin/troff: error: invalid positional argument number 'L' 1 troff:/usr/bin/troff: error: invalid positional argument number 'LD' 2 troff:/usr/bin/troff: error: invalid positional argument number 'LL' 5 troff:/usr/bin/troff: error: invalid positional argument number 'LT' 1 troff:/usr/bin/troff: error: invalid positional argument number 'M' 4 troff:/usr/bin/troff: error: invalid positional argument number 'P' 5 troff:/usr/bin/troff: error: invalid positional argument number 'X' 1 troff:/usr/bin/troff: error: invalid positional argument number 'dH' 1 troff:/usr/bin/troff: error: invalid positional argument number 'h' 1 troff:/usr/bin/troff: error: invalid positional argument number 'l' 1 troff:/usr/bin/troff: error: invalid positional argument number 'p' 1 troff:/usr/bin/troff: error: invalid positional argument number 'x' 3 troff:/usr/bin/troff: error: invalid positional argument number '|' 35 troff:/usr/bin/troff: error: invalid positional argument number (un= printable) 3 troff:/usr/bin/troff: error: unterminated transparent embedding esc= ape sequence The second to last (and most frequent) message in the list above is the "new" one. Here's the diff. diff --git a/src/roff/troff/input.cpp b/src/roff/troff/input.cpp index 8d828a01e..596ecf6f9 100644 --- a/src/roff/troff/input.cpp +++ b/src/roff/troff/input.cpp @@ -4556,10 +4556,21 @@ static void interpolate_arg(symbol nm) } else { const char *p; - for (p =3D s; *p && csdigit(*p); p++) - ; - if (*p) - copy_mode_error("invalid positional argument number '%1'", s); + bool is_valid =3D true; + bool is_printable =3D true; + for (p =3D s; *p !=3D 0 /* nullptr */; p++) { + if (!csdigit(*p)) + is_valid =3D false; + if (!csprint(*p)) + is_printable =3D false; + } + if (!is_valid) { + const char msg[] =3D "invalid positional argument number"; + if (is_printable) + copy_mode_error("%1 '%2'", msg, s); + else + copy_mode_error("%1 (unprintable)", msg); + } else input_stack::push(input_stack::get_arg(atoi(s))); } GNU troff may have started out with an easier task in this area than an AWK or a shell had; its syntax is not block-structured in the same way, so parser state recovery is easier, and it's _inherently_ a filter. The only fruitful fuzz attack on groff I can recall was upon indexed bibliographic database files, which are a binary format. This went unresolved for several years[1] but I fixed it for groff 1.23.0. https://bugs.debian.org/716109 Regards, Branden [1] I think I understand the low triage priority. Few groff users use the refer(1) preprocessor, and of those who do, even fewer find modern systems so poorly performant at text scanning that they desire the services of indxbib(1) to speed lookup of bibliographic entries. --ykviugjk6aprnpep Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEh3PWHWjjDgcrENwa0Z6cfXEmbc4FAmZLV4cACgkQ0Z6cfXEm bc4t5g//adruY2bvvMWcZFUvMAUm1dlVgyElDLIEn3eneJNX6B7Jhv0v9bGkVd9l ns6RW48og0y/L9+iubGzENZQkrP9cIM7+Je/Z20qpHCMmickx2a2vJ6yWEH/GDir LwUf903CPtllMklNgRZH4cPsM69wuj8mYswh236J0ueWAzmVFVznsn58Ndoq4/YZ K417Lg8nDfTBov0obC/Aa3zV0cgRCXt24PBUsBJES5nqUdGhGcInPeZ8oSuN13EA OqwRTl6EmaZqDF9myC/GEgIr/B2LXeWQZIpAWHKe6mJnDw972Odb/agXPqxiZoIk 7l++QF6g11eGyq7ZfIzGTiubP8UC2YeCA1zUPFErwmFgCtFdtR3unFmrmC5CihY3 WlN6koIJ5wzhPW9gQ3s1AxTI2t887hsfekMrWv8pz2roF4ws++JKOFad6y2qv0xt 5w+TvpXW5CD7KY8AHX9LMG1suvPmAegQu+OMT1ukEFocU0NfL9IzOLUv5KvP3TMh O6YO5HqsnvU1nOwK4Rg8tLWu/4MDr81ERayb4HXMFpMnhFula35R7asmF5Y5ZpMf huqnb3kmme74A8N8pV1Ob/0uYpRJlvxQBjTI0TDRaj8d3LFQ9CtRi/2Gew7wDnQx EJ5YI19roVpXS3igYXWbmW/hjoCSv6F1e3KKK/x1YCx9ENsIvoU= =pYtf -----END PGP SIGNATURE----- --ykviugjk6aprnpep--