* [TUHS] A fuzzy awk. (Was: The 'usage: ...' message.) @ 2024-05-20 13:06 Douglas McIlroy 2024-05-20 13:14 ` [TUHS] " arnold ` (3 more replies) 0 siblings, 4 replies; 18+ messages in thread From: Douglas McIlroy @ 2024-05-20 13:06 UTC (permalink / raw) To: TUHS main list [-- Attachment #1: Type: text/plain, Size: 616 bytes --] I'm surprised by nonchalance about bad inputs evoking bad program behavior. That attitude may have been excusable 50 years ago. By now, though, we have seen so much malicious exploitation of open avenues of "undefined behavior" that we can no longer ignore bugs that "can't happen when using the tool correctly". Mature software should not brook incorrect usage. "Bailing out near line 1" is a sign of defensive precautions. Crashes and unjustified output betray their absence. I commend attention to the LangSec movement, which advocates for rigorously enforced separation between legal and illegal inputs. Doug [-- Attachment #2: Type: text/html, Size: 752 bytes --] ^ permalink raw reply [flat|nested] 18+ messages in thread
* [TUHS] Re: A fuzzy awk. (Was: The 'usage: ...' message.) 2024-05-20 13:06 [TUHS] A fuzzy awk. (Was: The 'usage: ...' message.) Douglas McIlroy @ 2024-05-20 13:14 ` arnold 2024-05-20 14:00 ` G. Branden Robinson 2024-05-20 13:25 ` Chet Ramey ` (2 subsequent siblings) 3 siblings, 1 reply; 18+ messages in thread From: arnold @ 2024-05-20 13:14 UTC (permalink / raw) To: tuhs, douglas.mcilroy Perhaps I should not respond to this immediately. But: Douglas McIlroy <douglas.mcilroy@dartmouth.edu> wrote: > I'm surprised by nonchalance about bad inputs evoking bad program behavior. > That attitude may have been excusable 50 years ago. By now, though, we have > seen so much malicious exploitation of open avenues of "undefined behavior" > that we can no longer ignore bugs that "can't happen when using the tool > correctly". Mature software should not brook incorrect usage. It's not nonchalance, not at all! The current behavior is to die on the first syntax error, instead of trying to be "helpful" by continuing to try to parse the program in the hope of reporting other errors. > "Bailing out near line 1" is a sign of defensive precautions. Crashes and > unjustified output betray their absence. The crashes came because errors cascaded. I don't see a reason to spend valuable, *personal* time on adding defenses *where they aren't needed*. A steel door on your bedroom closet does no good if your front door is made of balsa wood. My change was to stop the badness at the front door. > I commend attention to the LangSec movement, which advocates for rigorously > enforced separation between legal and illegal inputs. Illegal input, in gawk, as far as I know, should always cause a syntax error report and an immediate exit. If it doesn't, that is a bug, and I'll be happy to try to fix it. I hope that clarifies things. Arnold ^ permalink raw reply [flat|nested] 18+ messages in thread
* [TUHS] Re: A fuzzy awk. (Was: The 'usage: ...' message.) 2024-05-20 13:14 ` [TUHS] " arnold @ 2024-05-20 14:00 ` G. Branden Robinson 0 siblings, 0 replies; 18+ messages in thread From: G. Branden Robinson @ 2024-05-20 14:00 UTC (permalink / raw) To: tuhs; +Cc: douglas.mcilroy, groff [-- Attachment #1: Type: text/plain, Size: 6974 bytes --] Hi folks, At 2024-05-20T07:14:07-0600, arnold@skeeve.com wrote: > Douglas McIlroy <douglas.mcilroy@dartmouth.edu> wrote: > > I'm surprised by nonchalance about bad inputs evoking bad program > > behavior. That attitude may have been excusable 50 years ago. By > > now, though, we have seen so much malicious exploitation of open > > avenues of "undefined behavior" that we can no longer ignore bugs > > that "can't happen when using the tool correctly". Mature software > > should not brook incorrect usage. > > It's not nonchalance, not at all! > > The current behavior is to die on the first syntax error, instead of > trying to be "helpful" by continuing to try to parse the program in > the hope of reporting other errors. [...] > The crashes came because errors cascaded. I don't see a reason to > spend valuable, *personal* time on adding defenses *where they aren't > needed*. > > A steel door on your bedroom closet does no good if your front door is > made of balsa wood. My change was to stop the badness at the front > door. > > > I commend attention to the LangSec movement, which advocates for > > rigorously enforced separation between legal and illegal inputs. > > Illegal input, in gawk, as far as I know, should always cause a syntax > error report and an immediate exit. > > If it doesn't, that is a bug, and I'll be happy to try to fix it. > > I hope that clarifies things. For grins, and for a data point from elsewhere in GNU-land, GNU troff is pretty robust to this sort of thing. Much as I might like to boast of having improved it in this area, it appears to have already come with iron long johns courtesy of James Clark and/or Werner Lemberg. I threw troff its own ELF executable as a crude fuzz test some years ago, and I don't recall needing to fix anything except unhelpfully vague diagnostic messages (a phenomenon I am predisposed to observe anyway). I did notice today that in one case we were spewing back out unprintable characters (newlines, character codes > 127) _in_ one (but only one) of the diagnostic messages, and while that's ugly, it's not an obvious exploitation vector to me. Nevertheless I decided to fix it and it will be in my next push. So here's the mess you get when feeding GNU troff to itself. No GNU troff since before 1.22.3 core dumps on this sort of unprepossessing input. $ ./build/test-groff -Ww -z /usr/bin/troff 2>&1 | sed 's/:[0-9]\+:/:/' | sort | uniq -c 17 troff:/usr/bin/troff: error: a backspace character is not allowed in an escape sequence parameter 10 troff:/usr/bin/troff: error: a space character is not allowed in an escape sequence parameter 1 troff:/usr/bin/troff: error: a space is not allowed as a starting delimiter 1 troff:/usr/bin/troff: error: a special character is not allowed in an identifier 1 troff:/usr/bin/troff: error: character '-' is not allowed as a starting delimiter 1 troff:/usr/bin/troff: error: invalid argument ')' to output suppression escape sequence 1 troff:/usr/bin/troff: error: invalid argument 'c' to output suppression escape sequence 1 troff:/usr/bin/troff: error: invalid argument 'l' to output suppression escape sequence 1 troff:/usr/bin/troff: error: invalid argument 'm' to output suppression escape sequence 1 troff:/usr/bin/troff: error: invalid positional argument number ',' 3 troff:/usr/bin/troff: error: invalid positional argument number '<' 3 troff:/usr/bin/troff: error: invalid positional argument number 'D' 1 troff:/usr/bin/troff: error: invalid positional argument number 'E' 10 troff:/usr/bin/troff: error: invalid positional argument number 'H' 1 troff:/usr/bin/troff: error: invalid positional argument number 'Hi' 1 troff:/usr/bin/troff: error: invalid positional argument number 'I' 1 troff:/usr/bin/troff: error: invalid positional argument number 'I9' 1 troff:/usr/bin/troff: error: invalid positional argument number 'L' 1 troff:/usr/bin/troff: error: invalid positional argument number 'LD' 2 troff:/usr/bin/troff: error: invalid positional argument number 'LL' 5 troff:/usr/bin/troff: error: invalid positional argument number 'LT' 1 troff:/usr/bin/troff: error: invalid positional argument number 'M' 4 troff:/usr/bin/troff: error: invalid positional argument number 'P' 5 troff:/usr/bin/troff: error: invalid positional argument number 'X' 1 troff:/usr/bin/troff: error: invalid positional argument number 'dH' 1 troff:/usr/bin/troff: error: invalid positional argument number 'h' 1 troff:/usr/bin/troff: error: invalid positional argument number 'l' 1 troff:/usr/bin/troff: error: invalid positional argument number 'p' 1 troff:/usr/bin/troff: error: invalid positional argument number 'x' 3 troff:/usr/bin/troff: error: invalid positional argument number '|' 35 troff:/usr/bin/troff: error: invalid positional argument number (unprintable) 3 troff:/usr/bin/troff: error: unterminated transparent embedding escape sequence The second to last (and most frequent) message in the list above is the "new" one. Here's the diff. diff --git a/src/roff/troff/input.cpp b/src/roff/troff/input.cpp index 8d828a01e..596ecf6f9 100644 --- a/src/roff/troff/input.cpp +++ b/src/roff/troff/input.cpp @@ -4556,10 +4556,21 @@ static void interpolate_arg(symbol nm) } else { const char *p; - for (p = s; *p && csdigit(*p); p++) - ; - if (*p) - copy_mode_error("invalid positional argument number '%1'", s); + bool is_valid = true; + bool is_printable = true; + for (p = s; *p != 0 /* nullptr */; p++) { + if (!csdigit(*p)) + is_valid = false; + if (!csprint(*p)) + is_printable = false; + } + if (!is_valid) { + const char msg[] = "invalid positional argument number"; + if (is_printable) + copy_mode_error("%1 '%2'", msg, s); + else + copy_mode_error("%1 (unprintable)", msg); + } else input_stack::push(input_stack::get_arg(atoi(s))); } GNU troff may have started out with an easier task in this area than an AWK or a shell had; its syntax is not block-structured in the same way, so parser state recovery is easier, and it's _inherently_ a filter. The only fruitful fuzz attack on groff I can recall was upon indexed bibliographic database files, which are a binary format. This went unresolved for several years[1] but I fixed it for groff 1.23.0. https://bugs.debian.org/716109 Regards, Branden [1] I think I understand the low triage priority. Few groff users use the refer(1) preprocessor, and of those who do, even fewer find modern systems so poorly performant at text scanning that they desire the services of indxbib(1) to speed lookup of bibliographic entries. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 18+ messages in thread
* [TUHS] Re: A fuzzy awk. (Was: The 'usage: ...' message.) 2024-05-20 13:06 [TUHS] A fuzzy awk. (Was: The 'usage: ...' message.) Douglas McIlroy 2024-05-20 13:14 ` [TUHS] " arnold @ 2024-05-20 13:25 ` Chet Ramey 2024-05-20 13:41 ` [TUHS] Re: A fuzzy awk Ralph Corderoy 2024-05-20 13:54 ` Ralph Corderoy 2024-05-20 16:06 ` [TUHS] Re: A fuzzy awk. (Was: The 'usage: ...' message.) Paul Winalski 3 siblings, 1 reply; 18+ messages in thread From: Chet Ramey @ 2024-05-20 13:25 UTC (permalink / raw) To: Douglas McIlroy, TUHS main list [-- Attachment #1.1: Type: text/plain, Size: 465 bytes --] On 5/20/24 9:06 AM, Douglas McIlroy wrote: > I'm surprised by nonchalance about bad inputs evoking bad program behavior. I think the claim is that it's better to stop immediately with an error on invalid input rather than guess at the user's intent and try to go on. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRU chet@case.edu http://tiswww.cwru.edu/~chet/ [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 203 bytes --] ^ permalink raw reply [flat|nested] 18+ messages in thread
* [TUHS] Re: A fuzzy awk. 2024-05-20 13:25 ` Chet Ramey @ 2024-05-20 13:41 ` Ralph Corderoy 2024-05-20 14:26 ` Chet Ramey 2024-05-22 13:44 ` arnold 0 siblings, 2 replies; 18+ messages in thread From: Ralph Corderoy @ 2024-05-20 13:41 UTC (permalink / raw) To: TUHS main list Hi Chet, > Doug wrote: > > I'm surprised by nonchalance about bad inputs evoking bad program > > behavior. > > I think the claim is that it's better to stop immediately with an > error on invalid input rather than guess at the user's intent and try > to go on. That aside, having made the decision to patch up the input so more punched cards are consumed, the patch should be bug free. Say it's inserting a semicolon token for pretence. It should have initialised source-file locations just as if it were real. Not an uninitialised pointer to a source filename so a later dereference failed. I can see an avalanche of errors in an earlier gawk caused problems, but each time there would have been a first patch of the input which made a mistake causing the pebble to start rolling. My understanding is that there was potentially a lot of these and rather than fix them it was more productive of the limited time to stop patching the input. Then the code which patched could be deleted, getting rid of the buggy bits along the way? -- Cheers, Ralph. ^ permalink raw reply [flat|nested] 18+ messages in thread
* [TUHS] Re: A fuzzy awk. 2024-05-20 13:41 ` [TUHS] Re: A fuzzy awk Ralph Corderoy @ 2024-05-20 14:26 ` Chet Ramey 2024-05-22 13:44 ` arnold 1 sibling, 0 replies; 18+ messages in thread From: Chet Ramey @ 2024-05-20 14:26 UTC (permalink / raw) To: Ralph Corderoy, TUHS main list [-- Attachment #1.1: Type: text/plain, Size: 2043 bytes --] On 5/20/24 9:41 AM, Ralph Corderoy wrote: > Hi Chet, > >> Doug wrote: >>> I'm surprised by nonchalance about bad inputs evoking bad program >>> behavior. >> >> I think the claim is that it's better to stop immediately with an >> error on invalid input rather than guess at the user's intent and try >> to go on. > > That aside, having made the decision to patch up the input so more > punched cards are consumed, the patch should be bug free. > > Say it's inserting a semicolon token for pretence. It should have > initialised source-file locations just as if it were real. Not an > uninitialised pointer to a source filename so a later dereference > failed. > > I can see an avalanche of errors in an earlier gawk caused problems, but > each time there would have been a first patch of the input which made > a mistake causing the pebble to start rolling. My understanding is that > there was potentially a lot of these and rather than fix them it was > more productive of the limited time to stop patching the input. Then > the code which patched could be deleted, getting rid of the buggy bits > along the way? Maybe we're talking about the same thing. My impression is that at each point there was more than one potential token to insert and go on, and gawk chose one (probably the most common one), in the hopes that it would be able to report as many errors as possible. There's always the chance you'll be wrong there. (I have no insight into the actual nature of these issues, or the actual corruption that caused the crashes, so take the next with skepticism.) And then rather than go back and modify other state after inserting this token -- which gawk did not do -- for the sole purpose of making this guessing more crash-resistant, Arnold chose a different approach: exit on invalid input. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRU chet@case.edu http://tiswww.cwru.edu/~chet/ [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 203 bytes --] ^ permalink raw reply [flat|nested] 18+ messages in thread
* [TUHS] Re: A fuzzy awk. 2024-05-20 13:41 ` [TUHS] Re: A fuzzy awk Ralph Corderoy 2024-05-20 14:26 ` Chet Ramey @ 2024-05-22 13:44 ` arnold 1 sibling, 0 replies; 18+ messages in thread From: arnold @ 2024-05-22 13:44 UTC (permalink / raw) To: tuhs, ralph I've been travelling, so I haven't been able to answer these mails until now. Ralph Corderoy <ralph@inputplus.co.uk> wrote: > I can see an avalanche of errors in an earlier gawk caused problems, but > each time there would have been a first patch of the input which made > a mistake causing the pebble to start rolling. My understanding is that > there was potentially a lot of these and rather than fix them it was > more productive of the limited time to stop patching the input. Then > the code which patched could be deleted, getting rid of the buggy bits > along the way? That's not the case. Gawk didn't try to patch the input. It simply set a flag saying "don't try to run" but kept on parsing anyway, in the hope of finding more errors. That was a bad idea, because the representation of the program being built was then not in the correct state to have more stuff parsed and converted into byte code. Very early on, the first parse error caused an exit. I changed it to keep going to try to be helpful. But when that became a source for essentially specious bug reports and a time sink for me, it became time to go back to exiting on the first problem. HTH, Arnold ^ permalink raw reply [flat|nested] 18+ messages in thread
* [TUHS] Re: A fuzzy awk. 2024-05-20 13:06 [TUHS] A fuzzy awk. (Was: The 'usage: ...' message.) Douglas McIlroy 2024-05-20 13:14 ` [TUHS] " arnold 2024-05-20 13:25 ` Chet Ramey @ 2024-05-20 13:54 ` Ralph Corderoy 2024-05-20 15:39 ` [TUHS] OT: LangSec (Re: A fuzzy awk.) Åke Nordin 2024-05-20 16:06 ` [TUHS] Re: A fuzzy awk. (Was: The 'usage: ...' message.) Paul Winalski 3 siblings, 1 reply; 18+ messages in thread From: Ralph Corderoy @ 2024-05-20 13:54 UTC (permalink / raw) To: TUHS main list Hi, Doug wrote: > I commend attention to the LangSec movement, which advocates for > rigorously enforced separation between legal and illegal inputs. https://langsec.org ‘The Language-theoretic approach (LangSec) regards the Internet insecurity epidemic as a consequence of ‘ad hoc’ programming of input handling at all layers of network stacks, and in other kinds of software stacks. LangSec posits that the only path to trustworthy software that takes untrusted inputs is treating all valid or expected inputs as a formal language, and the respective input-handling routines as a ‘recognizer’ for that language. The recognition must be feasible, and the recognizer must match the language in required computation power. ‘When input handling is done in ad hoc way, the ‘de facto’ recognizer, i.e. the input recognition and validation code ends up scattered throughout the program, does not match the programmers' assumptions about safety and validity of data, and thus provides ample opportunities for exploitation. Moreover, for complex input languages the problem of full recognition of valid or expected inputs may be *undecidable*, in which case no amount of input-checking code or testing will suffice to secure the program. Many popular protocols and formats fell into this trap, the empirical fact with which security practitioners are all too familiar. ‘LangSec helps draw the boundary between protocols and API designs that can and cannot be secured and implemented securely, and charts a way to building truly trustworthy protocols and systems. A longer summary of LangSec in this USENIX Security BoF hand-out, and in the talks, articles, and papers below.’ That does look interesting; I'd not heard of it. -- Cheers, Ralph. ^ permalink raw reply [flat|nested] 18+ messages in thread
* [TUHS] OT: LangSec (Re: A fuzzy awk.) 2024-05-20 13:54 ` Ralph Corderoy @ 2024-05-20 15:39 ` Åke Nordin 2024-05-20 16:09 ` [TUHS] " Ben Kallus 0 siblings, 1 reply; 18+ messages in thread From: Åke Nordin @ 2024-05-20 15:39 UTC (permalink / raw) To: tuhs On 2024-05-20 15:54, Ralph Corderoy wrote: > Doug wrote: >> I commend attention to the LangSec movement, which advocates for >> rigorously enforced separation between legal and illegal inputs. > https://langsec.org > > ‘The Language-theoretic approach (LangSec) regards the Internet > insecurity epidemic as a consequence of ‘ad hoc’ programming of > input handling at all layers of network stacks, and in other kinds > of software stacks. LangSec posits that the only path to > trustworthy software that takes untrusted inputs is treating all > valid or expected inputs as a formal language, and the respective > input-handling routines as a ‘recognizer’ for that language. . . . > ‘LangSec helps draw the boundary between protocols and API designs > that can and cannot be secured and implemented securely, and charts > a way to building truly trustworthy protocols and systems. A longer > summary of LangSec in this USENIX Security BoF hand-out, and in the > talks, articles, and papers below.’ Yes, it's an interesting concept. Those *n?x tools that have lex/yacc frontends are probably closer to this than the average hack. It may become hard to reconcile this with the robustness principle (Be conservative in what you send, be liberal in what you accept) that Jon Postel popularized. Maybe it becomes necessary, though. -- Åke Nordin <ake.nordin@netia.se>, resident Net/Lunix/telecom geek. Netia Data AB, Stockholm SWEDEN *46#7O466OI99# ^ permalink raw reply [flat|nested] 18+ messages in thread
* [TUHS] Re: OT: LangSec (Re: A fuzzy awk.) 2024-05-20 15:39 ` [TUHS] OT: LangSec (Re: A fuzzy awk.) Åke Nordin @ 2024-05-20 16:09 ` Ben Kallus 2024-05-20 20:02 ` John Levine 0 siblings, 1 reply; 18+ messages in thread From: Ben Kallus @ 2024-05-20 16:09 UTC (permalink / raw) To: Åke Nordin; +Cc: tuhs > It may become hard to reconcile this with the robustness principle > (Be conservative in what you send, be liberal in what you accept) > that Jon Postel popularized. Maybe it becomes necessary, though. Yes; the LangSec people essentially reject the robustness principle. See https://langsec.org/papers/postel-patch.pdf -Ben ^ permalink raw reply [flat|nested] 18+ messages in thread
* [TUHS] Re: OT: LangSec (Re: A fuzzy awk.) 2024-05-20 16:09 ` [TUHS] " Ben Kallus @ 2024-05-20 20:02 ` John Levine 2024-05-20 20:11 ` Larry McVoy 0 siblings, 1 reply; 18+ messages in thread From: John Levine @ 2024-05-20 20:02 UTC (permalink / raw) To: tuhs; +Cc: benjamin.p.kallus.gr It appears that Ben Kallus <benjamin.p.kallus.gr@dartmouth.edu> said: >> It may become hard to reconcile this with the robustness principle >> (Be conservative in what you send, be liberal in what you accept) >> that Jon Postel popularized. Maybe it becomes necessary, though. > >Yes; the LangSec people essentially reject the robustness principle. > >See https://langsec.org/papers/postel-patch.pdf On the contrary, they actually understand it. Postel was widely misunderstood to say that you should try to accept arbitrary garbage. People who knew him tell me that he meant to be liberal when the spec is ambiguous, not to allow stuff that is just wrong. As their quote from RFC 1122 points out, he also said you should be prepared for arbitrary garbage so you can reject it. R's, John ^ permalink raw reply [flat|nested] 18+ messages in thread
* [TUHS] Re: OT: LangSec (Re: A fuzzy awk.) 2024-05-20 20:02 ` John Levine @ 2024-05-20 20:11 ` Larry McVoy 2024-05-20 21:00 ` Ben Kallus 0 siblings, 1 reply; 18+ messages in thread From: Larry McVoy @ 2024-05-20 20:11 UTC (permalink / raw) To: John Levine; +Cc: tuhs, benjamin.p.kallus.gr On Mon, May 20, 2024 at 04:02:26PM -0400, John Levine wrote: > It appears that Ben Kallus <benjamin.p.kallus.gr@dartmouth.edu> said: > >> It may become hard to reconcile this with the robustness principle > >> (Be conservative in what you send, be liberal in what you accept) > >> that Jon Postel popularized. Maybe it becomes necessary, though. > > > >Yes; the LangSec people essentially reject the robustness principle. > > > >See https://langsec.org/papers/postel-patch.pdf > > On the contrary, they actually understand it. > > Postel was widely misunderstood to say that you should try to accept > arbitrary garbage. People who knew him tell me that he meant to be > liberal when the spec is ambiguous, not to allow stuff that is just > wrong. As their quote from RFC 1122 points out, he also said you > should be prepared for arbitrary garbage so you can reject it. Yeah, I read the pdf and I took away the same thing as John. -- --- Larry McVoy Retired to fishing http://www.mcvoy.com/lm/boat ^ permalink raw reply [flat|nested] 18+ messages in thread
* [TUHS] Re: OT: LangSec (Re: A fuzzy awk.) 2024-05-20 20:11 ` Larry McVoy @ 2024-05-20 21:00 ` Ben Kallus 2024-05-20 21:03 ` John R Levine 2024-05-20 21:14 ` Larry McVoy 0 siblings, 2 replies; 18+ messages in thread From: Ben Kallus @ 2024-05-20 21:00 UTC (permalink / raw) To: Larry McVoy; +Cc: John Levine, tuhs What I meant was that the LangSec people reject the robustness principle as it is commonly understood (i.e., make a "reasonable" guess when receiving garbage), not necessarily that their view is incompatible with Postel's original vision. This interpretation of the principle is pretty widespread; take a look at the Nginx mailing list if you have any doubt. I attribute this to the same phenomenon that inverted the meaning of REST. -Ben ^ permalink raw reply [flat|nested] 18+ messages in thread
* [TUHS] Re: OT: LangSec (Re: A fuzzy awk.) 2024-05-20 21:00 ` Ben Kallus @ 2024-05-20 21:03 ` John R Levine 2024-05-20 21:14 ` Larry McVoy 1 sibling, 0 replies; 18+ messages in thread From: John R Levine @ 2024-05-20 21:03 UTC (permalink / raw) To: Ben Kallus; +Cc: tuhs > What I meant was that the LangSec people reject the robustness > principle as it is commonly understood (i.e., make a "reasonable" > guess when receiving garbage), not necessarily that their view is > incompatible with Postel's original vision. This interpretation of the > principle is pretty widespread; take a look at the Nginx mailing list > if you have any doubt. I attribute this to the same phenomenon that > inverted the meaning of REST. Oh, OK, no disagreement there. I'm as tired as you are of people invoking Postel to excuse slovenly code. Regards, John Levine, johnl@taugh.com, Taughannock Networks, Trumansburg NY Please consider the environment before reading this e-mail. https://jl.ly ^ permalink raw reply [flat|nested] 18+ messages in thread
* [TUHS] Re: OT: LangSec (Re: A fuzzy awk.) 2024-05-20 21:00 ` Ben Kallus 2024-05-20 21:03 ` John R Levine @ 2024-05-20 21:14 ` Larry McVoy 2024-05-20 21:46 ` Ben Kallus 1 sibling, 1 reply; 18+ messages in thread From: Larry McVoy @ 2024-05-20 21:14 UTC (permalink / raw) To: Ben Kallus; +Cc: John Levine, tuhs On Mon, May 20, 2024 at 05:00:40PM -0400, Ben Kallus wrote: > What I meant was that the LangSec people reject the robustness > principle as it is commonly understood (i.e., make a "reasonable" > guess when receiving garbage) That most certainly is not what I took from what Postel said. And I say that as someone who designed a distributed system that had client and server sides and had to make that work across versions from last week to 10-20 years ago. I took it more as "Be more and more careful what you say, get that more correct with each release, but tolerate the less correct stuff you might get from earlier versions". In no way did I think he meant ``make a "reasonable" guess when receiving garbage''. Garbage is garbage, you error on that. -- --- Larry McVoy Retired to fishing http://www.mcvoy.com/lm/boat ^ permalink raw reply [flat|nested] 18+ messages in thread
* [TUHS] Re: OT: LangSec (Re: A fuzzy awk.) 2024-05-20 21:14 ` Larry McVoy @ 2024-05-20 21:46 ` Ben Kallus 2024-05-20 21:57 ` Larry McVoy 0 siblings, 1 reply; 18+ messages in thread From: Ben Kallus @ 2024-05-20 21:46 UTC (permalink / raw) To: Larry McVoy; +Cc: John Levine, tuhs My point was that, regardless of Postel's original intent, many people have interpreted his principle to mean that accepting garbage is good. *This* interpretation is incompatible with LangSec. See RFC 9413 for an exploration of the many interpretations of Postel's principle. -Ben ^ permalink raw reply [flat|nested] 18+ messages in thread
* [TUHS] Re: OT: LangSec (Re: A fuzzy awk.) 2024-05-20 21:46 ` Ben Kallus @ 2024-05-20 21:57 ` Larry McVoy 0 siblings, 0 replies; 18+ messages in thread From: Larry McVoy @ 2024-05-20 21:57 UTC (permalink / raw) To: Ben Kallus; +Cc: John Levine, tuhs Those would be the stupid people and you can't fix stupid. Seriously, people can twist anything into anything. Just because dumb people didn't understand his principle doesn't mean it was a bad principle. On Mon, May 20, 2024 at 05:46:48PM -0400, Ben Kallus wrote: > My point was that, regardless of Postel's original intent, many people > have interpreted his principle to mean that accepting garbage is good. > *This* interpretation is incompatible with LangSec. > > See RFC 9413 for an exploration of the many interpretations of > Postel's principle. > > -Ben -- --- Larry McVoy Retired to fishing http://www.mcvoy.com/lm/boat ^ permalink raw reply [flat|nested] 18+ messages in thread
* [TUHS] Re: A fuzzy awk. (Was: The 'usage: ...' message.) 2024-05-20 13:06 [TUHS] A fuzzy awk. (Was: The 'usage: ...' message.) Douglas McIlroy ` (2 preceding siblings ...) 2024-05-20 13:54 ` Ralph Corderoy @ 2024-05-20 16:06 ` Paul Winalski 3 siblings, 0 replies; 18+ messages in thread From: Paul Winalski @ 2024-05-20 16:06 UTC (permalink / raw) To: TUHS main list [-- Attachment #1: Type: text/plain, Size: 2182 bytes --] On Mon, May 20, 2024 at 9:17 AM Douglas McIlroy < douglas.mcilroy@dartmouth.edu> wrote: > I'm surprised by nonchalance about bad inputs evoking bad program > behavior. That attitude may have been excusable 50 years ago. By now, > though, we have seen so much malicious exploitation of open avenues of > "undefined behavior" that we can no longer ignore bugs that "can't happen > when using the tool correctly". Mature software should not brook incorrect > usage. > > Accepting bad inputs can also lead to security issues. The data breaches from SQL-based attacks are a modern case in point. IMO, as a programmer you owe it to your users to do your best to detect bad input and to handle it in a graceful fashion. Nothing is more frustrating to a user than to have a program blow up in their face with a seg fault, or even worse, simply exit silently. As the DEC compiler team's expert on object files, I was called on to add object file support to a compiler back end originally targeted to VMS only. I inherited support of the object file generator for Unix COFF and later wrote the support for Microsoft PECOFF and ELF. When our group was bought by Intel I did the object file support for Apple OS X MACH-O in the Intel compiler back end. I found that the folks who write linkers are particularly lazy about error checking and error handling. They assume that the compiler always generates clean object files. That's OK I suppose if the compiler and linker people are in the same organization. If the linker falls over you can just go down the hall and have the linker developer debug the issue and tell you where you went wrong. But that doesn't work when they work for different companies and the compiler person doesn't have access to the linker sources. I ran into a lot of cases where my buggy object file caused the linker to seg fault or, even worse, simply exit without an error message. I ended up writing a very thorough formatted dumper for each object file format that did very thorough checking for proper syntax and as many semantic errors (e.g., symbol table index number out of range) as I could. -Paul W. [-- Attachment #2: Type: text/html, Size: 2610 bytes --] ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2024-05-22 13:44 UTC | newest] Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2024-05-20 13:06 [TUHS] A fuzzy awk. (Was: The 'usage: ...' message.) Douglas McIlroy 2024-05-20 13:14 ` [TUHS] " arnold 2024-05-20 14:00 ` G. Branden Robinson 2024-05-20 13:25 ` Chet Ramey 2024-05-20 13:41 ` [TUHS] Re: A fuzzy awk Ralph Corderoy 2024-05-20 14:26 ` Chet Ramey 2024-05-22 13:44 ` arnold 2024-05-20 13:54 ` Ralph Corderoy 2024-05-20 15:39 ` [TUHS] OT: LangSec (Re: A fuzzy awk.) Åke Nordin 2024-05-20 16:09 ` [TUHS] " Ben Kallus 2024-05-20 20:02 ` John Levine 2024-05-20 20:11 ` Larry McVoy 2024-05-20 21:00 ` Ben Kallus 2024-05-20 21:03 ` John R Levine 2024-05-20 21:14 ` Larry McVoy 2024-05-20 21:46 ` Ben Kallus 2024-05-20 21:57 ` Larry McVoy 2024-05-20 16:06 ` [TUHS] Re: A fuzzy awk. (Was: The 'usage: ...' message.) Paul Winalski
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).