* [9front] seq: fix infinite loop @ 2021-08-15 22:02 Sean Hinchee 2021-08-16 21:45 ` Amavect ` (2 more replies) 0 siblings, 3 replies; 8+ messages in thread From: Sean Hinchee @ 2021-08-15 22:02 UTC (permalink / raw) To: 9front; +Cc: nabijaczleweli наб on twitter, then over e-mail (CC'd) pointed out that seq(1) in 9front with some very large inputs For example: ; seq 2e16 20000000000000002 would loop forever. After patch application, seq with the above example would print '2e+16' once and then exit. Patch also brings the usage message in line with the manual usage. Patch link: http://okturing.com/src/11872/body Inline: --- //.git/fs/object/2af46e406bbd443ae10025777247798a685afc3c/tree/sys/src/cmd/seq.c +++ sys/src/cmd/seq.c @@ -11,7 +11,7 @@ void usage(void) { - fprint(2, "usage: seq [-fformat] [-w] [first [incr]] last\n"); + fprint(2, "usage: seq [-w] [-fformat] [first [incr]] last\n"); exits("usage"); } @@ -67,6 +67,7 @@ default: goto out; }ARGEND + out: if(argc<1 || argc>3) usage(); @@ -81,6 +82,7 @@ } if(!format) buildfmt(); + if(incr > 0){ for(val = min; val <= max; val += incr){ n = sprint(buf, format, val); @@ -88,6 +90,8 @@ for(j=0; buf[j]==' '; j++) buf[j] ='0'; write(1, buf, n); + if(val == max) + break; } }else{ for(val = min; val >= max; val += incr){ @@ -96,7 +100,10 @@ for(j=0; buf[j]==' '; j++) buf[j] ='0'; write(1, buf, n); + if(val == max) + break; } } + exits(0); } ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [9front] seq: fix infinite loop 2021-08-15 22:02 [9front] seq: fix infinite loop Sean Hinchee @ 2021-08-16 21:45 ` Amavect 2021-08-18 22:53 ` Eckard Brauer 2021-08-16 22:24 ` [9front] " Anthony Martin 2021-08-18 2:49 ` [9front] " ori 2 siblings, 1 reply; 8+ messages in thread From: Amavect @ 2021-08-16 21:45 UTC (permalink / raw) To: 9front On Sun, 15 Aug 2021 15:02:38 -0700 Sean Hinchee <henesy.dev@gmail.com> wrote: > ; seq 2e16 20000000000000002 > would loop forever. At this range, 2e16 and 20000000000000002 are represented by the same double floating point value. > + if(val == max) > + break; We can cover more cases. For example, your patch does not fix the following: seq 9007199254740992 9007199254740996 The issue is with floating point precision. A double has a 52 bit mantissa, meaning that it allows 2^52 different values before the exponent is incremented (thus, floating point.) A double has a 11 bit exponent, and 1 sign bit. Exponents range from −1022 to +1023, with exponents −1023 and +1024 reserved for special numbers. At an exponent e, the mantissa represents a value between [2^e,2^(e+1)). The precision is range/buckets, so we can calculate the precision. At an exponent e and mantissa m, the precision is 2^(e-m) (2^(e+1)-2^e)/2^m = 2^e/2^m = 2^(e-m) Between [1, 2), the exponent is 0. The precision is 2^(0-52) ≈ 0.0000000000000002220446 Between [2^52, 2^53), the exponent is 52. The precision is 2^(52-52) = 1. 2e16 is between [2^54, 2^55), so the exponent is 54. The precision is 2^(54-52) = 4. Halfway ties round to the even bucket. This is demonstrated by the following. # infinite loop seq 2e16 19999999999999998 # 2^53-1, halts seq 9007199254740991 9007199254740991 # 2^53, infinite loop due to halfway tie rounding down to even bucket. seq 9007199254740992 9007199254740992 # halts, halfway tie rounds up to even bucket, 9007199254740996 seq 9007199254740994 9007199254740994 # different due to rounding error in loss of precision seq 5000000000000000 1.1 5000000000000002 seq 0 1.1 2 I don't believe that there is an ideal resolution to rounding error in that last example (libmp would be too complex), but we can cover more cases if we check that the previous val value is different from the current val. The correct thing to do is to throw an error warning about precision. Thanks, Amavect ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [9front] seq: fix infinite loop 2021-08-16 21:45 ` Amavect @ 2021-08-18 22:53 ` Eckard Brauer 0 siblings, 0 replies; 8+ messages in thread From: Eckard Brauer @ 2021-08-18 22:53 UTC (permalink / raw) To: 9front Am Mon, 16 Aug 2021 16:45:04 -0500 schrieb Amavect <amavect@gmail.com>: > On Sun, 15 Aug 2021 15:02:38 -0700 > Sean Hinchee <henesy.dev@gmail.com> wrote: > [...] > At this range, 2e16 and 20000000000000002 are represented by the same > double floating point value. > > [...] > We can cover more cases. > For example, your patch does not fix the following: > seq 9007199254740992 9007199254740996 > The issue is with floating point precision. > > A double has a 52 bit mantissa, meaning that it allows 2^52 different > values before the exponent is incremented (thus, floating point.) > A double has a 11 bit exponent, and 1 sign bit. > Exponents range from −1022 to +1023, with exponents −1023 and > +1024 reserved for special numbers. > > At an exponent e, the mantissa represents a value between > [2^e,2^(e+1)). The precision is range/buckets, so we can calculate > the precision. At an exponent e and mantissa m, the precision is > 2^(e-m) (2^(e+1)-2^e)/2^m = 2^e/2^m = 2^(e-m) > > Between [1, 2), the exponent is 0. > The precision is 2^(0-52) ≈ 0.0000000000000002220446 > Between [2^52, 2^53), the exponent is 52. > The precision is 2^(52-52) = 1. > > 2e16 is between [2^54, 2^55), so the exponent is 54. > The precision is 2^(54-52) = 4. > Halfway ties round to the even bucket. > > This is demonstrated by the following. > > # infinite loop > seq 2e16 19999999999999998 > > # 2^53-1, halts > seq 9007199254740991 9007199254740991 > > # 2^53, infinite loop due to halfway tie rounding down to even bucket. > seq 9007199254740992 9007199254740992 > # halts, halfway tie rounds up to even bucket, 9007199254740996 > seq 9007199254740994 9007199254740994 > > # different due to rounding error in loss of precision > seq 5000000000000000 1.1 5000000000000002 > seq 0 1.1 2 > > > I don't believe that there is an ideal resolution to rounding error in > that last example (libmp would be too complex), but we can cover more > cases if we check that the previous val value is different from the > current val. > The correct thing to do is to throw an error warning about precision. > > Thanks, > Amavect IMHO the args could easily be double, the internal counter can never. seq could be kind of that: if (argc == 3) delta = argv[2]; else delta = 1; if (!(argv[1] + delta > argv[1])) error(...); for (ulong i=0; i<MAXULONG && val <= argv[argc], i++) fprintf("%whatever\n", val = (argv[1]+i*delta)); or do i have too much beer tonight...? -- Corona? Ich desinfiziere regelmäßig mit Bier. What Corona? I regularly disinfect with beer. ^ permalink raw reply [flat|nested] 8+ messages in thread
* [9front] Re: seq: fix infinite loop 2021-08-15 22:02 [9front] seq: fix infinite loop Sean Hinchee 2021-08-16 21:45 ` Amavect @ 2021-08-16 22:24 ` Anthony Martin 2021-08-16 23:38 ` ori 2021-08-21 1:30 ` Amavect 2021-08-18 2:49 ` [9front] " ori 2 siblings, 2 replies; 8+ messages in thread From: Anthony Martin @ 2021-08-16 22:24 UTC (permalink / raw) To: 9front; +Cc: nabijaczleweli Sean Hinchee <henesy.dev@gmail.com> once said: > наб on twitter, then over e-mail (CC'd) pointed out that seq(1) in > 9front with some very large inputs > > For example: > > ; seq 2e16 20000000000000002 > > would loop forever. > > After patch application, seq with the above example would print > '2e+16' once and then exit. > > [...] > > @@ -88,6 +90,8 @@ > for(j=0; buf[j]==' '; j++) > buf[j] ='0'; > write(1, buf, n); > + if(val == max) > + break; This isn't quite right. There are many cases where the precision of any one of the values in {min, incr, max} will not be enough to make progress in the loop body. You need something like: /* ensure adequate precision of loop invariants */ if(incr > 0 && (max+incr <= max || min+incr <= min)) sysfatal("increment is too small"); if(incr < 0 && (max+incr >= max || min+incr >= min)) sysfatal("increment is too small"); before going into the loop. And that's just off the top of my head. I'd need to do more testing to be confident. Floating point can be nasty. But really, if you're playing games at the boundaries of floating point precision, you're bound to lose. I suggest we either a) do nothing and leave the loop exactly as it is described in the manual but add a note in the BUGS section, or b) go back to the v8 method that calculates the number of steps in the loop before executing it. Even with the latter method you can see odd results due to precision limitations but at least there will be no infinite loops. I vote for "a". Cheers, Anthony ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [9front] Re: seq: fix infinite loop 2021-08-16 22:24 ` [9front] " Anthony Martin @ 2021-08-16 23:38 ` ori 2021-08-17 2:15 ` hiro 2021-08-21 1:30 ` Amavect 1 sibling, 1 reply; 8+ messages in thread From: ori @ 2021-08-16 23:38 UTC (permalink / raw) To: 9front; +Cc: nabijaczleweli Quoth Anthony Martin <ality@pbrane.org>: > But really, if you're playing games at the boundaries > of floating point precision, you're bound to lose. I > suggest we either a) do nothing and leave the loop > exactly as it is described in the manual but add a note > in the BUGS section, or b) go back to the v8 method > that calculates the number of steps in the loop before > executing it. Even with the latter method you can see > odd results due to precision limitations but at least > there will be no infinite loops. > > I vote for "a". > I agree -- *BUT* I like the consolidation of the two loops. I'd take that. I also wonder if: for(val = min; val <= max; val += incr){ if(val == prev) sysfatal("insufficient precision") ... prev = val } covers the infinite loop case sufficiently. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [9front] Re: seq: fix infinite loop 2021-08-16 23:38 ` ori @ 2021-08-17 2:15 ` hiro 0 siblings, 0 replies; 8+ messages in thread From: hiro @ 2021-08-17 2:15 UTC (permalink / raw) To: 9front sounds like a good addition to the BUGS section of the man page On 8/17/21, ori@eigenstate.org <ori@eigenstate.org> wrote: > Quoth Anthony Martin <ality@pbrane.org>: >> But really, if you're playing games at the boundaries >> of floating point precision, you're bound to lose. I >> suggest we either a) do nothing and leave the loop >> exactly as it is described in the manual but add a note >> in the BUGS section, or b) go back to the v8 method >> that calculates the number of steps in the loop before >> executing it. Even with the latter method you can see >> odd results due to precision limitations but at least >> there will be no infinite loops. >> >> I vote for "a". >> > > I agree -- *BUT* I like the consolidation of the > two loops. I'd take that. > > I also wonder if: > > for(val = min; val <= max; val += incr){ > if(val == prev) > sysfatal("insufficient precision") > ... > prev = val > } > > covers the infinite loop case sufficiently. > > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [9front] Re: seq: fix infinite loop 2021-08-16 22:24 ` [9front] " Anthony Martin 2021-08-16 23:38 ` ori @ 2021-08-21 1:30 ` Amavect 1 sibling, 0 replies; 8+ messages in thread From: Amavect @ 2021-08-21 1:30 UTC (permalink / raw) To: 9front [-- Attachment #1: Type: text/plain, Size: 1470 bytes --] On Mon, 16 Aug 2021 15:24:29 -0700 Anthony Martin <ality@pbrane.org> wrote: > But really, if you're playing games at the boundaries > of floating point precision, you're bound to lose. I > suggest we either a) do nothing and leave the loop > exactly as it is described in the manual but add a note > in the BUGS section, or b) go back to the v8 method > that calculates the number of steps in the loop before > executing it. Even with the latter method you can see > odd results due to precision limitations but at least > there will be no infinite loops. > > I vote for "a". I vote for 'b', at the toss of a coin. I've spent some time bikeshedding this sickly program, and attached is the sickly result. It does neat things like `seq -w 003` printing out 001, 002, 003. It probably shouldn't do that, though. It also doesn't use exponential notation to print out numbers. This allows seq to not be limited from -999999 to 999999. Even still, the result doesn't feel quite right. Implementing a floating point number parser in seq is weird. Invalid number combinations cause it to crash. The following does not print out 0.3 in either implementation: seq 0.1 0.1 0.3 The only way to correctly print out decimals is by nesting seq: for(i in `{seq 0 9}) for(j in `{seq -w 0 9}) echo $i.$j Therefore, I propose replacing floating point with integers. If we're lucky, seq can count decimals using integers. We can also implement hexadecimal counting. Thanks, Amavect [-- Attachment #2: seq.c --] [-- Type: text/x-c++src, Size: 2691 bytes --] #include <u.h> #include <libc.h> #include <ctype.h> double first = 1.0; double last = 0.0; double step = 1.0; double nsteps = 0.0; double final = 0.0; int constant; int consigned; char *format; char *zarg; char *sstep; void usage(void) { fprint(2, "usage: %s [-w] [-fformat] [first [step]] last\n", argv0); exits("usage"); } /* * [+-]?(([0-9]+([.][0-9]*)?)|([.][0-9]+))([Ee][+-]?[0-9]+)? */ void getwidths(char *s, int *mh, int *mt) { int h, t, e, m; h = 0; t = 0; m = 0; if(*s == '+' || *s == '-'){ s++; m++; } if(isdigit(*s)){ s++; h++; while(isdigit(*s)){ s++; h++; } if(*s == '.'){ s++; while(isdigit(*s)){ s++; t++; } } }else if(*s == '.'){ s++; if(!isdigit(*s++)) goto out; while(isdigit(*s)){ s++; t++; } }else goto out; if(*s|32 == 'e'){ s++; e = atoi(s); if(e > 307) e = 307; if(e < -30) e = -30; h += e; t -= e; } out: if(h <= 0) h = 1 + m; if(t < 0) t = 0; if(h > *mh) *mh = h; if(t > *mt) *mt = t; return; } int headwidth(double d) { int i; i = 1; if(d < 0.0){ d = -d; i++; } if(d > 1.0) i += log10(d); return i; } void buildfmt(void) { int h, t, z; static char fmt[16]; h = 0; t = 0; /* determine head and tail widths from bounds and increment */ getwidths(zarg, &h, &t); if(sstep) getwidths(sstep, &z, &t); z = headwidth(final); if(z > h) h = z; if(constant){ if(t > 0) h += t + 1; if(first < 0.0 || last < 0.0){ consigned = 1; sprint(fmt, "%%+%d.%df\n", h, t); }else sprint(fmt, "%%%d.%df\n", h, t); }else sprint(fmt, "%%.%df\n", t); format = fmt; } void main(int argc, char *argv[]){ int j, n; char buf[512], ffmt[4096]; double val, i, preval; ARGBEGIN{ case 'w': if(format) sysfatal("-w incompatible with -f"); constant++; break; case 'f': if(constant) sysfatal("-f incompatible with -w"); format = EARGF(usage()); if(format[strlen(format)-1] != '\n'){ sprint(ffmt, "%s\n", format); format = ffmt; } break; default: goto out; }ARGEND out: if(argc<1 || argc>3) usage(); last = atof(argv[argc-1]); if(argc > 1) first = atof(argv[0]); if(argc > 2){ sstep = argv[1]; step = atof(argv[1]); } if(step == 0.0) sysfatal("zero increment"); nsteps = (last - first) / step; final = floor(nsteps) * step + first; zarg = argv[0]; if(!format) buildfmt(); preval = step + first; for(i = 0; i <= nsteps; i++){ val = i * step + first; n = sprint(buf, format, val); if(constant){ for(j = 0; buf[j] == ' '; j++) buf[j] = '0'; if(consigned && j > 0){ buf[0] = buf[j]; buf[j] = '0'; } } write(1, buf, n); preval = val; } exits(0); } ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [9front] seq: fix infinite loop 2021-08-15 22:02 [9front] seq: fix infinite loop Sean Hinchee 2021-08-16 21:45 ` Amavect 2021-08-16 22:24 ` [9front] " Anthony Martin @ 2021-08-18 2:49 ` ori 2 siblings, 0 replies; 8+ messages in thread From: ori @ 2021-08-18 2:49 UTC (permalink / raw) To: 9front; +Cc: nabijaczleweli Quoth Sean Hinchee <henesy.dev@gmail.com>: > наб on twitter, then over e-mail (CC'd) pointed out that seq(1) in > 9front with some very large inputs > > For example: > > ; seq 2e16 20000000000000002 > > would loop forever. > > After patch application, seq with the above example would print > '2e+16' once and then exit. > Another question: does seq even need to do floating point? Can we change it to simply do integers? Surprisingly, it seems like we don't even use seq much in our base system. If I had to guess at how often it was used, I'd have been off by an order of magnitude at least. % cd /rc/bin && g seq `{walk -f .} ./ape/libpng-config:46: --static revise subsequent outputs for static linking ./homespool:9: echo creating seqno file ./homespool:10: touch $home/spool/ctrl/seqno ./homespool:12: chmod 222 $home/spool/ctrl/seqno ./nietzsche:3: n=`{seq 1 638 | fortune /fd/0} ./newt:90: case ,*; seq 1 `{echo $* | sed 's/,//g'} ./newt:91: case *,; seq `{echo $* | sed 's/,//g'} $posts($#posts) ./newt:92: case *,*; seq `{echo $* | sed 's/,/ /g'} ./newt:167: posts=`{seq 1 $#rposts} ./newt:231: r=`{seq $r(1) `{echo $r(1)^+10|bc}} ./newt:233: r=`{seq $r(1) $posts($#posts)} Out of the places that I see 'seq', four are accidentally picked up, and the remainder would not be affected by this. While we're here: % seq -f '%s' 10 seq 4704701: suicide: sys: trap: general protection violation pc=0x20331f I wonder if it'd be worth scanning for unknown format verbs and barfing on them, or whether that's too much complexity. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2021-08-21 1:55 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-08-15 22:02 [9front] seq: fix infinite loop Sean Hinchee 2021-08-16 21:45 ` Amavect 2021-08-18 22:53 ` Eckard Brauer 2021-08-16 22:24 ` [9front] " Anthony Martin 2021-08-16 23:38 ` ori 2021-08-17 2:15 ` hiro 2021-08-21 1:30 ` Amavect 2021-08-18 2:49 ` [9front] " ori
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).