9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* [9fans] Repeated Bungetc or Bungetrune
@ 2007-02-15  0:39 Joel Salomon
  2007-02-15  0:58 ` Joel C. Salomon
  0 siblings, 1 reply; 4+ messages in thread
From: Joel Salomon @ 2007-02-15  0:39 UTC (permalink / raw)
  To: 9fans

Although bio(2) says:
	Bungetc and Bungetrune may back up a maximum of five bytes.
I’ve found that Bungetrune is idempotent.

The code in question is from my lexer (switching on the return value from Bgetrune):
	case 'l':	case 'L':
		switch(Bgetrune(bin)) {
		case '\'':
			return charlex(1);
		case '"':
			return strlex(1);
		default:
		Bungetrune(bin);
		}
		// fall through
	case 'a':	case 'b':	case 'c':	case 'd':	case 'e':
	case 'f':	case 'g':	case 'h':	case 'i':	case 'j':
	case 'k':		case 'm':	case 'n':	case 'o':
	case 'p':	case 'q':	case 'r':	case 's':	case 't':
	case 'u':	case 'v':	case 'w':	case 'x':	case 'y':
	case 'z':
	case 'A':	case 'B':	case 'C':	case 'D':	case 'E':
	case 'F':	case 'G':	case 'H':	case 'I':	case 'J':
	case 'K':		case 'M':	case 'N':	case 'O':
	case 'P':	case 'Q':	case 'R':	case 'S':	case 'T':
	case 'U':	case 'V':	case 'W':	case 'X':	case 'Y':
	case 'Z':
	case '_':
		Bungetrune(bin);
		return idlex();
This doesn’t work; identifiers beginning with ‘l’ lose their initial character.

>From a cursory reading of /sys/src/libbio/bgetc.c and
/sys/src/libbio/bgetrune.c, it seems that Bungetc will allow itself to
be called more than once where Bungetrune sets a condition to prevent
that (bp->runesize = 0;).  (Actually, it seems that there is no
hard-coded limit to how many times Bungetc can be called.  Will
arbitrary back-up work, or will something break deep within the Biobuf
if I abuse it?)

For this case anyway, where I know that the Rune read was a
single-byte character, is it safe to use Bungetc, and will that allow
me to back-up more than once?  I need (I think) three bytes of
back-up.

--Joel



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [9fans] Repeated Bungetc or Bungetrune
  2007-02-15  0:39 [9fans] Repeated Bungetc or Bungetrune Joel Salomon
@ 2007-02-15  0:58 ` Joel C. Salomon
  2007-02-15  2:37   ` Joel C. Salomon
  0 siblings, 1 reply; 4+ messages in thread
From: Joel C. Salomon @ 2007-02-15  0:58 UTC (permalink / raw)
  To: 9fans

> For this case anyway, where I know that the Rune read was a
> single-byte character, is it safe to use Bungetc, and will that allow
> me to back-up more than once?  I need (I think) three bytes of
> back-up.

I've tried it, and (on a small test case) nothing broke; 'long' is
recognized as the keyword rather than the identifier 'ong'.  I just
would like some reasurrance from someone who knows the bio code
better.

Also, after Beof is read, does Bunget* have any meaning?

--Joel


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [9fans] Repeated Bungetc or Bungetrune
  2007-02-15  0:58 ` Joel C. Salomon
@ 2007-02-15  2:37   ` Joel C. Salomon
  2007-02-15  9:05     ` Charles Forsyth
  0 siblings, 1 reply; 4+ messages in thread
From: Joel C. Salomon @ 2007-02-15  2:37 UTC (permalink / raw)
  To: 9fans

On 2/14/07, Joel C. Salomon <joelcsalomon@gmail.com> wrote:
> > For this case anyway, where I know that the Rune read was a
> > single-byte character, is it safe to use Bungetc, and will that allow
> > me to back-up more than once?  I need (I think) three bytes of
> > back-up.
>
> I've tried it, and (on a small test case) nothing broke; 'long' is
> recognized as the keyword rather than the identifier 'ong'.  I just
> would like some reasurrance from someone who knows the bio code
> better.

I've gone through the code again; the only places I have multiple
back-up are places where I know that the characters to be unread are
single-byte UTF sequences, so I use Bungetc.  Other back-ups are
single Rune of unknown UTF length, to be reread as part of the next
token, so I can (and must!) use Bungetrune.

What started as a simple design is now hopelessly tangled.  Next time I use lex.

> Also, after Beof is read, does Bunget* have any meaning?

Apparently, yes: to unget the last (pre-Beof) character read.  A few
rides through an infinite loop cleared that up.

--Joel


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [9fans] Repeated Bungetc or Bungetrune
  2007-02-15  2:37   ` Joel C. Salomon
@ 2007-02-15  9:05     ` Charles Forsyth
  0 siblings, 0 replies; 4+ messages in thread
From: Charles Forsyth @ 2007-02-15  9:05 UTC (permalink / raw)
  To: 9fans

>What started as a simple design is now hopelessly tangled.  Next time I use lex.

i think i'd just untangle the design.  it's not completely unheard of that
an initial idea turns out in implementation to have been a little under-developed,
but now you know better.  in fact, i'd have supposed that having
that experience was part of the point of doing a fairly realistic project.


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2007-02-15  9:05 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-02-15  0:39 [9fans] Repeated Bungetc or Bungetrune Joel Salomon
2007-02-15  0:58 ` Joel C. Salomon
2007-02-15  2:37   ` Joel C. Salomon
2007-02-15  9:05     ` Charles Forsyth

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).