[9fans] bison problem, not plan9 related

9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed

* [9fans] bison problem, not plan9 related
@ 2009-10-21 17:52 Rudolf Sykora
  2009-10-21 18:03 ` Rudolf Sykora
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Rudolf Sykora @ 2009-10-21 17:52 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

Hello,
sorry for an off-topic thing. But I guess somebody here could help me...
I have a problem with bison grammer

Having

%token	ATOM
%left	'+'
%left	REP

and a grammar:

block:	  ATOM
	| REP block
	| block '+' block
;

is ok. Having another grammer:

block:	  ATOM
	| REP block
	| block block %prec '+'
;

has 2 shift/reduce conflicts, similar to
state 7

    5 block: REP block .
    6      | block . block

    ATOM  shift, and go to state 3

    ATOM      [reduce using rule 5 (block)]
    $default  reduce using rule 5 (block)

    block  go to state 9

or
state 9

    6 block: block . block
    6      | block block .

    ATOM  shift, and go to state 3
    REP   shift, and go to state 4

    ATOM      [reduce using rule 6 (block)]
    $default  reduce using rule 6 (block)

    block  go to state 9

What I want is to have a parser that can read e.g. (the spaces are
left out by lex, they are not in what bison sees; I only write them
here for better readability)
12 Au 13 Cu 2 Ag
the former grammer (REP is for repetition) is able to read
12 Au + 13 Cu + 2 Ag
but I don't like those pluses, which are redundant.

Also important: I have those 'block' non-terminals there, since I want
to add another rule
block: '[' block ']'
so that I can use brackets and can parse things like
12 [ 2 Cu 3 Co]

Could anyone explain to me what goes wrong?
I can't figure it out...

Thanks a lot
Ruda

PS.: the grammer is actually identical to a grammer that can evaluate
expressions with +, *, and brackets, with usual operator precedence.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [9fans] bison problem, not plan9 related
  2009-10-21 17:52 [9fans] bison problem, not plan9 related Rudolf Sykora
@ 2009-10-21 18:03 ` Rudolf Sykora
  2009-10-21 18:46 ` Skip Tavakkolian
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Rudolf Sykora @ 2009-10-21 18:03 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

... anywhere where I wrote 'grammer' I meant 'grammar' ...



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [9fans] bison problem, not plan9 related
  2009-10-21 17:52 [9fans] bison problem, not plan9 related Rudolf Sykora
  2009-10-21 18:03 ` Rudolf Sykora
@ 2009-10-21 18:46 ` Skip Tavakkolian
  2009-10-21 19:44   ` Rudolf Sykora
  2009-10-21 18:47 ` Russ Cox
  2009-10-21 20:03 ` Bakul Shah
  3 siblings, 1 reply; 11+ messages in thread
From: Skip Tavakkolian @ 2009-10-21 18:46 UTC (permalink / raw)
  To: 9fans

i think this is what you want. untested:

pair: REP ATOM
	| REP '[' block ']'

block: pair
	| block pair

> Hello,
> sorry for an off-topic thing. But I guess somebody here could help me...
> I have a problem with bison grammer
>
> Having
>
> %token	ATOM
> %left	'+'
> %left	REP
>
> and a grammar:
>
> block:	  ATOM
> 	| REP block
> 	| block '+' block
> ;
>
> is ok. Having another grammer:
>
> block:	  ATOM
> 	| REP block
> 	| block block %prec '+'
> ;
>
> has 2 shift/reduce conflicts, similar to
> state 7
>
>     5 block: REP block .
>     6      | block . block
>
>     ATOM  shift, and go to state 3
>
>     ATOM      [reduce using rule 5 (block)]
>     $default  reduce using rule 5 (block)
>
>     block  go to state 9
>
> or
> state 9
>
>     6 block: block . block
>     6      | block block .
>
>     ATOM  shift, and go to state 3
>     REP   shift, and go to state 4
>
>     ATOM      [reduce using rule 6 (block)]
>     $default  reduce using rule 6 (block)
>
>     block  go to state 9
>
> What I want is to have a parser that can read e.g. (the spaces are
> left out by lex, they are not in what bison sees; I only write them
> here for better readability)
> 12 Au 13 Cu 2 Ag
> the former grammer (REP is for repetition) is able to read
> 12 Au + 13 Cu + 2 Ag
> but I don't like those pluses, which are redundant.
>
> Also important: I have those 'block' non-terminals there, since I want
> to add another rule
> block: '[' block ']'
> so that I can use brackets and can parse things like
> 12 [ 2 Cu 3 Co]
>
> Could anyone explain to me what goes wrong?
> I can't figure it out...
>
> Thanks a lot
> Ruda
>
> PS.: the grammer is actually identical to a grammer that can evaluate
> expressions with +, *, and brackets, with usual operator precedence.




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [9fans] bison problem, not plan9 related
  2009-10-21 17:52 [9fans] bison problem, not plan9 related Rudolf Sykora
  2009-10-21 18:03 ` Rudolf Sykora
  2009-10-21 18:46 ` Skip Tavakkolian
@ 2009-10-21 18:47 ` Russ Cox
  2009-10-21 19:41   ` Rudolf Sykora
  2009-10-21 20:03 ` Bakul Shah
  3 siblings, 1 reply; 11+ messages in thread
From: Russ Cox @ 2009-10-21 18:47 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

> %token  ATOM
> %left   '+'
> %left   REP
>
> block:    ATOM
>        | REP block
>        | block block %prec '+'

>    5 block: REP block .
>    6      | block . block
>
>    ATOM  shift, and go to state 3
>
>    ATOM      [reduce using rule 5 (block)]

This says that yacc isn't sure how to decide,
having read REP block, between shifting ATOM
or applying the REP block reduction.

>    6 block: block . block
>    6      | block block .
>
>    ATOM  shift, and go to state 3
>
>    ATOM      [reduce using rule 6 (block)]

This says that yacc isn't sure how to decide,
having read block block, between shifting ATOM
or applying the block block reduction.

To know how to decide, yacc needs a precedence
for the thing being shifted and the rule.  You've
given precedences for each rule (REP block has
REP's precedence, and block block has +'s thanks
to the override) but not to ATOM.

Concretely, when yacc sees REP block ATOM
it isn't sure whether that's (REP block) ATOM or
REP (block ATOM).

Instead of

> %token  ATOM
> %left   '+'
> %left   REP

you probably want

%left '+'
%left REP
%nonassoc ATOM

This wasn't an issue in the first grammar because the shift/reduce
decision had to be made when looking at the '+' instead
of when looking at ATOM, and the '+' did have a precedence.

Russ

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [9fans] bison problem, not plan9 related
  2009-10-21 18:47 ` Russ Cox
@ 2009-10-21 19:41   ` Rudolf Sykora
  2009-10-21 19:48     ` Rudolf Sykora
  2009-10-21 20:21     ` Russ Cox
  0 siblings, 2 replies; 11+ messages in thread
From: Rudolf Sykora @ 2009-10-21 19:41 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

2009/10/21 Russ Cox <rsc@swtch.com>:
> To know how to decide, yacc needs a precedence
> for the thing being shifted and the rule.  You've
> given precedences for each rule (REP block has
> REP's precedence, and block block has +'s thanks
> to the override) but not to ATOM.
>
> Concretely, when yacc sees REP block ATOM
> it isn't sure whether that's (REP block) ATOM or
> REP (block ATOM).
>
> Instead of
>
>> %token  ATOM
>> %left   '+'
>> %left   REP
>
> you probably want
>
> %left '+'
> %left REP
> %nonassoc ATOM
>
> Russ

Ok, thanks, this seems to have solved it.
So the %nonassoc says to the parser that
(REP block) ATOM
is the right decision as opposed to
REP (block ATOM)
right?

The sad point really is that I couldn't find a good explanation for
what %nonassoc really means. Everybody just says that it precludes a
situation where
'the same operator would be twice in a row...'
which isn't quite telling, when one realizes that the parser really
does not know what an 'operator' is... (both the documentation of
bison (Donnelly, Stallman) and of yacc (Johnson) are just like that; I
couldn't find any better)
So could you please say, what %nonassoc really assures?

Thanks
Ruda

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [9fans] bison problem, not plan9 related
  2009-10-21 18:46 ` Skip Tavakkolian
@ 2009-10-21 19:44   ` Rudolf Sykora
  0 siblings, 0 replies; 11+ messages in thread
From: Rudolf Sykora @ 2009-10-21 19:44 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

2009/10/21 Skip Tavakkolian <9nut@9netics.com>:
> i think this is what you want. untested:
>
> pair: REP ATOM
>        | REP '[' block ']'
>
> block: pair
>        | block pair

Thanks.
This would require there always be a REP before [ block ], which I
don't want (cf. ordinary expression (1+2) ).

Anyway, thanks.
Ruda



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [9fans] bison problem, not plan9 related
  2009-10-21 19:41   ` Rudolf Sykora
@ 2009-10-21 19:48     ` Rudolf Sykora
  2009-10-21 20:21     ` Russ Cox
  1 sibling, 0 replies; 11+ messages in thread
From: Rudolf Sykora @ 2009-10-21 19:48 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

> So the %nonassoc says to the parser that
> (REP block) ATOM
> is the right decision as opposed to
> REP (block ATOM)
> right?

... probably not exactly. the highest priority of ATOM is probably
also important, as I think of it now...

Ruda



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [9fans] bison problem, not plan9 related
  2009-10-21 17:52 [9fans] bison problem, not plan9 related Rudolf Sykora
                   ` (2 preceding siblings ...)
  2009-10-21 18:47 ` Russ Cox
@ 2009-10-21 20:03 ` Bakul Shah
  2009-10-21 20:18   ` Rudolf Sykora
  3 siblings, 1 reply; 11+ messages in thread
From: Bakul Shah @ 2009-10-21 20:03 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

Is this what you are trying to do?

$ cat b.y <<'EOF'
%token ATOM REP
%%
blocks: block | block blocks;
block: ATOM | REP block | '[' blocks ']';
%%
EOF
$ bison b.y
$

On Wed, 21 Oct 2009 19:52:41 +0200 Rudolf Sykora <rudolf.sykora@gmail.com>  wrote:
> Hello,
> sorry for an off-topic thing. But I guess somebody here could help me...
> I have a problem with bison grammer
>
> Having
>
> %token	ATOM
> %left	'+'
> %left	REP
>
> and a grammar:
>
> block:	  ATOM
> 	| REP block
> 	| block '+' block
> ;
>
> is ok. Having another grammer:
>
> block:	  ATOM
> 	| REP block
> 	| block block %prec '+'
> ;
>
> has 2 shift/reduce conflicts, similar to
> state 7
>
>     5 block: REP block .
>     6      | block . block
>
>     ATOM  shift, and go to state 3
>
>     ATOM      [reduce using rule 5 (block)]
>     $default  reduce using rule 5 (block)
>
>     block  go to state 9
>
> or
> state 9
>
>     6 block: block . block
>     6      | block block .
>
>     ATOM  shift, and go to state 3
>     REP   shift, and go to state 4
>
>     ATOM      [reduce using rule 6 (block)]
>     $default  reduce using rule 6 (block)
>
>     block  go to state 9
>
> What I want is to have a parser that can read e.g. (the spaces are
> left out by lex, they are not in what bison sees; I only write them
> here for better readability)
> 12 Au 13 Cu 2 Ag
> the former grammer (REP is for repetition) is able to read
> 12 Au + 13 Cu + 2 Ag
> but I don't like those pluses, which are redundant.
>
> Also important: I have those 'block' non-terminals there, since I want
> to add another rule
> block: '[' block ']'
> so that I can use brackets and can parse things like
> 12 [ 2 Cu 3 Co]
>
> Could anyone explain to me what goes wrong?
> I can't figure it out...
>
> Thanks a lot
> Ruda
>
> PS.: the grammer is actually identical to a grammer that can evaluate
> expressions with +, *, and brackets, with usual operator precedence.
>



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [9fans] bison problem, not plan9 related
  2009-10-21 20:03 ` Bakul Shah
@ 2009-10-21 20:18   ` Rudolf Sykora
  0 siblings, 0 replies; 11+ messages in thread
From: Rudolf Sykora @ 2009-10-21 20:18 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

> Is this what you are trying to do?
>
> $ cat b.y <<'EOF'
> %token ATOM REP
> %%
> blocks: block | block blocks;
> block: ATOM | REP block | '[' blocks ']';
> %%
> EOF
> $ bison b.y
> $

My head is starting to spin, but I think that, possibly, yes.
Seems to be the simplest way how to do it. :)
Anyway, I also needed to know what R Cox explained, that was
important; still wondering what that %nonassoc really means.

Thanks
Ruda

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [9fans] bison problem, not plan9 related
  2009-10-21 19:41   ` Rudolf Sykora
  2009-10-21 19:48     ` Rudolf Sykora
@ 2009-10-21 20:21     ` Russ Cox
  2009-10-21 20:28       ` Rudolf Sykora
  1 sibling, 1 reply; 11+ messages in thread
From: Russ Cox @ 2009-10-21 20:21 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

> Ok, thanks, this seems to have solved it.
> So the %nonassoc says to the parser that
> (REP block) ATOM
> is the right decision as opposed to
> REP (block ATOM)
> right?

%token declares its arguments as tokens but
does not give them any precedence level.

%left, %right, and %nonassoc also declare their
arguments as tokens.  in addition, each such line
introduces a new precedence level stronger than
the ones introduced by previous lines.

if a shift/reduce conflict involves different precedences,
the stronger precedence always wins.

if a shift/reduce conflict is a tie between identical precedences,
the resolution depends on which of the three lines
(%left, %right, or %nonassoc) introduced the precedence.

precedences introduced by %left resolve the tie
by reducing; this creates left-to-right associativity (x-y-z).

precedences introduced by %right resolve the tie
by shifting; this creates right-to-left associativity (x^y^z in hoc).

precedences introduced by %nonassoc do not resolve
the tie.  they leave it as a conflict that gets reported.

if you're defining a precedence that is not intended
to be associative, much of the time it doesn't matter
which you pick, because your grammar is likely to
be such that ties never happen.  but %nonassoc is
still the safe choice.

in the running example, %nonassoc by itself
doesn't say which of those two is right.  it just
defines a precedence for ATOM, which is used
as the precedence for shifting ATOM.
because the REP block and block block rules
have precedences too, the ambiguity can be
resolved by comparing the precedences.
which way things get resolved depends on whether
the %nonassoc line comes before or after the
other precedences, not on the meaning of %nonassoc.

i said

> %left '+'
> %left REP
> %nonassoc ATOM

and that will give you REP block ATOM == REP (block ATOM)
which is probably not what you want.  to tweak it,
just move the %nonassoc line above the two %left lines.

russ

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [9fans] bison problem, not plan9 related
  2009-10-21 20:21     ` Russ Cox
@ 2009-10-21 20:28       ` Rudolf Sykora
  0 siblings, 0 replies; 11+ messages in thread
From: Rudolf Sykora @ 2009-10-21 20:28 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

I must say, many thanks!
Ruda

2009/10/21 Russ Cox <rsc@swtch.com>:
>> Ok, thanks, this seems to have solved it.
>> So the %nonassoc says to the parser that
>> (REP block) ATOM
>> is the right decision as opposed to
>> REP (block ATOM)
>> right?
>
> %token declares its arguments as tokens but
> does not give them any precedence level.
>
> %left, %right, and %nonassoc also declare their
> arguments as tokens.  in addition, each such line
> introduces a new precedence level stronger than
> the ones introduced by previous lines.
>
> if a shift/reduce conflict involves different precedences,
> the stronger precedence always wins.
>
> if a shift/reduce conflict is a tie between identical precedences,
> the resolution depends on which of the three lines
> (%left, %right, or %nonassoc) introduced the precedence.
>
> precedences introduced by %left resolve the tie
> by reducing; this creates left-to-right associativity (x-y-z).
>
> precedences introduced by %right resolve the tie
> by shifting; this creates right-to-left associativity (x^y^z in hoc).
>
> precedences introduced by %nonassoc do not resolve
> the tie.  they leave it as a conflict that gets reported.
>
> if you're defining a precedence that is not intended
> to be associative, much of the time it doesn't matter
> which you pick, because your grammar is likely to
> be such that ties never happen.  but %nonassoc is
> still the safe choice.
>
> in the running example, %nonassoc by itself
> doesn't say which of those two is right.  it just
> defines a precedence for ATOM, which is used
> as the precedence for shifting ATOM.
> because the REP block and block block rules
> have precedences too, the ambiguity can be
> resolved by comparing the precedences.
> which way things get resolved depends on whether
> the %nonassoc line comes before or after the
> other precedences, not on the meaning of %nonassoc.
>
> i said
>
>> %left '+'
>> %left REP
>> %nonassoc ATOM
>
> and that will give you REP block ATOM == REP (block ATOM)
> which is probably not what you want.  to tweak it,
> just move the %nonassoc line above the two %left lines.
>
> russ
>
>



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2009-10-21 20:28 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-10-21 17:52 [9fans] bison problem, not plan9 related Rudolf Sykora
2009-10-21 18:03 ` Rudolf Sykora
2009-10-21 18:46 ` Skip Tavakkolian
2009-10-21 19:44   ` Rudolf Sykora
2009-10-21 18:47 ` Russ Cox
2009-10-21 19:41   ` Rudolf Sykora
2009-10-21 19:48     ` Rudolf Sykora
2009-10-21 20:21     ` Russ Cox
2009-10-21 20:28       ` Rudolf Sykora
2009-10-21 20:03 ` Bakul Shah
2009-10-21 20:18   ` Rudolf Sykora

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).