From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from Princeton.EDU ([128.112.128.1]) by hawkwind.utcs.toronto.edu with SMTP id <2706>; Sat, 5 Dec 1992 15:52:46 -0500 Received: from phoenix.Princeton.EDU by Princeton.EDU (5.65b/2.95/princeton) id AA16809; Sat, 5 Dec 92 15:52:07 -0500 Received: from tan.Princeton.EDU by phoenix (5.65b/1.113) id AA11075; Sat, 5 Dec 92 15:52:04 -0500 From: Emin Gun Sirer Received: by tan.Princeton.EDU (4.1/CS_101_Cluster_Client) id AA08118; Sat, 5 Dec 92 15:52:02 EST Date: Sat, 5 Dec 1992 15:52:02 -0500 Message-Id: <9212052052.AA08118@tan.Princeton.EDU> To: dmason@plg.uwaterloo.ca, rc@hawkwind.utcs.toronto.edu Subject: Re: more wishes for chrismas >From: Dave Mason >Date: Sat, 5 Dec 1992 11:21:13 -0500 > >> Date: Fri, 4 Dec 1992 09:02:51 -0500 >> From: Alan Watson >> >> There are a few characters which only make sense at certain places -- >> like "=" which has to be quoted in dd commands -- but at least the >> current rules have a simplicity about them. > >This is the only spot where I get bitten occasionally and wouldn't >mind a change. If local assignments were only allowed at the >beginning of lines, then: > dd count=1 bs=512 >wouldn't cause a problem. In fact, looking at the grammar in the rc >manual, assignments *are* only allowed at the beginning of lines, and >I don't see the rule that causes the problem. Yes, promotion is inconsistent. For example: ;~ a sdfdsfg [status 1] ;ls file1 ~ file3 ;ls file1 = file3 [syntax error] ;in arg1 syntax error ;cmnd in asfd ;(elem1 in elem3) So the rule in the latter part is that the keyword gets promoted to a string unless it is the first word on a line. But the same thing is not true of "=", which does not get promoted anywhere. However, "~" gets promoted around *in the lexer*, which is not a good thing. The solutions I see, in order of desirability as far as I'm concerned: 1. Fix dd and awk (I was at Bell Labs when someone changed the plan9 dd to take "-infile fname" and "-outfile" as opposed to "infile=fname". People were furious that a lot of their scripts had to be fixed. But it only happens once and saves a lot of headache later on). 1. Promote '=' as if it was a keyword. Put '~' promotion in the grammer along with '='. 1. Promote all syntax error causing keyword constructs to strings (this'll require following multiple paths through the parser but is doable) and retry. Duff's rc paper praises sh for using recursive descent, but criticizes it because a lot of the routines change their behaviour according to some flags. The counterpart to this in a yacc-based program is the special treatment of certain tokens in the lexer. Yes, it's nice that rc has a yacc grammer that anyone can understand, but does everyone know when a given character represents token TWIDDLE or CHAR and is this treatment uniform for all characters that are more or less the same ?? Gun.