From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from euclid.skiles.gatech.edu (list@euclid.skiles.gatech.edu [130.207.146.50]) by melb.werple.net.au (8.7.5/8.7.3) with ESMTP id JAA20386 for ; Mon, 20 May 1996 09:09:14 +1000 (EST) Received: (from list@localhost) by euclid.skiles.gatech.edu (8.7.3/8.7.3) id SAA19483; Sun, 19 May 1996 18:36:53 -0400 (EDT) Resent-Date: Sun, 19 May 1996 18:36:53 -0400 (EDT) From: Zoltan Hidvegi Message-Id: <199605192234.AAA00866@hzoli.ppp.cs.elte.hu> Subject: Re: 8-bit patch for zle_tricky.c To: A.Main@dcs.warwick.ac.uk (Zefram) Date: Mon, 20 May 1996 00:34:48 +0200 (MET DST) Cc: zsh-workers@math.gatech.edu In-Reply-To: <29384.199605181151@stone.dcs.warwick.ac.uk> from Zefram at "May 18, 96 12:51:33 pm" X-Mailer: ELM [version 2.4ME+ PL11 (25)] MIME-Version: 1.0 Content-Type: application/pgp; format=text; x-action=sign Content-Transfer-Encoding: 7bit Resent-Message-ID: <"IvNmD2.0.Lm4.5Awdn"@euclid> Resent-From: zsh-workers@math.gatech.edu X-Mailing-List: archive/latest/1097 X-Loop: zsh-workers@math.gatech.edu Precedence: list Resent-Sender: zsh-workers-request@math.gatech.edu -----BEGIN PGP SIGNED MESSAGE----- > There is also a very serious problem in tokenisation: the metafied > line is tokenised as if it were literal text. For example: > > % echo foo^@bar > foo bar > > Note the double space. What happens is that foo^@bar is encoded as > "foo\203 bar", and the tokeniser splits this into "foo\203" and > "bar". The former of these strings is illegal as a metafied string, but > if it happens to be terminated by two NULs then it unmetafies to "foo ". > I think the lexer will need significant changes to support metafication. > It's probably best to make the hgetc/hungetc history interface binary, > but leave ingetc/inungetc metafied. I'll leave it to others that know > that area of the code better. Well I do not think that this bug is serious. The biggest advantage of metafication is that does not requite significant changes in the lexer. I see that in your patch you added IBLANK, INBLANK and ISPECIAL types to the null character. I still think that null only needs IMETA. Adding IBLANK just complicates thinkg in the lexer and I do not think it is necessary. You may ask then why I added it to IFS. The main reason for that was that nulls cannot be passed in arguments to external commands. But iblank/inblank used only on the shell input which does not have this limitation. In this case the only six lines have to be added to lex.c to handle the situation. hgetc/hungetc can remain unchanged since hungetc is only used to check wether the next character in the input stream has a certain property or not. But a metafied character never has these properties, so when the lexer sees the Meta comes it know that this character is not interestin without looking at the real character following the Meta. Now here is a much bigger patch to the lexer and to a few other files. This hopefully makes substitution fully sh compatible when zsh invoked as sh. I used ksh93 and bash 1.14.5 as a reference. Sometimes bash and ksh behaves differently. In such cases I tried to implement to more logical or more simpler behaviour. Examples: % echo "${foo:-"*"}" * That's how bash behaves. ksh globbed the files. I think bash is better here. % foo=foobar % bar=??? % echo "${foo%$bar}" foo % foo='foo$bar' % echo "${foo%'$bar'}" foo As you see $bar was not expanded in the second example. Thats the way bash and ksh behaves. The string between % and } is parsed again with the lexer as if it were an unquoted string. This is only done if the substitution is double quoted since unquoted substitutions are correctly parsed by the lexer at the first time and it is unnecessary to repeat this. An other change I made is that (, |, ) and ~ lose their special meaning in ${...%...} substitution patterns if extendedglob is not set. It also affects the -m arguments of builtins and everything else which use tokenize(). I think that extendedglob should be set by default if zsh is not invoked as sh/ksh but I'd like to hear other's opinions about that. After this patch zsh should be very sh compatible if invoked as sh. I linked /bin/sh to zsh and I have no problems (I've been doing this for almost a year but now I'm more confident that this does not break scripts). The subst.c patch below is quite trivial and most of it deals with STOUC() and (unsigned char) casts. There are many places in zsh where (unsigned char) is used. I ask everyone to use the STOUC macro instead of the (unsigned char) cast since the later does not work with some compilers. I'll replace all (unsigned char) casts to STOUC() in beta18. The patch to the lexer also fixes handlink of the null character. The patch to utils.c removes the above discussed IBLANK, INBLANK and ISPECIAL types of null. If patch says that it is already applied, do not reverse it, just ignore it. It means that you did not apply Zefram's recent 8-bit patch. I also added some cmdpushes to the lexer. After the patch the %_ prompt expansion will tell you if you are in a brace expansion or in a ${...} parameter substitution. I also added some debugging tests. Please use configure - --enable-zsh-debug or add -DDEBUG to DEFS in the Makefile if you try this patch. This may make zsh a little bit slower (I do not think that it will be noticable) and it may catch some bugs. The patch is gzipped and uuencoded due to its size. Bye, Zoltan begin 644 lexsubst_cleanup.gz M'XL(`%2>GS$``^4\:7?3R+*?/;^B;1ALQW*0Y#T!Y@3P#+D3$H:$^^8]+O_38X?Q\/XL_+@Y_&EC8T,KU>OUK%3P>KWV?;=U'UYPO:U68\MK M%/!U?$G_H4$:#:?18U1D#/[=WV`PRG08S%B2!FG(-NYC0Y%*TR&;1BD;#19! M'$:)`Z4W@S@8AF^@(CAC#YF[#;VQSRR,L.BP03+]#$]^JXU-PTD0LXW!(HVQ M1(!S"."I?GT(KC>;#?]FTVFY$G_\P>&FT7`>C;;-FB`)%T&J5V90U+6..D7T MWL$,$3`&3<;3.$EGD:5R.-DFZ@!\34$=&_B]EN.YG@[_+*D_XO`#'4Q$1!-' MA+>:2&$'B12TK^+'WU_AN!UA[,H1ABXFYC2*Q!P'6*6"UF$XR3I(BO2:3N\" MBGA>V_%\?Y4DP\F'>3Q"K,33=JZ1"Y%ZSG=8S+/WX3ECNTT)\&=X-DH6>L4T MBL/1-)95A(_G-1W/NPBCINMXS;8II(+!2-GA6\GQ;)J,R5H/D]D:HV6?O$"O M8)8-9F.Z8KCLMB+R&M/50#D5T!B_TDD*@-?H`#W\]03SVQW'[P@1P#?\=L/Q MV\K(W!F%XVD4LKT__3>/_WAU<-1G7GNE_LG!\^<[S.NL5!\N_9%[WIYI1 M_;Q_!)U[W(`MHV1Z&H4C;G[`E@7#U'L-)NG$$24_*Z7SM[RTUD+YG9;C=Y5` M3Z-IBN\%@Z3R?CX=5:'NBV8_4I"L(I6$O>1&'A\?EX]+?SGX"F&*/^-YS"JI M2R(-D[,BM[[P6*M5";*")$'JGD"WO3\]3D;.36`4,$OQYCNC=R%RVS;DOA=N M5E9VFX[?4YIL(FH1M/+',E63T1)@58!F M5=''G&-E;-\ZMK]^;%^.[>MCDV)V4]+?F+X7$?$R(M6,]L.C M@U=/*L_#-*C*?FB.;HF6-K%NN&VGX2D+A=X;:+%%?3.O1"U/T7PXC^-P",]" M>+1U*5M0PH]).E_DW`US;87EEB_1GJP#%*#8WW^ZN__BU9'J";(]39(PK?3_ M/(+&_M/?]@X>5U&CR2%U/:?A=J2HW"HZWQ=Z*W.:7:?1:F?+1PJ_3\,4!M=0 M*BH#.T#8T95>R(=!S)^VM5Y#AXW0_4\5R%G;(@S?CI$P=<^A`G9\MYRGH3*M M@_!T&D73Z'0K0S5)T2_:?[6WQS6VT83``'QSM3!<`CB"="$85Y[=2D9P(1J= MAC+=&%2@NLHW*WQEJ8(LS&;S824+1*HX;V&RC`#V867(B\CL93*I/#E\\WSG MZ%D5`2\,T>>+"B^@``S9UZ_X^R&;T+#`?K32T)]H0G8: M<4:#T/^S__+EMJP]G:=S8"E1HX#6KJ!)*(EC84,@5SYVRX(7K8[3:'>5]?Q1 MV#O,O0$!]%E%E8F:J(S#=!E'BE!7I(M52F"5:?2:2DIP=`U&-39?:8H8ZR+E M$A!8\GRF$/2"1L'#'"L81AHB_LV4'P:$$8-D0D2_=P\'%19A=_^H_W+GR='N MO_OHP/;WCPZK0"A"M%+\/`N)LO`.*X8?%T$T`B7`4J4(&A."74V1K.#`PEB' MSW;W#X_`]N`(X#_%TZA:K5:5S@'LSZ9@QN)/X+0#/!$$32,V">,0Y((MXO!] M"%H(RT@1SEDY"-N$#;HI^B'CAPV0Z M"UF%>#V-=&8?1V5"19A3X9$1YS\$HY$2-ZR!EV^%ZS"-;=`D&XI[E[(-\2ZRVV*:3>%:/=]INYE9%JHLS3WG'66/ M`/,/8?E]B)$`01]PVIPZL&A_P(6%4(G!!V+S,9NFN&C$X,6=01^21[!V833] M'!"G=0X+0WMX]')W_[?,!X[T1?.:%A-_R(/3P`AO%`:I\G!]9;-HF+>E83!IAUMSW7:?D=I!RUKR8OR6F.!$1E1E1\)!"Z*8M>"+5;GM.NZWR`K'/_Y&A.M%9YALKQNY)=7@BM<$R[GE^`E!O(F['[3D= MS2W^7F`.KP^.C8H=UW4ZC4XN3P#K!9F(BTSJ[OXECI6$US"V6"'S(:$HX[NA M$&(1W_8.7?6H]K%85%\I?LD0.22UQVX#`[DY<:1[KXLOLC@1,NL9)P^8!^A1( M;-MH]JSVJEB$%V%$&Y8&.MP)RJ&#X=0C*Q0XU9/YV5EP">5VM!5=R!_&7IE7 M8AO<'*Z@6&6*#^ENH^%T.IU<8N8:RG';7J)4L958+^_CR,I5U9/CKJI?X7JZ M9X8P/-CX$>I'F249D]BT4-!C=6D2UE.H9VT%PJOJZ440W5!A9=2D^U[_.`V6 M7OD/UV2#[5=3:.OJU_*<3LM<_<(U"W*Q,AU-3Z']&S4*0* MY18%$LJ1LZU[5S!1.3-^SLU3#Z#OF>;I[X7^(K98D+"RI-L"I#PS5#@WW%`M MEBV<2RGEGB+AFK-1'!G=FHG0%JV:Q&Z5/%L]_"("OK.@;&U=US.;1SH*RE?$YX7W!4_CLHJL"-K#O1^7?=.>..Q'L5A4[V.Q[.B>ET$?V3>2-+` M>'7;YK;1M=CPMU!'"O>/H))5>)M=I]O*-D85G9[F"?5T+:&>2DII^5HSB"^5 MS1=SB[C*H7P.X[A26D9G03J<0+AY7"HYM/_/MRR(U5W?Z7:S7>_;`UK+^]P8 MZ6V2_L,>+%MCCSM[5WM13#>>;[2F-J ME[-J_OUSW5W\%EW6UCK3YLM9Z,RPVW!ZGK8\W#;^%^O;5>GP+5I[*3UL\M%S MFT[/U9)3O4;7Z;64E4,3]U:>EC$/<^07+WX\35BJ`4^EU_0C'+EM""7BM.$& MYHQOO8G="'TS@N\9;M&6'!X.QBP10#/:S.U&Y#_JSLE])>J95" MGN?T_*9Q4X+V@/E>IJ'0M*L'DU`&ZR?S_&IN#]/)]C=Q.1V^Y<\`M'D\+&5# MO0#*CT>DQ2Q9%&G4G)3U5S[/0O$6_4'I22!^`/&NH7"$HX^L#AT'RS'/`XH= M2+GDY-8<.1.$KL),5>3`E9,K6L[+ M(H`G%O`85T(I5T]\J:2-\)?%3NNIB"R2SEE0C5$(:JF"\[)?]/)+'(L_T(2E^^+MZ,XE8A;\(2 MF5T5^^X;4ARS:VZG\9?6[TO.%^6;(1Q_)#IDN>T]<^,B-]9=3<^S MD>A2E.OZ>!6PF;'W-NBP?H/Z.Q.C9LD>9QN;WT0FJ_!T07BZK_K6H9XYMF=ST-3M_968'A_L#K8X=/N0H@N.:R5+B386R3@`SV#M< MZM?/5\H9T,SZ"TMI>9="!\I=Q>1=2*]`T*G5!<`Z9B+D!H!E]A?G*BKWN9AQ M0.LYX.DVN2TKH^]U@IM%V/+D!;VQ;:8!M=>]Z]##SC@?K^8V7%VT.FVHZC84 MR7;V]@[^1^1V5M<@9D9,F@MH"ZFR]Q54,(#P2Q15-5<1FN1;2GFH.^"^#J,& MBF+3N"TMPB*$C>81=*+3U*/Y M<@#X$>.2377"5O/T.16['0"CU\QBTUL!@\[`;BAHV-$$NL&_9;(,9K-/#(]L MXF&X<#R/0X;C@(!5JIM,_5@0L-.UY3JH3L:597EJE!=7#XZJV\OBQ"C&#NH. MCDT!(#!74D*4`%'0Y"8RQ`A$"(_]`EZ\SHC6,TDKTDR@XP@L.'4S&:UY("6. MYWE=G46W@)$#HGH%I"Y.-UB2##0##*BE&:)YRL*S1?KIFQ,--R&J76S:/C?` MFMB@Y!NS"K4FSC"1.>07"N!Q'AN)0S`8RT@<[`XKB2+D^7HEHQ/E_%H""V93 ML.D)PV?`-(S?@WKP2_>X+4!RX0/(7F8`+P,Y#U$ML\"/6,.G!0R>'S#/[U3E M^IM#DT4A6+V_?AZ6%<**?63XU[VX0I]UU*@1N^'?-YF<,4O8![R!$8"8Q6?! M#&Q/#6W/$'=PHU$=;T"P(#Y=GN$&S6"9LBEF'L`&S:,TP('QDD@`LF4:*J"_ M'`HW>`@MNN,P/9VD=;S\`79K%.+Q_KM?-C+1_D0S9\A% M@20]H3VK\5'?T#MO.+[B4'N"[/@BM8T?\)^AHJ"G506D>=467/>,&=>@C,91[,^2+\:MI7!M$`1%#]RF'`MD'%8Z MF"]`K!\O\2*3E4(_)T@0BD!PV`3GA*BT[G-"#?NJY'5::.+5JG1G M&@UG2S`RI<_IIT6X.2EQ#.37,V2,]V=5'LRE/8!'H(&4&J72`[H2`PS=%J&> M%.+7U%ZKG3R$`?""CC8JZHRN'^S!0\R`&F,L9#I9O@BAYJ\'+S.;YYIMX*SO M]66;)Q8UCK%R=GXPQN;=I3LL1X$<`=@Q/SNF6PIE%<@B&`9!=3>%650S%7Y+ M>N(U*1Y,68"YG!WF+:MOXXM5,GT7/PSD>5I`YGL-_%:0N_IA&!BQO[?[Z]&S M_CX.Z+?-MF?]E_VG!T]H,K]C;7M*;?IG8F020P#I]RQM+W8(Q8;+D>>Q@_7G M*JI].IL/@EFBJ7=6(U5*MU-!_":SR*D!4C&NBML>9T"#F)7 MVC;H;$=76:K(,B@%X92E\7#Q"6Q3[*/ADLEL](^GT3++TYWK)X.+[^CL5F)L M)N`I8)X&Y4>!5^(A>GGF\2E!V_!/(VOP=0?7.+T1@;>/IK6RF*W]5(?G-C"G MULR.R9"UG,_&M&/SGK96JG)47L1TVS0),%[ZA14W@O MXJ`7C1/,7GO\7@6:<+F]1U^B6;E^L<2S/_J5Q7+-W#$1,(F<+LW%,O>[5%+9 M:?R&F=M4.[@_%+/O@86=7Q2"M[M&AUPD+1+^EP3UZ2$?[&3A3Y.="E5-LZ24XT:O1Y.W#*H<:6):;8K3V;% MVO,PU^%[%YW7+/_,Y9`7[M@Q?S%?1B,3\W=I5?\B05A99ONQ9\M9BIG(>TN9 MN\A^<\W0!A*:Q\U?H)(N`K^-`&^_H^()+>:?*O0[%QU^7(\3(6+%@XX/YQ,) M2_G5BC59'8C?>-YBF$M6X!L@U+$5+QS_T4]0/!,I%[;WD#8EE] M,T^Z:-+OTA9+4=2]LK5+I>=;!<[W_*8#O[1#7**JT;B07_]9Y9>\2UZNZ(6O M>J&J%_YK&E/KK1IS'PW[JI"6)8&@]=E5CYA7Q^RW<=[._N_LZ]L=Y^>ME>Z':=7 M[!>)?K*Y:#:[]F'PB;X?"7\/7_2?[.[LK8QL?.$-1]"_[Y9]9RYG,^B;A0^, M!KY85N6'Z(`78@[\#AX.?72@`!(?SE,4U/S?OYF".OY_-XW^%_8&:P:R60`` ` end -----BEGIN PGP SIGNATURE----- Version: 2.6.3i Charset: noconv iQCVAwUBMZ+iAQupSCiLN749AQEg9gP/fmFOBCNPhfEycQB6fTOKxCLPuV5Ugwsu 0cTuOjXUNxYufl5yaWcAQEynB6jAvDGsvylsBFzdqC/Jea/YsQv9nkHv3ytX1ByJ dQmXSJz16NyAlwnrSgcSFSGqVSkdxJwC7YSo6RixazqZoyNAtmzmX4Lnja+yT3RF ++aEgL8B7s0= =RnxI -----END PGP SIGNATURE-----