From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from euclid.skiles.gatech.edu (list@euclid.skiles.gatech.edu [130.207.146.50]) by melb.werple.net.au (8.7.5/8.7.3/2) with ESMTP id GAA02950 for ; Fri, 14 Jun 1996 06:06:12 +1000 (EST) Received: (from list@localhost) by euclid.skiles.gatech.edu (8.7.3/8.7.3) id PAA26786; Thu, 13 Jun 1996 15:56:20 -0400 (EDT) Resent-Date: Thu, 13 Jun 1996 15:56:20 -0400 (EDT) Message-Id: Subject: New features, uniq(1) alike? To: zsh-workers@math.gatech.edu Date: Thu, 13 Jun 1996 21:56:47 +0200 (MET DST) From: Thorsten Meinecke Organization: none. Location: Berlin, Germany X-Mailer: ELM [version 2.4 PL23] MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit Resent-Message-ID: <"yQv_G3.0.SY6.a97mn"@euclid> Resent-From: zsh-workers@math.gatech.edu X-Mailing-List: archive/latest/1350 X-Loop: zsh-workers@math.gatech.edu Precedence: list Resent-Sender: zsh-workers-request@math.gatech.edu This idea was prompted by the recent "Builtin append() and prepend() ..." discussion. The patch below can do what the original poster requested and a lot more, resembling the functionality of the external command `uniq'. There are two places in zsh I thought these features might fit: As new options -q, -d and -U to the print builtin, operating on its argument list, which is straightforward and in one respect comparable to the existing -m option (that they might throw away all the arguments). Secondly, in parameter substitution, I invented the new flags q, d and u. A little redecoration was required here in order to deal with the a-per- fectly-good-array-reduced-to-one-or-zero-elements problem, admittedly an inconsistency with other flags. Now, what is this all about: print -q ... or ... ${(q)foo} print only the first of repeated words -d ${(d)foo} print only words that are repeated -U ${(u)foo} print only words that are not repeated -dq ${(dq)foo} the sole useful combination of the above. Throw away all words that are not repea- ted, then keep only one of all those that are; equivalent to `uniq -d|uniq' There are the following advantages when compared to the external command: 1) of course there is no fork/exec necessary when something like `foo=$(print -lo $bar|uniq -u)' gets replaced by `foo=${(uo)bar}' 2) the word delimiter doesn't need to be a newline; 3) the order of elements can be preserved, if desired. OTOH the uniquifying (sp?) can be done faster if the elements are sorted. The latter is the major drawback with `typeset -U'. This is hopelessly ineffective in case the array is bigger than a few elements and there is no way to tell it that the array is sorted. I'm dealing with this by making unique after sorting, and providing different functions depending on the sort options -o and -O, or sort flags (oO), respectively. A few examples and the patch are to follow below. Is it useful? Is the shell the right place for it? Are the algorithms employed too crude? What do you think? Regards, --Thorsten Find the words that are only in one of both arrays and not in the other % foo=(a b c c) bar=(b c d d) % echo ${(u)=:-${(q)foo} ${(q)bar}} a d Which commands are in more than one directory of $PATH? % cmds=( $^path/*(N-x^/:t) ); whence -ap ${(dqo)cmds} /usr/bin/compress /bin/compress /local/home/kaefer/bin/csh /usr/bin/csh /bin/csh ... Prepend() and append(); could be done with `typeset -U' as well. Take the first positional parameter - if present - as a name of an array and prepend/append the other positional parameters to it, removing duplicate words as appropriate. % functions prepend append prepend () { [[ -n $1 ]] && $1=($(eval print -q $@[2,-1] \$$1)) } append () { local tmp tmp=($(eval print -q $@[2,-1] \$$1 )) [[ -n $1 ]] && $1=($tmp[$#,-1] $@[2,-1]) } Word frequency count. Zsh saves a few `tr's and one `sort -u' but still doesn't obviate the need for external commands :) % (IFS=$IFS'!"#$%&'\''()*+,-./:;<=>?@[\\]^_`{|}~'; print -l ${(Ldo):-$(< zsh-RCS/Etc/FAQ))})|uniq -c|sort -r|head -10|cat -n 1 482 the 2 228 to 3 203 is 4 173 of 5 173 in 6 170 a 7 169 zsh 8 151 you 9 149 and 10 113 for begin 664 b20-uniqpatch.gz M'XL(",1KP#$"`WIS:#(N-F(R,"UU;FEQ<&%T8V@`W5I;4]M*$GYV?L7`5IWX M(AM+EFV,P\/F4*<.55F@DE#[0%A*EL98%5D2ND!(PG_?[IX9:23+0$AJ+\=% MC&;4W=.7KWMZAG2[7784N7M?T]4B]X/,#]/!V@E;YFPVV1O"SX@-AP MM:R!:;WJ]_N/,I@6,\<'PQ'\O.I6/SAFD^&^,3%M1A.,)?[U*ALP=KQDV8JS M3\NW9^B#G M4Y_Y(;P%.C<*,_XE8T["67KGQ#'W!D`Q^'CV:@<7B!,_S%#T!7#!.$S2X&M\ MQ3D8 ME3*/IV[B+X!J<8_V<'<5P:*&$(Y,RR@(HCMP$?/\Y9(G/'1Y>D"A4^Z'Y_[_ MGOMOO/._=`0:$V(\-B:3H4@("A%.3$80 M243A2L3@%`=KYS-/49TUNXN2S\QU4HY"0H_''+["++A_U:/(]`3;#2[3$SZH M^28/_9N(()5"+$<0$.XC`065I0:R'M M#$K;;\A@KU@ZC71K0(YN#VB1<@V)M%C%CKQF1XWV*5.P"FN6;]BAYUE'+4SQ M?`>_?H_"6PZ1PX4#GF4\J2^810S*)90S#*X*)R+F0^+NK9QTE3F+@`]695LR M9:9]8-L'8PO;D@FA:1OQ4["T9D/#FDU42_(WP)6_A`?XP<^WD_-W[PRV&T>Q MMVNPH<$6?GCE>O1H&>SM\G9D<$$(7X_&+#959@Q`W>)^(^KL_?')Q]/ MSSY^$*+H'4GKFU(<$@#7^Z,SVAY/\J%IC>SQ9+H_B_WHU%WW=XN%:EKFZ6J[ MFKBYU_3<9`_"'U54,O?#K[N;$N\T=6!`S^)G-RFMP`"J0,AR\'\0"&I8_G*Q M4)DG&_R!6^;=A)G3`VMT8(\@[T:S(N\V2=N33W0<9\ MK\N<)90;VCBPGA6[GP&5!^J_FP=8`T'Z=;9*67=/,OK+=A2G%Z^CUY?L^W=& MSZ>O+SNHW3)*VB$[9,,Y"]D;%.G"4Z_74:J#M(OP$BA`+:4W3)!VY"]I^:@` M^J^UO#2^[V%[SOKGL)#>PQ35?V^C'=%8K_]QG9H MY$M:%(9.8OU#M">%5@?63Z6%:@FP8$?Z89/G$1;-V!MU6*D:BYM_V>8_EG8(U+8I(G1!+R/Z!:0)MQ=>X1<7&$)\D$$`L!(2FB9UKG:<86G&(,9V/&W@L5PWR]@-`#)RQ6.3<- MR!*20H+H"833&4L"3E2%;MJ!R6_P#S]R#K:);C>?RTE,#'@17H/&&%DY+7]! MQ%D[QX#.6;N;]WKP"$+G#(FSCJ0J*#-!V>WU,DH,@`&4HW87C:/`$T+,<8=]L?@:Y`119 ME4)E8.&4G)S2*132/X6XO-]'S^-C=F%")I%8U*O5*KPY;Y"P2+CSN?[BX1DN M?R_.GMRG"X7JZ2!*]&./`CH=/76H*W2"M!+U6]!^K&HQ.P2'L\^J@::+K)1,.#1T.R>93C,MD>^6S-0[5- M*?`9K+S!:`(B@H?`2"AJ!J3!W#Q)UDX&KJ1AX*19,=3@"MX..&Q:)*V`&6UD M.]+?_\+7,E/;NM@Z0CNX$10+=5!WJO:Y0)[8TN1TVC%M'6@UH;EQ*`H/E-#TD. MI%4(::E)%?L'D;,5AVWNI#6?E:6AZ-N`O4B;EO)*KGFJLCT)R%=1CAC/L`;E MNC.4H=U,6)G7C)*^+?WZO&+=4O@O'?%X,C"!:P53>/"BV/$\<(=$-E@CH!TG M'*J9?([2C`:"!EY!JU*,X*4^!/_F@?X2AIWF.VI[:$SL<=EH34=C8VI7&BU< M?AG`XB)1RDGJZI*-6>QP,XD?)_7AG*)U(]I5*5%@9W]3%P%LZ\C39Z5G`%.' M\AS<3>-R6"43]I>4T@/:!'FO2E!.S!L]-;5M<,RL]-0^N&Y_/"Z.>BV\XF)P M>#B@$:FCS#?GQ5RQFPLL$<\-\*A\+9QCECE<\"@&3V-0'GR,/F^@MS;H2R.X M9@2_=8)&$YJ\M&_/P"G%C1N14Z8P3)#J(1ZS48(%DECXBLY=Z\FI""*A?393W$@+!5#JK8JT7B*W<>&R5/'J!9/<)N2]3 M=D.J.)%#9FD[%<%$BU+K.2':S(P=E1KR+7[:MY'O=2JW+J2X3.Y#PFMMR][@ M?9SU09DE*M(/VO4,6UJE$>+.I,)7-!H%58VHT`^G+H:7))Z>8;\NE96J"9I2 M*TPKM2]M;-V5Q(/$%)'_,B^?[^D9]ZHO`)1M.;\_,\S1<%JMACSSG,QIAP:[ MIYK2TIJ%4-:G_M9D[V]-]GF-2R1X_^4)WF?(W9S@_9]*\*V21R^0[#XA]V7* MNOH%+,4:!,:!$\XD)E#B.S_\_,Z'8W"&P0CY70`3^,?:=D?L0J8],J'&JPMA M=6_V->!7T`^[G^^K-_#6$&_@+;R!M\N_?#52EY?PTRU_D;5,`[X*\&$6O6+T M7P?HS-4NE*<][)O<"^7AR8GQX+2@;S=6[16RG$0>'+(C3]L.H=U_F_N!5]X< M8!=-_Z,$CQ)&D['(A7/TQIU&<<5+ M!Q$'&UP7OQQI`)QYP`O,B>$X0NYI07S$91YZ9T<>.Q?$`X2+&,IAY08/0TNS M'7Q+IQK@HI46)$%OC;INV6*+F5`J(JH/)9&<*6]6R?`_G=`+A-'45-[Q!+_` EL!Q