From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from orpheus.amdahl.com ([129.212.11.6]) by hawkwind.utcs.utoronto.ca with SMTP id <24623>; Fri, 21 Feb 1997 09:31:22 -0500 Received: from juts.ccc.amdahl.com by orpheus.amdahl.com with smtp (Smail3.1.29.1 #3) id m0vxw0H-00011dC; Fri, 21 Feb 97 06:31 PST Received: by juts.ccc.amdahl.com (/\../\ Smail3.1.14.4 #14.6) id ; Fri, 21 Feb 97 06:31 PST Received: by juno.ccc.amdahl.com (/\==/\ Smail #25.1) id ; Fri, 21 Feb 97 06:31 PST Message-Id: X-Mailer: exmh version 1.6.9 8/22/96 To: sam-fans@hawkwind.utcs.utoronto.ca Subject: ssam-1.6 and libutf-2.7 X-Face: #XtQ?n%i%L2\|+cxl=,udz?jb=ZdVifqKtWh\j%[t%SpPO/J;r0V7jB2Q4[YOM6-\GQJf1- \}3/^-jzZl.WT^3-W\?aB::;?9B:FE53y [This has already been posted to the wily list. At the risk of offending those people who will thus see this message twice, I thought some folks here might be interested. - agc] I've made available new versions of libutf, some utf routines including UTF-aware regular expressions, and ssam, a stream editor using the sam command set. Please note the namechange, and the new URLs: http://www.westley.demon.co.uk/ssam-1.6.tar.gz http://www.westley.demon.co.uk/utf-2.7.tar.gz A complete list of changes follows at the end of this mail, but the changes to ssam are mainly cosmetic and bug fixes, whilst I have started implementing language-specific matching and ordering, using a function called utflangcmp(). Many thanks to Bengt Kleberg (Bengt.Kleberg@uab.ericsson.se) for the provision of Swedish, Finnish, Danish and Norwegian alphabets. As usual, the correct way to install the software is: % tar xvzf utf-2.7.tar.gz % cd utf-2.7 % ./configure % make tst % make install % cd .. % tar xvzf ssam-1.6.tar.gz % cd ssam-1.6 % ./configure % make tst % make install This release has been tested on UTS 4.3.2 (S390 mainframe), Solaris 2.4 (SS5), and NetBSD/i386 1.2C. Take care, Alistair ssam-1.6 changes + tarted up explanation code, and added a test + moved stuff around in ssam() + moved ure match arrays from the program stack to within ssam_t. We now allocate space for the match offsets when we know how many we'll need. This removes the hardcoded limit on subexpressions. + implemented a saner way of introducing default `p' command. We now do this when parsing, rather than on execution. Removes some cruft from execution functions. + ran gcc -Wall again, and cleaned up miscellaneous warnings, changing configure.in etc on the way. + added code to free match array, if requested via flags. Modify existing free checks, so that de-allocation takes place if storage was allocated, not if it was used. + re-code 'x' and 'y' commands to take advantage of improved ure ^ matching code. 'g', 'v' and 's' commands are unaffected. This is actually a significant speedup, especially when searching for anchored matches in large strings/files. + split writing files part of ssam() out into ssamcommit(), and call it accordingly. This gives us more control over file writing. + changed Makefile to track change to the name of the library + deleted "urelang.h", which doesn't exist anymore, and added "utf.h" utf-2.7 changes + fixed a bug in ^ matching - anchored searches were only tried once, which didn't take into account the case where the string to be matched included newline characters. + re-arrange tests so that error tests are done at end. Add a test for anchored beginning of line matching + added utflangcmp function, with a couple of supporting functions to get ordinal number of bits. Added findword test program, and one extra test case. + Swedish and Finnish alphabets from Bengt.Kleberg@uab.ericsson.se (Bengt Kleberg) + changed langcoll.utf file so that letters in brackets [] in an alphabet have the same collation ordering (e.g. v and w in Swedish) Modified all utf functions that use utfrune on the alphabets accordingly + bug fix for definition of ETCDIR - not incorporated in previous changes from Alan Watson (my mistake) + renamed library from libure to libutf (at suggestion of Alan Watson). Changed Makefile to make this possible. + fixed bug where v and w in Swedish weren't comparing as the same letter. + Norwegian and Danish alphabets from Bengt.Kleberg@uab.ericsson.se (Bengt Kleberg) + fixed a bug whereby language names were occasionally misconstrued (the old "English not found, using English" problem)