From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from oldp.astro.wisc.edu ([128.104.39.15]) by hawkwind.utcs.toronto.edu with SMTP id <2840>; Mon, 30 Nov 1992 22:08:16 -0500 Received: by oldp.astro.wisc.edu (5.65/DEC-Ultrix/4.3) id AA14736; Mon, 30 Nov 1992 18:40:45 -0600 Message-Id: <9212010040.AA14736@oldp.astro.wisc.edu> To: sam-fans@hawkwind.utcs.toronto.edu Subject: Lines and last lines. Date: Mon, 30 Nov 1992 19:40:45 -0500 From: Alan Watson X-Mts: smtp I'm struggling with sam's supposed break from the line-oriented nature of most Unix utilities. Relevant parts of the man page are [1] \n Match newline. ^ Match the null string immediately after a newline. $ Match the null string immediately before a newline. and later [2] (The peculiar properties of a last line without a newline are tem- porarily undefined.) and later still [3] x/regexp/ command For each match of the regular expression in the range, run the com- mand with dot set to the match. Set dot to the last match. If the regular expression and its slashes are omitted, /.*\n/ is assumed. Null string matches potentially occur before every character of the range and at the end of the range. My first comment is that [2] is somewhat against sam's philosophy that files are arbitrary byte streams; it does impose a structure on the file (i.e., that the final character must be \n). Secondly, let's investigate the interactions of [2] and [3]: ; sam -d a/abcd\n/ a/efgh\n/ a/ijkl/ ,p abcd efgh ijkl <- cursor here ,x/.*\n/p abcd efgh <- cursor here ,x p abcd efgh ijkl <- cursor here These last two might be expected to give the same output, given the stated default for x. Finally, lets investigate the interaction of [1] and [2]: ,x /^.*/p efghijklabcd <- cursor here ,x /.*$/p efghabcdabcd <- cursor here Okay, my point is that it might have been better to define different semantics for ^ and $, namely that they match the start of a line (i.e., the null string at the start of the file and the null string after a \n not at the end of the file) and the end of a line (i.e., the null string before a \n not at the end of the file and the null string at the end of the file). The default for x might then have been /^.*$/, and this might have helped to define the semantics of a final line without a newline.