* Re: Lines and last lines.
@ 1992-12-01 7:57 Byron Rakitzis
0 siblings, 0 replies; 8+ messages in thread
From: Byron Rakitzis @ 1992-12-01 7:57 UTC (permalink / raw)
To: alan, sam-fans
This is a feature, not a bug:
x<space>
is the same thing as
x/.*\n/
(modulo the RE weirdness that's being argued right now which I don't
want to get tied up with --- i.e., in short, x<space> splits the
current selection into lines. This *is* documented in the tutorial if
not in the man page as well.)
BTW, it is a *very* handy shortcut for emulating ed commands:
,x g/foo/d
is the same as ed's
g/foo/d
PS I just checked the man page and it says that
"If the regular expression and its slashes are omitted, /.*\n/
is assumed."
but of course this is not the whole story since sam barfs on
,xg/foo/d
(Another mini-gripe I have with sam: the lexical analyzer, such as it
is, sure is weird. What's the point of having delimiters between one-
letter commands like x and g anyway? Isn't that almost the point of
one-letter commands? I guess, he says sarcastically, the delimiters are
there to disambiguate "c d" and "cd" (the only multicharacter command
in sam that I know of).)
(BTW, to the religious: my negative comments about sam are meant as
constructive criticism. I don't think Rob Pike is God, and I don't
think he achived perfection with sam. He did come fairly close though,
didn't he?)
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Lines and last lines.
@ 1992-12-01 19:54 Byron Rakitzis
0 siblings, 0 replies; 8+ messages in thread
From: Byron Rakitzis @ 1992-12-01 19:54 UTC (permalink / raw)
To: alan, sam-fans
The treatment of whitespace in the sam lexical analyzer is
very idiosynchratic.
Here are the features:
B<space>
means "open a new file".
x<space>
is the RE hack we talked about. This applies to y, X and Y as
well. Also it means that either an RE or a space is *required*
between "x" and any other command. This is why I think the
comment in the man page about omitting the RE reads so poorly.
BTW,
x<nl>
also works, and seems to be equivalent to
x/(\n|.)*/p
Here are some more dubious features:
f<space>
sets the current filename to null.
w<space>
prints
?no file name
which is consistent with "f<space>" setting the filename to
null, at least.
Otherwise, whitespace *seems* to be optional. Did I cover all
the cases?
BTW, reading and writing an octal dump a binary editor does not
make. I can do exactly the same with vi (the editor that the
labs people detest), including the bit about piping the whole
file through an octal dumper & undumper.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Lines and last lines.
@ 1992-12-01 19:27 Alan Watson
0 siblings, 0 replies; 8+ messages in thread
From: Alan Watson @ 1992-12-01 19:27 UTC (permalink / raw)
To: sam-fans
Several points.
Yes, I should have said that the undefined semantics of a final line
without a newline went somewhat against sam's philosophy that files
are arbitrary streams of TEXT.
The default for x is more subtle that I at first supposed, and is
useful not just when the final line of the FILE is not terminated by a
newline, but also when the final line of the RANGE is a partial line.
As to the behaviour of "x /RE/", clearly I misread the somewhat
contradictory documentation. The sam paper states on page 6 that
"blanks in these examples are to improve readability; sam neither
requires nor interprets them." The sam tutorial states on page 11 "if
x is followed immediately by a space, the pattern .*\n is assumed."
The man page states on page 5 that "if the regular expression and its
slashes are omitted /*.\n/ is assumed." Is putting a space between
the x and the RE really an omission of the RE? Apparently it is.
I don't like white-space to have sematic content (other than to serve
as a separator and terminator). While I like the default for
x ...
I think I might have prefered
x /RE/...
to be equivalent to
x/RE/...
but I guess one might argue that the difference is needed to
disambiguate
x/RE/ ...
and
x /RE/...
although the latter could have been written without ambiguity as
x {
/RE/...
}
Can anyone think of a situation where one might want to use "x /RE/..."?
Many a time I too have blessed emacs for allowing me to edit binary
files. Howevere, all is not lost with sam, due to it's superbly
flexible shell escapes. All one needs a filter which takes a binary
file and writes out octal, and vice verse; one then uses ",<" and ",>"
to read an write binary, whilst editing the octal representation.
Alan.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Lines and last lines.
@ 1992-12-01 7:11 Michael John Haertel
0 siblings, 0 replies; 8+ messages in thread
From: Michael John Haertel @ 1992-12-01 7:11 UTC (permalink / raw)
To: alan; +Cc: sam-fans
"x " followed by a space means "for all lines"; it is not a bug,
it is a documented (see sam.tut.ms) feature.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Lines and last lines.
@ 1992-12-01 7:01 Alan Watson
0 siblings, 0 replies; 8+ messages in thread
From: Alan Watson @ 1992-12-01 7:01 UTC (permalink / raw)
To: sam-fans
Messing around with this, I think I've found a bug:
; sam -d
a/abcd\nefgh\nijkl/
,x /.*$/p
efghabcdabcd,x/.*$/p
abcdefgh
The only thing different between the two x commands is the space
between the x and the RE.
I would very much appreciate it if someone more familiar with the code
than myself take it upon themselves to investigate this.
Alan.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Lines and last lines.
1992-12-01 0:40 Alan Watson
@ 1992-12-01 5:46 ` Chris Siebenmann
0 siblings, 0 replies; 8+ messages in thread
From: Chris Siebenmann @ 1992-12-01 5:46 UTC (permalink / raw)
To: sam-fans
sam is happy with lines that don't end in newlines; it warns when
you write them out, but otherwise leaves them alone. The 'peculiar
properties' are probably that '$' does not match the end of a line
not terminated in one.
The manual page is not quite correct as to x/y's default selection;
there is an explicit routine that loops over all lines, using the usual
uncommented code.
- cks
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Lines and last lines.
@ 1992-12-01 5:08 Byron Rakitzis
0 siblings, 0 replies; 8+ messages in thread
From: Byron Rakitzis @ 1992-12-01 5:08 UTC (permalink / raw)
To: alan, sam-fans
>My first comment is that [2] is somewhat against sam's philosophy that
>files are arbitrary byte streams; it does impose a structure on the
>file (i.e., that the final character must be \n).
Unfortunately, sam's philosophy is most emphatically not that files are
arbitrary byte streams. e.g.:
; sam -d
B /bin/ls
-. /bin/ls
?warning: null characters elided
and this is an improvement(!) on the previous situation (pre-unicode?),
which was that any "non-ascii" characters got elided. I'm not sure if
that meant nul+any-characters-with-the-8th-bit-set or what.
Anyway, in case you haven't noticed, I think this is one of sam's more
serious design flaws. It's one of the few reasons why I still want
emacs available on the Unix machines I use. (other reasons: gdb support
and glass tty support, i.e., the Poor Man's Window System.)
^ permalink raw reply [flat|nested] 8+ messages in thread
* Lines and last lines.
@ 1992-12-01 0:40 Alan Watson
1992-12-01 5:46 ` Chris Siebenmann
0 siblings, 1 reply; 8+ messages in thread
From: Alan Watson @ 1992-12-01 0:40 UTC (permalink / raw)
To: sam-fans
I'm struggling with sam's supposed break from the line-oriented nature
of most Unix utilities.
Relevant parts of the man page are [1]
\n Match newline.
^ Match the null string immediately after a newline.
$ Match the null string immediately before a newline.
and later [2]
(The peculiar properties of a last line without a newline are tem-
porarily undefined.)
and later still [3]
x/regexp/ command
For each match of the regular expression in the range, run the com-
mand with dot set to the match. Set dot to the last match. If the
regular expression and its slashes are omitted, /.*\n/ is assumed.
Null string matches potentially occur before every character of the
range and at the end of the range.
My first comment is that [2] is somewhat against sam's philosophy that
files are arbitrary byte streams; it does impose a structure on the
file (i.e., that the final character must be \n).
Secondly, let's investigate the interactions of [2] and [3]:
; sam -d
a/abcd\n/
a/efgh\n/
a/ijkl/
,p
abcd
efgh
ijkl <- cursor here
,x/.*\n/p
abcd
efgh
<- cursor here
,x p
abcd
efgh
ijkl <- cursor here
These last two might be expected to give the same output, given the
stated default for x.
Finally, lets investigate the interaction of [1] and [2]:
,x /^.*/p
efghijklabcd <- cursor here
,x /.*$/p
efghabcdabcd <- cursor here
Okay, my point is that it might have been better to define different
semantics for ^ and $, namely that they match the start of a line
(i.e., the null string at the start of the file and the null string
after a \n not at the end of the file) and the end of a line (i.e.,
the null string before a \n not at the end of the file and the null
string at the end of the file).
The default for x might then have been /^.*$/, and this might have
helped to define the semantics of a final line without a newline.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~1992-12-01 19:57 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1992-12-01 7:57 Lines and last lines Byron Rakitzis
-- strict thread matches above, loose matches on Subject: below --
1992-12-01 19:54 Byron Rakitzis
1992-12-01 19:27 Alan Watson
1992-12-01 7:11 Michael John Haertel
1992-12-01 7:01 Alan Watson
1992-12-01 5:08 Byron Rakitzis
1992-12-01 0:40 Alan Watson
1992-12-01 5:46 ` Chris Siebenmann
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).