sam-fans - fans of the sam editor
 help / color / mirror / Atom feed
* Re: Lines and last lines.
@ 1992-12-01 19:54 Byron Rakitzis
  0 siblings, 0 replies; 8+ messages in thread
From: Byron Rakitzis @ 1992-12-01 19:54 UTC (permalink / raw)
  To: alan, sam-fans

The treatment of whitespace in the sam lexical analyzer is
very idiosynchratic.

Here are the features:

	B<space>

means "open a new file".

	x<space>

is the RE hack we talked about. This applies to y, X and Y as
well. Also it means that either an RE or a space is *required*
between "x" and any other command. This is why I think the
comment in the man page about omitting the RE reads so poorly.

BTW,

	x<nl>

also works, and seems to be equivalent to

	x/(\n|.)*/p


Here are some more dubious features:

	f<space>

sets the current filename to null.

	w<space>

prints

	?no file name

which is consistent with "f<space>" setting the filename to
null, at least.

Otherwise, whitespace *seems* to be optional. Did I cover all
the cases?

BTW, reading and writing an octal dump a binary editor does not
make. I can do exactly the same with vi (the editor that the
labs people detest), including the bit about piping the whole
file through an octal dumper & undumper.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Lines and last lines.
@ 1992-12-01 19:27 Alan Watson
  0 siblings, 0 replies; 8+ messages in thread
From: Alan Watson @ 1992-12-01 19:27 UTC (permalink / raw)
  To: sam-fans

Several points.

Yes, I should have said that the undefined semantics of a final line
without a newline went somewhat against sam's philosophy that files
are arbitrary streams of TEXT.

The default for x is more subtle that I at first supposed, and is
useful not just when the final line of the FILE is not terminated by a
newline, but also when the final line of the RANGE is a partial line.

As to the behaviour of "x /RE/", clearly I misread the somewhat
contradictory documentation.  The sam paper states on page 6 that
"blanks in these examples are to improve readability; sam neither
requires nor interprets them."  The sam tutorial states on page 11 "if
x is followed immediately by a space, the pattern .*\n is assumed."
The man page states on page 5 that "if the regular expression and its
slashes are omitted /*.\n/ is assumed."  Is putting a space between
the x and the RE really an omission of the RE?  Apparently it is.

I don't like white-space to have sematic content (other than to serve
as a separator and terminator).  While I like the default for

	x ...

I think I might have prefered

	x /RE/...

to be equivalent to

	x/RE/...

but I guess one might argue that the difference is needed to
disambiguate

	x/RE/ ...

and
	x /RE/...

although the latter could have been written without ambiguity as

	x {
	/RE/...
	}

Can anyone think of a situation where one might want to use "x /RE/..."?

Many a time I too have blessed emacs for allowing me to edit binary
files.  Howevere, all is not lost with sam, due to it's superbly
flexible shell escapes.  All one needs a filter which takes a binary
file and writes out octal, and vice verse; one then uses ",<" and ",>"
to read an write binary, whilst editing the octal representation.

Alan.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Lines and last lines.
@ 1992-12-01  7:57 Byron Rakitzis
  0 siblings, 0 replies; 8+ messages in thread
From: Byron Rakitzis @ 1992-12-01  7:57 UTC (permalink / raw)
  To: alan, sam-fans

This is a feature, not a bug:

	x<space>

is the same thing as

	x/.*\n/

(modulo the RE weirdness that's being argued right now which I don't
want to get tied up with --- i.e., in short, x<space> splits the
current selection into lines. This *is* documented in the tutorial if
not in the man page as well.)

BTW, it is a *very* handy shortcut for emulating ed commands:

	,x g/foo/d

is the same as ed's

	g/foo/d

PS I just checked the man page and it says that

	"If the regular expression and its slashes are omitted, /.*\n/
	is assumed."

but of course this is not the whole story since sam barfs on

	,xg/foo/d

(Another mini-gripe I have with sam: the lexical analyzer, such as it
is, sure is weird. What's the point of having delimiters between one-
letter commands like x and g anyway? Isn't that almost the point of
one-letter commands? I guess, he says sarcastically, the delimiters are
there to disambiguate "c d" and "cd" (the only multicharacter command
in sam that I know of).)

(BTW, to the religious: my negative comments about sam are meant as
constructive criticism. I don't think Rob Pike is God, and I don't
think he achived perfection with sam. He did come fairly close though,
didn't he?)


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Lines and last lines.
@ 1992-12-01  7:11 Michael John Haertel
  0 siblings, 0 replies; 8+ messages in thread
From: Michael John Haertel @ 1992-12-01  7:11 UTC (permalink / raw)
  To: alan; +Cc: sam-fans

"x " followed by a space means "for all lines"; it is not a bug,
it is a documented (see sam.tut.ms) feature.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Lines and last lines.
@ 1992-12-01  7:01 Alan Watson
  0 siblings, 0 replies; 8+ messages in thread
From: Alan Watson @ 1992-12-01  7:01 UTC (permalink / raw)
  To: sam-fans

Messing around with this, I think I've found a bug:

; sam -d
a/abcd\nefgh\nijkl/
,x /.*$/p
efghabcdabcd,x/.*$/p
abcdefgh

The only thing different between the two x commands is the space
between the x and the RE.

I would very much appreciate it if someone more familiar with the code
than myself take it upon themselves to investigate this.

Alan.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Lines and last lines.
  1992-12-01  0:40 Alan Watson
@ 1992-12-01  5:46 ` Chris Siebenmann
  0 siblings, 0 replies; 8+ messages in thread
From: Chris Siebenmann @ 1992-12-01  5:46 UTC (permalink / raw)
  To: sam-fans

 sam is happy with lines that don't end in newlines; it warns when
you write them out, but otherwise leaves them alone. The 'peculiar
properties' are probably that '$' does not match the end of a line
not terminated in one.

 The manual page is not quite correct as to x/y's default selection;
there is an explicit routine that loops over all lines, using the usual
uncommented code.

	- cks


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re:  Lines and last lines.
@ 1992-12-01  5:08 Byron Rakitzis
  0 siblings, 0 replies; 8+ messages in thread
From: Byron Rakitzis @ 1992-12-01  5:08 UTC (permalink / raw)
  To: alan, sam-fans

>My first comment is that [2] is somewhat against sam's philosophy that
>files are arbitrary byte streams; it does impose a structure on the
>file (i.e., that the final character must be \n).

Unfortunately, sam's philosophy is most emphatically not that files are
arbitrary byte streams. e.g.:

; sam -d
B /bin/ls
 -. /bin/ls
?warning: null characters elided

and this is an improvement(!) on the previous situation (pre-unicode?),
which was that any "non-ascii" characters got elided. I'm not sure if
that meant nul+any-characters-with-the-8th-bit-set or what.

Anyway, in case you haven't noticed, I think this is one of sam's more
serious design flaws. It's one of the few reasons why I still want
emacs available on the Unix machines I use. (other reasons: gdb support
and glass tty support, i.e., the Poor Man's Window System.)


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Lines and last lines.
@ 1992-12-01  0:40 Alan Watson
  1992-12-01  5:46 ` Chris Siebenmann
  0 siblings, 1 reply; 8+ messages in thread
From: Alan Watson @ 1992-12-01  0:40 UTC (permalink / raw)
  To: sam-fans

I'm struggling with sam's supposed break from the line-oriented nature
of most Unix utilities.

Relevant parts of the man page are [1]

     \n         Match newline.
     ^          Match the null string immediately after a newline.
     $          Match the null string immediately before a newline.

and later [2]

     (The peculiar properties of a last line without a newline are tem-
     porarily undefined.)

and later still [3]

     x/regexp/ command
          For each match of the regular expression in the range, run the com-
          mand with dot set to the match.  Set dot to the last match.  If the
          regular expression and its slashes are omitted, /.*\n/ is assumed.
          Null string matches potentially occur before every character of the
          range and at the end of the range.

My first comment is that [2] is somewhat against sam's philosophy that
files are arbitrary byte streams; it does impose a structure on the
file (i.e., that the final character must be \n).

Secondly, let's investigate the interactions of [2] and [3]:

	; sam -d
	a/abcd\n/
	a/efgh\n/
	a/ijkl/

	,p
	abcd
	efgh
	ijkl	<- cursor here

	,x/.*\n/p
	abcd
	efgh
		<- cursor here

	,x p
	abcd
	efgh
	ijkl	<- cursor here

These last two might be expected to give the same output, given the
stated default for x.

Finally, lets investigate the interaction of [1] and [2]:

	,x /^.*/p
	efghijklabcd	<- cursor here

	,x /.*$/p
	efghabcdabcd	<- cursor here
	
Okay, my point is that it might have been better to define different
semantics for ^ and $, namely that they match the start of a line
(i.e., the null string at the start of the file and the null string
after a \n not at the end of the file) and the end of a line (i.e.,
the null string before a \n not at the end of the file and the null
string at the end of the file).

The default for x might then have been /^.*$/, and this might have
helped to define the semantics of a final line without a newline.


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~1992-12-01 19:57 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1992-12-01 19:54 Lines and last lines Byron Rakitzis
  -- strict thread matches above, loose matches on Subject: below --
1992-12-01 19:27 Alan Watson
1992-12-01  7:57 Byron Rakitzis
1992-12-01  7:11 Michael John Haertel
1992-12-01  7:01 Alan Watson
1992-12-01  5:08 Byron Rakitzis
1992-12-01  0:40 Alan Watson
1992-12-01  5:46 ` Chris Siebenmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).