Re: PATCH: parser (was: Re: PATCH: Improved

zsh-workers
 help / color / mirror / code / Atom feed

* Re: PATCH: parser (was: Re: PATCH: Improved _mailboxes)
@ 2000-02-25  8:41 Sven Wischnowsky
  2000-02-25  9:44 ` Precompiled wordcode zsh functions Bart Schaefer
  2000-02-25  9:55 ` PATCH: parser (was: Re: PATCH: Improved _mailboxes) Andrej Borsenkow
  0 siblings, 2 replies; 15+ messages in thread
From: Sven Wischnowsky @ 2000-02-25  8:41 UTC (permalink / raw)
  To: zsh-workers

Bart Schaefer wrote:

> On Feb 24, 10:07am, Sven Wischnowsky wrote:
> } Subject: RE: PATCH: parser (was: Re: PATCH: Improved _mailboxes)
> }
> } 
> } Andrej Borsenkow wrote:
> } 
> } > zcodeload file
> 
> Let's not do that, shall we?  Let's stick with autoload and have a file
> suffix convention, like emacs' .el and .elc, or something.  Heck, there
> could even be separate fpath and compiled_fpath or ...

I was wondering what to do when the directory isn't writable... but a
$COMPILED_FPATH containing one directory would be enough. Hm. Do you
want to say that you actually like the idea? Making everything ready
for the mmap would be quite simple. The only problem I can see is that 
we would need to have a wordcode-verifier (but, of course, that can be 
done). That's yet another reason for having only a scalar containing
only one directory name (so $COMPILED_FDIR might be a better name) --
save compiled functions only if that is set and names an existing,
writable directory. Users would set it to a directory in their account 
so that others can't trick them into using evil code.

> } All this also makes me think about a way to allow multiple zsh's to
> } share other memory bits (like the command table and so on). How
> } portable is anonymous shared mmap or shared mmap on /dev/null?
> 
> Do we really want to go down the road of having e.g. zmodload in one
> zsh suddenly make new builtins available to another zsh?  I don't want
> the behavior of a script that's running in the background to change
> because of something I loaded into my foreground shell ...

Should be configurable, of course. And to be turned on explicitly. If
at all...

Bye
 Sven

--
Sven Wischnowsky                         wischnow@informatik.hu-berlin.de

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Precompiled wordcode zsh functions
  2000-02-25  8:41 PATCH: parser (was: Re: PATCH: Improved _mailboxes) Sven Wischnowsky
@ 2000-02-25  9:44 ` Bart Schaefer
  2000-02-25  9:55 ` PATCH: parser (was: Re: PATCH: Improved _mailboxes) Andrej Borsenkow
  1 sibling, 0 replies; 15+ messages in thread
From: Bart Schaefer @ 2000-02-25  9:44 UTC (permalink / raw)
  To: zsh-workers

On Feb 25,  9:41am, Sven Wischnowsky wrote:
} Subject: Re: PATCH: parser (was: Re: PATCH: Improved _mailboxes)
}
} > Let's stick with autoload and have a file suffix convention, like
} > emacs' .el and .elc, or something. Heck, there could even be
} > separate fpath and compiled_fpath or ...
} 
} I was wondering what to do when the directory isn't writable... but a
} $COMPILED_FPATH containing one directory would be enough.

There need to be at least two directories, one for the users' personal
functions and one for the .../share/zsh/$ZSH_VERSION/functions/... set.
You can't expect everyone to keep their own compiled copies of the base
function library, surely?

} Hm. Do you want to say that you actually like the idea?

Me?  I don't really care one way or the other, except that I want to see
it done right if it's going to be done.

} [...] we would need to have a wordcode-verifier [...]

How does emacs assure the integrity of .elc files?  Or does it?

} That's yet another reason for having only a scalar containing
} only one directory name (so $COMPILED_FDIR might be a better name) --
} save compiled functions only if that is set and names an existing,
} writable directory. Users would set it to a directory in their account 
} so that others can't trick them into using evil code.

Zsh should probably already be more paranoid than it is about loading
modules or functions from widely-writable directories or files.  But
that has nothing to do with how many such directories or files are
involved.  Where does "save compiled functions" come in?  I'd think
we'd want an explicit "zcompile" builtin so functions can selectively
be compiled or not.  I don't want it just automatically writing out
wordcode for every function it ever loads.

-- 
Bart Schaefer                                 Brass Lantern Enterprises
http://www.well.com/user/barts              http://www.brasslantern.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: PATCH: parser (was: Re: PATCH: Improved _mailboxes)
  2000-02-25  8:41 PATCH: parser (was: Re: PATCH: Improved _mailboxes) Sven Wischnowsky
  2000-02-25  9:44 ` Precompiled wordcode zsh functions Bart Schaefer
@ 2000-02-25  9:55 ` Andrej Borsenkow
  1 sibling, 0 replies; 15+ messages in thread
From: Andrej Borsenkow @ 2000-02-25  9:55 UTC (permalink / raw)
  To: Sven Wischnowsky, zsh-workers

>
> Bart Schaefer wrote:
>
> > On Feb 24, 10:07am, Sven Wischnowsky wrote:
> > } Subject: RE: PATCH: parser (was: Re: PATCH: Improved _mailboxes)
> > }
> > }
> > } Andrej Borsenkow wrote:
> > }
> > } > zcodeload file
> >
> > Let's not do that, shall we?  Let's stick with autoload and have a file
> > suffix convention, like emacs' .el and .elc, or something.  Heck, there
> > could even be separate fpath and compiled_fpath or ...
>
> I was wondering what to do when the directory isn't writable... but a
> $COMPILED_FPATH containing one directory would be enough. Hm. Do you
> want to say that you actually like the idea? Making everything ready
> for the mmap would be quite simple. The only problem I can see is that
> we would need to have a wordcode-verifier (but, of course, that can be
> done). That's yet another reason for having only a scalar containing
> only one directory name (so $COMPILED_FDIR might be a better name) --
> save compiled functions only if that is set and names an existing,
> writable directory. Users would set it to a directory in their account
> so that others can't trick them into using evil code.
>

Ehh ... not sure, I really did mean it all :-) Do you suggest compiling
functions on the fly and storing byte code externally? And mmaping every
single function? Just some points.

- at least on my system (and it is pretty much standard SVR4) process memory
is the list of segments. Every mmap results in adding address segment. To
resolve virtual address, system needs to search this list. mmaping 30-40
functions will add corresponding number of segments - not only is it slow,
but due to alignment restrictions it is going to waste virtual memory.

- I actually meant, that precompiled (standard) functions are installed in
default system location. This all was intended only for those functions,
that come with zsh distribution and can be considered "read only". Doing
this for arbitrary function (or any piece of code) has obvious problem of
keeping two in sync. I am not sure, if it worth troubles.

- in other words, my intention was to have single file with byte code for
distributed functions (that can be included into distribution or generated
as part of build). This will basically predefine all functions, making them
"part" of zsh binary  - without need to autoload it. If user wants to
override them - he can always define function again. (actually, step further
is to load them in memory and dump executable. Sounds familiar, does not it
:-)

/andrej


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Precompiled wordcode zsh functions
@ 2000-02-25 10:42 Sven Wischnowsky
  2000-02-25 17:35 ` Bart Schaefer
  0 siblings, 1 reply; 15+ messages in thread
From: Sven Wischnowsky @ 2000-02-25 10:42 UTC (permalink / raw)
  To: zsh-workers

Bart Schaefer wrote:

> ...
>
> } [...] we would need to have a wordcode-verifier [...]
> 
> How does emacs assure the integrity of .elc files?  Or does it?

Dunno. What I'm worried about is that the parser catches wrong shell
code, but for wordcode... (of course, modules have the same problem,
probably even worse).

> } That's yet another reason for having only a scalar containing
> } only one directory name (so $COMPILED_FDIR might be a better name) --
> } save compiled functions only if that is set and names an existing,
> } writable directory. Users would set it to a directory in their account 
> } so that others can't trick them into using evil code.
> 
> Zsh should probably already be more paranoid than it is about loading
> modules or functions from widely-writable directories or files.  But
> that has nothing to do with how many such directories or files are
> involved.  Where does "save compiled functions" come in?  I'd think
> we'd want an explicit "zcompile" builtin so functions can selectively
> be compiled or not.  I don't want it just automatically writing out
> wordcode for every function it ever loads.

In the light of Andrej's last comments, how about:

Add a builtin (`zcompile' if you wish), that gets a list of
filenames. The first one is used as the file to write the code for all 
functions named by the other filenames into. These have to name
existing function files (not necessarily in $fpath). So the generated
file is a kind of digest containing the code for multiple functions.

Then: $fpath may also contain names of such digest files. In
getfpfunc() (that's where we load autoloaded functions), if the name
of a digest file in $fpath is found, the file is searched for the
definition of the function we are seeking. If it contains this
function, the thing is mapped and the Eprog is set up. We would keep a 
list of already mapped files, of course, and if all functions used in
such a file are re-defined or unfunction'ed, we unmap it.

One problem: should there be some warning if the digest file is older
than the function file (if that is reachable through $fpath)? I.e. do
we have to test that?

Second problem: functions like _cvs that essentially just define lots
of functions and re-define themselves[1]. The mapped function would of 
course be the short lived function-defining one.

Bye
 Sven

[1] I was always against doing it that way ;-)

--
Sven Wischnowsky                         wischnow@informatik.hu-berlin.de

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Precompiled wordcode zsh functions
  2000-02-25 10:42 Precompiled wordcode zsh functions Sven Wischnowsky
@ 2000-02-25 17:35 ` Bart Schaefer
  0 siblings, 0 replies; 15+ messages in thread
From: Bart Schaefer @ 2000-02-25 17:35 UTC (permalink / raw)
  To: zsh-workers

On Feb 25, 11:42am, Sven Wischnowsky wrote:
} Subject: Re: Precompiled wordcode zsh functions
}
} Add a builtin (`zcompile' if you wish), that gets a list of
} filenames. The first one is used as the file to write the code for all 
} functions named by the other filenames into. These have to name
} existing function files (not necessarily in $fpath). So the generated
} file is a kind of digest containing the code for multiple functions.
} 
} Then: $fpath may also contain names of such digest files.

So far so good, though I'd still prefer if such a file could just sit
inside a directory in $fpath (or some other searched path) and be loaded
like any other autoloaded function.  (Which means there needs to be some
sort of convention for choosing the compiled file if both compiled and
uncompiled functions are present.)

} In getfpfunc() (that's where we load autoloaded functions), if the
} name of a digest file in $fpath is found, the file is searched for
} the definition of the function we are seeking. If it contains this
} function, the thing is mapped and the Eprog is set up.

Hmm.  Probably there'd have to be a "directory" at the top of the file
with the names and offsets (or some such) of all the functions therein.
That header could also contain some flags determined at compile time,
such as whether the file should be mmap'd or merely read.  Such a flag
would normally be computed by the compiler based on the size or some
such criteria, but could be overridden by an option to the "zcompile"
(or whatever) builtin.  Thus if one wanted to have a lot of small files
with only one function each, the result would not be a zillion mmaps.

} One problem: should there be some warning if the digest file is older
} than the function file (if that is reachable through $fpath)? I.e. do
} we have to test that?

I *think* emacs detects that condition only when the .el and .elc are
in the same directory.  Certainly we shouldn't go searching the entire
fpath to verify every compiled function, particularly if there is more
than one function in each wordcode file.

} Second problem: functions like _cvs that essentially just define lots
} of functions and re-define themselves[1].

I saw your follow-up, but one remark:  That technique would no longer be
necessary because loading the wordcode file would immediately define all
the functions therein without having to execute one of them first.

Random thought: How about .zwc (zsh word code) for the file extension?

-- 
Bart Schaefer                                 Brass Lantern Enterprises
http://www.well.com/user/barts              http://www.brasslantern.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Precompiled wordcode zsh functions
@ 2000-02-25 11:31 Sven Wischnowsky
  0 siblings, 0 replies; 15+ messages in thread
From: Sven Wischnowsky @ 2000-02-25 11:31 UTC (permalink / raw)
  To: zsh-workers


I wrote:

> Second problem: functions like _cvs that essentially just define lots
> of functions and re-define themselves[1]. The mapped function would of 
> course be the short lived function-defining one.

Forget that. Function definitions are stored in a way that allows us
to use them directly for the function Eprog (without allocating
separate memory). So...

Bye
 Sven


--
Sven Wischnowsky                         wischnow@informatik.hu-berlin.de


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Precompiled wordcode zsh functions
@ 2000-02-28 10:07 Sven Wischnowsky
  2000-02-28 14:50 ` Sven Wischnowsky
  0 siblings, 1 reply; 15+ messages in thread
From: Sven Wischnowsky @ 2000-02-28 10:07 UTC (permalink / raw)
  To: zsh-workers

[ I implemented my ideas at the weekend and got to read this mail
  today, so I'm withholding the patch for now... ]

Bart Schaefer wrote:

> On Feb 25, 11:42am, Sven Wischnowsky wrote:
> } Subject: Re: Precompiled wordcode zsh functions
> }
> } Add a builtin (`zcompile' if you wish), that gets a list of
> } filenames. The first one is used as the file to write the code for all 
> } functions named by the other filenames into. These have to name
> } existing function files (not necessarily in $fpath). So the generated
> } file is a kind of digest containing the code for multiple functions.
> } 
> } Then: $fpath may also contain names of such digest files.
> 
> So far so good, though I'd still prefer if such a file could just sit
> inside a directory in $fpath (or some other searched path) and be loaded
> like any other autoloaded function.  (Which means there needs to be some
> sort of convention for choosing the compiled file if both compiled and
> uncompiled functions are present.)

Hm. If we think about one file per function, we should certainly make
them be found in the directories in $fpath. But that would basically
give us two types of dump-files, unless we make them detectable
(e.g. by the .zwc extension you suggest) and make getfpfunc() search
all .zwc files for the function we are trying to load. Hm or maybe to
different kinds of lookup: if getfpfunc() finds out that one of the
strings in $fpath isn't a directory containing a file with the name
searched, it tries to use it as a dump-file containing multiple
functions and checks if it contains the definition for the function
searched (that' basically how the stuff I wrote works). But if the
directory from $fpath just being handled is a directory and it
contains a file <name>.zwc, we use that (at this time we could compare 
the modification times for <name> and <name>.zwc, of course).

> } In getfpfunc() (that's where we load autoloaded functions), if the
> } name of a digest file in $fpath is found, the file is searched for
> } the definition of the function we are seeking. If it contains this
> } function, the thing is mapped and the Eprog is set up.
> 
> Hmm.  Probably there'd have to be a "directory" at the top of the file
> with the names and offsets (or some such) of all the functions therein.

That's what my implementation does. Since this is currently only
intended for files containing lots of functions, they are always
mapped. Even mapped completely for now, could probably be changed to
map them step by step as more and more functions from it are
used. Although I'm really not that concerned about memory usage
here. The completion function (only the _* files) take up somewhat
less than 300KB, btw.

> That header could also contain some flags determined at compile time,
> such as whether the file should be mmap'd or merely read.  Such a flag
> would normally be computed by the compiler based on the size or some
> such criteria, but could be overridden by an option to the "zcompile"
> (or whatever) builtin.  Thus if one wanted to have a lot of small files
> with only one function each, the result would not be a zillion mmaps.

Hm, yes, hadn't thought about that. I'm not so sure about the
automatical detection of the flag since it would involve some kind of
threshold. It's always so difficult to find a good value (a page size?
per function or for the whole file if it contains more than one
function?).

> } One problem: should there be some warning if the digest file is older
> } than the function file (if that is reachable through $fpath)? I.e. do
> } we have to test that?
> 
> I *think* emacs detects that condition only when the .el and .elc are
> in the same directory.  Certainly we shouldn't go searching the entire
> fpath to verify every compiled function, particularly if there is more
> than one function in each wordcode file.

Yep. The implementation I have now does nothing about this, because it 
only thinks about `digest' files.

> } Second problem: functions like _cvs that essentially just define lots
> } of functions and re-define themselves[1].
> 
> I saw your follow-up, but one remark:  That technique would no longer be
> necessary because loading the wordcode file would immediately define all
> the functions therein without having to execute one of them first.

But it doesn't do any harm either -- it is very fast (with such a
dump-file in your fpath those initial completion where the functions
were loaded and parsed become, of course a lot faster and defining
functions in functions from a dump-file is very fast, too).

Bye
 Sven

--
Sven Wischnowsky                         wischnow@informatik.hu-berlin.de

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Precompiled wordcode zsh functions
@ 2000-02-28 14:50 ` Sven Wischnowsky
  2000-02-28 18:18   ` Zefram
  0 siblings, 1 reply; 15+ messages in thread
From: Sven Wischnowsky @ 2000-02-28 14:50 UTC (permalink / raw)
  To: zsh-workers

I wrote:

> Hm. If we think about one file per function, we should certainly make
> them be found in the directories in $fpath.

I forgot: there is a problem with this which I remembered at the
weekend. The word-code isn't really machine-independent, it depends on 
the endian-ness. For wordcode-files that are not to be mapped, it
would be possible to walk through the code and shuffle the bytes
around if need be, but I hope you all agree that the mapped files
should be mapped read-only, so...

With that a standard installation could:

- install only a digest file in a per-machine (machine-type)
  directory, i.e. not shared by all hosts
- install .zwc files for the functions in a directory different from
  the one where the (shared) functions files are (so that the test for 
  which-one-is-newer couldn't be done for them, which is probably not
  too big a problem)
- install the functions in a per-machine directory along with the
  wordcode-files for them

Of course, the first one could be combined with the other two. With
such digest files it would be up to the user to decide if he puts them 
into $fpath (at least that's how I think of them: as a different kind
of `function directory').

Since we can detect the endian-ness used in the wordcode file (this is 
already done in my implementation), we could also allow to install two 
wordcode-files, one for each endian-ness. Or we could make the wordcode-
files contain both versions (there are only two ways unsigned integers 
are stored nowadays, right?). As long as they are properly separated,
so that only one of the two is read/mapped, this shouldn't do much
harm, should it? But still, quite ugly, I think.

Bye
 Sven

--
Sven Wischnowsky                         wischnow@informatik.hu-berlin.de

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Precompiled wordcode zsh functions
  2000-02-28 14:50 ` Sven Wischnowsky
@ 2000-02-28 18:18   ` Zefram
  2000-02-29  4:22     ` Bart Schaefer
  0 siblings, 1 reply; 15+ messages in thread
From: Zefram @ 2000-02-28 18:18 UTC (permalink / raw)
  To: Sven Wischnowsky

Sven Wischnowsky wrote:
>I forgot: there is a problem with this which I remembered at the
>weekend. The word-code isn't really machine-independent, it depends on 
>the endian-ness.

Does it depend on the length of int too?

>With that a standard installation could:

The obvious clean solution is to define an architecture-independent
wordcode format so that the saved wordcode is shareable.  However,
I expect that would have an undesirable amount of overhead.

Next possibility is to have saved wordcode files contain wordcode for
both endiannesses.  Wordcode files get twice as big, but the unneeded half
can be so completely ignored that it never gets paged in at all.  I don't
think that address space usage is a significant concern at this level.

Next possibility: separate wordcode files for different endiannesses
(and possibly different int sizes).  Have .zwcb and .zwcl suffixes.
When looking for the file for an individual function, only look for the
appropriate suffix; digest files are ignored if they are of the wrong
endianness (determined internally).

On the issue of digest files versus individual function files, I think we
should have both.  A .zwc file in a directory in $fpath acts exactly like
a normal textual function definition file, except that it is in wordcode
instead of text; it should take precedence over any file (of either type)
further down $fpath, but we may want to do a date comparison if both
textual and wordcode files exist in the same directory.  A digest file
should actually be listed in $fpath; its definitions take precedence
over directories (and digest files) further down $fpath.

We should probably have some size theshold to switch between reading and
mapping wordcode files; e.g., any file over two pages long gets mmapped.

-zefram

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Precompiled wordcode zsh functions
  2000-02-28 18:18   ` Zefram
@ 2000-02-29  4:22     ` Bart Schaefer
  0 siblings, 0 replies; 15+ messages in thread
From: Bart Schaefer @ 2000-02-29  4:22 UTC (permalink / raw)
  To: zsh-workers

On Feb 28, 11:07am, Sven Wischnowsky wrote:
} Subject: Re: Precompiled wordcode zsh functions
}
} Hm. If we think about one file per function, we should certainly make
} them be found in the directories in $fpath. [...] if getfpfunc() finds
} out that one of the strings in $fpath isn't a directory containing
} a file with the name searched, it tries to use it as a dump-file
} containing multiple functions and checks if it contains the definition
} [... b]ut if the directory from $fpath just being handled is a
} directory and it contains a file <name>.zwc, we use that (at this time
} we could compare the modification times for <name> and <name>.zwc, of
} course).

Yes.  Note that I think the files should have the .zwc extension in both
cases; the only difference is whether the loading code opens the file and
searches its internal "directory," or simply matches on the file name.

On Feb 28,  6:18pm, Zefram wrote:
} Subject: Re: Precompiled wordcode zsh functions
}
} Sven Wischnowsky wrote:
} >I forgot: there is a problem with this which I remembered at the
} >weekend. The word-code isn't really machine-independent, it depends on 
} >the endian-ness.
} 
} [...] have saved wordcode files contain wordcode for
} both endiannesses.  Wordcode files get twice as big, but the unneeded half
} can be so completely ignored that it never gets paged in at all.  I don't
} think that address space usage is a significant concern at this level.

Sven suggested that, too, and I think it's the best idea.

} Have .zwcb and .zwcl suffixes.

We should be friendly to those who compile zsh under Cygwin, or to Amol
if he decides to update his NT port, and use only three-letter suffixes.
Perhaps .zbw and .zlw for 32-bit ints and .zbl and .zll for 64-bit?
Though IMO it'd be better if we could stick to 32 bits and one suffix.

} A .zwc file in a directory in $fpath acts exactly like a normal
} textual function definition file, except that it is in wordcode
} instead of text; it should take precedence over any file (of either
} type) further down $fpath, but we may want to do a date comparison
} if both textual and wordcode files exist in the same directory. A
} digest file should actually be listed in $fpath; its definitions take
} precedence over directories (and digest files) further down $fpath.

I'm a bit worried about functions getting redefined -- and about
functions that *need* to get redefined, e.g. a .zwc file representing
a "package" may contain a function whose name clashes with one that
the user defined earlier in $fpath.  In the current state of the world
(without wordcode files) the package clobbers the user's function
unless the package author has made an effort to avoid it (as in
Completion/User/_cvs).  Emacs .el and .elc have that same behavior.
What Zefram has suggested for function digest files would behave more
like standard path hashing.

Do we need some way to express at compile time whether a digest is a
package with internal dependencies vs. a mere collection of otherwise
unrelated functions?

-- 
Bart Schaefer                                 Brass Lantern Enterprises
http://www.well.com/user/barts              http://www.brasslantern.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Precompiled wordcode zsh functions
@ 2000-02-29  7:45 Sven Wischnowsky
  2000-02-29  8:15 ` Andrej Borsenkow
  2000-02-29  8:21 ` Bart Schaefer
  0 siblings, 2 replies; 15+ messages in thread
From: Sven Wischnowsky @ 2000-02-29  7:45 UTC (permalink / raw)
  To: zsh-workers

Bart Schaefer wrote:

> On Feb 28, 11:07am, Sven Wischnowsky wrote:
> } Subject: Re: Precompiled wordcode zsh functions
> }
> } Hm. If we think about one file per function, we should certainly make
> } them be found in the directories in $fpath. [...] if getfpfunc() finds
> } out that one of the strings in $fpath isn't a directory containing
> } a file with the name searched, it tries to use it as a dump-file
> } containing multiple functions and checks if it contains the definition
> } [... b]ut if the directory from $fpath just being handled is a
> } directory and it contains a file <name>.zwc, we use that (at this time
> } we could compare the modification times for <name> and <name>.zwc, of
> } course).
> 
> Yes.  Note that I think the files should have the .zwc extension in both
> cases; the only difference is whether the loading code opens the file and
> searches its internal "directory," or simply matches on the file name.

I've hacked more yesterday, reaching the state Zefram talked about,
sans the endian-ness-independence. Like you two I favour the approach
with files containing two versions. I hope to find the time for that
this evening.

Oh, and currently the shell does not check the extension of a digest
file and it doesn't compare the file times for compiled/non-compiled
functions... yet.

> ...
> 
> } Have .zwcb and .zwcl suffixes.
> 
> We should be friendly to those who compile zsh under Cygwin, or to Amol
> if he decides to update his NT port, and use only three-letter suffixes.
> Perhaps .zbw and .zlw for 32-bit ints and .zbl and .zll for 64-bit?
> Though IMO it'd be better if we could stick to 32 bits and one suffix.

I definitely want to stay with 32 bits. Although currently it is
dependent of the size of integers, I hope to make that architecture
independent and took care to always use the type `wordcode' instead of 
`int'. I still have to check -- do we have a configure test for the
size of ints?

> } A .zwc file in a directory in $fpath acts exactly like a normal
> } textual function definition file, except that it is in wordcode
> } instead of text; it should take precedence over any file (of either
> } type) further down $fpath, but we may want to do a date comparison
> } if both textual and wordcode files exist in the same directory. A
> } digest file should actually be listed in $fpath; its definitions take
> } precedence over directories (and digest files) further down $fpath.
> 
> I'm a bit worried about functions getting redefined -- and about
> functions that *need* to get redefined, e.g. a .zwc file representing
> a "package" may contain a function whose name clashes with one that
> the user defined earlier in $fpath.  In the current state of the world
> (without wordcode files) the package clobbers the user's function
> unless the package author has made an effort to avoid it (as in
> Completion/User/_cvs).  Emacs .el and .elc have that same behavior.
> What Zefram has suggested for function digest files would behave more
> like standard path hashing.

Yep. In my implementation digest files are really only one-file-
directories. I.e. they are searched like normal directories by
getfpfunc() (more precisely a utility function used by it). It will
not define all functions in the digest file immediatly. I really
prefer that behaviour because a user has to worry about nothing when,
for example, he wants to override one of the functions with his own
definition in a directory earlier in $fpath.

> Do we need some way to express at compile time whether a digest is a
> package with internal dependencies vs. a mere collection of otherwise
> unrelated functions?

I don't think so, if we keep the current behaviour.

Oh, and, btw, for testing purposes I set the threshold (when a
function gets mapped instead of being read) to 4096 bytes. The result
was that only very few functions (around ten) would be mapped. If we
use two pages as the threshold (or one page on a box with page-size == 
8192), no function will be mapped. I don't really have an opinion
about this, because I'll use it with one big wordcode file for the
whole completion system (and other functions I have)... so I won't do
much testing there, leaving it to all of you to decide (once I have
the patch in representable shape, so that you can play with it).

Bye
 Sven

--
Sven Wischnowsky                         wischnow@informatik.hu-berlin.de

^ permalink raw reply	[flat|nested] 15+ messages in thread

* RE: Precompiled wordcode zsh functions
  2000-02-29  7:45 Sven Wischnowsky
@ 2000-02-29  8:15 ` Andrej Borsenkow
  2000-02-29  8:21 ` Bart Schaefer
  1 sibling, 0 replies; 15+ messages in thread
From: Andrej Borsenkow @ 2000-02-29  8:15 UTC (permalink / raw)
  To: Sven Wischnowsky, zsh-workers

>
> Yep. In my implementation digest files are really only one-file-
> directories. I.e. they are searched like normal directories by
> getfpfunc() (more precisely a utility function used by it). It will
> not define all functions in the digest file immediatly. I really
> prefer that behaviour because a user has to worry about nothing when,
> for example, he wants to override one of the functions with his own
> definition in a directory earlier in $fpath.
>

Yep, after thinking about it a bit more I believe this is the "least
confusing" case. Unfortunately, here is where kshautoload cuts in :-) Does
your digest handle it currently? What happens with

autoload foo

contents of foo:

some prolog code
bar1() { }
bar2() { }
foo() { }

I'd expect, that bar1, bar2 and foo were (re-)defined as just reference to
already compiled code (actually, true for every funtction possibly defined
in zwc file).


/andrej


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Precompiled wordcode zsh functions
  2000-02-29  7:45 Sven Wischnowsky
  2000-02-29  8:15 ` Andrej Borsenkow
@ 2000-02-29  8:21 ` Bart Schaefer
  1 sibling, 0 replies; 15+ messages in thread
From: Bart Schaefer @ 2000-02-29  8:21 UTC (permalink / raw)
  To: zsh-workers

On Feb 29,  8:45am, Sven Wischnowsky wrote:
} Subject: Re: Precompiled wordcode zsh functions
}
} [...]  In my implementation digest files are really only one-file-
} directories. I.e. they are searched like normal directories by
} getfpfunc() (more precisely a utility function used by it). It will
} not define all functions in the digest file immediately. I really
} prefer that behaviour because a user has to worry about nothing when,
} for example, he wants to override one of the functions with his own
} definition in a directory earlier in $fpath.

I'm concerned that we should at least have a way to produce a warning
about it.  I mean, if I were to invent a function named `_files' that
had nothing to do with completion, and put it in a directory early in
my $fpath -- even PWS's guide recommends putting your own functions
before distributed ones -- three-quarters of the completion system
would be mysteriously broken for me.  If the whole completion system
has been hidden inside one giant file, how do I find out what has gone
wrong?

And lest you think this is farfetched, please note that I've had the
following in my .zshenv for many years now[*]:

    alias calc="noglob _calc"
    _calc() { awk "BEGIN {print $*}" < /dev/null }

So existing user functions with leading underscores are not out of the
question.  

Oh, and what's the handling with respect to kshautoload vs. a function
like _cvs that wants to define other functions and then call itself?

[*] Predating floating point support in zsh ...  I never learned "bc".

-- 
Bart Schaefer                                 Brass Lantern Enterprises
http://www.well.com/user/barts              http://www.brasslantern.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Precompiled wordcode zsh functions
@ 2000-02-29  7:52 Sven Wischnowsky
  0 siblings, 0 replies; 15+ messages in thread
From: Sven Wischnowsky @ 2000-02-29  7:52 UTC (permalink / raw)
  To: zsh-workers


I wrote:

> I definitely want to stay with 32 bits. Although currently it is
> dependent of the size of integers, I hope to make that architecture
> independent and took care to always use the type `wordcode' instead of 
> `int'. I still have to check -- do we have a configure test for the
> size of ints?

I forgot to ask: *are* there any machines with sizeof(int) == 8? And
if yes, do they have sizeof(short) == 4?


And about the threshold: when we make the wordcode files architecture
independent, we probably shouldn't make  it relative to the page size.

Bye
 Sven


--
Sven Wischnowsky                         wischnow@informatik.hu-berlin.de


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Precompiled wordcode zsh functions
@ 2000-02-29 11:42 Sven Wischnowsky
  0 siblings, 0 replies; 15+ messages in thread
From: Sven Wischnowsky @ 2000-02-29 11:42 UTC (permalink / raw)
  To: zsh-workers


Bart Schaefer wrote:

> On Feb 29,  8:45am, Sven Wischnowsky wrote:
> } Subject: Re: Precompiled wordcode zsh functions
> }
> } [...]  In my implementation digest files are really only one-file-
> } directories. I.e. they are searched like normal directories by
> } getfpfunc() (more precisely a utility function used by it). It will
> } not define all functions in the digest file immediately. I really
> } prefer that behaviour because a user has to worry about nothing when,
> } for example, he wants to override one of the functions with his own
> } definition in a directory earlier in $fpath.
> 
> I'm concerned that we should at least have a way to produce a warning
> about it.  I mean, if I were to invent a function named `_files' that
> had nothing to do with completion, and put it in a directory early in
> my $fpath -- even PWS's guide recommends putting your own functions
> before distributed ones -- three-quarters of the completion system
> would be mysteriously broken for me.  If the whole completion system
> has been hidden inside one giant file, how do I find out what has gone
> wrong?

That's one of the reasons why I'm not too happy with the thought of
installing such a digest file per-default. I mean, maybe we should
just leave it to the user to create his/her own digest files
containing the stuff (s)he really wants. The 400KB (yes, it's 400, the 
300 was a typo, sorry) isn't that much, is it? With that we would have 
the same situation as now. Also, the functions in the digest can, of
course, be listed, so it's the same problem as looking into the
directories in $fpath. Hm, maybe a function that checks everything in
$fpath to see which names are defined more than once? [1]

> And lest you think this is farfetched, please note that I've had the
> following in my .zshenv for many years now[*]:
> 
>     alias calc="noglob _calc"
>     _calc() { awk "BEGIN {print $*}" < /dev/null }
> 
> So existing user functions with leading underscores are not out of the
> question.  

I had functions beginning with an underscore myself...

> Oh, and what's the handling with respect to kshautoload vs. a function
> like _cvs that wants to define other functions and then call itself?

Currently zcompile just puts the contents of the files into the
wordcode files. I.e. functions in them behave exactly like the files.


Bye
 Sven

[1] There are other interesting possibilities for functions wrt
    compilation: a `recompile' function that checks file dates and
    digest files. A function for syntax-checking a file -- that's
    possible because the zcompile reports parse errors as usual and
    one can use /dev/null as the name of the target wordcode file.

--
Sven Wischnowsky                         wischnow@informatik.hu-berlin.de


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2000-02-29 11:42 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-02-25  8:41 PATCH: parser (was: Re: PATCH: Improved _mailboxes) Sven Wischnowsky
2000-02-25  9:44 ` Precompiled wordcode zsh functions Bart Schaefer
2000-02-25  9:55 ` PATCH: parser (was: Re: PATCH: Improved _mailboxes) Andrej Borsenkow
2000-02-25 10:42 Precompiled wordcode zsh functions Sven Wischnowsky
2000-02-25 17:35 ` Bart Schaefer
2000-02-25 11:31 Sven Wischnowsky
2000-02-28 10:07 Sven Wischnowsky
2000-02-28 14:50 ` Sven Wischnowsky
2000-02-28 18:18   ` Zefram
2000-02-29  4:22     ` Bart Schaefer
2000-02-29  7:45 Sven Wischnowsky
2000-02-29  8:15 ` Andrej Borsenkow
2000-02-29  8:21 ` Bart Schaefer
2000-02-29  7:52 Sven Wischnowsky
2000-02-29 11:42 Sven Wischnowsky

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).