zsh-workers
 help / color / mirror / code / Atom feed
* Question about ingetc() vs. word-code
@ 2017-07-05 11:37 Sebastian Gniazdowski
  2017-07-08 21:42 ` Bart Schaefer
  0 siblings, 1 reply; 3+ messages in thread
From: Sebastian Gniazdowski @ 2017-07-05 11:37 UTC (permalink / raw)
  To: zsh-workers

Hello,
I noticed quite large number of ingetc() calls:

10,588,088  Src/input.c:ingetc [Src/zsh]

This is for:
- zplugin.zsh compiled
- all plugins compiled
- all functions in zwc, loaded via autoload -w

For zplugin uncompiled, it's:

15,315,810  Src/input.c:ingetc [Src/zsh]

So indeed compilation helps. However, when I did:


ingetc(void)
{
    int lastc = ' ';

    if (lexstop)
        return ' ';

    FILE *f = fopen("/tmp/reply", "a+");
    int loop = 0;

    for (;;) {
        if (f) {
            fprintf( f, "%c\n", inbufptr ? *inbufptr : 'x' );
            fflush( f );
        }


then I could find in the output:

xsource"$HHOME/.zplugin/bin/zplugin.zsh"BIN_DIR]BIN_DIR]BIN_DIR]ZERO]ZERO]BIN_DIR]ZERO]BIN_DIR]BIN_DIR]BIN_DIR]BIN_
DIR]HOME_DIR]HOME_DIR]HOME_DIR]PLUGINS_DIR]HOME_DIR]PLUGINS_DIR]PLUGINS_DIR]COMPLETIONS_DIR]HOME_DIR]COMPLETIONS_DI
R]COMPLETIONS_DIR]SNIPPETS_DIR]HOME_DIR]SNIPPETS_DIR]SNIPPETS_DIR]LEX_DIR]HOME_DIR]LEX_DIR]LEX_DIR]


This source command is fine, it comes from uncompiled ~/.zshrc. However, BIN_DIR, ZERO, HOME_DIR, PLUGINS_DIR, SNIPPETS_DIR, LEX_DIR – these ZPLGM hash fields are declared in the beginning of (compiled) zplugin.zsh.

Or following:

CUR_USPL2]*^@keyword]rst]*^@$uuspl2]$uuspl2]$uuspl2^@DTRACE]CUR_USPL2]*^@keyword]rst]*^@$uuspl2]$uuspl2]$uuspl2^@DTRACE]CUR_USPL2]*^@keyword]rst]*^@$uuspl2]$uuspl2]$uuspl2^@

ZPLGM[CUR_USPL2] – zplugin hash field. $uspl2 - local variable. ZPLGM[DTRACE] - hash field.


Why the compiled, not-eval source still exist in hunks in ingetc() input? Many times. The eval-code also appears, but this is probably expected.

--  
Sebastian Gniazdowski
psprint /at/ zdharma.org


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Question about ingetc() vs. word-code
  2017-07-05 11:37 Question about ingetc() vs. word-code Sebastian Gniazdowski
@ 2017-07-08 21:42 ` Bart Schaefer
  2017-07-09  7:25   ` Sebastian Gniazdowski
  0 siblings, 1 reply; 3+ messages in thread
From: Bart Schaefer @ 2017-07-08 21:42 UTC (permalink / raw)
  To: zsh-workers

On Wed, Jul 5, 2017 at 4:37 AM, Sebastian Gniazdowski
<psprint@zdharma.org> wrote:
> Hello,
> I noticed quite large number of ingetc() calls

ingetc() is the central function used for reading any shell input that
has to undergo alias expansion or any other sort of lookahead -- stdio
only provides one byte of input "put-back" [ungetc()], but in order to
properly manage aliases and to differentiate things like "((..." [as
either two subshells or one math expression], the shell lexer may need
to read, put back, and then re-read an arbitrary amount of the input
stream.

> Why the compiled, not-eval source still exist in hunks in ingetc() input? Many times. The eval-code also appears, but this is probably expected.

The compiled wordcode includes all the original text of most strings
and identifiers, so that XTRACE and VERBOSE output can be properly
reproduced.  Only shell lexical tokens are turned into numeric codes.
Identifiers that are referenced as well as assigned will appear at
each $NAME expansion or function name call.

A possible optimization for compiling whole digests of related
functions would be to build an identifier dictionary and refer to the
identifiers by a wordcode value followed by an offset into the
dictionary, but this would be wasteful for most small/single-function
compilations and would complicate the XTRACE playback.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Question about ingetc() vs. word-code
  2017-07-08 21:42 ` Bart Schaefer
@ 2017-07-09  7:25   ` Sebastian Gniazdowski
  0 siblings, 0 replies; 3+ messages in thread
From: Sebastian Gniazdowski @ 2017-07-09  7:25 UTC (permalink / raw)
  To: Bart Schaefer, zsh-workers

On 08.07.2017 at 23:42:59, Bart Schaefer (schaefer@brasslantern.com) wrote:
> > Why the compiled, not-eval source still exist in hunks in ingetc() input? Many times.  
> The eval-code also appears, but this is probably expected.
>  
> The compiled wordcode includes all the original text of most strings
> and identifiers, so that XTRACE and VERBOSE output can be properly
> reproduced. Only shell lexical tokens are turned into numeric codes.
> Identifiers that are referenced as well as assigned will appear at
> each $NAME expansion or function name call.

Ok, got it. I also think that I understand why stringsubst(), prefork(), etc. are called that often – to decode flags, perform substitution, obtain actual value.

> A possible optimization for compiling whole digests of related
> functions would be to build an identifier dictionary and refer to the
> identifiers by a wordcode value followed by an offset into the
> dictionary, but this would be wasteful for most small/single-function
> compilations and would complicate the XTRACE playback.

I'm thinking about converting WC_SUBLIST to WC_SUBLIST_SIMPLE. Comment in parse.c says *_SIMPLE lists are executed faster. But the topic is difficult, and I'm not having much time, so my tempo for this is low. I think this might be related to has_token(), to simple actions (not sure if could write "simple lists" here) instead of prefork(), etc. But unsure if current token-detection is over-possitive. If I could simplify word-code, it would lead to situation, where normal Zsh compilation does things in stable way, while Zplugin's compilation would do things in hackish way, possibly for specific functions only. So both ways would be meaningful, could exist concurrently.

--  
Sebastian Gniazdowski
psprint /at/ zdharma.org


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-07-09  7:25 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-05 11:37 Question about ingetc() vs. word-code Sebastian Gniazdowski
2017-07-08 21:42 ` Bart Schaefer
2017-07-09  7:25   ` Sebastian Gniazdowski

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).