zsh-users
 help / color / mirror / code / Atom feed
* Source mangling in $functions_source and typeset -f
@ 2021-11-27  8:01 Zach Riggle
  2021-11-27 17:47 ` Bart Schaefer
  0 siblings, 1 reply; 10+ messages in thread
From: Zach Riggle @ 2021-11-27  8:01 UTC (permalink / raw)
  To: Zsh Users

[-- Attachment #1: Type: text/plain, Size: 1928 bytes --]

Hello again!

I've been playing around with some things regarding $fpath and autoloadable
functions.

Ultimately, I've got a nice wrapper which will print out the source of a
function (and autoload it if necessary) and then pass it to `bat` for
syntax highlighting.

Unfortunately, "$functions_source[foo]" and "typeset -f foo" both seem to
remove all comments, and rewrite the source such that there's no empty
newlines -- even if done explicitly with line-continuation slashes, or if
spaced out manually and meticulously in an array.

With a little bit of grep-foo, it's possible to use "$functions[foo]" and
search for the file for the function declaration.  With this little trick,
it makes it easy to open the editor to the correct line in the file where
the function is declared.

This makes it easy to display the path/to/file:linenum on which a given
function is declared, and open it easily in the editor of your choice with
a ⌘-Click or ⌃-Click depending on your chosen editor.

https://i.imgur.com/oPSiPWB.png


However, there's no easy way to determine the LAST line in the original
file which corresponds to the function -- due to aforementioned newline-
and comment-stripping.

https://i.imgur.com/nvzuFEm.png


Is there a convenient way, from within zsh, to get either:

   - The original, unmodified source of a function (autoloaded or otherwise)
   - The line offsets in the file where the function is defined (if any)?

Getting the starting offset is easy-ish (thanks grep!) but finding the
function end is less easy.  I expect there are Zsh internals that could
track this if desired, but it simply isn't tracked.

Are there any easy fixes to this?  My best path forward for detecting the
[start, end] of a function, with its original comments, will rely on
finding a closing '}' with the same indentation as the 'function foo()'
definition.

*Zach Riggle*

[-- Attachment #2: Type: text/html, Size: 2691 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Source mangling in $functions_source and typeset -f
  2021-11-27  8:01 Source mangling in $functions_source and typeset -f Zach Riggle
@ 2021-11-27 17:47 ` Bart Schaefer
  2021-11-27 19:25   ` Ray Andrews
  2021-11-29 16:00   ` Peter Stephenson
  0 siblings, 2 replies; 10+ messages in thread
From: Bart Schaefer @ 2021-11-27 17:47 UTC (permalink / raw)
  To: Zach Riggle; +Cc: Zsh Users

On Sat, Nov 27, 2021 at 12:02 AM Zach Riggle <zachriggle@gmail.com> wrote:
>
> Unfortunately, "$functions_source[foo]" and "typeset -f foo" both seem to remove all comments, and rewrite the source such that there's no empty newlines

With the exception of the contents of strings (including
here-documents), the original source of a function is not kept in
shell memory.  Instead a parse tree is stored and used to regenerate
the function definition by "typeset -f" et al.  Comments and
semantically-meaningless whitespace are discarded during parsing,
hence they're not available later.

> Is there a convenient way, from within zsh, to get either:
>
> The original, unmodified source of a function (autoloaded or otherwise)
> The line offsets in the file where the function is defined (if any)?

The parse tree only tracks the line numbers of executable code, so as
to be able to update the LINENO variable and print line numbers in
debug traces and prompts.  The line number of the closing brace isn't
recorded (in fact it's possible to define a function without any open
or close brace if the body is a single expression).

So, strictly speaking, no, neither of those.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Source mangling in $functions_source and typeset -f
  2021-11-27 17:47 ` Bart Schaefer
@ 2021-11-27 19:25   ` Ray Andrews
  2021-11-27 19:40     ` Lawrence Velázquez
  2021-11-27 20:14     ` Bart Schaefer
  2021-11-29 16:00   ` Peter Stephenson
  1 sibling, 2 replies; 10+ messages in thread
From: Ray Andrews @ 2021-11-27 19:25 UTC (permalink / raw)
  To: zsh-users

On 2021-11-27 9:47 a.m., Bart Schaefer wrote:
>
> With the exception of the contents of strings (including
> here-documents), the original source of a function is not kept in
> shell memory.  Instead a parse tree is stored and used to regenerate
> the function definition by "typeset -f" et al.
That's most interesting, it seems circular so there must be a good 
reason for it, but why take the source as written, then parse it down to 
'clean code' and then construct whatever internal representations zsh 
uses and then reconstruct clean code from that when one could just 
repeat the first step?  Is it perhaps faster to perform the 3d step than 
repeat the first step?  But repeating the first step would surely 
preserve more?




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Source mangling in $functions_source and typeset -f
  2021-11-27 19:25   ` Ray Andrews
@ 2021-11-27 19:40     ` Lawrence Velázquez
  2021-11-27 20:09       ` Ray Andrews
  2021-11-27 20:14     ` Bart Schaefer
  1 sibling, 1 reply; 10+ messages in thread
From: Lawrence Velázquez @ 2021-11-27 19:40 UTC (permalink / raw)
  To: Ray Andrews; +Cc: zsh-users

On Sat, Nov 27, 2021, at 2:25 PM, Ray Andrews wrote:
> On 2021-11-27 9:47 a.m., Bart Schaefer wrote:
>>
>> With the exception of the contents of strings (including
>> here-documents), the original source of a function is not kept in
>> shell memory.  Instead a parse tree is stored and used to regenerate
>> the function definition by "typeset -f" et al.
> That's most interesting, it seems circular so there must be a good 
> reason for it, but why take the source as written, then parse it down to 
> 'clean code' and then construct whatever internal representations zsh 
> uses and then reconstruct clean code from that when one could just 
> repeat the first step?

Are you asking why ''typeset -f'' and its ilk don't reread the
original source code?

> Is it perhaps faster to perform the 3d step than 
> repeat the first step?

Surely it is, if the first step involves disk I/O.

> But repeating the first step would surely preserve more?

Don't assume code comes from a file that can be read again.  What
if the original file is no longer available?  What if the function
was defined using standard input and didn't originate from a file
at all?

-- 
vq


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Source mangling in $functions_source and typeset -f
  2021-11-27 19:40     ` Lawrence Velázquez
@ 2021-11-27 20:09       ` Ray Andrews
  0 siblings, 0 replies; 10+ messages in thread
From: Ray Andrews @ 2021-11-27 20:09 UTC (permalink / raw)
  To: zsh-users

On 2021-11-27 11:40 a.m., Lawrence Velázquez wrote:
> Don't assume code comes from a file that can be read again. What
> if the original file is no longer available?  What if the function
> was defined using standard input and didn't originate from a file
> at all?
>
That alone could be reason enough, thanks.  The method would want to be 
universal, so, as you say, if no file is involved, then obviously only 
what's internal is available anyway.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Source mangling in $functions_source and typeset -f
  2021-11-27 19:25   ` Ray Andrews
  2021-11-27 19:40     ` Lawrence Velázquez
@ 2021-11-27 20:14     ` Bart Schaefer
  2021-11-27 21:56       ` Ray Andrews
  1 sibling, 1 reply; 10+ messages in thread
From: Bart Schaefer @ 2021-11-27 20:14 UTC (permalink / raw)
  To: Ray Andrews; +Cc: Zsh Users

On Sat, Nov 27, 2021 at 11:26 AM Ray Andrews <rayandrews@eastlink.ca> wrote:
>
> On 2021-11-27 9:47 a.m., Bart Schaefer wrote:
> >
> > With the exception of the contents of strings (including
> > here-documents), the original source of a function is not kept in
> > shell memory.
> That's most interesting, it seems circular so there must be a good
> reason for it

The parse is stored in a bytecode format (the same one used when
writing a file with zcompile) so:

It's much smaller than the source (usually), and takes up less memory.

It's much closer to being directly executable than the source, and
therefore faster.

Zach's use case is unusual; there's typically not any reason to "preserve more".


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Source mangling in $functions_source and typeset -f
  2021-11-27 20:14     ` Bart Schaefer
@ 2021-11-27 21:56       ` Ray Andrews
  0 siblings, 0 replies; 10+ messages in thread
From: Ray Andrews @ 2021-11-27 21:56 UTC (permalink / raw)
  To: zsh-users

On 2021-11-27 12:14 p.m., Bart Schaefer wrote:
>
> It's much smaller than the source (usually), and takes up less memory.
Yeah, so reverse engineering it back to source code wouldn't be a big 
deal.  Yup, I get it.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Source mangling in $functions_source and typeset -f
  2021-11-27 17:47 ` Bart Schaefer
  2021-11-27 19:25   ` Ray Andrews
@ 2021-11-29 16:00   ` Peter Stephenson
  2021-11-29 17:55     ` Bart Schaefer
  1 sibling, 1 reply; 10+ messages in thread
From: Peter Stephenson @ 2021-11-29 16:00 UTC (permalink / raw)
  To: Zsh Users


> On 27 November 2021 at 17:47 Bart Schaefer <schaefer@brasslantern.com> wrote:
> > Is there a convenient way, from within zsh, to get either:
> >
> > The original, unmodified source of a function (autoloaded or otherwise)
> > The line offsets in the file where the function is defined (if any)?
> 
> The parse tree only tracks the line numbers of executable code, so as
> to be able to update the LINENO variable and print line numbers in
> debug traces and prompts.  The line number of the closing brace isn't
> recorded (in fact it's possible to define a function without any open
> or close brace if the body is a single expression).
> 
> So, strictly speaking, no, neither of those.

Actually, the original line number is remembered, and can be output
using a prompt --- typically PS4.  See the zshmisc manual and
the description of prompt escapes.

%i     The  line  number  currently being executed in the script, sourced file, or
       shell function given by %N.  This is most useful for debugging as  part  of
       $PS4.

%I     The  line  number currently being executed in the file %x.  This is similar
       to %i, but the line number is always a line number in the  file  where  the
       code was defined, even if the code is a shell function.

So, for example, my precmd is defined in .zshrc from line 297 and if I edit PS4 to

%e:%N:%i(%I)> 

then I get output lines with "set -x" like

+1:precmd:2(299)> integer new_status=0

I don't think there's any way of getting this information without actually executing
the code, though.

pws


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Source mangling in $functions_source and typeset -f
  2021-11-29 16:00   ` Peter Stephenson
@ 2021-11-29 17:55     ` Bart Schaefer
  2021-12-01  6:53       ` Daniel Shahaf
  0 siblings, 1 reply; 10+ messages in thread
From: Bart Schaefer @ 2021-11-29 17:55 UTC (permalink / raw)
  To: Peter Stephenson; +Cc: Zsh Users

On Mon, Nov 29, 2021 at 8:00 AM Peter Stephenson
<p.w.stephenson@ntlworld.com> wrote:
>
> > On 27 November 2021 at 17:47 Bart Schaefer <schaefer@brasslantern.com> wrote:
> >
> > The parse tree only tracks the line numbers of executable code, so as
> > to be able to update the LINENO variable and print line numbers in
> > debug traces and prompts.
>
> Actually, the original line number is remembered, and can be output
> using a prompt

That's what I thought I said ...

> I don't think there's any way of getting this information without actually executing
> the code, though.

... which is exactly the problem here.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Source mangling in $functions_source and typeset -f
  2021-11-29 17:55     ` Bart Schaefer
@ 2021-12-01  6:53       ` Daniel Shahaf
  0 siblings, 0 replies; 10+ messages in thread
From: Daniel Shahaf @ 2021-12-01  6:53 UTC (permalink / raw)
  To: zsh-users

Bart Schaefer wrote on Mon, Nov 29, 2021 at 09:55:07 -0800:
> On Mon, Nov 29, 2021 at 8:00 AM Peter Stephenson
> <p.w.stephenson@ntlworld.com> wrote:
> >
> > > On 27 November 2021 at 17:47 Bart Schaefer <schaefer@brasslantern.com> wrote:
> > >
> > > The parse tree only tracks the line numbers of executable code, so as
> > > to be able to update the LINENO variable and print line numbers in
> > > debug traces and prompts.
> >
> > Actually, the original line number is remembered, and can be output
> > using a prompt
> 
> That's what I thought I said ...
> 
> > I don't think there's any way of getting this information without actually executing
> > the code, though.
> 
> ... which is exactly the problem here.

We _could_ teach zsh/parameter to emit this information, so, say,
${functions_source2[foo]} might expand to "/some/file.zsh:42", if that's
the filename and line number foo() was last defined at.

Another approach would be to use tools such as ctags.


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-12-01  6:54 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-27  8:01 Source mangling in $functions_source and typeset -f Zach Riggle
2021-11-27 17:47 ` Bart Schaefer
2021-11-27 19:25   ` Ray Andrews
2021-11-27 19:40     ` Lawrence Velázquez
2021-11-27 20:09       ` Ray Andrews
2021-11-27 20:14     ` Bart Schaefer
2021-11-27 21:56       ` Ray Andrews
2021-11-29 16:00   ` Peter Stephenson
2021-11-29 17:55     ` Bart Schaefer
2021-12-01  6:53       ` Daniel Shahaf

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).