zsh-users
 help / color / mirror / code / Atom feed
* zregexparse
@ 2017-03-29  8:46 Sebastian Gniazdowski
  2017-03-31  5:19 ` zregexparse Bart Schaefer
  0 siblings, 1 reply; 8+ messages in thread
From: Sebastian Gniazdowski @ 2017-03-29  8:46 UTC (permalink / raw)
  To: zsh-users

Hello,
I've stumbled upon zregexparse. Verified that it doesn't auto-load zsh/regex. Manual says:

   zregexparse
      This implements some internals of the _regex_arguments function.

Test V02 suggest this is a very capable tool. How it is compiled, with use of LGPL Gnu regex? I wonder what use cases might it have, this test:

  zregexparse p1 p2 abcdef \(  '/c?|?/' '{print $match[1]}' \) \#
  print $? $p1 $p2
0:abcdef ( /c?|?/ {M1} ) #
>a
>b
>cd
>e
>f
>0 6 6

reveals that it can run code for every match. p1 and p2 aren't that clear. Has someone come across a practical use case for this call? It looks like it is safe to use, as it is tested and is not in module, and 4.3.17 has it..?

--
Sebastian Gniazdowski
psprint /at/ zdharma.org


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: zregexparse
  2017-03-29  8:46 zregexparse Sebastian Gniazdowski
@ 2017-03-31  5:19 ` Bart Schaefer
  2017-03-31  5:53   ` zregexparse Sebastian Gniazdowski
  0 siblings, 1 reply; 8+ messages in thread
From: Bart Schaefer @ 2017-03-31  5:19 UTC (permalink / raw)
  To: zsh-users

On Mar 29, 10:46am, Sebastian Gniazdowski wrote:
}
} I've stumbled upon zregexparse. Verified that it doesn't auto-load
} zsh/regex. Manual says:
}
}    zregexparse
}       This implements some internals of the _regex_arguments function.
} 
} Test V02 suggest this is a very capable tool. How it is compiled, with
} use of LGPL Gnu regex?

It has its own simple regular expression matcher, towards the end of
the zsh/zutil module.  There's no borrowed code, except maybe the
algorithm from a textbook.

This was invented during the time when it had been decided that there
should be separate documentation for developers and users, so the yodl
doc was deliberately sparse on things only developers were supposed to
need to know about.  Probably a poor decision in hindsight, as in many
cases the doc for developers never got written.

} I wonder what use cases might it have

It's used in the following completions:

Completion/X/Command/_xset
Completion/X/Command/_xwit
Completion/Unix/Command/_ip
Completion/Zsh/Command/_ztodo
Completion/Debian/Command/_apt

The Completion/Base/Utility/_regex_arguments file contains what little
doc there is for the syntax.

There are probably several other completions that could do better on
context-sensitive arguments than they do with _arguments, if only the
use of _regex_arguments were a bit less impenetrable.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: zregexparse
  2017-03-31  5:19 ` zregexparse Bart Schaefer
@ 2017-03-31  5:53   ` Sebastian Gniazdowski
  2017-03-31  6:58     ` zregexparse Sebastian Gniazdowski
  0 siblings, 1 reply; 8+ messages in thread
From: Sebastian Gniazdowski @ 2017-03-31  5:53 UTC (permalink / raw)
  To: Bart Schaefer, zsh-users

On 31 marca 2017 at 07:20:56, Bart Schaefer (schaefer@brasslantern.com) wrote:
> On Mar 29, 10:46am, Sebastian Gniazdowski wrote:
> }
> } I've stumbled upon zregexparse. Verified that it doesn't auto-load
> } zsh/regex. Manual says:
> }
> } zregexparse
> } This implements some internals of the _regex_arguments function.
> }
> } Test V02 suggest this is a very capable tool. How it is compiled, with
> } use of LGPL Gnu regex?
>  
> It has its own simple regular expression matcher, towards the end of
> the zsh/zutil module. There's no borrowed code, except maybe the
> algorithm from a textbook.
>  
> This was invented during the time when it had been decided that there
> should be separate documentation for developers and users, so the yodl
> doc was deliberately sparse on things only developers were supposed to
> need to know about. Probably a poor decision in hindsight, as in many
> cases the doc for developers never got written.

I suspect one reason to left it undocumented is that it doesn't integrate with syntax. Writing such functions is rather a pain from one point of view, e.g. it would be easy to add Levenshtein distance function to zsh/util or new module, that would match and sort according to the distance, so that fuzzy finding fever ;) that's current would move to Zsh side, however the function would be a multi-argument small hog, not (o)/(O) and no "= *glob*" integrated thing (however zregexparse doesn't appear to be a hog). Coming up with some syntax for this would be a big discovery. Hmm, I think the problem is maybe in the one feature that I'd expect: additional array to be created with corresponding distances, without it it's maybe even doable, the integration with syntax hmm..

In history-search-multi-word I do:

    __hsmw_region_highlight_data=( )
    : "${text//(#mi)(${~colsearch_pattern})/$(( hsmw_append(MBEGIN,MEND) ))}"
    region_highlight+=( $__hsmw_region_highlight_data )

to colorize search-query matches. Apparently zregexparse would work here too, and who knows, maybe it would be faster. It's just about running code for every match, and hopefully zregexparse has MBEGIN and MEND too. It might be however slower, if it's written as completion-backend, nice non-problematic code to serve completions, i.e. rather unoptimized. PS. Also rather no array support in zregexparse, i.e. ${lines[@]//…} equivalent.

-- 
Sebastian Gniazdowski
psprint /at/ zdharma.org


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: zregexparse
  2017-03-31  5:53   ` zregexparse Sebastian Gniazdowski
@ 2017-03-31  6:58     ` Sebastian Gniazdowski
  2017-03-31 21:13       ` zregexparse Bart Schaefer
  0 siblings, 1 reply; 8+ messages in thread
From: Sebastian Gniazdowski @ 2017-03-31  6:58 UTC (permalink / raw)
  To: Bart Schaefer, zsh-users

PS. I've examined if zregexparse could be use to fill region_highlight, and it seems it can't:

zregexparse p1 p2 abc \( '/bc|?/' '{print "$p1, $p2, $mbegin[1], $mend[1], $match[1]"}' \) \#
1, 1, 1, 1, a
3, 3, 1, 2, bc

The "|?" is needed, otherwise "bc" will not be matched – no substring matching. $p1, $p2 are quite veird values, rather not usable as mbegin/mend (except if they are mend, then mbegin could be computed using $#match[1]). And the mbegin, mend are set locally, i.e. always start from "1".

--
Sebastian Gniazdowski
psprint /at/ zdharma.org


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: zregexparse
  2017-03-31  6:58     ` zregexparse Sebastian Gniazdowski
@ 2017-03-31 21:13       ` Bart Schaefer
  2017-04-01 10:21         ` zregexparse Sebastian Gniazdowski
  0 siblings, 1 reply; 8+ messages in thread
From: Bart Schaefer @ 2017-03-31 21:13 UTC (permalink / raw)
  To: zsh-users

On Mar 31,  8:58am, Sebastian Gniazdowski wrote:
} Subject: Re: zregexparse
}
} PS. I've examined if zregexparse could be use to fill
} region_highlight, and it seems it can't:
} 
} zregexparse p1 p2 abc \( '/bc|?/' '{print "$p1, $p2, $mbegin[1], $mend[1], $match[1]"}' \) \#
} 1, 1, 1, 1, a
} 3, 3, 1, 2, bc

The return values from zregexparse are pretty strongly tailored to be
used with compset and compadd, and it does nothing useful with $match
and friends -- it's not tied into glob pattern referencing at all.

The values stored in $p1 and $p2 here are intended to calculate offsets
into "abc" for a call to compset -p, which isn't going to help you with
ranges for highlighting.  Also (though not significant in this eample),
zregexparse expects words joined with $'\0'.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: zregexparse
  2017-03-31 21:13       ` zregexparse Bart Schaefer
@ 2017-04-01 10:21         ` Sebastian Gniazdowski
  2017-04-01 16:43           ` zregexparse Bart Schaefer
  0 siblings, 1 reply; 8+ messages in thread
From: Sebastian Gniazdowski @ 2017-04-01 10:21 UTC (permalink / raw)
  To: Bart Schaefer, zsh-users

On 31 march 2017 at 23:15:31, Bart Schaefer (schaefer@brasslantern.com) wrote:
> } 1, 1, 1, 1, a
> } 3, 3, 1, 2, bc
>  
> The return values from zregexparse are pretty strongly tailored to be
> used with compset and compadd, and it does nothing useful with $match
> and friends -- it's not tied into glob pattern referencing at all.
>  
> The values stored in $p1 and $p2 here are intended to calculate offsets
> into "abc" for a call to compset -p, which isn't going to help you with
> ranges for highlighting. Also (though not significant in this eample),
> zregexparse expects words joined with $'\0'.

Let me just express regret about this outcome. Redundancy is a cool thing. I was able to filter $history with (M)/:# and (R), and (R) turned out to be faster. zregexparse could prove its value in e.g. static build without zsh/regex. One IRC user once reported problems with some completions, because he sincerely configured BSD port as static build (no zsh/regex by default). I then quickly removed regex usage from _hosts, but the patch wasn't accepted, maybe because the problem wasn't highlighted (that said, I may naively equate zsh/regex with zregexparse, but it's just that shipping own regex engine is extreamly cool, at least for me). I once wrote simple ANSI color codes parsing using (#b), wonder if here the zregexparse could do something, but it would have to be able to progress across text, and lack of mend seems to be a problem.

-- 
Sebastian Gniazdowski
psprint /at/ zdharma.org


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: zregexparse
  2017-04-01 10:21         ` zregexparse Sebastian Gniazdowski
@ 2017-04-01 16:43           ` Bart Schaefer
  2017-04-01 23:21             ` zregexparse Bart Schaefer
  0 siblings, 1 reply; 8+ messages in thread
From: Bart Schaefer @ 2017-04-01 16:43 UTC (permalink / raw)
  To: zsh-users

On Apr 1, 12:21pm, Sebastian Gniazdowski wrote:
} Subject: Re: zregexparse
}
} On 31 march 2017 at 23:15:31, Bart Schaefer (schaefer@brasslantern.com) wrote:
} > The return values from zregexparse are pretty strongly tailored to be
} > used with compset and compadd, and it does nothing useful with $match
} > and friends -- it's not tied into glob pattern referencing at all.
}
} Let me just express regret about this outcome.

It's not so much an "outcome" as a statement of fact.  zregexparse was
written to solve a particular problem.  At the time there was no reason
to consider having a generalized regex library; the =~ operator (and
the ability to have modules define operators) had not yet been added.
This is just one of the corners of the code that has not been paid
any attention in a very long time (it's essentially unchanged since
2001 or so).

} One IRC user once reported problems
} with some completions, because he sincerely configured BSD port as
} static build (no zsh/regex by default). I then quickly removed regex
} usage from _hosts, but the patch wasn't accepted

If you're referring to your post from March 8th, this appears to have
been a case of something missed because no one had time to commit it
right then, not an intentional rejection.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: zregexparse
  2017-04-01 16:43           ` zregexparse Bart Schaefer
@ 2017-04-01 23:21             ` Bart Schaefer
  0 siblings, 0 replies; 8+ messages in thread
From: Bart Schaefer @ 2017-04-01 23:21 UTC (permalink / raw)
  To: zsh-users

On Apr 1,  9:43am, Bart Schaefer wrote:
}
} This is just one of the corners of the code that has not been paid
} any attention in a very long time (it's essentially unchanged since
} 2001 or so).

Also, I just looked a little more closely at the code to find out why
Sebastian's first example with '{print $match[1]}' seemed to output
something useful, and realized that zregexparse is built on top of
the globbing pattern matcher -- it builds a state machine for the
regular expression semantics that weren't yet part of extendedglob at
the time, and then steps through the state machine calling back for
each subexpression found; but otherwise it's globbing.

That's how/why $match[1] gets set (and reset) for each callback.

Oh -- Sebastian also remarked "just that shipping own regex engine is
extreamly cool, at least for me" -- Src/pattern.c is explicity (for
licensing reasons!) derived from Henry Spencer's regular expression
package, so zsh *is* "shipping own regex engine", it merely calls it
extendedglob instead.

The equivalent (though perhaps not easily a syntactic duplication) of
zregexparse probably could now be rewritten entirely with extendedglob.
E.g. here's a fragment of it:

    zrp() {
      setopt localoptions extendedglob
      local var1=$1 var2=$2 string=$3 pattern=$4 callback=$5
      local _zrp_cb='typeset -g $var1=$mend[1] $var2=$mend[1]'
      {
	functions[_zrp_cb]="$_zrp_cb;$callback;return 0"
	functions -M _zrp_cb
	: ${string//(#b)($~pattern)/$((_zrp_cb()))}
      } always {
	functions +M _zrp_cb
	unfunction _zrp_cb
      }
    }

torch% zrp p1 p2 abc 'bc|?' '{print "$p1, $p2, $mbegin[1], $mend[1], $match[1]"}' 
1, 1, 1, 1, a
3, 3, 2, 3, bc


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-04-01 23:21 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-29  8:46 zregexparse Sebastian Gniazdowski
2017-03-31  5:19 ` zregexparse Bart Schaefer
2017-03-31  5:53   ` zregexparse Sebastian Gniazdowski
2017-03-31  6:58     ` zregexparse Sebastian Gniazdowski
2017-03-31 21:13       ` zregexparse Bart Schaefer
2017-04-01 10:21         ` zregexparse Sebastian Gniazdowski
2017-04-01 16:43           ` zregexparse Bart Schaefer
2017-04-01 23:21             ` zregexparse Bart Schaefer

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).