zsh-workers
 help / color / mirror / code / Atom feed
bfabfd1cbb3857f70f8c5d687bdafefd08eb8152 blob 4605 bytes (raw)

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
 
COMMENT(!MOD!zsh/re2
Interface to the RE2 regular expression library.
!MOD!)
cindex(regular expressions)
cindex(re2)
The tt(zsh/re2) module provides regular expression handling using the
RE2 library.
This engine assumes UTF-8 strings by default and zsh never disables this.
Canonical documentation for this syntax accepted by this regular expression
engine can be found at:
uref(https://github.com/google/re2/wiki/Syntax)

The tt(zsh/re2) module makes available some commands and test conditions.

Regular expressions can be pre-compiled and given explicit names; these
are not shell variables and do not share a namespace with them.  There
is currently no mechanism to enumerate them.

The supported commands are:

startitem()
findex(re2_compile)
item(tt(re2_compile) COMMENT(TODO: [ tt(-R) var(NAME) ]) [ tt(-acilwLP) ] var(REGEX))(
Compiles an RE2-syntax regular expression, defaulting to case-sensitive.

COMMENT(TODO: Option tt(-R) stores the regular expression with the given name,
instead of in anonymous global state.)
Option tt(-L) will interpret the pattern as a literal, not a regex.
Option tt(-P) will enable POSIX syntax instead of the full language.
Option tt(-a) will force the pattern to be anchored.
Option tt(-c) will re-enable Perl class support in POSIX mode.
Option tt(-i) will compile a case-insensitive pattern.
Option tt(-l) will use a longest-match not first-match algorithm for
selecting which branch matches.
Option tt(-w) will re-enable Perl word-boundary support in POSIX mode.
)
enditem()

startitem()
findex(re2_match)
item(tt(re2_match) [ tt(-v) var(var) ] [ tt(-a) var(arr) ] \
COMMENT(TODO:[ tt(-R) var(REGNAME) ]|)[ tt(-P) var(PATTERN) ] var(string))(
Matches a regular expression against the supplied string, storing matches in
variables.
Returns success if var(string) matches the tested regular expression.

Without option+COMMENT(TODO: s tt(-R) or) tt(-P) will match against an implicit current regular
expression object, which must have been compiled with tt(re2_compile).
COMMENT(TODO: Option tt(-R) will use the regular expression with the given name.)
Option tt(-P) will take a regular expression as a parameter and compile and
use it, without changing the implicit current regular expression object as
set by calling tt(re2_compile).

Without a successful match, no variables are modified, even those explicitly
specified.

Upon successful match: the entire matched portion of var(string) is stored in
the var(var) of option tt(-v) if given, else in tt(MATCH); any captured
sub-expressions are stored in the array var(arr) of option tt(-a) if given,
else in tt(match).

No offset variables are currently mutated; this may change in a future release
of Zsh.
)
enditem()

The supported test conditions are:

startitem()
findex(re2-match)
item(var(expr) tt(-re2-match) var(regex))(
Matches a string against an RE2 regular expression.
Upon successful match, the
matched portion of the string will normally be placed in the tt(MATCH)
variable.  If there are any capturing parentheses within the regex, then
the tt(match) array variable will contain those.
If the match is not successful, then the variables will not be altered.

In addition, the tt(MBEGIN) and tt(MEND) variables are updated to point
to the offsets within var(expr) for the beginning and end of the matched
text, with the tt(mbegin) and tt(mend) arrays holding the beginning and
end of each substring matched.

If tt(BASH_REMATCH) is set, then the array tt(BASH_REMATCH) will be set
instead of all of the other variables.

The tt(NO_CASE_MATCH) option may be used to make matching case-sensitive.

For finer-grained control, use the tt(re2_match) builtin.
)
enditem()

startitem()
findex(re2-match-posix)
item(var(expr) tt(-re2-match-posix) var(regex))(
Matches as per tt(-re2-match) but configuring the RE2 engine to use
POSIX syntax.
)
enditem()

startitem()
findex(re2-match-posixperl)
item(var(expr) tt(-re2-match-posixperl) var(regex))(
Matches as per tt(-re2-match) but configuring the RE2 engine to use
POSIX syntax, with the Perl classes and word-boundary extensions re-enabled
too.

This thus adds support for:
tt(\d), tt(\s), tt(\w), tt(\D), tt(\S), tt(\W), tt(\b), and tt(\B).
)
enditem()

startitem()
findex(re2-match-longest)
item(var(expr) tt(-re2-match-longest) var(regex))(
Matches as per tt(-re2-match) but configuring the RE2 engine to find
the longest match, instead of the left-most.

For example, given

example([[ abb -re2-match-longest ^a+LPAR()b|bb+RPAR() ]])

This will match the right-branch, thus tt(abb), where tt(-re2-match) would
instead match only tt(ab).
)
enditem()
debug log:

solving bfabfd1 ...
found bfabfd1 in https://inbox.vuxu.org/zsh-workers/20160909011242.GC12371@breadbox.private.spodhuis.org/

applying [1/1] https://inbox.vuxu.org/zsh-workers/20160909011242.GC12371@breadbox.private.spodhuis.org/
diff --git a/Doc/Zsh/mod_re2.yo b/Doc/Zsh/mod_re2.yo
new file mode 100644
index 0000000..bfabfd1

Checking patch Doc/Zsh/mod_re2.yo...
Applied patch Doc/Zsh/mod_re2.yo cleanly.

index at:
100644 bfabfd1cbb3857f70f8c5d687bdafefd08eb8152	Doc/Zsh/mod_re2.yo

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).