zsh-workers
 help / color / mirror / code / Atom feed
From: Jon Strait <jstrait@moonloop.net>
Cc: zsh workers <zsh-workers@sunsite.dk>
Subject: Re: PATCH: New options for the PCRE module (to replace my previous)
Date: Wed, 25 Mar 2009 04:21:36 -0700	[thread overview]
Message-ID: <49CA13C0.1070308@moonloop.net> (raw)
In-Reply-To: <20090325100256.33b614d3@news01>


[-- Attachment #1.1: Type: text/plain, Size: 870 bytes --]

Peter Stephenson wrote:
> On Wed, 25 Mar 2009 02:20:02 -0700
> Jon Strait <jstrait@moonloop.net> wrote:
>   
>> A few adjustments since last time, with documentation.
>>
>> No reset of the special variables is done on a match failure.
>>     
>
> Er, I can't remember what Phil said (I haven't been following this in any
> detail), but the documentation now says variables aren't altered on a
> failure, so presumably that is now incorrect.  I don't think this is
> crucial as long as it's documented correctly.
>
> Could you in any case send a documentation patch against the current source
> and with lines wrapped to 80 columns?
>
> Thanks.
>
>   
No, I ended up keeping the original behavior:  On match failure, none of 
the special variables are modified (reset).

Here is the updated doc patch.

Please let me know if anything I added isn't clear enough.  Thanks.

[-- Attachment #1.2: Type: text/html, Size: 1352 bytes --]

[-- Attachment #2: mod_pcre.yo.patch --]
[-- Type: text/x-patch, Size: 2654 bytes --]

--- mod_pcre-old.yo	2009-01-15 01:49:06.000000000 -0800
+++ mod_pcre.yo	2009-03-25 03:55:58.000000000 -0700
@@ -6,7 +6,7 @@
 
 startitem()
 findex(pcre_compile)
-item(tt(pcre_compile) [ tt(-aimx) ] var(PCRE))(
+item(tt(pcre_compile) [ tt(-aimxs) ] var(PCRE))(
 Compiles a perl-compatible regular expression.
 
 Option tt(-a) will force the pattern to be anchored.
@@ -15,6 +15,8 @@
 tt(^) and tt($) will match newlines within the pattern.
 Option tt(-x) will compile an extended pattern, wherein
 whitespace and tt(#) comments are ignored.
+Option tt(-s) makes the dot metacharacter match all characters, 
+including those that indicate newline.
 )
 findex(pcre_study)
 item(tt(pcre_study))(
@@ -22,7 +24,8 @@
 matching.
 )
 findex(pcre_match)
-item(tt(pcre_match) [ tt(-v) var(var) ] [ tt(-a) var(arr) ] var(string))(
+item(tt(pcre_match) [ tt(-v) var(var) ] [ tt(-a) var(arr) ] \
+[ tt(-n) var(offset) ] [ tt(-b) ] var(string))(
 Returns successfully if tt(string) matches the previously-compiled
 PCRE.
 
@@ -33,8 +36,38 @@
 case it will set the array var(arr).  Similarly, the variable
 var(MATCH) will be set to the entire matched portion of the
 string, unless the tt(-v) option is given, in which case the variable
-var(var) will be set.
-No variables are altered if there is no successful match.
+var(var) will be set. 
+No variables are altered if there is no successful match. 
+A tt(-n) option starts searching for a match from the 
+byte var(offset) position in var(string).  If the tt(-b) option is given, 
+the variable var(ZPCRE_OP) will be set to an offset pair string, 
+representing the byte offset positions of the entire matched portion 
+within the var(string).  For example, a var(ZPCRE_OP) set to "32 45" indicates
+that the matched portion began on byte offset 32 and ended on byte offset 44. 
+Here, byte offset position 45 is the position directly after the matched
+portion.  Keep in mind that the byte position isn't necessarily the same 
+as the character position when UTF-8 characters are involved.  
+Consequently, the byte offset positions are only to be relied on in the
+context of using them for subsequent searches on var(string), using an offset
+position as an argument to the tt(-n) option.  This is mostly
+used to implement the "find all non-overlapping matches" functionality.
+
+A simple example of "find all non-overlapping matches":
+
+example(
+string="The following zip codes: 78884 90210 99513"
+pcre_compile -m "\d{5}"
+accum=()
+pcre_match -b -- $string
+while [[ $? -eq 0 ]] do
+    b=($=ZPCRE_OP)
+    accum+=$MATCH
+    pcre_match -b -n $b[2] -- $string
+done
+print -l $accum
+
+
+)
 )
 enditem()
 

  reply	other threads:[~2009-03-25 11:23 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-25  9:20 Jon Strait
2009-03-25 10:02 ` Peter Stephenson
2009-03-25 11:21   ` Jon Strait [this message]
2009-03-25 11:30     ` Peter Stephenson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49CA13C0.1070308@moonloop.net \
    --to=jstrait@moonloop.net \
    --cc=zsh-workers@sunsite.dk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).