zsh-workers
 help / color / mirror / code / Atom feed
From: Phil Pennock <zsh-workers+phil.pennock@spodhuis.org>
To: Peter Stephenson <p.w.stephenson@ntlworld.com>
Cc: zsh-workers@zsh.org
Subject: PATCH: zsh/regex meta fixes for widechar (Re: Possible 4.3.18?)
Date: Fri, 15 Jun 2012 20:44:11 -0400	[thread overview]
Message-ID: <20120616004411.GA16035@redoubt.spodhuis.org> (raw)
In-Reply-To: <20120615194210.33ab9abc@pws-pc.ntlworld.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160

On 2012-06-15 at 19:42 +0100, Peter Stephenson wrote:
> Phil was suggesting he might need to make a change before big release,
> is that still the case?

Oh right, sorry.  zsh/regex module needed to support wide characters
without getting upset, so some unmeta/meta sprinkling needed.

Here's the patch, I'll commit shortly.

AFAICT, I'm already using the correct character length counting for
$mbegin $mend, and so this patch is sufficient.  It seems suspiciously
simpler than the pcre.c revision 1.19 change I wrote.

With patch:
- ----------------------------8< cut here >8------------------------------
% unsetopt rematchpcre
% [[ 'aei→bx' =~ ^([aeiou]+)(.)(.) ]] && print -l $match === $mbegin === $mend === $MATCH
aei
→
b
===
1
4
5
===
3
4
5
===
aei→b
- ----------------------------8< cut here >8------------------------------

Without patch:
- ----------------------------8< cut here >8------------------------------
% [[ 'aei→bx' =~ ^([aeiou]+)(.)(.) ]] && print -l $match === $mbegin === $mend === $MATCH
ae
i
?
===
1
3
4
===
2
3
4
===
aei?
- ----------------------------8< cut here >8------------------------------

Index: Src/Modules/regex.c
===================================================================
RCS file: /home/cvsroot/remote-repos/zsh-repo/zsh/Src/Modules/regex.c,v
retrieving revision 1.7
diff -a -u -p -r1.7 regex.c
- --- Src/Modules/regex.c	20 Jan 2010 11:17:11 -0000	1.7
+++ Src/Modules/regex.c	16 Jun 2012 00:30:08 -0000
@@ -3,7 +3,7 @@
  *
  * This file is part of zsh, the Z shell.
  *
- - * Copyright (c) 2007 Phil Pennock
+ * Copyright (c) 2007,2012 Phil Pennock
  * All Rights Reserved.
  *
  * Permission is hereby granted, without written agreement and without
@@ -56,14 +56,19 @@ zcond_regex_match(char **a, int id)
     regex_t re;
     regmatch_t *m, *matches = NULL;
     size_t matchessz = 0;
- -    char *lhstr, *rhre, *s, **arr, **x;
+    char *lhstr, *lhstr_zshmeta, *rhre, *rhre_zshmeta, *s, **arr, **x;
     int r, n, return_value, rcflags, reflags, nelem, start;
 
- -    lhstr = cond_str(a,0,0);
- -    rhre = cond_str(a,1,0);
+    lhstr_zshmeta = cond_str(a,0,0);
+    rhre_zshmeta = cond_str(a,1,0);
     rcflags = reflags = 0;
     return_value = 0; /* 1 => matched successfully */
 
+    lhstr = ztrdup(lhstr_zshmeta);
+    unmetafy(lhstr, NULL);
+    rhre = ztrdup(rhre_zshmeta);
+    unmetafy(rhre, NULL);
+
     switch(id) {
     case ZREGEX_EXTENDED:
 	rcflags |= REG_EXTENDED;
@@ -101,7 +106,7 @@ zcond_regex_match(char **a, int id)
 	    if (nelem) {
 		arr = x = (char **) zalloc(sizeof(char *) * (nelem + 1));
 		for (m = matches + start, n = start; n <= (int)re.re_nsub; ++n, ++m, ++x) {
- -		    *x = ztrduppfx(lhstr + m->rm_so, m->rm_eo - m->rm_so);
+		    *x = metafy(lhstr + m->rm_so, m->rm_eo - m->rm_so, META_DUP);
 		}
 		*x = NULL;
 	    }
@@ -112,7 +117,7 @@ zcond_regex_match(char **a, int id)
 		char *ptr;
 
 		m = matches;
- -		s = ztrduppfx(lhstr + m->rm_so, m->rm_eo - m->rm_so);
+		s = metafy(lhstr + m->rm_so, m->rm_eo - m->rm_so, META_DUP);
 		setsparam("MATCH", s);
 		/*
 		 * Count the characters before the match.
@@ -174,12 +179,16 @@ zcond_regex_match(char **a, int id)
 	break;
     default:
 	DPUTS(1, "bad regex option");
- -	return 0; /* nothing to cleanup, especially not "re". */
+	return_value = 0;
+	goto CLEAN_BASEMETA;
     }
 
     if (matches)
 	zfree(matches, matchessz);
     regfree(&re);
+CLEAN_BASEMETA:
+    free(lhstr);
+    free(rhre);
     return return_value;
 }
 
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAk/b1tIACgkQQDBDFTkDY39n9gCeLnbvIM3fpndE0GaNOGdN338s
u1cAn064nG9fOcKq80Zf2Dg5IwL/O5R7
=Q7/2
-----END PGP SIGNATURE-----


  parent reply	other threads:[~2012-06-16  1:02 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-11 16:04 Possible 4.3.18? Frank Terbeck
2012-06-11 16:24 ` Peter Stephenson
2012-06-15 18:42   ` Peter Stephenson
2012-06-15 20:51     ` Danek Duvall
2012-06-15 21:24       ` Peter Stephenson
2012-06-20 22:35         ` Danek Duvall
2012-06-16  0:44     ` Phil Pennock [this message]
2012-06-16  0:45     ` Phil Pennock
2012-06-20 15:32     ` Peter Stephenson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120616004411.GA16035@redoubt.spodhuis.org \
    --to=zsh-workers+phil.pennock@spodhuis.org \
    --cc=p.w.stephenson@ntlworld.com \
    --cc=zsh-workers@zsh.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).