From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 20942 invoked from network); 25 Apr 2007 06:22:43 -0000 X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00,FORGED_RCVD_HELO autolearn=ham version=3.1.8 Received: from news.dotsrc.org (HELO a.mx.sunsite.dk) (130.225.247.88) by ns1.primenet.com.au with SMTP; 25 Apr 2007 06:22:43 -0000 Received-SPF: none (ns1.primenet.com.au: domain at sunsite.dk does not designate permitted sender hosts) Received: (qmail 56669 invoked from network); 25 Apr 2007 06:22:37 -0000 Received: from sunsite.dk (130.225.247.90) by a.mx.sunsite.dk with SMTP; 25 Apr 2007 06:22:37 -0000 Received: (qmail 11478 invoked by alias); 25 Apr 2007 06:22:35 -0000 Mailing-List: contact zsh-workers-help@sunsite.dk; run by ezmlm Precedence: bulk X-No-Archive: yes X-Seq: 23320 Received: (qmail 11469 invoked from network); 25 Apr 2007 06:22:35 -0000 Received: from news.dotsrc.org (HELO a.mx.sunsite.dk) (130.225.247.88) by sunsite.dk with SMTP; 25 Apr 2007 06:22:35 -0000 Received: (qmail 56358 invoked from network); 25 Apr 2007 06:22:35 -0000 Received: from vms040pub.verizon.net (206.46.252.40) by a.mx.sunsite.dk with SMTP; 25 Apr 2007 06:22:31 -0000 Received: from torch.brasslantern.com ([71.116.88.130]) by vms040.mailsrvcs.net (Sun Java System Messaging Server 6.2-6.01 (built Apr 3 2006)) with ESMTPA id <0JH100B9LJ1A6NJ2@vms040.mailsrvcs.net> for zsh-workers@sunsite.dk; Wed, 25 Apr 2007 01:22:23 -0500 (CDT) Received: from torch.brasslantern.com (localhost.localdomain [127.0.0.1]) by torch.brasslantern.com (8.13.1/8.13.1) with ESMTP id l3P6MLC3023162 for ; Tue, 24 Apr 2007 23:22:22 -0700 Received: (from schaefer@localhost) by torch.brasslantern.com (8.13.1/8.13.1/Submit) id l3P6MLMc023161 for zsh-workers@sunsite.dk; Tue, 24 Apr 2007 23:22:21 -0700 Date: Tue, 24 Apr 2007 23:22:20 -0700 From: Bart Schaefer Subject: Re: rfc2396 url encoding In-reply-to: <20070425034745.GA11136@scowler.net> To: zsh-workers@sunsite.dk Message-id: <070424232221.ZM23158@torch.brasslantern.com> MIME-version: 1.0 X-Mailer: OpenZMail Classic (0.9.2 24April2005) Content-type: text/plain; charset=iso-8859-1 Content-transfer-encoding: quoted-printable References: <20051106185713.GA11612@scowler.net> <1051106194911.ZM32000@candle.brasslantern.com> <20070425034745.GA11136@scowler.net> Comments: In reply to Clint Adams "Re: rfc2396 url encoding" (Apr 24, 11:47pm) On Apr 24, 11:47pm, Clint Adams wrote: } } > input=3D( ${(s::)1} ) } > print ${(j::)input/(#b)([^A-Za-z0-9_.!~*\'\(\)-])/%$(([##16]#match)= )} }=20 } This changed behavior in that =F3 (UTF-8) becomes %F3 instead of %C3%B3 . } Thoughts? (Is that actually wrong? When I cut and paste from your email message, there's only one byte at =F3.) I see the point, though, I think. (s::) is now splitting between wide characters rather than between raw bytes. What happens if you simply unsetopt multibyte within the function? urlencode() { setopt localoptions extendedglob nomultibyte input=3D( ${(s::)1} ) print ${(j::)input/(#b)([^A-Za-z0-9_.!~*\'\(\)-])/%$(([##16]#match))} } If that doesn't work, we'll have to come up with some way to force $1 to be re-interpreted as a raw byte string rather than a wide character string before splitting it up.