From mboxrd@z Thu Jan 1 00:00:00 1970 MIME-Version: 1.0 In-Reply-To: <22fc94a82c16f8b347bc45dd539b5fc6@coraid.com> References: <71b1e3b728efbd1b2a2ae2b5b4e2b1d0@coraid.com> <3aaafc130911300754i7f244f02j7d161d907d7a8bed@mail.gmail.com> <22fc94a82c16f8b347bc45dd539b5fc6@coraid.com> Date: Mon, 30 Nov 2009 14:43:32 -0500 Message-ID: <3aaafc130911301143h1e788401t7741a55b7fbb18cb@mail.gmail.com> From: Jorden Mauro To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [9fans] =?iso-8859-1?q?gr=EBp_=28rhymes_with_creep=29_and_cptmp?= Topicbox-Message-UUID: a59408ee-ead5-11e9-9d60-3106f5b1d025 On Mon, Nov 30, 2009 at 11:00 AM, erik quanstrom wrot= e: >> ``unfold turns a character, say =C3=AB into the set of >> characters that can be folded to the same base >> character. =C2=A0so >> =C2=A0 =C2=A0 =C2=A0 =C2=A0; unfold =C3=AB >> =C2=A0 =C2=A0 =C2=A0 =C2=A0[e=C3=A8=C3=A9=C3=AA=C3=AB=C4=93=C4=95=C4=97= =C4=99=C4=9B=C8=85=C8=87=C8=A9=E1=B8=95=E1=B8=97=E1=B8=99=E1=B8=9B=E1=B8=9D= =E1=BA=B9=E1=BA=BB=E1=BA=BD=E1=BA=BF=E1=BB=81=E1=BB=83=E1=BB=85=E1=BB=87]'' >> >> To me, that sounds like [e-f] should be >> >> [e=C3=A8=C3=A9=C3=AA=C3=AB=C4=93=C4=95=C4=97=C4=99=C4=9B=C8=85=C8=87=C8= =A9=E1=B8=95=E1=B8=97=E1=B8=99=E1=B8=9B=E1=B8=9D=E1=BA=B9=E1=BA=BB=E1=BA=BD= =E1=BA=BF=E1=BB=81=E1=BB=83=E1=BB=85=E1=BB=87f=C6=92] >> >> iff e unfolds to the same set as =C3=AB. If e only unfolds to [e], then >> [e-f] would unfold to [ef]. > > i don't think that works. =C2=A0consider [e-g]. =C2=A0normally > this would match 'f', but under your algorithm it wouldn't. > the problem is that [a-z] works because ascii is arranged > in alphabetical order. =C2=A0all the various accented characters > are not. It would work if the algorithm didn't expand the class just by enumerating ASCII letters, but for every letter also added the accented chars. > > that's why the folding approach has an advantage [a-z] > will work and will do the Right Thing. > > - erik > >