From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <13277e55555fc0e249f7ce04f144a19f@felloff.net> Date: Sun, 30 Mar 2014 03:44:22 +0200 From: cinap_lenrek@felloff.net To: 9fans@9fans.net In-Reply-To: <938abc1c40e15468aa034d34b07a2d49@brasstown.quanstro.net> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Subject: Re: [9fans] a strange bug in grep Topicbox-Message-UUID: d2ba0122-ead8-11e9-9d60-3106f5b1d025 very good. one question about: - x = re2or(x, rclass(ov, Runemask)); + x = re2or(x, rclass(ov, 0xffff)); this seems wrong for 21 bit runes (the old is also wrong i think). shouldnt that be: + x = re2or(x, rclass(ov, Runemax)); as Runemask (0x1fffff) is not a valid rune for 21-bit rune as it is >Runemax. as i understand it, tab1[] array contains the last valid rune in a range of the same utf8 encoding length. basically: 0-07f -> 1 byte, 0x80-0x7ff -> 2 byte ect... so adding 0xffff is right. the next would be 0x10ffff for 21 bit runes but there shouldnt be any runes above 0x10ffff. makes any sense? -- cinap