From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 28455 invoked by alias); 28 Oct 2013 15:42:50 -0000 Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Workers List List-Post: List-Help: X-Seq: 31921 Received: (qmail 24830 invoked from network); 28 Oct 2013 15:42:45 -0000 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-2.7 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,T_FRT_INCOME, T_TO_NO_BRKTS_FREEMAIL autolearn=ham version=3.3.2 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to:content-type :content-transfer-encoding; bh=f4o6v9wn/TPAtPNRGCJLbXg8H6hyzN/GBAfS6iWt8Fo=; b=bCHjuK0bNIaGAL6iJQJHrytlQ6gu+qtIKASVT1w5I/+B3dbGveRCnU0m5SnPTLXvA1 o4FvxsCIh2Z3Qk3OLzdIjJEHwibbHzOYkfKtWjIn16r+OI2UhcsQP+tVJnQWa8PcWefU oEU8kIavLsGVzWwiGwyoJ0pO4NmIjNI5dGe+/HlluNusYm8u5uWOHYaUSXjwB/AeZ7kl jCYiaktRuP/sUdxfrrFQLfTJgXTPwcF35YlbYT4S9M8sgoE8ZkR+5ySdiM4EHZNLLn+D JejxZaWG35agnHbIWZmWJRXJ/gpvFbP+MZWyqbpPcoOw4xxumQL2Lr3lyBrcukK3h8Lh 9NwQ== X-Received: by 10.180.187.175 with SMTP id ft15mr9756381wic.20.1382974962683; Mon, 28 Oct 2013 08:42:42 -0700 (PDT) MIME-Version: 1.0 From: Yichao Yu Date: Mon, 28 Oct 2013 11:42:22 -0400 Message-ID: Subject: zsh bug: isearch doesn't support unicode properly To: zsh-workers@zsh.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi, I've got a problem when typing unicode characters in the zsh's history sear= ch. When trying to use unicode characters in isearch (the default binding for ^S and ^R), some of the unicode characters are not recognized correctly. For example, (with utf-8 encoding), when typing "=E9=87=8D=E6=96= =B0=E5=AE=89=E8=A3=85" (utf-8 string: \xe9\x87\x8d\xe6\x96\xb0\xe5\xae\x89\xe8\xa3\x85) what's recognized by zsh is actually "=E9=A7=AD=E6=B6=B0=E5=AE=A9=E8=A3=A5"= (utf-8 string: \xe9\xa7\xad\xe6\xb6\xb0\xe5\xae\xa9\xe8\xa3\xa5e). It seems that the fifth bit in one byte is flipped in come cases. I'm not yet sure what's wrong with these characters but at least the meta character processing looks suspicious. Yichao Yu