From mboxrd@z Thu Jan  1 00:00:00 1970
From: erik quanstrom <quanstro@quanstro.net>
To: 9fans@cse.psu.edu, uriel@cat-v.org
References: <0658d9ebf605b525d017007cadbc2e51@cat-v.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
In-Reply-To: <0658d9ebf605b525d017007cadbc2e51@cat-v.org>
Subject: Re: [9fans] ports from GPL
Message-Id: <20060320115059.DC5B311FC1@dexter-peak.quanstro.net>
Date: Mon, 20 Mar 2006 05:50:59 -0600
Cc:
Topicbox-Message-UUID: 198db6fa-ead1-11e9-9d60-3106f5b1d025

uriel@cat-v.org writes

|
| > the gnu awk folks are doing a pretty good job, given their constraints.
| >
| > i have not read the sed code (for a while, anyway), but i could imagine
| > that it may have the same character set problems as newer versions of gnu grep.
| > gnu grep calls mbtowc for each input character, even when not required.
| >
| > have you tried your test with LC_LANG=C?
|
| I have seen GNU awk produce different matches with LC_ALL=UTF-8 than
| with LC_ALL=C when input was plain ASCII (only digits!)

can you give an example script and example input that exhibits this?