From mboxrd@z Thu Jan 1 00:00:00 1970 From: erik quanstrom To: 9fans@cse.psu.edu, uriel@cat-v.org References: <0658d9ebf605b525d017007cadbc2e51@cat-v.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 In-Reply-To: <0658d9ebf605b525d017007cadbc2e51@cat-v.org> Subject: Re: [9fans] ports from GPL Message-Id: <20060320115059.DC5B311FC1@dexter-peak.quanstro.net> Date: Mon, 20 Mar 2006 05:50:59 -0600 Cc: Topicbox-Message-UUID: 198db6fa-ead1-11e9-9d60-3106f5b1d025 uriel@cat-v.org writes | | > the gnu awk folks are doing a pretty good job, given their constraints. | > | > i have not read the sed code (for a while, anyway), but i could imagine | > that it may have the same character set problems as newer versions of gnu grep. | > gnu grep calls mbtowc for each input character, even when not required. | > | > have you tried your test with LC_LANG=C? | | I have seen GNU awk produce different matches with LC_ALL=UTF-8 than | with LC_ALL=C when input was plain ASCII (only digits!) can you give an example script and example input that exhibits this?