9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
From: uriel@cat-v.org
To: 9fans@cse.psu.edu
Subject: Re: [9fans] ports from GPL
Date: Mon, 20 Mar 2006 04:39:43 +0100	[thread overview]
Message-ID: <0658d9ebf605b525d017007cadbc2e51@cat-v.org> (raw)
In-Reply-To: <20060320021808.91DE411FC1@dexter-peak.quanstro.net>

> the gnu awk folks are doing a pretty good job, given their constraints.
>
> i have not read the sed code (for a while, anyway), but i could imagine
> that it may have the same character set problems as newer versions of gnu grep.
> gnu grep calls mbtowc for each input character, even when not required.
>
> have you tried your test with LC_LANG=C?

I have seen GNU awk produce different matches with LC_ALL=UTF-8 than
with LC_ALL=C when input was plain ASCII (only digits!)

Since then at the top of all unix shell scripts I add LC_LANG=C, not
for performance reasons, but because otherwise things often break in
subtle and very hard to debug ways, really sad.

I wonder how many more years we will have to wait until any unix
system supports UTF-8 properly.

Only thing that excuses GNU is that the locale system is not entirely
their fault, locales are probably one of the worst ideas in the
history of Unix, if not the worst.

I will ignore the subject of UTF-8 support in terminal
emulators, many books could be written about the various kinds of
braindamage in this area.  Thank God for 9term.

> | I wonder who spent so much time speeding up awk and ignoring sed? :)

A program that produces incorrect results twice as fast is infinitely slower.
    -- John Osterhout

I wonder how many thousands of man-years have been wasted due to
locale-related braindamage.

uriel



  reply	other threads:[~2006-03-20  3:39 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-03-17 14:11 erik quanstrom
2006-03-17 14:22 ` Brantley Coile
2006-03-17 15:40 ` Ronald G Minnich
2006-03-18  0:39 ` geoff
2006-03-18  0:59   ` erik quanstrom
2006-03-18  1:16   ` Lyndon Nerenberg
2006-03-18  1:18     ` George Michaelson
2006-03-19 14:43   ` David Leimbach
2006-03-20  2:18     ` erik quanstrom
2006-03-20  3:39       ` uriel [this message]
2006-03-20 11:50         ` erik quanstrom
2006-03-20 20:00       ` Aharon Robbins
2006-03-21  2:41         ` erik quanstrom
2006-03-20 19:50     ` Aharon Robbins
2006-03-20 19:59       ` George Michaelson
2006-03-21 23:43       ` Jack Johnson
  -- strict thread matches above, loose matches on Subject: below --
2006-03-21  1:56 erik quanstrom
2006-03-21  3:43 ` dmr
2006-03-18 19:01 erik quanstrom
2006-03-18 14:14 erik quanstrom
2006-03-18 18:47 ` Skip Tavakkolian
2006-03-18 14:03 erik quanstrom
2006-03-18  2:03 dmr
2006-03-18  6:12 ` Bruce Ellis
2006-03-18  6:24   ` Tim Wiess
2006-03-18  6:36     ` Bruce Ellis
2006-03-18 13:31       ` Brantley Coile
2006-03-18 13:30   ` Brantley Coile
2006-03-18 16:28 ` George Michaelson
2006-03-18 23:13   ` Brantley Coile
2006-03-19  1:03     ` geoff
2006-03-17 16:17 erik quanstrom
2006-03-17 15:25 erik quanstrom
2006-03-17 15:21 Mike Haertel
2006-03-17 15:12 erik quanstrom
2006-03-16  8:11 Fernan Bolando
2006-03-16 13:03 ` Anthony Sorace
2006-03-16 16:50   ` Jack Johnson
2006-03-17  1:05   ` erik quanstrom
2006-03-17 11:33     ` Brantley Coile
2006-03-17 12:03       ` Axel Belinfante
2006-03-17 15:39       ` Ronald G Minnich
2006-03-20  3:44         ` Dave Eckhardt
2006-03-20  3:50           ` Skip Tavakkolian
2006-03-20  4:11             ` Russ Cox
2006-03-20  8:13           ` Charles Forsyth
2006-03-24  5:29         ` ems
2006-03-24  7:49           ` Bruce Ellis
2006-03-24 17:14           ` Ronald G Minnich
2006-03-24 17:34             ` erik quanstrom
2006-03-24 18:11             ` Wes Kussmaul
2006-03-24 18:09               ` Ronald G Minnich
2006-03-24 18:26                 ` Wes Kussmaul
2006-03-26  9:14             ` ems

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0658d9ebf605b525d017007cadbc2e51@cat-v.org \
    --to=uriel@cat-v.org \
    --cc=9fans@cse.psu.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).