9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
From: Uriel <uriel99@gmail.com>
To: "Fans of the OS Plan 9 from Bell Labs" <9fans@cse.psu.edu>
Subject: Re: [9fans] simplicity
Date: Tue, 18 Sep 2007 17:38:48 +0200	[thread overview]
Message-ID: <5d375e920709180838t4070c23al11bc0eb5cc7280c9@mail.gmail.com> (raw)
In-Reply-To: <7359f0490709180827h6978ae52re27825646a091ec8@mail.gmail.com>

Don't complain, at least it is not producing random behaviour, I have
seen versions of gnu awk that when feed plain ASCII input, if the
locale was UTF-8, rules would match random lines of input, the fix?
set the locale to 'C' at the top of all your scripts (and don't even
think of dealing with files which actually contain non-ASCII UTF-8).

This was some years ago, it might be fixed by now, but it demonstrates
how the locale insanity makes life so much more fun.

And talking of simplicity, don't forget to mention X. By chance I just
found this gem in one of the many X headers:

#define NBBY    8       /* number of bits in a byte */

uriel


On 9/18/07, Rob Pike <robpike@gmail.com> wrote:
> On 9/17/07, Douglas A. Gwyn <DAGwyn@null.net> wrote:
> > erik quanstrom wrote:
> > > i think the devolution of gnu grep is quite instructive.  ...
> > > it gets to the heart of why plan9's invention and use (thank's rob, ken) of
> > > utf-8 is so great.
> >
> > If the problem is that Gnu grep converts any non-8-bit character set
> > to wchar_t (the equivalent of Plan 9 "rune"), then it's not really a
> > fair criticism of the software.  The conversion approach handles a
> > wide variety of character encoding scheme, whereas grepping the
> > encodings directly (the fast approach) doesn't work well for many
> > non-UTF-8 encodings.
>
> Well, on a 2GHz x86, gnu grep ran for me at about 9600 baud on an
> ASCII file if I set my locale to the UTF-8 locale.  UTF-8 is ASCII
> compatible - explicitly, publicly, and on purpose - so there is no
> excuse for this sort of performance penalty.  To be specific, in
> the UTF-8 locale it should take just a few instructions to convert
> any character to wchar_t, ASCII or not, but gnu grep was calling
> malloc for this, even for an ASCII byte.
>
> It is a fair criticism to say this is unacceptable, whatever the
> intentions of the authors may be.
>
> -rob
>


  reply	other threads:[~2007-09-18 15:38 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-09-16 18:55 Francisco J Ballesteros
2007-09-16 20:42 ` Anant Narayanan
2007-09-16 21:24   ` Francisco J Ballesteros
2007-09-17 15:22     ` Douglas A. Gwyn
2007-09-16 20:43 ` roger peppe
2007-09-16 20:53   ` Steve Simon
2007-09-17 15:22     ` Douglas A. Gwyn
2007-09-17 20:00   ` Scott Schwartz
2007-09-17  3:23 ` erik quanstrom
2007-09-17 15:22   ` Douglas A. Gwyn
2007-09-17 15:55     ` erik quanstrom
2007-09-18  8:38       ` Douglas A. Gwyn
2007-09-18 10:45         ` dave.l
2007-09-18 14:44           ` Iruata Souza
2007-09-18 15:41             ` Douglas A. Gwyn
2007-09-18 21:34               ` Iruata Souza
2007-10-10  3:30         ` Jack Johnson
2007-10-10  4:02           ` erik quanstrom
2007-10-10  6:17             ` Jack Johnson
2007-10-10 12:22               ` erik quanstrom
2007-09-18 15:27     ` Rob Pike
2007-09-18 15:38       ` Uriel [this message]
2007-09-19  8:50         ` Douglas A. Gwyn
2007-09-19 11:51           ` erik quanstrom
2007-09-19 15:02             ` Russ Cox
2007-09-19 14:17           ` Charles Forsyth
2007-09-19 14:21           ` Iruata Souza
2007-09-19 15:32           ` Skip Tavakkolian
2007-10-09 20:08         ` Aharon Robbins
2007-10-09 21:08           ` Uriel
2007-10-10  5:33         ` sqweek
2007-10-10 11:49           ` erik quanstrom
2007-09-17 14:52 ` ron minnich
2007-09-17 14:53 ` ron minnich
2007-10-10  7:36 John Stalker
2007-10-10  8:24 ` Charles Forsyth
2007-10-10 11:47 ` erik quanstrom
2007-10-10 14:05   ` John Stalker
2007-10-10 14:29     ` erik quanstrom
2007-10-10 15:26       ` John Stalker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5d375e920709180838t4070c23al11bc0eb5cc7280c9@mail.gmail.com \
    --to=uriel99@gmail.com \
    --cc=9fans@cse.psu.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).