9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
From: erik quanstrom <quanstro@coraid.com>
To: 9fans@cse.psu.edu
Subject: Re: [9fans] simplicity
Date: Mon, 17 Sep 2007 11:55:04 -0400	[thread overview]
Message-ID: <7ba10925935da3080b62c7cb6e2649d5@coraid.com> (raw)
In-Reply-To: <46EE9A41.7DD78E60@null.net>

> erik quanstrom wrote:
> > i think the devolution of gnu grep is quite instructive.  ...
> > it gets to the heart of why plan9's invention and use (thank's rob, ken) of
> > utf-8 is so great.
>
> If the problem is that Gnu grep converts any non-8-bit character set
> to wchar_t (the equivalent of Plan 9 "rune"), then it's not really a
> fair criticism of the software.  The conversion approach handles a
> wide variety of character encoding scheme, whereas grepping the
> encodings directly (the fast approach) doesn't work well for many
> non-UTF-8 encodings.

performance may suck, but that's just a symptom of a bigger problem.

wchar_t is not the equivalent of Rune.  Rune is always utf-8.  wchar_t
can be whatever.

this is not a feature.  it is a bug.

suppose Linux user a and user b grep the same "text" file for the same string.
results will depend on the users' locales.

contrast plan 9.  any two users grepping the same file for the same string
will get the same results.

in either case a character set conversion might be necessary to match
the locale.  but in the plan 9 case, one conversion will fix things for
any plan 9 user.  in the Linux case, there is no conversion that will fix
things for any Linux user.

- erik

p.s. gnu grep does special-cases utf-8 and avoids wchar_t conversions



  reply	other threads:[~2007-09-17 15:55 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-09-16 18:55 Francisco J Ballesteros
2007-09-16 20:42 ` Anant Narayanan
2007-09-16 21:24   ` Francisco J Ballesteros
2007-09-17 15:22     ` Douglas A. Gwyn
2007-09-16 20:43 ` roger peppe
2007-09-16 20:53   ` Steve Simon
2007-09-17 15:22     ` Douglas A. Gwyn
2007-09-17 20:00   ` Scott Schwartz
2007-09-17  3:23 ` erik quanstrom
2007-09-17 15:22   ` Douglas A. Gwyn
2007-09-17 15:55     ` erik quanstrom [this message]
2007-09-18  8:38       ` Douglas A. Gwyn
2007-09-18 10:45         ` dave.l
2007-09-18 14:44           ` Iruata Souza
2007-09-18 15:41             ` Douglas A. Gwyn
2007-09-18 21:34               ` Iruata Souza
2007-10-10  3:30         ` Jack Johnson
2007-10-10  4:02           ` erik quanstrom
2007-10-10  6:17             ` Jack Johnson
2007-10-10 12:22               ` erik quanstrom
2007-09-18 15:27     ` Rob Pike
2007-09-18 15:38       ` Uriel
2007-09-19  8:50         ` Douglas A. Gwyn
2007-09-19 11:51           ` erik quanstrom
2007-09-19 15:02             ` Russ Cox
2007-09-19 14:17           ` Charles Forsyth
2007-09-19 14:21           ` Iruata Souza
2007-09-19 15:32           ` Skip Tavakkolian
2007-10-09 20:08         ` Aharon Robbins
2007-10-09 21:08           ` Uriel
2007-10-10  5:33         ` sqweek
2007-10-10 11:49           ` erik quanstrom
2007-09-17 14:52 ` ron minnich
2007-09-17 14:53 ` ron minnich
2007-10-10  7:36 John Stalker
2007-10-10  8:24 ` Charles Forsyth
2007-10-10 11:47 ` erik quanstrom
2007-10-10 14:05   ` John Stalker
2007-10-10 14:29     ` erik quanstrom
2007-10-10 15:26       ` John Stalker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7ba10925935da3080b62c7cb6e2649d5@coraid.com \
    --to=quanstro@coraid.com \
    --cc=9fans@cse.psu.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).