9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
From: erik quanstrom <quanstro@quanstro.net>
To: 9fans@9fans.net
Subject: Re: [9fans] Lex, Yacc, Unicode Plane 1
Date: Thu, 28 Jan 2010 15:05:03 -0500	[thread overview]
Message-ID: <6277a4dcc738c2eee17e029efeb1b324@ladd.quanstro.net> (raw)
In-Reply-To: <4B61A280020000CC0001D4A1@wlgw07.wlu.ca>

> A colleague put me on to Plan9, some of whose online documentation I
> have read with interest, in particular the "Hello World" discussion as
> it relates to Unicode/UTF-8.
>
> I'm one of the authors of the Cuneiform proposal now encoded under
> Unicode (see block U+12000), and I'm interesting in lex/yacc-like
> parsing of Unicode input to produce (among other things) Cuneiform
> output.
>
> I realize some of the documentation was written long ago... so I'm
> unclear as to whether or not (or how easily) Plan9 (and specifically its
> lex/yacc software, etc.) handles such things? (this sparked by the
> references to four hex digits etc.)

that's interesting stuff.

lex(1) is generally not used, and doesn't support
unicode.  yacc(1) does a fine job with unicode.
though, to be fair, most of that job falls on the
lexer.  however this is not hard to do by hand. there
are many good examples in the distribution.
the bio(2) buffered io library provides a Bgetrune
function, which is generally what is desired.

(i have some patches, partially stolen from russ,
that should support extended plane runes at the
cost of double the storage.)

- erik



  reply	other threads:[~2010-01-28 20:05 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-28 19:43 Karljurgen Feuerherm
2010-01-28 20:05 ` erik quanstrom [this message]
2010-01-28 20:46 ` geoff
2010-01-28 20:59   ` Karljurgen Feuerherm
2010-01-28 21:20     ` geoff
2010-01-28 21:51       ` Karljurgen Feuerherm
2010-01-28 22:07         ` ron minnich
2010-01-28 22:19           ` hiro
2010-01-28 22:34             ` Karljurgen Feuerherm
2010-01-28 22:56           ` erik quanstrom
2010-01-28 23:38             ` Federico G. Benavento
2010-01-28 23:42       ` erik quanstrom
2010-01-29  0:08         ` Karljurgen Feuerherm
2010-01-29  0:19       ` Rob Pike
2010-01-29  0:24         ` erik quanstrom
2010-01-29  0:36           ` Russ Cox
2010-01-29  0:42             ` erik quanstrom
2010-01-29  0:58               ` Russ Cox
2010-01-29  6:08             ` erik quanstrom
2010-01-29  6:18               ` Justin Jackson
2010-01-29 14:36                 ` Ethan Grammatikidis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6277a4dcc738c2eee17e029efeb1b324@ladd.quanstro.net \
    --to=quanstro@quanstro.net \
    --cc=9fans@9fans.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).