From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id RAA15116; Mon, 11 Jun 2001 17:58:04 +0200 (MET DST) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id RAA15223 for ; Mon, 11 Jun 2001 17:58:03 +0200 (MET DST) Received: from speakeasy.org (shawnw-0.dsl.speakeasy.net [64.81.17.141]) by nez-perce.inria.fr (8.11.1/8.10.0) with ESMTP id f5BFw2v19706 for ; Mon, 11 Jun 2001 17:58:02 +0200 (MET DST) Received: (from shawnw@localhost) by speakeasy.org (8.10.1/8.10.1) id f5BG07Q19765 for caml-list@inria.fr; Mon, 11 Jun 2001 09:00:07 -0700 Date: Mon, 11 Jun 2001 09:00:07 -0700 From: Shawn Wagner To: caml-list@inria.fr Subject: Re: [Caml-list] Ocaml interface to ctype.h functions Message-ID: <20010611090007.C13396@speakeasy.org> Mail-Followup-To: caml-list@inria.fr References: <20010601232433.A22189@speakeasy.org> <20010605182909.A16268@pauillac.inria.fr> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20010605182909.A16268@pauillac.inria.fr>; from Xavier.Leroy@inria.fr on Tue, Jun 05, 2001 at 06:29:09PM +0200 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk On Tue, Jun 05, 2001 at 06:29:09PM +0200, Xavier Leroy wrote: > > I've been working on some projects recently where it would be nice to have > > access to the ctype.h character classification functions (isalpha(), > > isspace(), etc.) in Ocaml, and couldn't find anything like them in a search > > through the standard library. It's easy to whip up a library for this, but > > before doing so, I thought I'd ask if there's any plans to put them in the > > Character module or some other place it makes sense to have them. > > It would make sense to have classification functions in the Char > module. The main issue is: what is a letter?, or: how to deal with > character sets. > > If only one, fixed character set is supported (e.g. US-ASCII or > Latin-1), it's truly easy, but will not satisfy everyone. OCaml has > already been criticized for supporting ISO Latin-1 accented letters in > identifiers! (Look at the caml-list archives if you don't believe me.) > > Building on the C functions isalpha(), etc, is a bit of a cop-out, > because then we're dependent on what these functions actually do on a > variety of Unix, Windows and Macintosh systems. In particular, we > become dependent on the ISO C internationalization framework ("locales"), > which I think is a mess because it relies too much on a global state > (the current locale). Okay, I've done the isFOO() and setlocale() interface as a seperate library for now, and will release it soon (Like, tonight). Am I correct in assuming that it's not likely to make it into the standard library based on the above, though? I've discovered that setlocale() of LC_CTYPE is done already by the runtime, by a function used in Char.escaped... so if locales are a mess, they're a mess ocaml is already stuck with. Also, someone else is asking about a way to set LC_NUMERIC from in ocaml, so I'm not alone in having a need for setlocale, at least. > > To give an example of the kind of problems I fear, just doing > setlocale(LC_ALL, "fr_FR") in an OCaml program causes > float_of_string "3.14" to return 0.0. Guess why? float_of_string > relies on the C function atof(), which is internationalized, and > doesn't recognize "." as a decimal point -- French uses a "," instead... This is why LC_ALL is bad, and why it's better to just use the specific locale categories you want. -- Shawn Wagner shawnw@speakeasy.org ------------------- Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr