From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id LAA15786; Fri, 4 Oct 2002 11:07:16 +0200 (MET DST) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id LAA16003 for ; Fri, 4 Oct 2002 11:07:15 +0200 (MET DST) Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35]) by concorde.inria.fr (8.11.1/8.11.1) with ESMTP id g9497E500933; Fri, 4 Oct 2002 11:07:14 +0200 (MET DST) Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id LAA15958; Fri, 4 Oct 2002 11:07:14 +0200 (MET DST) From: Pierre Weis Message-Id: <200210040907.LAA15958@pauillac.inria.fr> Subject: Re: [Caml-list] Pattern matching and strings (and a mini-bug in Scanf) In-Reply-To: <3D9B1CF4.5030706@baretta.com> from Alessandro Baretta at "Oct 2, 102 06:21:08 pm" To: alex@baretta.com (Alessandro Baretta) Date: Fri, 4 Oct 2002 11:07:13 +0200 (MET DST) Cc: luc.maranget@inria.fr, caml-list@inria.fr X-Mailer: ELM [version 2.4ME+ PL28 (25)] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk [...] > I meant what I wrote. The %s conversion stops reading at the > first whitespace character. However, ocaml does not like > the "%[^]" which, in my opinion, is to be considered a > mini-bug. "%[^]" should be interpreted as "the set of all > characters except none", which is "the set of all > characters", which can also be expressed, more verbosely, as > "%[\000-\255]". By the same standards, "%[]" is rejected, > when it should be interpreted as "the set containing no > characters", or more verbosely "%[^\000-\255]" [...] > Alex This is not a mini-bug, this is a carefully crafted feature and thoroughly considered design decision :( "%[]" is not allowed because we need to allow the matching of the closing bracket. Hence, if ']' just follows the opening bracket (or the ^ character, in case of negative range) it is considered as a plain ']' (or ascii code 93) to be matched. So "%[]]" means matching ']' (well, more precisely zero or more ']'); more generally, "%[]range]" means matching ']' or range; "%[^]range]" means matching any character different from ']' and not belonging to range. Admitedly, we could have made a different decision, such as a special escape for ']' in a range that would have made possible your interpretation of "%[]". However, I considered that the matching of an empty range of characters (or conversely a full range of characters) was way less useful and frequent than matching a ']'. Hence, the more complex expressions "%[^\000-\255]" and "%[\000-\255]" for the seldom used ranges. To end with, I want also to elaborate a bit on the last sentence of the Scanf documentation because many people misunderstand it: Note: the [scanf] facility is not intended for heavy duty lexical analysis and parsing. As you said ``I meant what I wrote''. I did not mean that [scanf] is slow (it is not). I did not mean that it is not powerful (since it is much more powerful than the corresponding C facility). I meant that it is perfectly ok to write complex input reading functions (up to, say, a polymorphic list scanner for instance), but [scanf] is not the right tool to write a full-fledge Caml parser. On the other hand, writing a polymorphic list scanner using lex and yacc seems to me way more difficult than the 10 or 15 lines of code you need to implement the polymorphic list scanner, if you use the [Scanf] module facilities :) May be this note should be removed, since, at last, it just means ``Use the right tool for the work at hand'', which evidently goes without saying when you are talking to people that already chose Objective Caml to write their programs ! Pierre Weis INRIA, Projet Cristal, Pierre.Weis@inria.fr, http://pauillac.inria.fr/~weis/ ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners