From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@sympa.inria.fr Delivered-To: caml-list@sympa.inria.fr Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by sympa.inria.fr (Postfix) with ESMTPS id 3D6E27EE48 for ; Sat, 21 Feb 2015 21:08:37 +0100 (CET) Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender authenticity information available from domain of benoit.vaugon@gmail.com) identity=pra; client-ip=209.85.212.178; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="benoit.vaugon@gmail.com"; x-sender="benoit.vaugon@gmail.com"; x-conformance=sidf_compatible Received-SPF: Pass (mail2-smtp-roc.national.inria.fr: domain of benoit.vaugon@gmail.com designates 209.85.212.178 as permitted sender) identity=mailfrom; client-ip=209.85.212.178; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="benoit.vaugon@gmail.com"; x-sender="benoit.vaugon@gmail.com"; x-conformance=sidf_compatible; x-record-type="v=spf1" Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender authenticity information available from domain of postmaster@mail-wi0-f178.google.com) identity=helo; client-ip=209.85.212.178; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="benoit.vaugon@gmail.com"; x-sender="postmaster@mail-wi0-f178.google.com"; x-conformance=sidf_compatible X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A0B0AQB95OhUlLLUVdFag1hagwiwJ49ghXECgRlDAQEBAQEBEAEBAQEHCwsJEjCEEAEBBBIRBBkBGxIKAgMMBgULDQkEEggDAgIJAwIBAgEPAhEBBQEKARETBgIBAQ4Qh3gBAxEEAQiuMT4xiy6Ba4J3jSMKGScDClSEWAEBAQEBBQEBAQEBAQEVAQUOiwWCRIIxgmiBQwWEXQqIeoNQgX+EH4FGgVOFC4ZOA4Q6NYEVgiEfgVFugkMBAQE X-IPAS-Result: A0B0AQB95OhUlLLUVdFag1hagwiwJ49ghXECgRlDAQEBAQEBEAEBAQEHCwsJEjCEEAEBBBIRBBkBGxIKAgMMBgULDQkEEggDAgIJAwIBAgEPAhEBBQEKARETBgIBAQ4Qh3gBAxEEAQiuMT4xiy6Ba4J3jSMKGScDClSEWAEBAQEBBQEBAQEBAQEVAQUOiwWCRIIxgmiBQwWEXQqIeoNQgX+EH4FGgVOFC4ZOA4Q6NYEVgiEfgVFugkMBAQE X-IronPort-AV: E=Sophos;i="5.09,621,1418079600"; d="diff'?scan'208,217";a="122821315" Received: from mail-wi0-f178.google.com ([209.85.212.178]) by mail2-smtp-roc.national.inria.fr with ESMTP/TLS/RC4-SHA; 21 Feb 2015 21:08:36 +0100 Received: by mail-wi0-f178.google.com with SMTP id em10so9275893wid.5 for ; Sat, 21 Feb 2015 12:08:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type; bh=Yn0uMSHM/BqvxJrndr6Go7uv9gX5q3SHkdJF3soer1M=; b=JWcJoet0DDju5fCxTZ8oCSvFphi5TZhkFUF9u053Jb1n7YS667yIikjEtx4exuerLY MCOPu/FoSPujeRPIPfU7mQG1OCGk2aN9BXbQrFyHrHaDgTo+YmOOKNJfl6Lm1/atbjFI MH2EAmvM7aSAnX2nXW0mmcQFc08e6iRee+9YMaPv5pUjLwWiF2HYkBwlf8fYNufysm76 NcyYked1xMAHYemD242NhH20+aB47fcrLssJEOCB2VpBC1yusDljY38WlQg5netVbiNU pwy/8MAgW2SKjTGKzs52cCU8RrUjwYSEFPpzGli8u8FiTx3mtp7RI2NhPku2e38uLGEQ HD/w== X-Received: by 10.194.133.101 with SMTP id pb5mr7741790wjb.40.1424549316366; Sat, 21 Feb 2015 12:08:36 -0800 (PST) Received: from [192.168.1.20] ([80.12.39.155]) by mx.google.com with ESMTPSA id d10sm47974952wjn.45.2015.02.21.12.08.35 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 21 Feb 2015 12:08:35 -0800 (PST) Message-ID: <54E8E5C0.8070208@gmail.com> Date: Sat, 21 Feb 2015 21:08:32 +0100 From: Benoit Vaugon User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: caml-list@inria.fr References: In-Reply-To: Content-Type: multipart/mixed; boundary="------------070407040500040908000905" Subject: Re: [Caml-list] scanf question This is a multi-part message in MIME format. --------------070407040500040908000905 Content-Type: multipart/alternative; boundary="------------080808090001050608090102" --------------080808090001050608090102 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Oups, I confirm it is a bug appearing when we have separated formatting_gen to formatting_lit in the implementation of formats to handle Format constructions like "@{<%s>" and "@[" that collides with scanf constructions like "%s@{" and "%s@[". I attach a patch (+ 6 lines) that fix the problem, if somebody have time to integrate it... Sorry. Benoît. Le 21/02/2015 10:55, Gabriel Scherer a écrit : > This seems to be a regression caused by the reimplementation of format > strings in 4.02; some "stop after this character" indications are > supported, but some other fail (when they correspond to valid Format > specifications). Would you mind opening a but report ( > http://caml.inria.fr/mantis/ )? > > On Fri, Feb 20, 2015 at 8:28 PM, Milan Stanojević > wrote: > > I want to parse a bash like sequence string, for example > "some-thing{1..20}", and extract the string before first { and then > two numbers inside. > > I tried this first > utop # Scanf.sscanf "some-thing{1..3}" "%s{%d..%d}" (fun s n1 n2 > -> s,n1,n2);; > Exception: End_of_file. > > Then I realized %s is special and needs scanning indication > (character @). > > So I tried that but I get the same error > utop # Scanf.sscanf "some-thing{1..3}" "%s@{%d..%d}" (fun s n1 n2 > -> s,n1,n2);; > Exception: End_of_file. > > But if I use some other character as end of string indication then > it works. > utop # Scanf.sscanf "some-thingI{1..3}" "%s@I{%d..%d}" (fun s n1 n2 -> > s,n1,n2);; > - : bytes * int * int = ("some-thing", 1, 3) > > It seems '{' is somehow special but I can't find anything in the docs > about this. > Can someone explain what is going on here? > > Thanks! > > > p.s. At the end I solved my thing by using a range > > utop # Scanf.sscanf "some-thing{1..3}" "%[a-z-]{%d..%d}" (fun s n1 n2 > -> s,n1,n2);; > - : bytes * int * int = ("some-thing", 1, 3) > > -- > Caml-list mailing list. Subscription management and archives: > https://sympa.inria.fr/sympa/arc/caml-list > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs > > --------------080808090001050608090102 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 8bit Oups, I confirm it is a bug appearing when we have separated formatting_gen to formatting_lit in the implementation of formats to handle Format constructions like "@{<%s>" and "@[<hov %d>" that collides with scanf constructions like "%s@{" and "%s@[". I attach a patch (+ 6 lines) that fix the problem, if somebody have time to integrate it...

Sorry.
Benoît.



Le 21/02/2015 10:55, Gabriel Scherer a écrit :
This seems to be a regression caused by the reimplementation of format strings in 4.02; some "stop after this character" indications are supported, but some other fail (when they correspond to valid Format specifications). Would you mind opening a but report ( http://caml.inria.fr/mantis/ )?

On Fri, Feb 20, 2015 at 8:28 PM, Milan Stanojević <milanst@gmail.com> wrote:
I want to parse a bash like sequence string, for example
"some-thing{1..20}", and extract the string before first { and then
two numbers inside.

I tried this first
utop # Scanf.sscanf "some-thing{1..3}" "%s{%d..%d}" (fun s n1 n2 -> s,n1,n2);;
Exception: End_of_file.

Then I realized %s is special and needs scanning indication (character @).

So I tried that but I get the same error
utop # Scanf.sscanf "some-thing{1..3}" "%s@{%d..%d}" (fun s n1 n2 -> s,n1,n2);;
Exception: End_of_file.

But if I use some other character as end of string indication then it works.
utop # Scanf.sscanf "some-thingI{1..3}" "%s@I{%d..%d}" (fun s n1 n2 ->
s,n1,n2);;
- : bytes * int * int = ("some-thing", 1, 3)

It seems '{' is somehow special but I can't find anything in the docs
about this.
Can someone explain what is going on here?

Thanks!


p.s. At the end I solved my thing by using a range

utop # Scanf.sscanf "some-thing{1..3}" "%[a-z-]{%d..%d}" (fun s n1 n2
-> s,n1,n2);;
- : bytes * int * int = ("some-thing", 1, 3)

--
Caml-list mailing list.  Subscription management and archives:
https://sympa.inria.fr/sympa/arc/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


--------------080808090001050608090102-- --------------070407040500040908000905 Content-Type: text/x-patch; name="fix-scanf-open-box-tag-patch.diff" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="fix-scanf-open-box-tag-patch.diff" diff -Naur old/stdlib/scanf.ml new/stdlib/scanf.ml --- old/stdlib/scanf.ml 2015-02-21 20:38:02.250636479 +0100 +++ new/stdlib/scanf.ml 2015-02-21 20:53:51.486677851 +0100 @@ -1125,6 +1125,12 @@ let scan width _ ib = scan_string (Some stp) width ib in let str_rest = String_literal (str, rest) in pad_prec_scanf ib str_rest readers pad No_precision scan token_string + | String (pad, Formatting_gen (Open_tag (Format (fmt', _)), rest)) -> + let scan width _ ib = scan_string (Some '{') width ib in + pad_prec_scanf ib (concat_fmt fmt' rest) readers pad No_precision scan token_string + | String (pad, Formatting_gen (Open_box (Format (fmt', _)), rest)) -> + let scan width _ ib = scan_string (Some '[') width ib in + pad_prec_scanf ib (concat_fmt fmt' rest) readers pad No_precision scan token_string | String (pad, rest) -> let scan width _ ib = scan_string None width ib in pad_prec_scanf ib rest readers pad No_precision scan token_string --------------070407040500040908000905--