caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* Re: [Caml-list] pcre
  2002-12-08 21:09 ` [Caml-list] pcre onlyclimb
@ 2002-12-08 13:51   ` Markus Mottl
  0 siblings, 0 replies; 5+ messages in thread
From: Markus Mottl @ 2002-12-08 13:51 UTC (permalink / raw)
  To: onlyclimb; +Cc: caml-list

On Sun, 08 Dec 2002, onlyclimb wrote:
> let s ="abcdbcd" ;;
> let t  = pcre_exec ~pat:"bc"  s ;;
> 
> t = [| 1; 3; 0 |]
> 
> what does the last zero mean?

The matching engine needs extra workspace. This wasn't mentioned in the
documentation (added now). You'll rarely need this function anyway. Better
use "exec", "exec_all" or "extract", "extract_all" for your purposes.

On Sun, 08 Dec 2002, onlyclimb wrote:
> Dose the int array returned by pcre_exec contain the offsets of all
> the matches or the first match from pos?

It contains the offsets of the first, whole match followed by the
offsets of matched subpatterns (introduced with parentheses in the
pattern string).

> However as i tried, it seems that it returned the first match offects,
> then why it return a int array  not a turple of int*int which refers to 
> (from, to) ?

Because there can be arbitrarily many subpatterns. E.g. try this:

  let t = pcre_exec ~pat:"a(bc)"  s ;;

  t = [|0; 3; 1; 3; 0; 0|]

The whole match ranges from character 0 to 3 (exclusive), the first
subgroup from 1 to 3. The remaining zeroes belong to the extra workspace.

Using "extract" will make things clearer. Then the result is:

  [|"abc"; "bc"|]

With "exec" you can extract strings of matched (sub)patterns more
efficiently if you do not want to access all of them.

Regards,
Markus Mottl


-- 
Markus Mottl                                             markus@oefai.at
Austrian Research Institute
for Artificial Intelligence                  http://www.oefai.at/~markus
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Caml-list] pcre
@ 2002-12-08 21:09 ` onlyclimb
  2002-12-08 13:51   ` Markus Mottl
  0 siblings, 1 reply; 5+ messages in thread
From: onlyclimb @ 2002-12-08 21:09 UTC (permalink / raw)
  To: caml-list

Dear Camlist
i am begin learning  ocaml pcre
i am a litter comfused about the result of function  pcre_exec

let s ="abcdbcd" ;;
let t  = pcre_exec ~pat:"bc"  s ;;

t = [| 1; 3; 0 |]

what does the last zero mean?
thanks


-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Caml-list] another question about pcre_exec
@ 2002-12-08 21:22 onlyclimb
  2002-12-08 21:09 ` [Caml-list] pcre onlyclimb
  0 siblings, 1 reply; 5+ messages in thread
From: onlyclimb @ 2002-12-08 21:22 UTC (permalink / raw)
  To: caml-list

Dose the int array returned by pcre_exec contain the offsets of all the 
matches or the first match from pos?

However as i tried, it seems that it returned the first match offects,
then why it return a int array  not a turple of int*int which refers to 
(from, to) ?


thanks



-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Caml-list] PCRE
  2002-12-29 15:42 [Caml-list] PCRE Oleg
@ 2002-12-29 18:41 ` Markus Mottl
  0 siblings, 0 replies; 5+ messages in thread
From: Markus Mottl @ 2002-12-29 18:41 UTC (permalink / raw)
  To: Oleg; +Cc: caml-list

On Sun, 29 Dec 2002, Oleg wrote:
> I'm new to PCRE. Can anyone explain to me why the output of
> 
> # open Pcre;;
> # version;;
> - : string = "3.4 22-Aug-2000"
> # full_split ~pat:"^(\\w+)(,(\\w+))*$" "a,b,c,d";;
> - : Pcre.split_result list =
> [Delim "a,b,c,d"; Group (1, "a"); Group (2, ",d"); Group (3, "d")]
> 
> does not contain Group(3, "b") and Group(3, "c") ?

Note that using "split" instead of "full_split" produces this list:

  - : string list = [""; "a"; ",d"; "d"]

This is absolutely correct Perl-behaviour.

"full_split" is actually the same, but it also allows you to access
matched substrings. Grouped subpatterns can only capture substrings once
(i.e. the last one if several are possible)!

> Similarly, I expected 
> 
> # full_split ~pat:"S(a\\d)+" "Sa1a2";;
> - : Pcre.split_result list = [Delim "Sa1a2"; Group (1, "a2")]
> 
> to produce [Delim "Sa1a2"; Group(1, "a1"); Group (1, "a2")]
> 
> The above uses the latest pcre-ocaml-4.31.0.

The same applies here: the behaviour of PCRE-OCaml is correct
wrt. Perl-semantics.

Regards,
Markus Mottl

-- 
Markus Mottl                                             markus@oefai.at
Austrian Research Institute
for Artificial Intelligence                  http://www.oefai.at/~markus
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Caml-list] PCRE
@ 2002-12-29 15:42 Oleg
  2002-12-29 18:41 ` Markus Mottl
  0 siblings, 1 reply; 5+ messages in thread
From: Oleg @ 2002-12-29 15:42 UTC (permalink / raw)
  To: caml-list

Hi

I'm new to PCRE. Can anyone explain to me why the output of

# open Pcre;;
# version;;
- : string = "3.4 22-Aug-2000"
# full_split ~pat:"^(\\w+)(,(\\w+))*$" "a,b,c,d";;
- : Pcre.split_result list =
[Delim "a,b,c,d"; Group (1, "a"); Group (2, ",d"); Group (3, "d")]

does not contain Group(3, "b") and Group(3, "c") ?

Similarly, I expected 

# full_split ~pat:"S(a\\d)+" "Sa1a2";;
- : Pcre.split_result list = [Delim "Sa1a2"; Group (1, "a2")]

to produce [Delim "Sa1a2"; Group(1, "a1"); Group (1, "a2")]

The above uses the latest pcre-ocaml-4.31.0.

Thanks
Oleg
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2002-12-29 18:41 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-12-08 21:22 [Caml-list] another question about pcre_exec onlyclimb
2002-12-08 21:09 ` [Caml-list] pcre onlyclimb
2002-12-08 13:51   ` Markus Mottl
2002-12-29 15:42 [Caml-list] PCRE Oleg
2002-12-29 18:41 ` Markus Mottl

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).