caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* ocamlagrep anybody ?
@ 2006-05-08  7:57 Ingo Bormuth
  2006-05-08 12:10 ` [Caml-list] " Xavier Leroy
  0 siblings, 1 reply; 3+ messages in thread
From: Ingo Bormuth @ 2006-05-08  7:57 UTC (permalink / raw)
  To: caml-list


Hi list,

ocamlagrep produces strange results in my hands:

  let p =  Agrep.pattern "test" ;;
  let s = "Hello test world." ;;
  let l = String.length s ;;

  Agrep.errors_substring_match p s ~numerrs:0 ~pos:0 ~len:l

    ==> returns 0   ( as expected )

  Agrep.errors_substring_match p s ~numerrs:3 ~pos:0 ~len:l 

    ==> returns 3   ( why ??? Should be 0 !!! )

I tried many other combinations and do not get what's going on.
The library is around since ages. What did I miss.

Thanks
        Ingo


-- 
Ingo Bormuth, voicebox & telefax: +49-12125-10226517       '(~o-o~)'
public key 86326EC9, http://ibormuth.efil.de/contact   --ooO--(.)--Ooo--


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Caml-list] ocamlagrep anybody ?
  2006-05-08  7:57 ocamlagrep anybody ? Ingo Bormuth
@ 2006-05-08 12:10 ` Xavier Leroy
  2006-05-08 15:35   ` Ingo Bormuth
  0 siblings, 1 reply; 3+ messages in thread
From: Xavier Leroy @ 2006-05-08 12:10 UTC (permalink / raw)
  To: Ingo Bormuth; +Cc: caml-list

> ocamlagrep produces strange results in my hands:
>
>   let p =  Agrep.pattern "test" ;;
>   let s = "Hello test world." ;;
>   let l = String.length s ;;
>
>   Agrep.errors_substring_match p s ~numerrs:0 ~pos:0 ~len:l
>
>     ==> returns 0   ( as expected )
>
>   Agrep.errors_substring_match p s ~numerrs:3 ~pos:0 ~len:l
>
>     ==> returns 3   ( why ??? Should be 0 !!! )
>
> I tried many other combinations and do not get what's going on.

It's been a long time since I wrote this library, but AFAIK
Agrep stops at the first (approximate) match found.
So, in your example with numerrs=0 it scans s all the way to "test"
and reports success; and in your example with numerrs=3 it stops
at "Hell" (a 3-error match) and reports success.

In other terms, the integer returned by errors_substring_match
is not the minimal number of errors for a match over the whole text.
If that's what you want, you can obtain that number by repeated calls
to errors_substring_match using binary search on the value of numerrs.

- Xavier Leroy


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Caml-list] ocamlagrep anybody ?
  2006-05-08 12:10 ` [Caml-list] " Xavier Leroy
@ 2006-05-08 15:35   ` Ingo Bormuth
  0 siblings, 0 replies; 3+ messages in thread
From: Ingo Bormuth @ 2006-05-08 15:35 UTC (permalink / raw)
  To: caml-list


Thank's for the prompt reply! Thanks for the language anyway.

On 2006-05-08 14:10, Xavier Leroy wrote:
> It's been a long time since I wrote this library, but AFAIK
> Agrep stops at the first (approximate) match found.

Okay, that seems reasonable. Just the manual is a bit misleading:

  val errors_substring_match ... Same as Agrep.substring_match, but 
  return the smallest number of errors such that the substring matches 
  the pattern.

> If that's what you want, you can obtain that number by repeated calls
> to errors_substring_match using binary search on the value of numerrs.

That's what I did. Just performance on millions of strings sucks :)
Probably I will tokenize the strings and just match tokens.

- Ingo

-- 
Ingo Bormuth, voicebox & telefax: +49-12125-10226517       '(~o-o~)'
public key 86326EC9, http://ibormuth.efil.de/contact   --ooO--(.)--Ooo--


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2006-05-08 15:36 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-05-08  7:57 ocamlagrep anybody ? Ingo Bormuth
2006-05-08 12:10 ` [Caml-list] " Xavier Leroy
2006-05-08 15:35   ` Ingo Bormuth

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).